This section describes some of the concepts of population genetics that are used by this program.
If you are knowledgable in this field, you probably don't need to read this section.
But if terms such as 'mean kinship' and 'founder equivalents' are unfamiliar to you, then you do need to this section, as understanding of these concepts is necessary in order to get the most out of a product such as PedScope.
A founder within a set of pedigree data is defined as an animal where neither the sire nor dam is known. Such an animal may truly be a 'founding ancestor' of a breed or 'population' in the sense that it is not related to any other founder, or it may be related - possibly closely - to other members of the group but the details of its lineage are not known. Within PedScope founders are treated as unrelated and non-inbred.
Many features of PedScope involve determining the ancestors of an animal or group, and then computing various metrics or reports based on them. In most situations PedScope gives you a way to artificially restrict the depth of ancestry that is considered. E.g. you may choose to limit inbreeding calculations to 16 generations. Whenever such an artificial cutoff is applied, the definition of 'founder' takes on a slightly different meaning: it means an animal, within the subset of the population that comprises the animal(s) under study and however many generations of their ancestors that has been selected, that either has no known sire and/or dam or whose sire and/or dam, even though known in the entire population, do not fall within the subset under study. It is as if the program 'pretends' that the deeper ancestry is simply unknown.
Most animals have two sets of chromosomes. They have one 'copy' of each gene on each chromosome. Such genes are said to be autosomal. The particular location on a chromosome where a gene is found is called its locus. An animal therefore has 2 'copies' of a gene for any particular locus. An allele is one of two or more forms of a gene. The presence of different combinations of alleles can affect physical traits in the animal, such as coat color in some animals. This description is a deliberate simplification, but sufficient for our purposes. E.g. we ignore sex chromosomes etc.
If both alleles for a given locus are the same, the animal is said to be homozygous for that gene. The phrase identical by state (IBS) means the same thing: the alleles are functionally the same. If the alleles are different, they are said to be heterozygous.
A related term is identical by descent (IBD). Two alleles are said to be IBD if one of them is a physical copy of the other, or if they are both physical copies of the same allele in a particular ancestor. IBD implies IBS but not vice versa.
The kinship coefficient (KC) (also known as coancestry) between any two animals is the probability, for any particular locus, that an allele selected randomly from one animal is IBD to an allele selected randomly from the other.
An animal gets half its genes from its father, half from its mother. So that means the that 'kinship coefficient' between an animal and either of its parents must be one half (i.e. 50%, or 0.5), right? Wrong. The KC between an offspring and either parent is one quarter i.e. 25%. Why is this? First, there is a 50% chance that any allele chosen at random in an animal is from a particular parent (the sire, say). And secondly there is then a 50% chance that it is the same as any particular allele chosen randomly in the sire. So, there is a 0.5 * 0.5 chance i.e. 0.25 or 25% that they are IBD.
Inbreeding occurs when an animal has one or more common ancestors.
The term 'common ancestor' has a very precise meaning in population genetics. It means an ancestor that is present at least once on both sides of an animal's pedigree. It does not mean an ancestor that just 'occurs a lot'.
When an animal has common ancestors, this raises the possibility that for any particular gene locus, both alleles present are in fact physical copies of the exact same allele from one of its common ancestors. The inbreeding coefficient is the value of this probability. Or, put another way, it is the probability that both alleles are IBD.
The inbreeding coefficient as in common use today was first defined in 1922 in a paper by Dr Sewall Wright . For this reason it is often known as 'Wrights Inbreeding Coefficient'.
When an animal has no common ancestors, its inbreeding coeffficient is zero. The symbol 'F' is commonly used in scientific literature to mean an animal's inbreeding coefficient.
The inbreeding coefficient can be computed by hand only for relatively simple pedigrees. For deep pedigrees it can only realistically be done by computer. Calculating inbreeding for very large pedigrees is a computationally intensive process and if not implemented very carefully can take a long time, if it is correct at all. The computation of inbreeding coefficients in PedScope is amazingly fast. E.g. to compute all inbreeding coefficients to the maximum depth possible for the UK Golden Retriever pedigree - some 500,000+ dogs going back up to 65 generations - took just 7 seconds .
A partial inbreeding coefficient of an animal with respect to a specific founder is the probability that the animal is IBD for an allele descended from the specific founder. It is that part of the animals inbreeding that is due to that founder.
PedScope can tabulate partial inbreeding coefficients when you list ancestors. For further information see Reports - Relations.
The ancestral inbreeding coefficient of an animal is the cumulative proportion of an animal's genome that has been previously exposed to inbreeding in its ancestors . Ancestral inbreeding coefficients can be tabulated in the main record table.
A relationship matrix is used in population genetics to express the genetic relations within a set of animals and their ancestors. Such matrices are always square: they have one row and one column for each animal. It is accepted practice that the rows/columns are ordered such that offspring are always preceded by their parents. Thus the first row/column will always hold a founder of the group under study.
The most commonly used matrix is the matrix of additive genetic relationships. This is sometimes referred to in population genetics literature as the 'A' matrix. Each cell in this matrix, say the intersection of row I and column J, gives the additive genetic relationship between animals I and J. When I is not equal to J, i.e. for off diagonal values, this is equal to twice the kinship coefficient. When I is equal to J, i.e. for the diagonal, this is equal to the inbreeding coefficient of animal I, plus one.
The founders of a population under study contain all the genetic diversity available to be inherited by their descendants. However, not necessarily all of the genetic variation present in the founders may have 'made it' to the current population, due to genetic drift, selection and inbreeding.
Genetic drift refers to the loss, by chance, of genetic diversity present in the founders. E.g. when a founder has only one offspring, but that offspring is then the ancestor of one or more members of the current population, then by definition at least half of the genetic diversity that was present in that founder must have been lost. That is because its sole offspring only inherited half of it. Even when a founder has many offspring and from which there may be large numbers of descendants in the current population, it is not likely that all its genes will still be present.
Larger populations, with larger numbers of founders, large families and little or no selection i.e. truly random mating usually maintain genetic diversity. Genetic diversity can be increased through mutation and migration i.e. the introduction of truly unrelated 'new' founders. But within smaller populations, such as 'closed' populations of fancy animals or with endangered species where there are very limited numbers of animals remaining, the extent of retained genetic diversity can be limited, and the extent to which different founder's are represented within the current population can vary, sometimes greatly.
The number of founder equivalents (Lacy ) is a measure of the genetic diversity of a current population. It is the number of equally contributing founders that would be expected to produce the same level of genetic diversity as the current population. It has a standard symbol in population genetics literature: 'fe'. In a large population with no selection and in the absence of genetic drift fe remains relatively constant because founder contributions do not vary much from one generation to another. When there is extensive selection fe can lose value, such as with fancy animals when popular sires can give rise to huge numbers of offspring and hence magnify the representation of their founders within the current population at the expense of those of unpopular sires.
The number of founder genome equivalents (Lacy ) is a related measure which, unlike the number of founder equivalents, takes genetic drift into account. It is the number of equally contributing founders with no random loss of alleles in the offspring that would be expected to produce the same level of genetic diversity as the current population. It has standard symbol 'fg'. The calculation of fg is more involved than fe because it requires knowledge of the extent to which each founders alleles are present in the current population i.e. the degree of genetic drift. This can be calculated using a gene drop analysis. fg will always be less than or equal to fe.
Another useful metric is the number of effective ancestors (Boichard et al ). It is similar to the number of founder equivalents except that it also takes into account bottlenecks in the pedigree.
It does this by working out the marginal contribution of each ancestor - not each founder - to the current population to find the ancestor with the most influence, and repeats the process taking care not to consider the contribution of any previously identified influential ancestor more than once.
The result is a list of the most influential ancestors, and a metric, the number of effective ancestors. It has standard symbol in population genetics - 'fa'. This number will always be less than or equal to fe.
The mean kinship (symbol: MK) of an animal within a group of animals is the mean of its kinship coefficient with every other member of the group, including itself. If an animal's MK is low, this means it is less related, on the whole, to the rest of the population than an animal with a higher MK.
MK plays an important part in breeding decisions in programmes designed to maintain genetic diversity in small populations. Leaving aside other factors, if you had to choose between one sire and another for a mating decision, it would be better to choose the one with lower MK. But, it is inferior to a related measure, kinship value.
The kinship value (symbol: KV) is a weighted variant of mean kinship. The weighting used is the reproductive value of the animal concerned; it is age-specific and defined as the expected future lifetime reproduction (Fisher's reproductive value 'Vx' ).
Clearly, an animal in the current population that is no longer reproductive e.g. a female nearing the end of her life, is of no value in maintaining future genetic diversity (because she can no longer produce offspring to add to the population), and yet calculations based on MK do not take this into account. Likewise it follows that if you have two animals with the same MK, but one is older than the other, then - and ignoring any other factors - a greater priority ought to be given to breeding from the older animal first, to lessen the risk that its genes are lost for good.
In order to compute KV, additional information is required. Firstly the current age of each animal is needed. Which means, in practice, that the input data to PedScope must include the date/year of birth. Secondly, the program needs the weights to be assigned to each animal. PedScope provides 2 ways in which the weights can be entered for KV calculation. For further information see Vx Data.
The gene diversity (symbol: GD) of a population is defined as 1 minus the 'mean MK' (i.e. the mean of the MK's of every animal in the current population, including with itself). The lower the mean MK of all animals in a population, the less related, on average, they are to each other. GD is simply a different way to view the same number.
Maximizing GD can be a good basis for a breeding programme e.g. for an endangered species with a small population but with known ancestry, though it is inferior to GV.
The gene value (symbol: GV) of a population is defined as 1 minus the 'mean KV' (i.e. the mean of the kinship values of every animal in the current population, including with itself). GV is to KV what GD is to MK. Since KV's are, in effect, an improved form of MK, so GV is an improvement upon GD.
If you have age-structure data for the current population along with reproductive values, GV should normally be preferred over GD as the method of ranking breeding decisions.
The genome uniqueness (GU) of an animal with respect to a current population of which it is part is the probability that it contains founder alleles not present in any other single (normally) animal in that current population. GU can also be computed with respect to a specific founder in which case it is the probability that the genes from that founder are inherited 'uniquely'. GU can be used as a factor when making breeding/mate recommendations.
Informally, if an animal is the only one carrying a particular founder's genes, then you'd likely want to breed from that animal so as not to lose those genes from the population forever.
We say 'normally' here because PedScope lets you configure what is meant by genome uniqueness, in that you can define that number of animals in the current population for which, if any given founder's alleles are only present in that number, or fewer, of the current population, then it is considered 'unique'. For further information see Customization. A better term might be 'genome rarity' or 'founder rarity' but the term 'genome uniqueness' is already established in the literature. Genome uniqueness is determined using gene drop analysis.
A gene drop analysis (MacCluer et al ) is a computer simulation technique for analyzing a group of animals comprising a current population and their ancestors.
The founders of the group are first identified. The computer then simulates Mendelian inheritance by 'dropping' a gene, comprising pair of unique alleles for each founder, 'through' the pedigree. The actual alleles that the current descendants have 'received' are then counted up. The whole process is then repeated, usually many thousands of times. By counting the frequencies of the alleles that the descendants end up with, it is possible to work out good approximations for various metrics that otherwise would be difficult to work out exactly.
PedScope uses gene drop analysis for the computation of genome uniqueness, and the proportion of alleles retained in the current population from any given founder (which in turn is required for the calculation of the number of founder genome equivalents).
1. Wright, S. Coefficients of inbreeding and relationship. American Naturalist 56: 330-338, 1922.
2. Using Windows 7, 64 bit, AMD Phenom II X6 1035T processor 2.6Gz, 8Gb memory.
3. Ballou, JD. Ancestral Inbreeding Only Minimally Affects Inbreeding Depression in Mammalian Populations, Journal of Heredity 1997, 8:169-178.
4. Lacy, RC. Analysis of founder representation in pedigrees: Founder equivalents and founder genome equivalents. Zoo Biology 8:111-124, 1989.
5. Boichard, D, Maignel L, and Verrier, E. The value of using probabilities of gene origin to measure genetic variability in a population. Genetics Selection Evolution, 1997; 29(1): 5-23.
6. Fisher, R. The Genetical Theory of Natural Selection, 1930.
7. MacCluer, JW, VandeBerg JL, Read B and Ryder OA. Pedigree analysis by computer simulation. Zoo Biology 5, 147/160, 1986.