To use all functions of this page, please activate cookies in your browser.
With an accout for my.bionity.com you can always see everything at a glance – and you can configure your own website and individual newsletter.
- My watch list
- My saved searches
- My saved topics
- My newsletter
In genetics, coalescent theory is a retrospective model of population genetics that traces all alleles of a gene in a sample from a population to a single ancestral copy shared by all members of the population, known as the most recent common ancestor (MRCA; sometimes also termed the coancestor to emphasize the coalescent relationship ). The inheritance relationships between alleles are typically represented as a gene genealogy, similar in form to a phylogenetic tree. This gene genealogy is also known as the coalescent; understanding the statistical properties of the coalescent under different assumptions forms the basis of coalescent theory. In the most simple case, coalescent theory assumes no recombination, no natural selection, and no gene flow or population structure. Advances in coalescent theory, however, allow extension to the basic coalescent, and can include recombination, selection, and virtually any arbitrarily complex evolutionary or demographic model in population genetic analysis. The mathematical theory of the coalescent was originally developed in the early 1980s by John Kingman .
Additional recommended knowledge
Consider two distinct haploid organisms who differ at a single nucleotide. By tracing the ancestry of these two individuals backwards there will be a point in time when the Most Recent Common Ancestor (MRCA) is encountered and the two lineages will have coalesced.
Probability of fixation
Under conditions of genetic drift alone, every finite set of genes or alleles has a "coalescent point" at which all descendants converge to a single ancestor (i.e. they 'coalesce'). This fact can be used to derive the rate of gene fixation of a neutral allele (that is, one not under any form of selection) for a population of varying size (provided that it is finite and nonzero). Because the effect of natural selection is stipulated to be negligible, the probability at any given time that an allele will ultimately become fixed at its locus is simply its frequency p in the population at that time. For example, if a population includes allele A with frequency equal to 20% and allele a with frequency equal to 80%, there is an 80% chance that after an infinite number of generations a will be fixed at the locus (assuming genetic drift is the only operating evolutionary force).
For a diploid population of size N and (neutral) mutation rate μ, the initial frequency of a novel mutation is simply and the number of new mutations per generation is 2Nμ. Since the fixation rate is the rate of novel neutral mutation multiplied by their probability of fixation, the overall fixation rate is . Thus the rate of fixation for a mutation not subject to selection is simply the rate of introduction of such mutations.
Time to coalescence
A useful analysis based on coalescence theory seeks to predict the amount of time elapsed between the introduction of a mutation and a particular allele or gene distribution in a population. This time period is equal to how long ago the most recent common ancestor existed.
The probability that two lineages coalesce in the immediately preceding generation is the probability that they share a parent. In a diploid population of constant size with 2N copies of each locus, there are 2N "potential parents" in the previous generation, so the probability that two alleles share a parent is and correspondingly, the probability that they do not coalesce is .
At each successive preceding generation, the probability of coalescence is geometrically distributed - that is, it is the probability of noncoalescence at the t − 1 preceding generations multiplied by the probability of coalescence at the generation of interest:
For sufficiently large values of N, this distribution is well approximated by the continuously defined exponential distribution
The standard exponential distribution has both the expectation value and the standard deviation equal to 2N - therefore, although the expected time to coalescence is 2N, actual coalescence times have a wide range of variation.
Coalescent theory can also be used to model the amount of variation in DNA sequences expected from genetic drift alone. This value is termed the mean heterozygosity, represented as . Mean heterozygosity is calculated as the probability of a mutation occurring at a given generation divided by the probability of any "event" at that generation (either a mutation or a coalescence). The probability that the event is a mutation is the probability of a mutation in either of the two lineages: 2μ. Thus the mean heterozygosity is equal to
For , the vast majority of allele pairs have at least one difference in nucleotide sequence.
Coalescents can be visualised using dendrograms which show the relationship of branches of the population to each other. The point where two branches meet indicates the Most Recent Common Ancestor (MRCA).
Disease gene mapping
The utility of coalescent theory in the mapping of disease is slowly gaining more appreciation; although the application of the theory is still in its infancy there are a number of researchers who are actively developing algorithms for the analysis of human genetic data that utilise coalescent theory.
Coalescent theory is a natural extension of the more classical population genetics concept of neutral evolution and is an approximation to the Fisher-Wright (or Wright-Fisher) model for large populations. It was ‘discovered’ independently by several researchers in the 1980’s , but the definitive formalisation is attributed to Kingman . Major contributions to the development of coalescent theory have been made by Peter Donnelly, Robert Griffiths, Richard R Hudson and Simon Tavaré . This has included incorporating variations in population size , recombination and selection . In 1999 Jim Pitman and Serik Sagitov independently introduced coalescent processes with multiple collisions of ancestral lineages. Shortly later the full class of exchangeable coalescent processes with simultaneous multiple mergers of ancestral lineages was discovered by Martin Möhle and Serik Sagitov and Jason Schweinsberg .
A large body of software exists for simulating data sets under the coalescent process, and gradually software is emerging that allows the analysis of human genetics data for the mapping of disease susceptibility loci.
References and notes
|This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Coalescent_theory". A list of authors is available in Wikipedia.|