To use all functions of this page, please activate cookies in your browser.

my.bionity.com

With an accout for my.bionity.com you can always see everything at a glance – and you can configure your own website and individual newsletter.

- My watch list
- My saved searches
- My saved topics
- My newsletter

## Tajima's D
According to Motoo Kimura's neutral theory of molecular evolution, the majority of mutations in the human genome are neutral, ie have no effect on fitness and survival. When looking at the human population as a whole, we say that the population frequency of a neutral mutation fluctuates randomly (ie the percentage of people in the population with the mutation changes from one generation to the next, and this percentage is equally likely to go up or down), through genetic drift. The strength of genetic drift depends on the population size. If a population is at a constant size with constant mutation rate, the population will reach an equilibrium of gene frequencies. This equilibrium has important properties, including the number of segregating sites The purpose of Tajima's test is to identify sequences which do not fit the neutral theory model at equilibrium between mutation and genetic drift. In order to perform the test on a DNA sequence or gene, you need to sequence homologous DNA for at least 3 individuals. Tajima's test compares a standardized measure of the total number of segregating sites (these are DNA sites that are polymorphic) in the sampled DNA and the average number of mutations between pairs in the sample. If these two numbers are the same (or close enough that the difference between them is less that two standard deviations from the average), then the null hypothesis of neutrality cannot be rejected. Otherwise, the null hypothesis of neutrality is rejected. ## Additional recommended knowledge
## Hypothetical exampleEach chromosome in the human genome can be represented as a long DNA sequence. Genes represent 2% of this sequence, while much of the other 98% is "junk DNA" not coding a functional protein. Genes are under selection, while the junk evolves randomly. Lets say that you are a genetic researcher who finds two mutations, a mutation in a gene which causes pre-natal death and a mutation in junk DNA which has no effect on human health or survival. You publish your findings in a scientific journal, identifying the first mutation as "under negative selection" and the second as "neutral". The neutral mutation gets passed on from one generation to the next, while the mutation under negative selection disappears, since anyone with the mutation cannot reproduce and pass it on to the next generation. In order to back your discovery with more scientific evidence, you gather DNA samples from 100 people and determine the exact DNA sequence for the gene in each of these 100. Using all 100 DNA samples as input, you determine Tajima's D on both the gene and the junk DNA. If your hypothesis was correct, then Tajima's Test will output "neutral" for the junk DNA and "non-neutral" for the gene. ## Scientific explanationUnder the neutral theory model, for a population at constant size at equilibrium: for diploid DNA, and for haploid. But, selection, demographic fluctuations and other violations of the neutral model (including rate heterogeneity and introgression) will change the expected values of is calculated by taking the difference between the two estimates of the population genetics parameter . This difference is called , and D is calculated by dividing by the square root its variance (its standard deviation, by definition). Fumio Tajima demonstrated by computer simulation that the statistic described above could be modeled using a beta distribution. If the value for a sample of sequences is outside the confidence interval then one can reject the null hypothesis of neutral mutation for the sequence in question. ## Statistical testWhen performing a statistical test such as Tajima's D, you have a 'null hypothesis', an 'alternative hypothesis', a distribution and a confidence interval. The uppercase D calculated in the example below represents "how many standard deviations from the mean your lower-case d is". In the example below the answer was 2 standard deviations. To keep matters simple, lets just say that this is "really far" from the mean. Thus the chances of "randomly" getting a D of 2 or greater is extremely rare, (lets say 1%). Thus the value of D obtained is outside of the 99% confidence interval for the "null hypothesis". Thus the conclusion of the experiment is to "reject the null hypothesis." In Tajima's Test, the ## Mathematical detailswhere and are two estimates of the expected number of single nucleotide polymorphisms (SNPs)between two DNA sequences under the neutral mutation model in a sample size from an effective population size The first estimate is the average number of SNPs found in (n choose 2) pairwise comparisons of sequences ( The second estimate is derived from the expected value of , the total number of polymorphisms in the sample Tajima defines , whereas Hartl & Clark use a different symbol to define the same parameter . ## Historical exampleThe genetic mutation which causes sickle-cell anemia is non-neutral because it affects survival and fitness. People homozygous for the mutation have the sickle-cell disease, while those heterozygous do not have the disease, but instead are resistant to malaria. People without the mutation (homozygous for the wild-type allele) do not have the disease, but are at higher risk for malaria disease. Thus in Africa, where there is a prevalence of the malaria parasite ## ExampleLets say you are a geneticist studying an unknown gene. As part of your research you get DNA samples from four random people (plus yourself). To make things simple, you label your sequence as a string of zero's, and for the other four people you put a zero when their DNA is the same as yours and a one when it is different. Lowercase Since this is a statistical test, you need to divide lower-case ## References[1] Statistical Method for Testing the Neutral Mutation Hypothesis by DNA Polymorphism. Fumio Tajima. Genetics, 123: 585-595. ## Computational tools for Tajima's D test- [3] DNAsp (Windows)
- [4] Variscan (OS X, Linux, Windows)
- [5] Online view of Tajima'S D values in human genome
MEGA4
Categories: DNA | Molecular evolution |
|||||||||||||||||

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Tajima's_D". A list of authors is available in Wikipedia. |