Ti/Tv Ratio Confirms Snp Discovery. Is This A General Rule?
5
21
Entering edit mode
14.0 years ago
Yahan ▴ 210

The transition transversion ratio in human is observed to be around 2.1 and this can be used as a confirmation for the filtering in a snp discovery project.

If, after filtering, you get a comparable ratio for a high throughput experiment this indicates the accuracy of the resultset.

Is this ratio a universal one or is it expected to be different in different organisms?

If the ti/tv ratio is unknown for your organism, can you still use it as a confirmation of the applied filter?

Thanks for the input.

snp snp • 42k views
ADD COMMENT
30
Entering edit mode
14.0 years ago

Recent human studies particularly from the 1000 genomes project have been showing that for whole human genome, a ts/tv of around 2-2.1 is generally correct. This is only when assessing the genome as a whole.

Heng Li mentions in this thread that different specific genetic regions will display different ts/tv ratios. Looking at human exomes, it appears that the ratio increases to a ts/tv of 2.8-3.0 or higher.

There is a biological premise for this, however. In the case of exomes, the theory is that the increased presence of methylated cytosine in CpG dinucleotides in exonic regions leads to an increased ts/tv ratio. This is because methylated cytosine can very easily undergo deamination and transition to a thymine.

Interestingly, this does suggest that metrics such as GC content are linked to ts/tv. However, because ts/tv is a measure of sequence changes, it is a metric that inclusively accounts for numerous other factors, be they GC content, radiation exposure, or intra-species variation. For example, when looking at YRI versus CEU individuals in the 1000 genomes, we see that ts/tv is different between the two groups.

Ultimately, ts/tv is going to differ from species to species, and even among populations and individuals in the same species. It is therefore important to estimate accuracy another way prior to using ts/tv (for example, by SNP chip or genotyping comparison), then tweak your variant calling parameters for the highest accuracy and determine what the optimal ts/tv of your genome/exome/genetic region is. You can then use that optimal ts/tv ratio as your metric to aim for.

ADD COMMENT
12
Entering edit mode
14.0 years ago

there is a very useful reading from the Molecular Evolutionary Genetics Analysis software manual that states the following: "the ratio of the number of transitions to the number of transversions for a pair of sequences becomes 0.5 when there is no bias towards either transitional or transversional substitution because, when the two kinds of substitution are equally probable, there are twice as many possible transversions as transitions". this is probably something you will always have to have in mind, because it's a basic statistical estimation.

but maybe some more reading will help to understand why this bias may not always exist: Transition-Transversion Bias Is Not Universal: A Counter Example from Grasshopper Pseudogenes

it describes how the authors found an organism where the transitions were not "favoured" over the transitions, showing that this bias shouldn't be assumed in general. also, it talks about that bias estimation in the discussion section: "It is generally assumed that the ratio of transitions (ts) to transversions (tv) is higher in animal nuclear genomes than the 1:2 ratio expected if all substitutions were equally likely, while the relative transition rate is even higher in their mitochondrial DNA".

using the approximate 2:1 ratio for evaluating a SNP discovery experiment's results is definitely a fast and valid methodology that gives the researcher a quick idea of what has happened, and specially in well known organisms such as humans where that ratio has been confirmed (although it shouldn't be used as the solely confirmation method).

ADD COMMENT
1
Entering edit mode

Jorge, it looks like we are thinking with one mind on this one!

ADD REPLY
0
Entering edit mode

ha! definitely!

ADD REPLY
7
Entering edit mode
14.0 years ago

Yes, it is generally assumed that the spontaneous neutral mutation process leads to a 2:1 Ts:Tv ratio, but this is mainly derived from studies on nuclear DNA in mammals and flies or mitochondrial DNA. However, recent by Keller et al work in grasshoppers shows that the 2:1 Ts:Tv ratio may not be universal.

Note also there is a well-trodden gotcha in the calculation of Ts:Tv ratios that leads to an apparent non-2:1 Ts:Tv ratio: since there are twice as many Tv mutations as Ts mutations, it is necessary to rescale observed Tv counts by a factor of two before comparing ratios.

ADD COMMENT
4
Entering edit mode

No, since the Ts:Tv ratio is a ratio of rates, not observed events. Imagine observing 100 sites with transitions and 100 sites with transversions. Your method would say that the Ts:Tv rate ratio is 1. But since there are 4 possible Tv mutation types and only 2 possible Ts mutation types, in this example there is actually a 2-fold higher rate of Ts mutations that Tv mutations per site. Thus, the Ts:Tv (rate) ratio is 2:1.

ADD REPLY
0
Entering edit mode

Could you clarify your 'gotcha' statement? Isn't Ti/Tv calculated just as the number of transitions divided by the number of transversions?

ADD REPLY
0
Entering edit mode

Great, thank you for clarifying! One of the answers actually states that ti/tv is "the ratio of the number of transitions to the number of transversions", so I was even more confused.

ADD REPLY
7
Entering edit mode
14.0 years ago
lh3 33k

Although I haven't studied other species, a ts/tv 2.1 is certainly not universal. Even for human it varies with regions: mitochondrial DNA has a much higher ts/tv (something around 10 as I remember); ts/tv in exome is also higher; ts/tv also varies with chromosomes slightly. Due to the regional variability, ts/tv=2.1 is only a genome-wide approximation and you do not really know the exact ts/tv in the region where SNPs can be called.

On the other hand, in SNP calling, you know which SNPs tend to be right and which tend to be wrong. If you choose the reliable set of SNPs to compute ts/tv, it will approach the true ts/tv. This is not a perfect estimate because there are other biases, too, but usually good enough to evaluate SNP accuracy.

EDIT: When used carefully, the ts/tv ratio is not only for a fast check, but also for distinguishing subtle difference in accuracy which no other methods can detect. In the later use case, it is important to get a precise estimate of the "desired ts/tv" (not necessarily the true ts/tv) with the method I talked above. Testing ts/tv is one of the most sensitive computational methods for evaluating SNP accuracy.

ADD COMMENT
0
Entering edit mode

sure it varies with regions, and of course it does vary among organisms, but the ts/tv ratio is like the GC content: an estimate that helps describing a genotyping result in order to perform a very fast check if everything has gone fine. sure it is not a perfect estimate, and it shouldn't be used alone, but I think it is definitely useful.

ADD REPLY
0
Entering edit mode

ts/tv is much more powerful than the GC content. It works even when more other metrics are not sensitive enough.

ADD REPLY
0
Entering edit mode

Ts/tv is much more powerful than the GC content. It works fine even when most other computational metrics fail.

ADD REPLY
1
Entering edit mode
14.0 years ago

I am not an expert on this, but I am quite certain that a transition:transversion ratio of 2.1:1 is not universal. I believe that the ratio is generally close to 2:1, but that it differs between clades. I would guess that the transition:transversion ratio is affected by the nucleotide composition of the genome, but this is speculation on my part.

ADD COMMENT

Login before adding your answer.

Traffic: 3517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6