Estimating Allele Frequency From Dna Pools ( Non-Uniquely Mappable Reads )
1
2
Entering edit mode
12.7 years ago
Wen.Huang ★ 1.2k

Hi everyone,

I want to ask for opinions on dealing with non-uniquely mappable reads in pooled DNA sequencing.

Instead of sequencing individual samples, DNA from multiple samples are mixed and sequenced so as to identify variant and/or estimate allele frequencies. My question is, how do we properly handle non-uniquely mapped reads in this application? If they are discarded, it is almost certainly going to have bias in the estimation of allele frequency because one allele may be uniquely mappable and the other may be not. What would be the potential problem if such reads are assigned to a random alignment as BWA does?

Thanks!

sequencing non read • 3.0k views
ADD COMMENT
1
Entering edit mode
12.7 years ago

NGS data is like a tightrope act. You are balancing false positives and false negatives. Deciding which is more important will make your course of action clean. For example: I am working with pooled data and I am going to use FST to find signatures of selection. I want as few fasle positives as possible so I toss reads not mapping uniquely. Also keep in mind that linkage works in your favor. If you throw out a read with a true SNV because it doesn't map uniquely a downstream SNP will and give you signal as well.

ADD COMMENT
0
Entering edit mode

throwing out non-unique reads does not necessarily guarantee you a smaller false positive rate. In fact, since looking at only unique reads generate bias, it may increase false positives. I look at this problem from two angles. If the purpose is to identify variants, then keeping only uniquely mapped reads is probably the right thing to do. But if the purpose is to estimate allele frequency, perhaps non-uniquely mapped reads are helpful to reduce bias.

ADD REPLY

Login before adding your answer.

Traffic: 2406 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6