Entering edit mode
12.9 years ago
Bioscientist
★
1.7k
I just see lots of people discussing using Picard-MaskDuplicates to remove PCR duplicates when analyzing NGS reads.
I'm just wondering what's this PCR duplicates? Is it duplicates when preparing libraries, or produced during bridge amplification step of Illumina sequencing?
Thanks
Thanks. Actually I'm wondering how much samples are amplified by PCR during library construction? I mean PCR is really powerful, say amplifying 1000000 times; which means for each unique DNA fragment, we've got 1000000 exactly identical copies for sequencing, resulting in 1000000 "reads" mapping to the same location. Or, only 1 out of 1000000 of our reads is really valuable?? This is absurd...
Only 1 of the 1000000 copies is really useful (strictly speaking, it is not amplifying 1000000 times; about 20 cycles give you 1000000 copies).