Entering edit mode
2.4 years ago
BioStar22
•
0
Hey Biostars, I'm asking for your expertise: I am handling 90x WGS data (with PCR) and I observe many different reads starting exactly at the same positions in the genome. Interestingly, the reads show different variants. Can these variants be true (=biological duplicates) or is to assume that they are false positives (=technical duplicates)?
In other words: how likely is the same start and end position of reads from different cells? Is it common? Seems unlikely to me.
Any opinions on this will be appreciated!! Thanks!
It is common practice to remove duplicate reads after alignment, did you do that?
Thanks, yes I used gatk MarkDuplicates I think it doesnt recognize those as duplicates, as they still show up when I choose not to show duplicates in IGV. Do you think something went wrong with this in my case?