How can a duplicate read be interchromosomal?
1
0
Entering edit mode
3.7 years ago
maxrwjones ▴ 60

Hi all,

I've seen many Q/As on here and elsewhere that state that the main advantage of Picard Mark Duplicates over samtools rmdup is that the former removes interchromosomal duplicates while the latter does not.

My question is, if duplicate reads are determined by them having the same 5' start coordinate, how can reads mapping to different chromosomes (interchromosomal) ever be considered duplicates? They would have a different coordinate.

Cheers!

interchromosomal duplicate NGS samtools picard • 1.1k views
ADD COMMENT
0
Entering edit mode
3.7 years ago
JC 13k

If the coordinates point to identical regions (duplicate chromosomal regions are common), yes, it can be considered duplicated

ADD COMMENT
0
Entering edit mode

Thanks for your answer :)

This makes sense and I understand there can be duplicated regions of sequence. It was my understanding though that these tools did not detect sequence identity - they simply flag up reads if the 5' mapped coordinate is the same as that of another read. Even duplicated regions on different chromosomes would have different coordinates... they will be millions of bases apart in the concatenated genomic reference.

But maybe the tools are sorting by both coordinates and sequence?

ADD REPLY

Login before adding your answer.

Traffic: 1666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6