Which Short Read Mappers Can Handle Overlapping Paired-End Reads?
2
1
Entering edit mode
13.8 years ago
Ryan Thompson ★ 3.6k

Suppose I have a paired-end data set from Illumina with 100 base pairs on each end. If any fragment is shorter than 200 base pairs, the ends of the two sequences will overlap when mapped to the genome. For example, if a particular fragment is 150 base pairs long, then the last 50 base pairs of read 1 will be the reverse complement of the last 50 base pairs of read 2.

So, which short-read mapping programs can handle such a case? And for ones that don't, how can I work around this problem?

paired short short aligner • 5.3k views
ADD COMMENT
3
Entering edit mode
13.8 years ago
lh3 33k

Most of paired-end mappers work (maq, novoalign and bwa for sure). There were overlapping ends three years ago. This is not a new problem at all.

ADD COMMENT
0
Entering edit mode

bowtie works too

ADD REPLY
0
Entering edit mode

Just to clarify, these mappers will successfully map the reads, but in the overlapping part of the reads, do they get the coverage right for subsequent SNP calling? Meaning, the overlapping reads represent a single molecule, and should represent a single read at that position. If you map a single pair of overlapping reads, does it give you a coverage of two at the overlap location, and one at the non-overlapping portion of the reads?

ADD REPLY
0
Entering edit mode

No, but that's not really the aligner's job. The subsequent analysis tools for finding SNPs and such would have to explicitly consider the overlap. Many do not.

ADD REPLY
0
Entering edit mode
11.0 years ago
Adrian Pelin ★ 2.6k

I believe a completely different approach is necessary here.

It is not which mapper can map overlapping reads, but the fact that you have to merge your reads together before doing any mapping, see comment #2 and #3 to the first answer.

EDIT: Tools are FLASH and SeqPrep, I use the latter but there are others as well.

EDIT2: wow this thread had been dead for 3 years, nice bump:)

ADD COMMENT
0
Entering edit mode

SeqPrep is nice, but it does not get around the problem. Your reads could overlap by only one nucleotide, in which case you cannot merge them, but your mapper still has to deal with the fact that they are overlapping.

ADD REPLY

Login before adding your answer.

Traffic: 2573 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6