How to merge paired-end reads from sam files?
2
0
Entering edit mode
7.8 years ago
aquaq ▴ 40

Hi,

I have paired-end read sequencing data. I have aligned reverse and forward reads with bwa mem. Reverse and forward reads are 120 nucleotid long and they cover a 180 nucleotid long part of a genome, hence they overlap.

bwa mem  $REF $file1 $file2 -t 20 > $sam

When I open the sam output file, the first lines begin like this:

M00135:404:HBJFESJSN:2:1101:2016:1297   53      ref   ...
M00135:404:HBJFESJSN:2:1101:2016:1297   133     ref   ...
M00135:404:HBJFESJSN:2:1101:2646:1297   53      ref   ...
M00135:404:HBJFESJSN:2:1101:2646:1297   133     ref   ...

For every pair, I have the two lines aligned to the reference from the two directions ( I know, this is the normal output). Is it possible to combine reverse and forward reads to one sequence, thus getting a 180 nucleotid long alignment for each pair?

Many thanks!

EDIT: sorry for not being clear, I would like to merge pairs after alignment is done.

seq bwa paired-end • 4.0k views
ADD COMMENT
2
Entering edit mode
7.8 years ago

BBMerge can do this :)

ADD COMMENT
0
Entering edit mode

Thanks. I have used pandaseq for this problem as well, but I would like to merge sequences after alignment, not before... I am sorry, I was not clear on this.

ADD REPLY
1
Entering edit mode

I'm not sure what you biological motivation is for this objective, but I'm completely against tampering with alignment data. Which problem are you trying to solve?

ADD REPLY
0
Entering edit mode

It would be just a trial. In a specific part of the sequence that we are interested in, there is a large number of mutations/sequencing error (it was a random sequence, but it was not supposed to be that random). I just wanted to be sure that it is not caused by some weird behaviour of pandaseq that I am not aware of before continuing with further analysis. But I could totally accept if that's unusual, I will find an other way to confirm it (eg by running bbmerge and comparing the results). Thanks for help!

ADD REPLY
0
Entering edit mode

I would also like to do this, and yes, after alignment, because I am using a downstream application that needs a merged PE format, but the alignments contain < 1% of the total original fastq reads, and it will be much more efficient to merge only the aligned reads. Did you try using aftermerge? How did it go?

ADD REPLY
2
Entering edit mode
7.8 years ago

I just saw this tool by chance, but obviously I have no idea how well it works.

ADD COMMENT

Login before adding your answer.

Traffic: 2345 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6