Help with BWA: how to align paired as single and then combine again
0
0
Entering edit mode
7.0 years ago
simplitia ▴ 130

Hi I'm trying to follow a work flow and I'm confused by instruction.

This is what the instruction says,

- Map reads with bwa as single end sequences then combine single reads into one file

How does one combine single reads into one file? What tried so far is this.

bwa mem -t 4 hg19 Kf_1.fq.gz  > Kf_1.sam;
bwa mem -t 4 hg19 Kf_2.fq.gz > Kf_2.sam;

I then tried to combine the sam file with picard,

java -jar picard.jar MergeSamFiles I=Kf_1.sam I=Kf_2.sam O=/sc/snpir.out/KF.sam

but this does not work because I assume headers are completely the same for both files? I get this error here,

 Cannot add sequence that already exists in SAMSequenceDictionary:

Does anyone have an idea how I can go about this? perhaps I'm not reading the instruction correctly??

thanks in advance!

RNA-Seq alignment bwa • 3.3k views
ADD COMMENT
0
Entering edit mode

Map reads with bwa as single end sequences then combine single reads into one file

As written this does not make sense. If you aligned SE reads separately then you would have to combine the alignment files (not reads) into one as you are trying to do. Is this a published protocol somewhere that you can link?

ADD REPLY
0
Entering edit mode

yes its weird I have a hard time following it; I think the authors meant merge alingments. its from a package here; https://github.com/rpiskol/SNPiR the instructions are on the bottom steps 3 and 4

ADD REPLY
0
Entering edit mode

This is an odd workflow, why you shouldn't map the pairs as pairs? Is this from some paper, some software?

bwa mem -t 4 hg19 Kf_1.fq.gz Kf_2.fq.gz > Kf.sam;
ADD REPLY
0
Entering edit mode

I have the same question so I'm bit at odd but I'm sure the authors had a good reason. Its from a paper and here is the github: https://github.com/rpiskol/SNPiR

ADD REPLY
0
Entering edit mode

Try merging the .sam files with samtools:

bwa mem -t 4 hg19 Kf_1.fq.gz | samtools view -h -b -o - - | samtools sort -o Kf_1.bam;
bwa mem -t 4 hg19 Kf_2.fq.gz | samtools view -h -b -o - - | samtools sort -o Kf_2.bam;
samtools merge Kf.bam Kf_1.bam Kf_2.bam

And, if needed, convert bam to sam:

samtools view -h -o Kf.sam Kf.bam
ADD REPLY
0
Entering edit mode

thanks this looks good. One worry I have is would the merge bam still contain duplicate header info? If that is the case other tools from picard like MarkDuplicates is going to crash with the same error.

ADD REPLY
0
Entering edit mode

would the merge bam still contain duplicate header info?

I don't know, but if it does, try the suggestion from this SAMtools mail list message.

ADD REPLY
0
Entering edit mode

Thanks. This is going to be more complicated than I expected. I will try your method out first since its the most straightforward. I will post back here on an update.

ADD REPLY

Login before adding your answer.

Traffic: 1896 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6