Question

Bowtie2 and BWA-MEM giving very different results in metagenomic data

1

Entering edit mode

5.0 years ago

Antonio Camargo ▴ 160

I've assembled a metagenome using MEGAHIT and begun testing different mapping options to perform the binning of the contigs. However, I've noticed that Bowtie2 and BWA-MEM had very different mapping rates to the metagenome:

Bowtie2:

182783328 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
55100932 + 0 mapped (30.15% : N/A)
182783328 + 0 paired in sequencing
91391664 + 0 read1
91391664 + 0 read2
46385654 + 0 properly paired (25.38% : N/A)
49135024 + 0 with itself and mate mapped
5965908 + 0 singletons (3.26% : N/A)
2364578 + 0 with mate mapped to a different chr
1755482 + 0 with mate mapped to a different chr (mapQ>=5)

BWA-MEM:

184716120 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
1932792 + 0 supplementary
0 + 0 duplicates
116629821 + 0 mapped (63.14% : N/A)
182783328 + 0 paired in sequencing
91391664 + 0 read1
91391664 + 0 read2
93117324 + 0 properly paired (50.94% : N/A)
107719404 + 0 with itself and mate mapped
6977625 + 0 singletons (3.82% : N/A)
14496150 + 0 with mate mapped to a different chr
11129016 + 0 with mate mapped to a different chr (mapQ>=5)

BWA-MEM mapped way more reads than Bowtie2. As the metagenome was assembled using those reads, I think that Bowtie2 mapping only 30% of them is quite strange.

What may be causing this difference? As the .bam file will be used for binning, using the output of one tool or the other will greatly affect downstream analysis.

Thanks!

metagenomics bwa bowtie genomics alignment • 4.1k views

ADD COMMENT • link 5.0 years ago by Antonio Camargo ▴ 160

0

Entering edit mode

Can you post the bowtie2 and bwa commands used?

ADD REPLY • link 5.0 years ago by h.mon 35k

0

Entering edit mode

Sure! They were both executed with the default parameters.

bwa mem -t 80 bwa_index reads_1.fastq.gz reads_2.fastq.gz | samtools view -bS - > bwa.bam
bowtie2 --threads 80 -x bowtie_index -1 reads_1.fastq.gz -2 reads_2.fastq.gz | samtools view -bS - > bowtie2.bam

ADD REPLY • link 5.0 years ago by Antonio Camargo ▴ 160

0

Entering edit mode

This might be an ignorant question, but why assemble before you map? Are the two not mutually exclusive?

ADD REPLY • link 4.2 years ago by robert.murphy ▴ 90

0

Entering edit mode

Because I didn't have a metagenome to map to :)

ADD REPLY • link 4.1 years ago by Antonio Camargo ▴ 160

score 2 · Answer 1 · 2019-11-24

2

Entering edit mode

5.0 years ago

ATpoint 85k

I guess a fair comparison would require to run bowtie2 in --local mode as its default is end-to-end, whereas bwa mem defaults (afaik) with local / soft-clipped alignments.

ADD COMMENT • link 5.0 years ago by ATpoint 85k

2

Entering edit mode

I think you are right. I knew that BWA-MEM uses soft-clipping, but I've never checked whether that's the case for Bowtie2. Indeed, Bowtie2 is end-to-end.

I'll do a test and post the result here.

ADD REPLY • link 5.0 years ago by Antonio Camargo ▴ 160

0

Entering edit mode

I don't think this is the main source of the problem (even though it certainly contributes to it). When I use local alignment the % of mapped reads goes from 30.15% to 42.56%. It's a large increase, but it's still far from BWA-MEM.

182783328 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
77797217 + 0 mapped (42.56% : N/A)
182783328 + 0 paired in sequencing
91391664 + 0 read1
91391664 + 0 read2
66792824 + 0 properly paired (36.54% : N/A)
70257200 + 0 with itself and mate mapped
7540017 + 0 singletons (4.13% : N/A)
3182446 + 0 with mate mapped to a different chr
2437169 + 0 with mate mapped to a different chr (mapQ>=5)

I'm getting similar results for other metagenomic datasets that I'm analyzing.

ADD REPLY • link 5.0 years ago by Antonio Camargo ▴ 160