Question

How to Align to multiple reference genomes -> Discard multiply mapped reads?

0

Entering edit mode

5.7 years ago

nattzy94 ▴ 60

Hi,

I have reads containing E. coli, K. pneumoniae and GAPDH spike-in. I would like to align these reads to the 3 genomes and then discard reads that map to more than one of the references.

So far, I have concatenated all 3 fasta files into 1 composite genome. I have then used

bwa mem -c 1 <composite_genome.fasta> <Sample_x_R1> <Sample_x_R2>.

Can I check if this would be the correct way to do this or have I gone wrong somewhere?

bwa cmd alignment • 1.9k views

ADD COMMENT • link updated 5.7 years ago by h.mon 35k • written 5.7 years ago by nattzy94 ▴ 60

score 0 · Answer 1 · 2019-03-11

0

Entering edit mode

5.7 years ago

h.mon 35k

BBSplit is a tool specifically designed with your goal in mind. See the tool announcement and the online documentation.

ADD COMMENT • link 5.7 years ago by h.mon 35k

0

Entering edit mode

Hey h.mon,

Thanks for the suggestion. I have used BBsplit to generate the fastq files that mapped to the corresponding genomes. Just to check, in order to get the number of mapped reads, do I just convert the .fastq files to .bam files and use samtools?

ADD REPLY • link 5.7 years ago by nattzy94 ▴ 60