Read filtering strategy and BWA
2
0
Entering edit mode
7.9 years ago
Picasa ▴ 650

Hi,

I have a list of sequences that are contaminant. I would like to map my reads to these contaminants and then extract the unmapped reads .

For that I want to use BWA. But I have a doubt with an option.

Should I use the -M option (mark shorter alignment as secondary) ? because I use the flag -f 4 and -f 12 to extract unmapped reads so I am wondering if the secondary flag will be treat as unmapped.

bwa contamination • 3.0k views
ADD COMMENT
2
Entering edit mode

A read with secondary alignment is mapped more than once, meaning that secondary alignment and unmapped are by definition not compatible.

ADD REPLY
2
Entering edit mode
7.9 years ago

For contaminant filtering, I find it is beneficial to remove both reads in a pair if either maps to the contaminant genome. For that you can use BBMap like this (assuming the reads are interleaved in a single file; if not you can use "in1" and "in2"):

bbmap.sh in=reads.fq ref=contam.fasta outm=bad.sam outu=good.fq

Here, "good" will have all the paired reads in which neither mapped to the reference, and they will still be properly interleaved with their original names, which will not be the case if you use samtools to do the filtering.

ADD COMMENT
0
Entering edit mode
7.9 years ago
vmicrobio ▴ 290

First align your reads and get your aligned reads

samtools view -b -F4 in.bam > out_mapped.bam

then convert your bam in fastq (with bamtools for example)

and re-align with your list of contaminant and remove aligned reads

samtools view -b -f4 in.bam > out_unmapped.bam
ADD COMMENT

Login before adding your answer.

Traffic: 3025 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6