Exclude Reads With Xa Tag
2
0
Entering edit mode
11.7 years ago
fo3c ▴ 450

Hello there,

Does anyone know of a tool to remove reads that have alternative alignments? Similarly to filtering for specific flags, I would like to filter out all reads that have more than one alignment.

I am aware that depending on the aligner the XA tag might not be set. I used bwa, which by default does not set an XA tag if there are more than 3 alternative alignments. However, this should not be a problem since multiple alignments would result in a lower mapping quality (and can thus be removed on this basis).

A simple grep would work, but that could leave me with broken flags (filter out one read in a pair and not the other without adjusting the flag).

Any suggestions?

filtering samtools sam bam bwa • 8.0k views
ADD COMMENT
0
Entering edit mode

I don't remember if BWA reports it, but the NH tag reports the total hits, so NH:i:1 means unique hits.

ADD REPLY
4
Entering edit mode
11.7 years ago
matted 7.8k

It's a bit ugly as well, but for these types of tasks I usually identify the read names I want to filter via grep, then use grep again to remove both read pairs that match the name. This assumes that paired reads have the same name, which I think is usually the case.

For example:

samtools view in.bam | fgrep XA | cut -f 1 > bad_names.txt
samtools view -h in.bam | fgrep -vf bad_names.txt | samtools view -Sb - > out.bam

This would filter out a complete pair if either (or both) read has an XA tag, and leave the BAM consistent. If you'd prefer to keep a read whose mate has an XA tag but fix the flags, I think you'd have to do it yourself with something like pysam or Bio-Samtools (or just hack the flags and text on your own).

ADD COMMENT
1
Entering edit mode
11.7 years ago

The following java program will filter a BAM using javascript (rhino) and the picard library. It reads a BAM file/stdin and applies the javascript script to accept/reject each SamRecord. The samRecord is injected as "record" and the SamFileHeader is injected as "header" in the js context. To answer your question you can use the script:

record.getAttribute("XA")==null;

The java program, how to compile and a test file :

ADD COMMENT

Login before adding your answer.

Traffic: 2435 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6