Question

Is There A Tool (Samtools, Picard ..) To Filter Certain Flags In Bam File ?

4

Entering edit mode

13.1 years ago

madkitty ▴ 690

Is there a Picard Tool to filter one or more reads through the following :

FLAG (second column)
Mapping Quality (MAPQ 5th Column)
TLEN (9th Column)

??

samtools bam • 9.8k views

ADD COMMENT • link updated 9.4 years ago by Biostar 20 • written 13.1 years ago by madkitty ▴ 690

0

Entering edit mode

11.4 years ago

mathieu.bahin ▴ 90

How can I filter on a string in a "subtag" ?

I have an annotated BAM file (from CRAC) with an 'XE' tag. For example, the tags are:

XE:Z:0:0:Error:Unknown:1:15.0049

or

XE:Z:0:0:chimera:32:12|-1,35194979:12|-1,34872998

I would like to get only the chimeras but I can't figure out how to use 'bamtools filter' correctly (in order to be quicker than a 'grep').

Anyone knows how to do ?

Cheers

ADD COMMENT • link 11.4 years ago by mathieu.bahin ▴ 90

score 7 · Accepted Answer · 2012-06-25

Hypothetical scenario: Suppose you'd want to get all reads with flag 83=paired reads|first in pair|read reverse mapped| proper pair reads with mapping quality > 30 from a bam file and return output to a bam file:

samtools view -h input.bam | awk 'substr($0,1,1) == "@" || $2 == 83 || $5 >= 30 {print}' | samtools view -bS - > output.bam

How it works?:

samtools view -h input.bam - decodes bam and prints sam file with header to screen
awk 'substr($0,1,1) == "@" - keep all header lines
|| $2 == 83 || $5 >= 30 {print}        - the actual condition you would want to check. for unmapped reads you could use GAWK with bitwise operation and($2, 4) == 4 || and($2, 16) == 16, or something like that.
samtools view -bS          - pipe input S=sam to output b=bam and the "-" mediates samtools to take input from the screen
> output.bam               - simple unix redirection to output file

Basically, you can use the awk line to filter anything you want.

score 7 · Accepted Answer · 2012-06-25

7

Entering edit mode

13.1 years ago

Fedor Gusev ▴ 210

You can use BamTools with its filter utility. You need to write a .json file like this one:

{
"isMapped" : "true",
"mapQuality" : ">50",
"isPaired" : "true"
}

And run command

bamtools filter -in in.bam -out out.bam -script filter.json

See a tutorial here.

I don't see a way to extract TLEN with it, though.

ADD COMMENT • link 13.1 years ago by Fedor Gusev ▴ 210

1

Entering edit mode

(+1) haven't seen that. thanks.

ADD REPLY • link 13.1 years ago by Arun 2.4k

0

Entering edit mode

Will try it out. Thanks a lot :)

ADD REPLY • link 11.1 years ago by madkitty ▴ 690