Ideal samtools view f/F flags
2
2
Entering edit mode
9.6 years ago
by0 ▴ 110

I'm filtering my sam file using the following command:

samtools view -Sb -f 2 -F 256

after doing some research I think I've figured out what the flags mean (correct me if I'm wrong).

-f 2 is keeping only reads that are "properly aligned according to the aligner"

-F 256 is discarding "secondary alignments"

  1. I'm not really sure what "properly aligned" entails but is it maybe too stringent? Typically I've seen people use only -F 4 to discard unmapped reads.
  2. If I'm correct a secondary alignment is a second tier alignment in which the read that matches with a lesser score than some other read. If so, is discarding this reasonable or again too stringent?

I'm new to this so I'd really value some input on how to select these two flags. I've seen a wide variety/combinations of them used. My reads are paired end ones from WGS studies in yeast and E. coli and I'm using them to call SNPs/indels using freebayes.

Thank you

alignment samtools wgs next-gen-sequencing • 5.8k views
ADD COMMENT
0
Entering edit mode

What do you actually want to do with the resulting BAM files? It's quite possible that you don't need to do any filtering at all.

ADD REPLY
0
Entering edit mode

Ah forgot to mention. I'm using the bam for SNP calling using freebayes

ADD REPLY
3
Entering edit mode
9.6 years ago

By default, freeBayes ignores unmapped and secondary alignments, so there's no point in filtering them. I see little benefit in filtering improperly paired alignments, if there's are a bunch of them then they're likely correct.

ADD COMMENT
1
Entering edit mode
9.6 years ago
alolex ▴ 960

Please see the Biostars post What Does The "Proper Pair" Bitwise Flag Mean In A Sam File? to answer your first question. With regards to the second, from my understanding having secondary alignments in your SAM file depends on the settings used by the aligner. If the aligner was set to report only the top alignment you may not have any secondary alignments present. In general, before filtering your input you should make sure you understand the tool you are using and the input it requires. As @Devon Ryan said, freeBayes ignores secondary alignments anyway, so there is no need to do this possibly time-consuming step if that's the program you are using.

ADD COMMENT

Login before adding your answer.

Traffic: 1722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6