Hi all,
I've a general question about if it is recommended to process bam file before variant calling process. I'm going to use samtools for the call and my question is if I should make some treatments to bam file before that. For example: remove not primary alignment reads (flag 256), remove suplementary alignments (flag 2048), get only those reads mapped in proper pair (flag 2), or remove reads by quality 30...
Are these steps recommended before variant calling? I've read that PCR duplicates must be removed before the call, but I'm curious about if I should apply these other "filters".
Thanks in advance
FYI, samtools mpileup will already ignore marked duplicates and supplementary alignments. The remainder of what you mentioned can simply be passed in as options, rather than needing to explicitly preprocess things.
Hi Devon, do you mean that mpileup intrinsically will discard (or not take into account) those reads with not primary flag, or supplementary flag, or proper pair... Or should I specify to mpileup these conditions as arguments?
Thanks
It defaults to ignoring those, unless you explicitly instruct it otherwise (see the
--ff
option). There's no need to do any of the preprocessing you mentioned with samtools mpileup. For the mapq score, just specify-q 30
. You can also tell samtools to only use properly paired reads by specifying--rf 2
.