bam processing before variant calling
1
1
Entering edit mode
9.6 years ago
user230613 ▴ 380

Hi all,

I've a general question about if it is recommended to process bam file before variant calling process. I'm going to use samtools for the call and my question is if I should make some treatments to bam file before that. For example: remove not primary alignment reads (flag 256), remove suplementary alignments (flag 2048), get only those reads mapped in proper pair (flag 2), or remove reads by quality 30...
Are these steps recommended before variant calling? I've read that PCR duplicates must be removed before the call, but I'm curious about if I should apply these other "filters".

Thanks in advance

variant calling • 5.1k views
ADD COMMENT
0
Entering edit mode

FYI, samtools mpileup will already ignore marked duplicates and supplementary alignments. The remainder of what you mentioned can simply be passed in as options, rather than needing to explicitly preprocess things.

ADD REPLY
0
Entering edit mode

Hi Devon, do you mean that mpileup intrinsically will discard (or not take into account) those reads with not primary flag, or supplementary flag, or proper pair... Or should I specify to mpileup these conditions as arguments?

Thanks

ADD REPLY
2
Entering edit mode

It defaults to ignoring those, unless you explicitly instruct it otherwise (see the --ff option). There's no need to do any of the preprocessing you mentioned with samtools mpileup. For the mapq score, just specify -q 30. You can also tell samtools to only use properly paired reads by specifying --rf 2.

ADD REPLY
1
Entering edit mode
9.6 years ago
Fabio Marroni ★ 3.0k

It depends on what kind of variants you want to call.

I will give you my opinion for SNPs and PAVs, assuming you have very high coverage. For CNVs I usually use the same rules as SNPs.

SNPs:

  • remove not primary alignment reads: YES
  • remove suplementary alignments: YES
  • get only those reads mapped in proper pair: YES
  • remove reads by quality: YES (but I think most SNP callers would do that anyway)

PAVs:

  • remove not primary alignment reads: YES
  • remove suplementary alignments: NO
  • get only those reads mapped in proper pair: NO
  • remove reads by quality: YES (but I think most SNP callers would do that anyway)

Hope this helps

ADD COMMENT
0
Entering edit mode

Thank you Fabio. Sorry I've not mentined in the question, I'm interested in calling SNPs and indels. For indels the steps are the same as in SNPs?

ADD REPLY
0
Entering edit mode

For small indels (few bp), yes.

ADD REPLY

Login before adding your answer.

Traffic: 1802 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6