How To Filter The Snps/Indel Results From Mpileup?
3
1
Entering edit mode
13.3 years ago
Haiping ▴ 110

Hi. I trid to use mpileup to identified the SNP/Indels. The commands I used were:

samtools fillmd –bAr sample.sorted.bam ref.fa > sample.sorted.baq.bam

samtools mpileup -uf ref.fa aln.bam | bcftools view -bvcg - > var.raw.bcf

bcftools view var.raw.bcf | vcfutils.pl varFilter -D 100 > var.flt.vcf

I saw some of the SNPs that Qual value that low than 20 and Indels that low than 50.The problem is that I don't know if I can just trust this output resutls ar I still need to filter the resutls to make it reliable. If I need to filter, what kand of rule I should use? Thanks a lot.

Here are some of the results:

   1. 1102 . C T 3.01 . DP=10;AF1=0.4997;AC1=1;DP4=1,7,2,0;MQ=60;FQ=4.77;PV4=0.067,1,1,1 GT:PL:GQ 0/1:30,0,147:28
   2. 14689 . A G 18.1 . DP=10;AF1=0.5;AC1=1;DP4=0,6,0,4;MQ=60;FQ=21;PV4=1,1.1e-05,1,1 GT:PL:GQ 0/1:48,0,116:51
   3. 9373 . C A 44 . DP=10;AF1=0.5;AC1=1;DP4=0,6,4,0;MQ=60;FQ=47;PV4=0.0048,0.014,1,1 GT:PL:GQ 0/1:74,0,115:77
   4. 6427 . T TT 14.6 . INDEL;DP=9;AF1=0.5025;AC1=1;DP4=0,1,0,2;MQ=56;FQ=-14.7;PV4=1,1,0.12,1 GT:PL:GQ 0/1:52,0,20:23
   5. 314 . AA A 18.5 . INDEL;DP=9;AF1=0.5;AC1=1;DP4=4,0,4,0;MQ=60;FQ=18.5;PV4=1,0.0011,1,1 GT:PL:GQ 0/1:56,0,56:56
   6. 6068 . GATTAG G 214 . INDEL;DP=9;AF1=1;AC1=2;DP4=0,0,2,7;MQ=60;FQ=-61.5 GT:PL:GQ 1/1:255,27,0:51
mpileup • 9.3k views
ADD COMMENT
1
Entering edit mode
13.3 years ago
Swbarnes2 ★ 1.6k

Your depth of coverage is kind of low. Also, most of these SNPs displayed are mixed. So it's possible they are real, but I'd be skeptical. (But I don't have much empirical sanger to back that claim up) The last one, that has a decent quality score, because it has 9 reads that all agree that's an indel.

ADD COMMENT
0
Entering edit mode

Thanks for you response. The average depth of my data were near 30. Here are just part of the resutls. I just wonder that do I need to filger some of the results like that for 1,2,4 and 5 since the qual are less than 20 for SNP and 50 for Indels.

ADD REPLY
0
Entering edit mode

Thanks for you response. The average depth of my data were near 30. Here are just part of the resutls. I just wonder that do I need to filter some of the results like that for 1,2,4 and 5 since the qual are less than 20 for SNP and 50 for Indels

ADD REPLY
1
Entering edit mode
13.2 years ago
Travis ★ 2.8k

The Broad Institute would probably recommend filtering based on recalibrated variant scores for a relatively low coverage experiment like this. Have a look here.

ADD COMMENT
1
Entering edit mode
12.7 years ago
Leszek 4.2k

Beside mentioned filtering, I often discard calls that are confirmed by alignments from one strand only as this is likely due to sequencing errors. Have a look at this discussion.

ADD COMMENT

Login before adding your answer.

Traffic: 2386 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6