How to interpret genotypes with DP=1 for a vcf file
0
1
Entering edit mode
7.6 years ago

Dear all, I used samtools for SNPs calling and vcftools for SNPs filtering. I got a vcf file with a lot SNPs.How to interpret the genotypes when DP=1? In my opinion, DP=1 means that this site has only one read, it could be homozygous 0/0 or 1/1, but how can it be a heterozygous 0/1?

The following is what I have observed in my vcf file.

Thanks for your attentions! Please help!

GT:PL:DP:SP:GQ 0/0:0,3,36:1:0:4

GT:PL:DP:SP:GQ 0/1:0,3,36:1:0:4

GT:PL:DP:SP:GQ 1/1:0,3,36:1:0:4

SNP sequencing next-gen • 3.9k views
ADD COMMENT
0
Entering edit mode

It can be heterozygous if you find one read with the alternative base (instead of the reference base). But do you trust these DP=1 calls? I mean 1 read is pretty minimal.

Edit: I see your point, you mean with one read of the alt allele it could be both homozygous or heterozygous.

ADD REPLY
0
Entering edit mode

Thanks for your reply. I will not trust the lower DP SNPs. How can one read possesses an alternative base? I also have another question, how can we know the depth for each allele for a heterozygous site?

ADD REPLY
0
Entering edit mode

I am not sure about your method, I never used vcftools for this. When using varscan after samtools, I get more info than you get: e.g.,

GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR
1/1:117:31:21:0:21:100%:1.8578E-12:0:25:0:0:5:16

With other meaning for DP as well:

##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=SDP,Number=1,Type=Integer,Description="Raw Read Depth as reported by SAMtools">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Quality Read Depth of bases with Phred score >= 15">
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Depth of reference-supporting bases (reads1)">
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="Depth of variant-supporting bases (reads2)">
##FORMAT=<ID=FREQ,Number=1,Type=String,Description="Variant allele frequency">
##FORMAT=<ID=PVAL,Number=1,Type=String,Description="P-value from Fisher's Exact Test">
##FORMAT=<ID=RBQ,Number=1,Type=Integer,Description="Average quality of reference-supporting bases (qual1)">
##FORMAT=<ID=ABQ,Number=1,Type=Integer,Description="Average quality of variant-supporting bases (qual2)">
##FORMAT=<ID=RDF,Number=1,Type=Integer,Description="Depth of reference-supporting bases on forward strand (reads1plus)">
##FORMAT=<ID=RDR,Number=1,Type=Integer,Description="Depth of reference-supporting bases on reverse strand (reads1minus)">
##FORMAT=<ID=ADF,Number=1,Type=Integer,Description="Depth of variant-supporting bases on forward strand (reads2plus)">
##FORMAT=<ID=ADR,Number=1,Type=Integer,Description="Depth of variant-supporting bases on reverse strand (reads2minus)">
ADD REPLY
0
Entering edit mode

DP4 field has information about reads that support: reference positive strand, reference negative, alternative positive, alternative negative

But check out documentation if it's in that order

ADD REPLY

Login before adding your answer.

Traffic: 1979 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6