Difference "./." vs "0/0" In VCF files
2
4
Entering edit mode
12.6 years ago
michealsmith ▴ 800

I used GATK to call SNP/indel. And I called for five members of trio together.

Result is like below:

G    A    89.43    PASS    AB=0.427;AC=2;AF=0.33;AN=6;BaseQRankSum=-1.468;DP=12;Dels=0.00;FS=0.000;HRun=1;HaplotypeScore=0.6373;MQ=60.00;MQ0=0;MQRankSum=0.536;QD=12.78;ReadPosRankSum=1.134;SB=-26.82;set=variant2    GT:AD:DP:GQ:PL    ./.    0/1:2,3:5:67.41:102,0,67    ./.    0/1:1,1:2:25.53:26,0,36    0/0:5,0:5:15.05:0,15,200

I'm just curious, what's the difference between "./." and "0/0" ?

They both indicate there's no alternative allele, right? Then why use two different symbols?

genotyping vcf • 10k views
ADD COMMENT
23
Entering edit mode
12.6 years ago

They are very different. 0/0 means that sufficient data was available in the alignments to conclude the absence of alternate alleles. That is, the alignments made it clear to the caller that only reference alleles were present. In contrast, ./. means that there was insufficient data to make any conclusions about that individual's genotype. This typically happens when there are very few (if any) reads aligned for that individual.

It may seem to be a trivial difference, but the absence of data is very different from the absence of alternate alleles. For example, if you wanted to compute the minor allele frequency for this variant, each allele in the 0/0 genotype should be added to the denominator, while the alleles from the missing (./.) genotypes should be omitted, as one is uncertain. Clearly, this changes the resulting frequency calculation.

This is addressed in the VCF spec: see the "Genotype fields" section.

ADD COMMENT
0
Entering edit mode

this is very helpful. thx. Then how to deal with these "./." ? Just disgard them? I'm doing trio analysis, to compare, say five family members within a trio..

ADD REPLY
1
Entering edit mode

They are just missing information, so you just need to handle that in a manner appropriate to the interpretation of your experiment. Sorry if that is vague, but I am not sure what else to say.

ADD REPLY
0
Entering edit mode

One option is to use one of the imputation pipelines to fill in the missing SNPs based on the local haplotype. Have a look at Beagle, impute2 and other similar pipelines. Beagle at least should be able to use the Trio/Pedigree structure which could help alot.

Also it won't work for filling in denovo variants but for your family it should do a pretty good job of filling in shared missing variants.

I would note though in the example your showing all the individuals have pretty low read depth coverage of that site. If you've only got half a dozen individuals and very low coverage even imputation will have issues if your missing too many vars.

ADD REPLY
0
Entering edit mode
10.5 years ago
stat.1405 ▴ 30

I faced like your problem and I found that every ./. corresponding to DP=0

so, it is because there is no read depth at this GT

ADD COMMENT
0
Entering edit mode

In my VCF files, some genotypes are coded as 0/3 or 0/5 instead of 0/1.

What does it mean? does it mean something else other than just another heterozygote call?

ADD REPLY
1
Entering edit mode

This happens when there is more than one alternative allele.

Suppose your reference is A and the most common alt. allele is C, but G and T can also occur.

Then,

AC -> 0/1
AG -> 0/2
AT -> 0/3

I believe this is common in indels.

ADD REPLY
0
Entering edit mode

What organism are you working with?

ADD REPLY

Login before adding your answer.

Traffic: 1728 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6