Question About Vcf4.1 File
2
0
Entering edit mode
10.9 years ago
Mathew Bunj ▴ 40

I have a VCF version 4.1 which has

CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    Normal   Tumor 
1      11049    .    G    .    .    .       NS=2;AN=2       GT:PS    0/0:.      ./.:.

Where can I find the meaning of these: 0/0:., ./.:.

I read http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41 and could not make out. Any pointer will be helpful.

vcf • 3.3k views
ADD COMMENT
3
Entering edit mode
10.9 years ago

0/0 means both alleles have the reference genotype, while ./. means that it's unknown. The PS value "." means unphased (which is also why the genotype is 0/0 instead of 0|0 or something like that.

ADD COMMENT
0
Entering edit mode

Thanks, that is helpful. Is there a documentation which I may keep on referring in future.

ADD REPLY
0
Entering edit mode

I actually got all of that information from the link you provided :oP Having said that, the specification is now on github, so you'd might as well look at it there in the future. If you're not that comfortable reading these sorts of specs (understandable, depending on your background) you might be best off searching biostars for other examples with explanations.

ADD REPLY
0
Entering edit mode

what does it meaning unphased or phased?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode
9.1 years ago

in that page, click on pdf version of 4.1 spec. A pdf will load (http://samtools.github.io/hts-specs/VCFv4.1.pdf). In pdf, navigate to section 1.4.2 (Genotype fields).

0/0 means diploid sample is genotyped, genotype is unphased and both the alleles are same as those of reference.

./. means diploid sample is missing genotype information (i.e either missing call).

copy/pasted from pdf:

GT : genotype, encoded as allele values separated by either of / or |. The allele values are 0 for the reference allele (what is in the REF field), 1 for the first allele listed in ALT, 2 for the second allele list in ALT and so on. For diploid calls examples could be 0/1, 1 | 0, or 1/2, etc. For haploid calls, e.g. on Y, male non-pseudoautosomal X, or mitochondrion, only one allele value should be given; a triploid call might look like 0/0/1. If a call cannot be made for a sample at a given locus, '.' should be specified for each missing allele in the GT field (for example './.' for a diploid genotype and '.' for haploid genotype). The meanings of the separators are as follows (see the PS field below for more details on incorporating phasing information into the genotypes):

/ : genotype unphased
| : genotype phased

ADD COMMENT

Login before adding your answer.

Traffic: 1981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6