Hi!
This might sound completely stupid, lazy and silliest question. But please help in understanding these acronyms of VCF file. I did had a look at the format pdf of VCF file and I got further confused. So, I have my data in a VCF file version 4.2:
chr1 10043 . T C 21 PASS 3 GT:DP:FT:GQ:GL 1/0:8:PASS:20:-2.0739,-0.00379451,-27.336
chr1 10055 . T G 4 PASS 3 GT:DP:FT:GQ:GL 0/0:13:PASS:31:-0.00031839,-3.6121,-54.5012
chr1 10105 . A C 7 PASS 3 GT:DP:FT:GQ:GL 1/0:45:PASS:13:-1.58548,-0.0193946,-153.729
Now, what does GT here represent for, it stand for Genotype and the different values that I have for GT, what does that represents and consequence can they have on a SNP call? Can I also determine if the mutation is one copy or in both copies of gene ? If yes how?
For GQ, I guess it is a phred score with -10log10p and higher the better?
The genome used is hg19.
Please help.
Thank you
Hi,
Thank you for the answer, since I am making rules to identify homozygous and heterozygous calls in my data. I have following combinations in my GT field.
I know first is homozygous call and second is heterozygous call, but what about the others? Could you please help.
Hi,
I have some GT that is ./. Do you know what this means? I am assuming no reads align to this position and we do not know what the genotype is there? Best, Thanks. C.
./. means that there is not enough information. It depends on the thresholds you set when you call variants: say that you want at least coverage 10 to call a variant and you have only 8 reads all on one strand, this is not easy to judge by the algorithm so places there a ./. (or at least this is my understanding of this issue).
I remember it's written somewhere here: https://samtools.github.io/hts-specs/VCFv4.2.pdf
Hi, sometimes i see 1/. or ./1 or 0/. or ./0 What does it mean?
In your case you called variants for 3 different lines on 1 reference, thus you have numbers ranging from 0 to 3. Whenever the numbers are equal you have a homozygous call, whenever they're not you have a heterozygous call.
If the type is GT does it means it's only SNPs? How can I know that?
If you called the variants, you will see which kind of variant that is by looking in the field number 8 of your VCF file (INFO field).