Entering edit mode
4.3 years ago
bernardmcnamee
•
0
I need to get up to speed as the IT person for a new project and I think it will help if I understand what's behind the data as all the resources are bound up in genetics concepts / language.
bcftools query -f '%ID\t%REF\t%ALT[\t%GT][\t%TGT]\n' ID_0298249.vcf.gz | head -1
rs9701055 C T 0/0 C/C
If the C is the ref genome and the T is the sample genome, what is the 0/0 and C/C significance?
Not sure what do you mean by "significance", it is just notation for genotype (as far as you do have 2 copie of DNA in each cell). So both copies of the DNA in your example harbour reference variant which is C.
If the question was: how sure you are about the fact that you do not have a T in this position - you can not say it from current vcf. The reason is a lack any staticstic on mapping and variant colling steps (these fields depends on previouse data processing).
Source: https://samtools.github.io/hts-specs/VCFv4.2.pdf