Hi All,
Please could someone help me with interpreting these allele values in the VFC file below.
# [1]CHROM [2]POS [3]REF [4]ALT [5]ALT [6]QUAL [7]DP [8]RO [9]AO [10]Par-1_DHT02696-8_L6:GT [11]Par-1_DHT02696-8_L6:DP [12]Par-1_DHT02696-8_L6:RO [13]Par-1_DHT02696-8_L6:AO [14]Par-1_DHT02696-8_L6:AO [15]Par-2_DHT02696-9_L6:GT [16]Par-2_DHT02696-9_L6:DP [17]Par-2_DHT02696-9_L6:RO [18]Par-2_DHT02696-9_L6:AO [19]Par-2_DHT02696-9_L6:AO
1.Chr1A 61556 G A . 26.7544 4 2 2 0/1 4 2 2 . . . . . .
2.Chr1A 95880 C T . 57.0319 2 0 2 1/1 2 0 2 . . . . . .
3.Chr1A 1156169 G T . 1.59189e-14 90 88 2 0/0 35 33 2 . 0/0 55 55 0 .
4.Chr1A 1159646 G A . 0.0185916 162 149 13 0/0 67 67 0 . 0/1 95 82 13 .
5.Chr1A 1940398 TG CG CA 306.879 27 12 8 1/2 8 0 1 7 0/1 19 12 7 0
Here, I aligned two bam files (one from Par-1 and second from Par-2) to Reference using freebayes. Initial aim was to have VFC file with three columns for REF, Par-1 and Par-2, but as you can see the third column is mostly empty (why?). Anyway, I tried to understand it. So, I wonder if someone could help me with allele values 0/0, 0/1, 1/1, 1/2.
Does line-1 say that REF is "G", Par-1 (ALT-1) is "A". If so, what about Par-2 represented by many dots? Why in line-2 allele value is 1/1? Does line3 say that REF is "G", Par-1 (ALT-1) is "T"? and what about Par-2? The same question for line-4 and line-5.
Thanks a lot Kanat
Thanks for your quick response.
So, it means line-1 should look like "Chr1A 61556 G A G (instead of dot)". What about 1/2 in the line-5? Sorry for stupid question, can you give an example for substitution in both alleles 1/1? In my understanding 0/1 is GG vs AA(line-1). Par-1 and Par-2 are different samples and refers to Parent -1 and Parent-2.
Unless you are dealing with bacteria, you would have two copies of the same gene (plants - could be more, see "ploidy") 0/0 == G | G (G in copy one and G in copy 2, thinking humans: "one from mum and one from dad", unless you are studying something like X chromosome, then only females will have two copies... ) 0/1 == A | G 1/1 == A | A
1/2 confuses me a bit, but if your variant caller was running two samples simultaneously, it might have a notation for "alt in both samples", but alternatively it might refer to the depth of coverage (how many reads support the ref /how many for alt, but I think former rather then the latter.