I have an excel spreadsheet I'm trying convert to VCF File. I've got most fields right but I can't figure out how to convert to homologous / heterogeneous SNPs to one of these:
What is the difference between these?
- 1/0 vs 0/1 ?? (I'm guessing no difference ???)
- 1/1 vs 0/0 ??
- 0|1 vs 0/1 ?? (1st method is haplotype specific, 2nd method is not? No linkage?)
I'm trying to read the manual but it's greek to me. Any help?
This is my input
Current version of handmade VCF
Given that all the information I have is "hom" or "het" what do I enter into that column? Any advice? My biologist PhD and myself are having problems interpreting the VCF manual.
Note that in the input you showed above, you also have a variant at chr7:128846328 that is not biallelic (there are three alleles in total, the REF ("GA", which gets GT index 0), ALT1 ("CT", which gets GT index 1), and ALT2 ("CA[obscured]", which gets GT index 2). So on the assumption that "het" in your column means that the sample has both of the ALTs and not the REF, you would need to use GT = "1/2".
Thank you!!!!!!
Why is
and not
?
They are both heterozygous
@Kevin Blighe - actually my questions are below your response (that you answered recently). The one that you answered was posted 3.5 years ago. Can i kindly request your help with the recent one
Hi @pierre Lindenbaum @Len Trigg @jnowacki,
May I check with you as I am new to this domain and found this thread useful
Yes, but you have to be aware that the 'alternate' allele is with respect to the reference genome, and there is no ideal reference genome ( see A: Alternate nucleotide is more frequent than reference nucleotide. OMG I'm dizzy. )
Where have you seen these terms? - homvar and homalt probably relate to the same thing, i.e., 1/1, i.e., a homozygous variant call. Any source quoting them should state which is the ref and alt allele.
Multi-allelic site, where 2 alternate bases have been identified. This can occur in multi-sample studies or in disease conditions, like cancer. In a VCF, you'd see these listed like this:
So, this individual has genotype TC
No - incorrect. See answer for q2