I'm working on encoding variant types in a VCF so that each variant has a single entry i.e. 0|0 becomes homozygous_ref and 1|1 becomes homozygous_alt.
I've not seen heterozygous variants subclassed into whether one allele is reference or not and wondered if this information is significant for a model that might be fitted to the categorised data. Are there cases where there is likely to be a significant difference between 0|1 variants and 1|2 variants e.g. in tumour suppression?
Thanks in advance for any answers/opinions
Matt