TCGA documentation for the MAFs (mutation annotation files) have Allele1 and Allele2 versions of similar values; what is the difference between these alleles?
For instance, I am interested in Tumor_Seq_Allele but not sure whether to use Tumor_Seq_Allele1 or Tumor_Seq_Allele2. I am looking at data for ACC (Adrenocortical carcinoma) tumors, and for many of the mutations, Tumor_Seq_Allele1 is actually the same as the reference allele. Which one is the actual mutation?
Due to the fact that we are diploid organisms, we can have a somatic variant and 'reference' variant at the same position in a tumour sample, no? In fact, in a bulk tumour tissue, provided there is enough heterogeneity, we should see many positions that have multiple different mutations at the same position.
Be careful about your interpretation of the 'reference' base. Is it reference based on its presence in the reference genome, or is it the wild-type allele in the patient.