Entering edit mode
8.4 years ago
taylorzheng.zz
•
0
When I was downloading point mutation data for a bunch of genes from cBioportal, I got some .tsv files. In each file, the variation of each amino acid of a gene is represented like "R435Y". I was confused with which protein sequence was selected as the reference sequence as there are so many isoforms for each gene, were they selected the longest one as the reference? I'm not quite clear about the procedure that TCGA or cbioportal used to analysis all kind of mutation data. So I'm looking forward to someone's help through which I can understand how this work and resolve the problem quickly. Many thanks.