Hi, as it may seem for my question I'm a newbie at dealing with all the too many databases for sequences. The situation: I have some .maf with mutations of a given cancer, let's say breast cancer, and a given gene, TP53. The .maf clearly says where the mutations start and end. (I'm just interested in point mutations)
The point is that I want to construct a mutated sequence from this mutation data, using the reference sequence a a template, but there are so many different transcripts, so basically I just want to know which one does TCGA uses for references. Is it the whole gen? Or just the exons?
Thanks in advance
If you want to see the mutation effect of protein, you have to choose exon regions (transcript) from direct splicing or transcripts derived from alternative splicing, try to see if there are mutations. In case branch points, intron exon donor acceptor sites also crucial, since the mutations in these region could affect the splicing. Read this: A: How to analysis mutations effects bioinformatically? and this A: Allele frequency visualization
This does not answer the original question, pltbiotech_tkarthi