Hi all
I have been on maternity leave for a year prior to which I was analysing TCGA RNASeq cancer data, which I downloaded, collated and analysed. I have been trying to run some new analyses now IM back at work but the gene symbol names have changed that Im looking at! I tried to run GSEA in WebGestalt and found a lot that couldnt be mapped and I assume this is because some gene symbols have changed. Now I am in a conundrum. I cant merely access a file of different gene ID types because these contain the new gene symbol IDs so I cannot programme it to look them up.
Has any one had this problem? And if so how did you overcome it? I was thinking Id need an archive gene ID conversion file so I could change the gene ID from gene symbol to something like ensembl and run that instead in my GSEA. Id like to avoid redownloading and reprocessing the data if possible!
Many thanks in advance
Can you post examples of what you are referring to? TCGA data moved to the new portal while you were gone and perhaps that may have something to do with this.
Yes, FAM38A is now Piezo1 for instance. When I ran the data through WebGestalt it couldnt map 1338 of the genes and eyeballing them it looks like their gene symbols have changed.