First off sorry if this question makes no sense but I will try to give as much information as possible.
I am trying to analyze the differential expression of a non long coding RNA.
Currently cbioportal does not recognize ENsembl gene name for this LNCRNA at all so that kills the quick and dirty method of doing this.
I know the exon coordinates of the LNCRNA from Ensemble so I decided to download the ExonQuantification text files from firehose. Browsing through the file I was able to locate the exons for my gene with their quantification (RPKM). There are 4 exons each with different RPKM values. The gene is not in the Gene_normalize files at all so I am guessing that since the data was aligned using GRCh37 and the gene was not annotated yet.
So I can use NCBI remap to remap the Exons from GRCh37 to GRCh38.
Is there a way to annotate the exon quantification file after I remap it to GRCh38. So instead of having a file with different exon coordinates and their quantification I will end up with a file with gene names or geneid or ENsemble id and the quantification. If such a thing is possible how does it work with genes with multiple exons with different RPKM. Does it just average them or take the max or will there just be multiple entries for the gene in the file?
The other strategy I am trying is to download the raw RNA seq files using gene torrent , try aligning it and annotating it myself. That would be stretching to the limits of my meager abilities.
Any help at all is very appreciated.
Hi, I know this is an old post but I came about it as I have a similar problem. I'm dealing with level 3 exon quantification data and I'm trying to map the co-ordinates to transcript ID. The problem I'm having is that some of the co-ordinates aren't annotated or even in the gtf file. I know the TCGA uses a gaf file but I need to use a gtf file for further analysis. I was wondering if you had any suggestions?