I'm new to NGS data analysis. Recently I am dealing with various types of TCGA cancer data from Genomic browser of UCSC. I found some discrepancies between TCGA data and well-known published data when I performed cancer survival analysis by expression level of certain gene.
I know that the RPKM values of TCGA data have been normalized. I still wonder if we can get more reasonable data when the RPKM value of certain gene are further normalized by that of a reference gene, say, GADPH. Just like what we do when we analysis the results gained from relative quantification in real-time PCR.
Does anyone think it necessary?
I think that this normalization is sufficient (at least for genes that are not very short) and that you do not need further normalization. Note that GAPDH itself is altered in cancer http://www.ncbi.nlm.nih.gov/pubmed/23620736 so in a way this type of normalization is better. Also, that advantage of using the TCGA is that you can have large number of samples, so you can validate your findings across many of them.
Thank your comment very much! I also ask an expert of bioinformatics in our institute for this question. His opinion is the same as yours.