Hi everyone,
I am using the TCGA portal to get mRNA expression data for various cancer studies (e.g. lung, liver, thyroid etc). I have two questions about the data:
- Some cancer studies on TCGA have "mRNA expression RNASeq V2 RSEM" values & corresponding "z-scores". I am confused as to what the "mRNA expression z-Scores (RNA Seq V2 RSEM)" data constitutes of. How are the z-scores calculated and what do they represent?
- We have been on a lookout for control dataset for the cancer studies on TCGA. Does anyone know of a good place where you can find control dataset for tissues like Lung, Liver, Thyroid etc. (basically all the fore-gut tissues)? We are working with control data from GTEx but they have RPKM values & TCGA has RSEM/RSEM z-scored values, so we have to do a lot of scaling/normalization/transformation to compare these disparate datasets. We would like to know if there is any mRNA expression data (obtained via RNASeq V2 RSEM) for controls.
Thanks!
UPDATE: I have posted the second part as a separate question here.
Thanks! I will create another question for the second part!
Thanks for the information. Reading the literature and comments, my understanding of the z-score:
The question is the above protocol is correct or not, please advised.
Does these z-score really have meaning. The z-score COSMIC provide:
If I calculate the z-score using above approach, should I be able to calculate the z-score and find out whether the gene is over regulated or normal regulated.
Thanks
Dear sir:
Thank you for your information. I don't know how to download diploid information from TCGA. However, I checked the wiki of TCGA, and I found that the diploid information is in each VCF file. I guess I must download VCF file to get diploid information. Is it right?