Entering edit mode
6.8 years ago
bioinfo456
▴
150
I have an RNA Seq data set that is log2(x+1) transformed RSEM normalized count. Can somebody explain how I can obtain raw read counts from this that way I could perform DEG analysis using DeSeq2 R package?
Please see Devon's and Michael's input here: RSEM Downstream Analysis
Also input from Michael and Simon (DESeq2 deveopers) on Bioconductor, here: https://support.bioconductor.org/p/51577/
Thanks for the reply Kevin. Devon suggests Limma or edgeR. DESeq2 developers recommend the option of using rounded estimated gene-level counts from RSEM as input to DESeq2. By rounded, do they mean the closest integer value?
Yes, the general idea that I get from the comments is that, if you really wish to use DESeq2, then you should:
Obviously the ideal situation is to get the raw counts (or produce them yourself). May I ask on which data you are working? - TCGA?
Yes, TCGA gene expression RNAseq - IlluminaHiSeq data.
Description of the data set is as follows :- The gene expression profile was measured experimentally using the Illumina HiSeq 2000 RNA Sequencing platform by the University of North Carolina TCGA genome characterization center. Level 3 data was downloaded from TCGA data coordination center. This dataset shows the gene-level transcription estimates, as in log2(x+1) transformed RSEM normalized count. Genes are mapped onto the human genome coordinates using UCSC Xena HUGO probeMap.
I don't have the resource to produce raw counts. You reckon i can round off this normalized count and use DeSeq on it? Thanks a ton for your insight.
You could try the recommendations of Michael Love, Simon Anders, and Devon Ryan, as they are experts in this area. From the discussion, it just didn't seem convincing that it is an ideal type of data to use for DESeq2, though.
If it is TCGA data that you want to analyse, then you should be able to get the raw HTSeq counts via the GDC Legacy Archive, but it depends on the cancer of interest. I recently re-analysed all 500+ raw HTSeq count files for endometrial cancer, for example, using DESeq2.
I'm gonna go ahead with Michael Love's recommendation. Thanks a ton, Kevin :).