Hello, I used an RSEM, gene-level count estimates matrix and just rounded the values using the round() function in R to feed them to DESeq2 but the results are very odd (VERY low number of deferentially expressed genes). Is this really a solid way to go about deferential analysis or should I resort to starting from raw data again.
I read multiple threads that suggested that the rounding method, although not optimal, should work fine.
NOTE: The study I got the data from provided the data as RSEM gene-level count matrix and FPKM normalized matrix. I had to use rsem since I know DESeq2 only takes non-normalized counts. The study used an independent t-test on the FPKM file to do the analysis, but I read somewhere that it is highly discouraged.
0Hello! Thank you very much for your timely reply. I check the code again to make sure I did not miss any important step or detail and I got the following MA, Dispersion and Volcano plots in my analysis so far:
I will do a PCA and get the p-value plots as well and post them here. I posted the above plots to make sure I did not miss anything so far.
I should also add that the analysis is between three grades of glioma 2, 3 and 4 as conditions and that the plots are the plots are for the 4vs2 results from DESeq2.
Looks all normal to me. Be happy that you have a limited number of genes to focus on rather than thousands of DEGs and the entire transcriptome going wild with no idea which genes to start with. How many DEGs is that, like a hundred, that is ok I'd say, nothing to worry about. If you expected many more DEGs (or altered gene functions in general) based on a strong phenotype you might want to investigate mechanisms beyond transcription such as posttranscriptional processes that might drive the phenotype. If this is published data you can try
recount
which is a Bioconductor package that provides raw counts for almost all RNA-seq studies in the SRA (at least from mouse/human afaik) processed by an independent method (not RSEM, I think it was called Rail-RNA and uses a custom bowtie2 version that is splice-aware). Anyway, this gives raw counts that you could use to check that the result is not a processing issue.