Dear All,
I have a data set containing expression profiles (normalized RPKM data) based on RNAseq. The raw count of the data is not accessible. What is the best package to use in this case in view of performing differential expression ? I realized that DESeq2 can not be used in that case. I know one can use the Limma package but are there also other packaged eventually more advantageous in some sense than Limma in the case of having only the RPKM of the data ?
Thanks.
Thanks for the comment. Comparing between groups being the basis of DE, I do not see then a way out in using only RPKM for this scope. How stringent are these limitations? Is limma-trend the only option to use in my case for performing DE between groups ? I am comparing between at least 10 different groups or conditions inside the big data set. I am also wondering if there is a way to estimate the raw counts based on RPKM alone and then use the standard DESeq package?
You can reverse-engineer RPKM values, if you wish, and if you have all relevant pieces of information:
So, if you want 'Gene Reads' (raw counts), then you need:
I do not have enough information to comment on the limitations of the limma-trend method used in this context - my sincere apologies.
Another possibility is to transform the RPKM values to Z-scores using the zFPKM package. On the Z-scale, you can use any parametric test to derive p-values.