Hello all,
I am trying to analyse differential expression in a dataset for which I only have RPKM values available to me. I usually use the R/BioC environment for RNA-seq analysis, and have read in various BioC documentation that using RPKM values in packages such as DESeq is really bad, from my understanding this is because the models used within the DESeq pipeline assume negative-binomial distribution of raw read counts, and RPKM doesn't fit at all into these assumptions (happy for anyone to correct my understanding there though).
So, I am trying to consider my options. Has anyone attempted something similar? I wondered if the best thing to do would be to just take the data out of the BioC environment entirely, and analyse with an appropriate GLM in R normal.
Any ideas would be appreicated!
Many thanks, Harriet
A GLM would seem to be the most appropriate approach. Obviously this isn't ideal, but it seems the only option.