Hi,
I am utilizing publicly available data for my research. It is increasingly common for papers to report gene expression data in FPKM gene expression, rather than raw counts. (i am also very curious why they do that?). The raw fastq files are available, however, obtaining raw counts from them requires a significant amount of effort, so I would prefer to use the provided FPKM gene expression for genes.
Are there any tutorials or recommended tools to identify differentially expressed genes from FPKM gene expression data?
I would be grateful for any assistance.
Please have a look at this link, where you can find a very concise and easy explanation of different normalization methods for RNA-seq. Note how FPKM normalization is not appropriate for DE analysis, but only to compare genes expression within the same sample. For DE analysis, the best way to go is using median of ratios, the default normalization method used by DESeq2 (or also TMM normalization by EdgeR).
Thank you very much for the link; it is a very good summary, and now all those normalization methods make sense.