There are efficient methods for normalizing read counts, such as the median of ratios method used by default in DESeq2. It would seem more correct to calculate FPKM (or TPM, or RPKM) from normalized read counts rather than from raw read counts. FPKM calculated from normalized read counts, I think, would be an excellent metric of gene expression, because it would take into account both gene length and include a correction for the influence of highly expressed genes. However, people either calculate FPKM from raw read counts, or perform the median of ratios correction without converting it to FPKM. Why is that so? Am I missing something?
Scientists stopped using FPKM in general because better measures are available:
FPKM not suitable for DE?
For anyone interested, this is the first paper to point out that RPKM (and, by extension, FPKM) normalization is inconsistent across samples due to variations in total RNA output and transcript composition (PMID 22872506).
The authors introduced TPM as an alternative.That was in 2012—not long after the introductions of RPKM in 2008 (PMID 18516045) and FPKM in 2010 (PMID 20436464).Bo Li and Colin Dewey (the developers of RSEM) introduced TPM in 2009 even before the cufflinks paper introduced FPKM. https://academic.oup.com/bioinformatics/article/26/4/493/243395
However, I think that 2012 paper you linked to formalized the relationship between RPKM and TPM.
Thank you for correcting me. I've updated the post.