What is the reason why we usually use normalized values from RNA-Seq (FPKM, RPKM, etc.) ?
3
1
Entering edit mode
7.2 years ago
ebrudermanver ▴ 100

I don't have much experience with RNA-Seq but I am seeing that the data is usually published not in raw counts but in FPKM values. What is the reason for that? Is it only because so that we can model the values by a log-Gaussian distribution rather than a discrete distribution like Poisson or negative binomial? Or does it have any purpose to make data more accurate and reliable?

RNA-Seq • 3.6k views
ADD COMMENT
3
Entering edit mode
7.2 years ago
Michael 55k

The reason for FPKM is mostly historical as there are practically only disadvantages in distributing the data this way.

  • There are several posts and publications showing that FPKM is inferior to other units.
  • FPKM is not directly compatible with most DE packages.
  • Providing raw counts would instead allow anyone to compute the transformation they wanted (CPM, TPM, FPKM), while the FPKM transformation is not easily reversible.
  • FPKM manifests biases and errors in the gene prediction, especially it is not suitable for draft genomes where the exons are often not well annotated.
  • FPKM need to be represented as floating point values, introducing unnecessary rounding errors and maybe data volume, while the counts can be represented by integers.
ADD COMMENT
2
Entering edit mode
7.2 years ago

R(F)PKM/TPM values are used to normalize read counts by library size (total number of reads you have in a given RNAseq experiment) and the length of the feature (gene/transcript). But remember that commonly used software for differential expression analysis (DESEQ2/EdgeR) are using raw counts instead of normalized values (they do their internal normalization steps).

ADD COMMENT
0
Entering edit mode
7.2 years ago
ebrudermanver ▴ 100

Okay, I just found that link which says that FPKM makes it possible to compare Gene A to Gene B even if they are of different lengths, and to compare Sample 1 and Sample 2 even if they have different library sizes.

ADD COMMENT

Login before adding your answer.

Traffic: 2032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6