Differences between Cufflinks results and home-made FPKM values
1
0
Entering edit mode
6.9 years ago
nicolas.hipp ▴ 10

Hi everyone,

I have a problem to understand how Cufflinks calculate FPKM :

There are a lots of posts on this issue, I read that FPKM= Number of mapped fragments / (length of the gene /1000) / (size of the libraries / 10^6)

I have output from featureCount which give me raw counts and the length of the gene. When I do the calcul with these values I cannot find the output for the same .bam file from Cufflink eg:

FPKM    raw.count   Gene Length Manual FPKM
BACH2   31,6243 26315   9786    30,04520196
GAPDH   3490,14 358474  2981    1343,608215
Size of the librarie    89500000    89500000

Interestingly BACH2 have similar values for FPKM, but I don't understand why the two values for GAPDH are so different. Does-anybody know how It can be possible, the only think that I see is that length of the gene GAPDH is wrong... and I don't find anything about the calculation for this Length by featureCounts ( is-it the ORF size? how It deals with isoforms?)

If anybody have an idea :)

Thanks a lot, nicolas

RNA-Seq • 2.1k views
ADD COMMENT
0
Entering edit mode
6.9 years ago

The values you use and those used by cufflinks will be different for "number of mapped fragments" and "length of gene". Cufflinks is trying to handle multimappers and will also be using more of an "effective length" for each gene, which will likely vary by sample.

If you look at a single gene in your GTF file then I expect you'll quickly realize what featureCounts is doing to get a gene length.

ADD COMMENT
0
Entering edit mode

Oh yes, thanks for this explanation, I forgot the multi mapperparameters... sorry for the naive question ..

I still try to understand the calcul of the length, when I extract information from BACH2 gene for eg, I found that the gtf file contains the length of each exon, utr, gene length and "transcript values". By sum the differences between starting and ending position I guess that length for gene is a sum of length for all exons transcribed? If this is that, I don't find the same value as Cufflink do... (800 bp between the both).

Thanks a lot nicolas

ADD REPLY
0
Entering edit mode

Cufflinks is taking the weighted average of expressed transcript lengths (weighted by their relative expression), or something close to that.

ADD REPLY
0
Entering edit mode

Ok thanks for the help ;)

ADD REPLY

Login before adding your answer.

Traffic: 1608 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6