Hi everyone,
I have a problem to understand how Cufflinks calculate FPKM :
There are a lots of posts on this issue, I read that FPKM= Number of mapped fragments / (length of the gene /1000) / (size of the libraries / 10^6)
I have output from featureCount which give me raw counts and the length of the gene. When I do the calcul with these values I cannot find the output for the same .bam file from Cufflink eg:
FPKM raw.count Gene Length Manual FPKM
BACH2 31,6243 26315 9786 30,04520196
GAPDH 3490,14 358474 2981 1343,608215
Size of the librarie 89500000 89500000
Interestingly BACH2 have similar values for FPKM, but I don't understand why the two values for GAPDH are so different. Does-anybody know how It can be possible, the only think that I see is that length of the gene GAPDH is wrong... and I don't find anything about the calculation for this Length by featureCounts ( is-it the ORF size? how It deals with isoforms?)
If anybody have an idea :)
Thanks a lot, nicolas
Oh yes, thanks for this explanation, I forgot the multi mapperparameters... sorry for the naive question ..
I still try to understand the calcul of the length, when I extract information from BACH2 gene for eg, I found that the gtf file contains the length of each exon, utr, gene length and "transcript values". By sum the differences between starting and ending position I guess that length for gene is a sum of length for all exons transcribed? If this is that, I don't find the same value as Cufflink do... (800 bp between the both).
Thanks a lot nicolas
Cufflinks is taking the weighted average of expressed transcript lengths (weighted by their relative expression), or something close to that.
Ok thanks for the help ;)