Problem With Fpkm In Cufflinks
2
2
Entering edit mode
11.3 years ago
Bio_ysl ▴ 20

Hello,

I'm having some problems to understand the Cufflinks FPKMs... I use Cufflinks to obtain the FPKMs froma a ".sam" file obtained from the denovo assembly permormed wit MIRA3 of a transcriptome. Cufflinks give me us an output the files "Isoforms.fpkm_trasckin" and "genes.fpkm_tracking", in both files the FPKMs values are the same. However, I tried to apply the formula FPKM=10^9*numreads_of_the_fragment/(total_assembled_reads * length_of_the_fragment)) and the values are ttally different, with non apparent correlation...

I give you an example (Num. reads assembled: 61725)

contig       num.reads      length     FPKM(formula applied)        FPKM(cufflinks)
c1                   7           487    232.8670171315                     402.01        
c2                   6           446    217.9492069373                     399.86        
c3                   5           486    166.6758338375                     287.15        
c4                   6              489    198.7839392516                     342154.00        
c5                   7           712    159.2784232346                     223638.00

Can somebody explain me these differences? What formula Cufflinks use? Thanks!

cufflinks fpkm rpkm differential-expression transcriptome • 4.3k views
ADD COMMENT
2
Entering edit mode

I think that you meant 10^6 instead of 10^9. Also, check the normalization method you used with cuffdiff. I f you used the default you actually applied a further normalization (which is better than FPKM). You can read the manual page: http://cufflinks.cbcb.umd.edu/manual.html#library_norm_meth

ADD REPLY
0
Entering edit mode

I think 10^9 is okay, isn't it 1000*1000000?

By the way, there should be SOME correlation with the cufflinks RPKM and the one you calculate yourself... If not, something's wrong.

ADD REPLY
0
Entering edit mode

Yes, 10^9 should be correct... If there is any correlation I couldn't see it, and this is what creates me so many doubts...

ADD REPLY
1
Entering edit mode

Ooops. Yes, that's correct...

ADD REPLY
3
Entering edit mode
11.3 years ago

Did you read the cufflinks paper? It's a lot more complicated than that... Not sure it's really possible to put it into a text box, even if I did claim to understand it. Supplementary methods section 3, lots of math.

ADD COMMENT
3
Entering edit mode
11.3 years ago

Two things that may be responsible for the discrepancy:

  1. Handling of multimapped reads. I believe that Cufflinks evenly divides a read between all places that it maps. For example, a read that is mapped to 4 locations is counted as .25 reads at each individual locus.
  2. If any of your transcripts overlap (i.e. one gene has multiple transcripts), then Cufflinks does some sort of deconvolution to best determine which transcript a read belongs to.

Your read counting program probably addresses these 2 cases differently than Cufflinks.

ADD COMMENT
0
Entering edit mode

Thank you for the answer! But in this case, we are talking about a denovo assemble done with MIRA3, I'm not sure if a read can be assembled in more than one contig....

ADD REPLY

Login before adding your answer.

Traffic: 2106 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6