Question

How to deal with the FPKM values for isoforms in RNA-seq for particular gene

2

Entering edit mode

10.3 years ago

ancient_learner ▴ 680

This might be one of the trivial things but being new to RNA-seq data I am really confused on how to assign fpkm value for a gene in rna-seq data that has 3 or 4 isoforms.

I have downloaded analysed rna-seq data from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52450 and have noticed that transcripts belonging to the same gene(isoforms) have different fpkm values which is usual. But for my analysis purpose I am thinking whether It is ok if I sum up all the fpkm values of the isoforms to represent that particular gene's expression? or should I keep the values as it is?

example:

genaA Isoform1 2.98
geneA isoform2 5.98
geneA isoform3 2.43

Can I make it as geneA 11.39 (2.98+5.98+2.43)?

isoforms RNA-Seq fpkm • 6.2k views

ADD COMMENT • link updated 3.0 years ago by Ram 44k • written 10.3 years ago by ancient_learner ▴ 680

Ram · Answer 1 · 2014-07-01

2

Entering edit mode

10.3 years ago

Sukhi Singh 11k

It should work principally, but divide by the number of isoforms, to have a normalized value or the length of isoforms, depending on what you want. So, it will be called as averaged gene expression. I checked the files you are using, generally, there is another file named gene_exp.diff, which has the expression value per gene generated using Tuxedo suite, so you dont have to calculate it yourself.

For a more detailed answer, check this How do I get one FPKM value per gene?

There is also a raw code (tar gz archive) provided by user mgogol, use discreetly after reading everything, as it assumes you to have some output files from cufflinks/cuffdiff.

ADD COMMENT • link updated 3.0 years ago by Ram 44k • written 10.3 years ago by Sukhi Singh 11k

2

Entering edit mode

+1 for using gene_exp.diff, except I think genes.fpkm_tracking is the file that you would typically look for if you ran cufflinks on your own (to get FPKM values for each sample)

ADD REPLY • link updated 3.0 years ago by Ram 44k • written 10.3 years ago by Charles Warden 8.3k

0

Entering edit mode

Yes, Charles, you are right. gene_exp.diff is the output of cuffdiff while doing the DE genes analysis, though reports raw FPKM values.

ADD REPLY • link updated 3.0 years ago by Ram 44k • written 10.3 years ago by Sukhi Singh 11k

0

Entering edit mode

Hi Thank you for your suggestions I have downloaded the differential expression testing file in which value_1 and value_2 correspond to the expression values for genes at 2 different stages. Again they seem to be present as transcripts. I have annotated the ref-seq IDs with gene names and checked. As you told it would be fine if I take average of fpkm values i think it would be better for me to go head with that.

ADD REPLY • link updated 3.0 years ago by Ram 44k • written 10.3 years ago by ancient_learner ▴ 680