This might be one of the trivial things but being new to RNA-seq data I am really confused on how to assign fpkm value for a gene in rna-seq data that has 3 or 4 isoforms.
I have downloaded analysed rna-seq data from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52450 and have noticed that transcripts belonging to the same gene(isoforms) have different fpkm values which is usual. But for my analysis purpose I am thinking whether It is ok if I sum up all the fpkm values of the isoforms to represent that particular gene's expression? or should I keep the values as it is?
example:
genaA Isoform1 2.98
geneA isoform2 5.98
geneA isoform3 2.43
Can I make it as geneA 11.39 (2.98+5.98+2.43)?
+1 for using
gene_exp.diff
, except I thinkgenes.fpkm_tracking
is the file that you would typically look for if you ran cufflinks on your own (to get FPKM values for each sample)Yes, Charles, you are right.
gene_exp.diff
is the output ofcuffdiff
while doing the DE genes analysis, though reports raw FPKM values.Hi Thank you for your suggestions I have downloaded the differential expression testing file in which
value_1
andvalue_2
correspond to the expression values for genes at 2 different stages. Again they seem to be present as transcripts. I have annotated the ref-seq IDs with gene names and checked. As you told it would be fine if I take average of fpkm values i think it would be better for me to go head with that.