Divide isoform expression by gene expression to calculate splicing ratios
0
0
Entering edit mode
5.4 years ago

I have a large tab separated matrix containing FPKM values (expression) of known and novel genes and transcripts. The code first needs to calculate overall FPKM for a gene and then divide each isoform FPKM by overall gene FPKM. For example below MSTRG.1 gene contains three transcripts (AT1G01010.1, MSTRG.1.2, MSTRG.1.3) and transcript FPKM values in the corresponding columns:

gene_id     trans           Sample1     Sample2
MSTRG.1     AT1G01010.1     3.217145    5.362317
MSTRG.1     MSTRG.1.2       0           0
MSTRG.1     MSTRG.1.3       0           1.265547
AT3G04280   AT3G06460.1     0           4.852563
AT3G04280   MSTRG.12548.1   0.099178    0.480905
AT3G04280   AT3G06470.1     4.548129    6.963614

So the overall gene expression for sample1 for gene MSTRG.1 is 3.217145 and for AT3G04280 is 4.647307, similarly, the gene expression for sample2 for gene MSTRG.1 is 6.627864 and for AT3G04280 is 12.297082, when we divide the transcript expression by gene expression, the output matrix will be something like this:

gene_id     trans          Sample1      Sample2
MSTRG.1     AT1G01010.1     1           0.809056582935317
MSTRG.1     MSTRG.1.2       0           0
MSTRG.1     MSTRG.1.3       0           0.190943417064683
AT3G04280   AT3G06460.1     0           0.3946
AT3G04280   MSTRG.12548.1   0.02134     0.039
AT3G04280   AT3G06470.1     0.9786      0.566

Any help will be highly appreciated.

RNA-Seq Assembly • 1.0k views
ADD COMMENT
0
Entering edit mode

This reads as you want someone to write the code for you... what have you tried?

ADD REPLY

Login before adding your answer.

Traffic: 1750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6