Question

How to generate FPKM value for each replicates?

0

Entering edit mode

7.2 years ago

Karma ▴ 310

Given two conditions (control and treated) with 4 replicates. How can I get genes with FPKM for each replicate?

Control Replicates: C1, C2, C3, C4

Treated Replicate: T1, T2, T3, T4

For example, for a gene TP53 I need the following

TP53 FPKM(c1), FPKM(c2),FPKM(c3),FPKM(c4), FPKM(t1),FPKM(t2),FPKM(t3),FPKM(t4)

I tried the following command cuffdiff -o diff_out -b genome.fa -p 8 -L CR,TR -u merged_asm/merged.gtf CR_R1_thout/accepted_hits.bam,CR_R2_thout/accepted_hits.bam,CR_R3_thout/accepted_hits.bam,CR_R4_thout/accepted_hits.bam TR_R1_thout/accepted_hits.bam,TR_R2_thout/accepted_hits.bam,TR_R3_thout/accepted_hits.bam,TR_R4_thout/accepted_hits.bam But, I got combined FPKM for two conditions not for replicates

How can I get FPKM for each replicates?

RNA-Seq NGS cufflinks Tophat cuffdiff • 3.4k views

ADD COMMENT • link updated 7.2 years ago by Sparrow_kop ▴ 260 • written 7.2 years ago by Karma ▴ 310

score 1 · Answer 1 · 2017-08-30

1

Entering edit mode

7.2 years ago

Sparrow_kop ▴ 260

Hi, please note that cuffdiff is used to identify the dfferent expression genes based on the individual sample FPKM and the group informations. So if you means to get the individual sample FPKM value, just simple use cufflinks instead. for example :

cufflinks -g yourGTF -u -o output_name individual.bam

ADD COMMENT • link 7.2 years ago by Sparrow_kop ▴ 260

0

Entering edit mode

Though cuffdiff is for differential expression, it generates the FPKM/RPKM values

ADD REPLY • link 7.2 years ago by GouthamAtla 12k

0

Entering edit mode

The data which I am using is of paired end data. So it will generate FPKM values

ADD REPLY • link 7.2 years ago by Karma ▴ 310

0

Entering edit mode

So, if I use this, I am going to get fpkm/rpkm from each replicate. If I create a matrix of fpkm values from replicates and calculate fold change it should be equal to the fold change generated by the command cuffdiff -o diff_out -b genome.fa -p 8 -L CR,TR -u merged_asm/merged.gtf CR_R1_thout/accepted_hits.bam,CR_R2_thout/accepted_hits.bam,CR_R3_thout/accepted_hits.bam,CR_R4_thout/accepted_hits.bam TR_R1_thout/accepted_hits.bam,TR_R2_thout/accepted_hits.bam,TR_R3_thout/accepted_hits.bam,TR_R4_thout/accepted_hits.bam Right?

ADD REPLY • link 7.2 years ago by Karma ▴ 310

1

Entering edit mode

Yes, I think so. Meanwhile, if you have the individual sample expression value, you could apply another method or statistics model to identify DE genes. Also you should refer to the manual : cufflinks package manual

ADD REPLY • link 7.2 years ago by Sparrow_kop ▴ 260

score 0 · Answer 2 · 2017-08-30

If you inspect all the files, there will files with rpkm for each replicate.

From cuffdiff website:

Cuffdiff calculates the expression and fragment count for each transcript, primary transcript, and gene in each replicate. The results are output in per-replicate tracking files in the format described here. isoforms.read_group_tracking genes.read_group_tracking cds.read_group_tracking tss_groups.read_group_tracking

Otherwise,

Quantify the genes using featureCounts and feed the matrix to edgeR's and use rpkm() function. featureCounts output contains the gene length information in one of the columns, which edgeR needs.