The Easiest/Fastest Way To Get From Bam To Tpm Or Rpkm
2
0
Entering edit mode
11.1 years ago
Nick ▴ 290

I have already aligned all my RNA-seq samples using tophat2. I've done the differential analysis using htseq-count and edgeR. I would like to get rpkm and tpm metrics. What is the easiest way to do that using the output of tophat2, htseq-count and edgeR? I also have the reference GTF. So, please, recommend a tool or a workflow that would take me to TPKM or RPKM in less than a day - I have a machine with 32GB RAM and 8 cores and my study comprises 24 samples.

rna-seq bam rpkm tophat2 bowtie2 • 12k views
ADD COMMENT
4
Entering edit mode
11.1 years ago

Edger rpkm() function?

ADD COMMENT
0
Entering edit mode

Thanks - this would be really neat. I just checked the manual but can't quite understand how to produce it from a DGEList object. Can you, perhaps, give an example?

ADD REPLY
0
Entering edit mode

I also saw that logCPM (counts per million) is part of the topTags object but I don't understand how to interpret it. Is there a way to derive RPKM from it?

ADD REPLY
1
Entering edit mode

Did you try ?rpkm

rpkm(x, gene.length, normalized.lib.sizes=TRUE, log=FALSE, prior.count=0.25)

Where x is a DGElist

ADD REPLY
2
Entering edit mode
10.7 years ago
dfernan ▴ 770

Easiest means wrong in this case, since one cannot possibly get a right estimate for RPKMs in an RNA-Seq experiment without proper handle of multiple alignments. I'd say the easiest and still right way to do it is to use Express/Cufflinks/RSEM/etc. - they produce RPKM estimates, and RSEM produces RPKM/TPM estimates - anything else without handling of multiple alignments may be easy but it's just wrong...

You can also get RPKM estimates and transform them to TPM estimates

ADD COMMENT

Login before adding your answer.

Traffic: 1677 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6