Question

Computing TPM from normalized read count from DESEQ

0

Entering edit mode

7.3 years ago

jack ▴ 980

Hey all,

I have normalised read count from DESeq2 for miRNA. I would like to compute TPM out of this normalised read count. I was wondering whether would it be possibile to do that ?

RNA-Seq DEseq genomics miRNA Expression • 10k views

ADD COMMENT • link updated 7.3 years ago by GouthamAtla 12k • written 7.3 years ago by jack ▴ 980

score 2 · Answer 1 · 2017-08-20

2

Entering edit mode

7.3 years ago

GouthamAtla 12k

TPM itself is a normalisation method, so you do it on raw counts, not on normalised data. If you want to know how to calculate TPM, check this link.

http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/

I am not sure if you could apply TPM for miRNA.

ADD COMMENT • link 7.3 years ago by GouthamAtla 12k

0

Entering edit mode

I am not sure if you could apply TPM for miRNA.

I second this. You calculate TPM to eliminate the length-bias (longer genes get more reads), but I don't think this doesn't make sense for miRNA sequencing.

Also: if you have normalised read counts from DESeq2 why would you want to get TPM values?

ADD REPLY • link 7.3 years ago by WouterDeCoster 47k

0

Entering edit mode

Need clarification:

longer genes get more reads

Isn't that the biological variation you are trying to normalize (independent to the length a gene can be can be poorly/highly transcribed depend on the condition) ?

Does it matter when you do between sample comparisons where we compare one/same gene in different samples/biological conditions assuming same library sizes (not like within sample where length matters because comparing different genes from the same sample/biological condition) ?

ADD REPLY • link 7.3 years ago by EagleEye 7.6k

0

Entering edit mode

I'm not sure what you mean to say.

Isn't that the biological variation you are trying to normalize (independent to the length a gene can be can be poorly/highly transcribed depend on the condition) ?

I don't consider read length "biological" but a technical element.

Does it matter when you do between sample comparisons where we compare one/same gene in different samples/biological conditions assuming same library sizes (not like within sample where length matters because comparing different genes from the same sample/biological condition)?

I would say gene length doesn't matter when doing differential expression analysis between genes. However, when comparing transcript usage it does play a role.

The point I wanted to make (poorly, in hindsight) is that when you have DESeq2 normalized counts (which I consider superior to TPM) there is no real need to use something else (except if OP has good reasons to do so - but then still normalization of normalized counts is not a good idea as pointed out by geek_y).

ADD REPLY • link 7.3 years ago by WouterDeCoster 47k

1

Entering edit mode

I don't consider read length "biological" but a technical element.

I was not talking about the read length, it was about the number of reads mapped to a transcript/gene. Sorry I did not notice your last point.

Also: if you have normalised read counts from DESeq2 why would you want to get TPM values?

I agree with you for differential expression analysis it is better to consider normalization from DESeq2/edgeR (always superior, whatever packages designed to handle RNA-seq data). I guess OP might consider converting raw counts to TPM only if want to check the expression levels without going for diffExp.

normalization of normalized counts is not a good idea

Very bad idea.

ADD REPLY • link 7.3 years ago by EagleEye 7.6k