Question

Transcript vs. Gene TPM counts?

4

Entering edit mode

5.1 years ago

n,n ▴ 370

Working with some GTEx portal data right now and I've noticed that at GTEx's downloads page there are both a "Gene TPMs" and a "Transcript TPMs" file. My question is how exactly do these files differ from each other in terms of the steps for obtaining such files? I guess another way to phrase it would be why are there two files like this if RNA-Seq is supposed to output reads for transcripts in general? I would expect only 1 file with all the transcripts from GTEx instead of one that makes a distinction of gene vs. transcript... I'm obviously missing out on something rather elemental here but I don't know what it is.

Another minor question is whether or not if it is safe to assume that data from these files is normalized. As I understand, the data being TPMs implies the read counts have been normalized in the process of converting to the TPMs themselves, but I'm not 100% sure about this Thanks for any help.

RNA-Seq GTEx GTEX • 4.6k views

ADD COMMENT • link updated 5.1 years ago by Kristoffer Vitting-Seerup ★ 4.1k • written 5.1 years ago by n,n ▴ 370

score 3 · Answer 1 · 2019-10-10

3

Entering edit mode

5.1 years ago

Kristoffer Vitting-Seerup ★ 4.1k

1) Gene expression is obtained by summing the expression of all transcripts belonging to the same gene. 2) Yes TPM are normalised values - but you might still need to perform a inter-library normalization. You can read more about that here.

ADD COMMENT • link 5.1 years ago by Kristoffer Vitting-Seerup ★ 4.1k

score 0 · Answer 2 · 2019-10-08

0

Entering edit mode

5.1 years ago

MatthewP ★ 1.4k

Hey, RNA-seq can output both gene and transcript read counts. TPM is normalized data, yes.

ADD COMMENT • link 5.1 years ago by MatthewP ★ 1.4k

0

Entering edit mode

what is exactly being measured in "gene counts" though?

ADD REPLY • link 5.1 years ago by n,n ▴ 370

4

Entering edit mode

Mike, I am very sorry if I am being pedantic and what I say below is too simplistic.

Here, the word "transcript" does not mean the mRNA product of the gene. The "Gene" and the "Transcript" are those defined in the gene definition file (gtf, or gff/gff3). For example, Hoxa1 gene in human has two transcripts according to ensembl. So if GTEx has used ensembl gene definition the "transcript TPM" file will have two values while the "gene TPM" file will have only one value.

ADD REPLY • link 5.1 years ago by vj ▴ 520

0

Entering edit mode

Not pedantic at all, this actually makes a lot of sense. Thank you

ADD REPLY • link 5.0 years ago by n,n ▴ 370

0

Entering edit mode

This is super helpful and exactly the answer I was looking for, thank you!!

ADD REPLY • link 4.5 years ago by Danielle B ▴ 10

0

Entering edit mode

Reads mapped to the gene.

ADD REPLY • link 5.1 years ago by shoujun.gu ▴ 350