Entering edit mode
6.3 years ago
Folder40g
▴
190
I'm using a software that requieres either TPM or raw counts at gene level.
So I downloaded this data set from Xena browser: https://xenabrowser.net/datapages/?dataset=TCGA.SKCM.sampleMap%2FHiSeqV2&host=https%3A%2F%2Ftcga.xenahubs.net
This matrix seems to be the log2(RSEM output + 1)
Am I wrong If said that by doing the antilog2 of the counts of this matrix, then subtract 1 I get TPM?
I've not been able to find raw counts at gene level. TCGAbiolinks as far as I've seen only provides htseq counts at transcript level.
Thanks
TPM and RPKM/FPKM are highly correlated at gene level quantification. TPM use transcript length to normalize. FPKM use gene length to normalize. And the transcript length is highly correlated with gene length. My take is that both TPM and RPKM/FPKM are reasonable estimates for expression levels, my experience is changing between the two rarely give drastically different results, none the less, you probably want to be consistent across all your datasets in consideration so that you don't pull out signal that is due to difference in metric.
Xena has the raw counts here: https://xenabrowser.net/datapages/?dataset=TCGA-SKCM%2FXena_Matrices%2FTCGA-SKCM.htseq_counts.tsv&host=https%3A%2F%2Fgdc.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443