Am I getting TPM correct from tcga?
1
1
Entering edit mode
5 months ago
Znow ▴ 20

Hello everyone,

I am using the TCGAbiolinks package in R to retrieve TPM data from the TCGA database. Below is the code I am using:

library("TCGAbiolinks")

query_TCGA <- GDCquery(
  project = "TCGA-HNSC",
  data.category = "Transcriptome Profiling", 
  data.type = "Gene Expression Quantification",
  experimental.strategy = "RNA-Seq",
  workflow.type = "STAR - Counts"
)

GDCdownload(query = query_TCGA)

tcga_data <- GDCprepare(query_TCGA)

TPM <- assay(tcga_data, 4)

Could you please confirm if I am correctly extracting TPM values with this approach? Specifically, I'm using the assay() function to extract data from tcga_data, assuming that TPM values are in the fourth assay slot, as its name is tpm_unstrand. Is this the correct method to obtain TPM values from TCGA data using TCGAbiolinks?

Thank you for your assistance!

TPM TCGABIOLINKS TCGA • 673 views
ADD COMMENT
1
Entering edit mode
5 months ago
bk11 ★ 3.0k

You can get TPM as follows:

library("TCGAbiolinks")

query_TCGA <- GDCquery(
  project = "TCGA-HNSC",
  data.category = "Transcriptome Profiling", 
  data.type = "Gene Expression Quantification",
  experimental.strategy = "RNA-Seq",
  workflow.type = "STAR - Counts"
)

GDCdownload(query = query_TCGA)

tcga_data <- GDCprepare(query_TCGA)

tcga_data
class: RangedSummarizedExperiment 
dim: 60660 566 
metadata(1): data_release
assays(6): unstranded stranded_first ... fpkm_unstrand fpkm_uq_unstrand
rownames(60660): ENSG00000000003.15 ENSG00000000005.6 ... ENSG00000288674.1 ENSG00000288675.1
rowData names(10): source type ... hgnc_id havana_gene
colnames(566): TCGA-CR-7370-01A-11R-2132-07 TCGA-CR-6484-01A-11R-1873-07 ... TCGA-H7-8501-01A-11R-2403-07
  TCGA-CV-7415-01A-11R-2081-07
colData names(80): barcode patient ... paper_Copy.Number paper_PARADIGM

tpm <- tcga_data@assays@data$tpm_unstrand
rownames(tpm) <- rownames(tcga_data)
colnames(tpm) <- colnames(tcga_data)
tpm[1:5, 1:5]
                   TCGA-CR-7370-01A-11R-2132-07 TCGA-CR-6484-01A-11R-1873-07 TCGA-CR-6478-01A-11R-1873-07 TCGA-CV-A45Y-01A-11R-A24Z-07 TCGA-UF-A719-01A-12R-A34R-07
ENSG00000000003.15                      59.7559                      12.9848                      13.4885                      19.7511                      56.4732
ENSG00000000005.6                        0.0000                       0.0000                       0.0295                       0.0000                       0.0000
ENSG00000000419.13                      70.1836                      91.8942                      76.2891                      59.1658                      71.4131
ENSG00000000457.14                       9.9460                       4.5738                       4.8244                       3.7459                       8.0012
ENSG00000000460.17                      14.1288                       5.4911                       8.8251                       2.4338                       9.8382
ADD COMMENT
0
Entering edit mode
assay(tcga_data, "tpm_unstrand")

will also work

ADD REPLY
0
Entering edit mode

Yes, if used in the context of bk11's answer, not as a standalone function call. Point is, this should have been a comment, not an answer. I've moved it to a comment this time, please be more mindful in the future.

ADD REPLY

Login before adding your answer.

Traffic: 2688 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6