How to get fpkm of TCGA data
1
0
Entering edit mode
2.0 years ago
Maryam • 0

Hello everyone, would you mind helping me. I am using TCGAbiolinks to get TCGA data, I want to get fpkm instead of count, on the other hand I have to use STAR-counts instead of HTseq-counts since TCGA has been updated.

library(TCGAbiolinks)

stadquery <- GDCquery(project = "TCGA-STAD", 
                      data.category = "Transcriptome Profiling",
                      data.type = "Gene Expression Quantification",
                      workflow.type = "STAR - Counts", legacy = F,
                      experimental.strategy = "RNA-Seq") 


GDCdownload(query = stadquery, method = "api",)                        


stadprpr <- GDCprepare(query = stadquery, summarizedExperiment = T)

but when I use Exdata <-stadprpr@assays@data$fpkm_uq_unstrand, the matrix doen't contain the colnames(samples) and rownames(genes). How can I fix it? Thanks in advance.

FPKM RNA-seq TCGAbiolinks • 1.6k views
ADD COMMENT
0
Entering edit mode
2.0 years ago
ATpoint 85k

That is because you use a custom and not recommended way of accessing the data. It is a SummarizedExperiment and for this you should use the dedicated getter function assay:

library(SummarizedExperiment)

# show available assays
assayNames(stadprpr)

# get FPKM
assay(stadprpr, "fpkm_uq_unstrand")

# show sample annotations
colData(stadprpr)

# show gene annotations
rowData(stadprpr)

Accessing specialized data formats such as a SE with @ results in these types of hickups. There are always dedicated functions (setters/getters) for subsetting and extraction operations, see https://bioconductor.org/packages/release/bioc/vignettes/SummarizedExperiment/inst/doc/SummarizedExperiment.html#assays

ADD COMMENT
0
Entering edit mode

I appreciate your help

ADD REPLY

Login before adding your answer.

Traffic: 2353 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6