Entering edit mode
2.0 years ago
Rob
▴
170
Hi all,
I used the following code to download the TCGA RNAseq data. This includes all genes. I want only protein-coding genes. Is there any code to filter for only coding genes? Thanks
query <- GDCquery(
project = "TCGA-KIRC",
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
experimental.strategy = "RNA-Seq",
sample.type = "Primary Tumor",
workflow.type = "STAR - Counts")
GDCdownload(query, method = "api" )
#prepare data
data_TCGA_STAR_KIRC <- GDCprepare(query)
########
# generate count matrix
rna_STAR <- as.data.frame(SummarizedExperiment::assay(data_TCGA_STAR_KIRC))
write.csv(rna_STAR, "STAR Count_expression_mRNA.csv")