I need clinical data of Glioblastoma patients from TCGA including Age at initial pathologic diagnosis, Date of death, Last follow up date, Ethnicity, Gender, Histological type, History of neoadjuvant treatment, Method of initial pathologic diagnosis, Karnofsky performance score, Performance status scale timing, Person neoplasm cancer status, Prior glioma, Race. How to download these data from TCGA portal giving patient or sample ID's? If anyone have this dataset, plz share as soon as possible.
You can do that in R using the TCGAbiolinks package (with which you can also download the actual gene expression data as follows) and follow the instructions from the Bioconductor webpage. My script would be as follows:
library("TCGAbiolinks")
DIRPREFIX <- "/PATH/"
## Download raw counts for indicated sets
## Example of TCGA.data.sets: c("TCGA-GBM")
for (id in TCGA.data.sets){
clin.query <- GDCquery(project = id, data.category = "Clinical")
json <- tryCatch(GDCdownload(clin.query),
error = function(e) {GDCdownload(clin.query, method = "client")})
## the following types of clinical data can be obtained (just pick whichever you need)
clinical.drug <- GDCprepare_clinic(clin.query, clinical.info = "drug")
clinical.admin <- GDCprepare_clinic(clin.query, clinical.info = "admin")
clinical.radiation <- GDCprepare_clinic(clin.query, clinical.info = "radiation")
clinical.patient <- GDCprepare_clinic(clin.query, clinical.info = "patient")
clinical.stage_event <- GDCprepare_clinic(clin.query, clinical.info = "stage_event")
clinical.new.tumor.event <- GDCprepare_clinic(clin.query, clinical.info = "new_tumor_event")
clinical.followup <- GDCprepare_clinic(clin.query, clinical.info = "follow_up")
clinical.index <- GDCquery_clinic(id, type = "clinical")
clinical.biospecimen <- GDCquery_clinic(id, type = "biospecimen")
save(WHAT_YOU_NEED, file = file.path(DIRPREFIX, paste0(id, "_clinical.Rdata")), compress = "xz")
}