Hi!
I'm using R and TCGAbiolinks to retrieve data and clinical data from GDC. To do so I use the following:
library(TCGAbiolinks)
patientdownload<-function("TCGA-LIHC"){
clinquery<-GDCquery(project = "TCGA-LIHC",data.category = "Clinical")
GDCdownload(clinquery,chunks.per.download = 30)
prepatientout<-GDCprepare_clinic(clinquery, clinical.info = "patient")
However, I am finding some iconsistencies between what I'm getting and what is in GDC. For instance, for subject with 'bcr_patient_barcode=TCGA-DD-AADB' I retrieve the following data from 'GDCquery'
bcr_patient_barcode gender race_list vital_status neoplasm_histologic_grade stage_event_pathologic_stage
18 TCGA-DD-AADF FEMALE ASIAN Dead G4 Stage I
However, when you look at the subject's data in GDC (here) everything is in agreement, with exception for Grade, which is never reported.
Why?
Could this mean that 'neoplasm_histologic_grade' is not the tumor grade? Or that 'GDCquery' is retrieving some Legacy data?
EDIT: this was now crossposted at github