Dear Ankita,
what was your exact R code in order to download your relative TCGA dataset ?
For example, a query for the COAD dataset, in order to download raw HTSEQ RNA-Seq counts for the provinsional data:
hg38.coad <- GDCquery(project = "TCGA-COAD",
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "HTSeq - Counts",
experimental.strategy = "RNA-Seq",
sample.type = c("Primary solid Tumor","Solid Tissue Normal"))
GDCdownload(hg38.coad,files.per.chunk = 50)
coad_data <- GDCprepare(query = hg38.coad,
save=TRUE,save.filename = "hg38.coad.updated.htseqcounts.rda")
So, this code chuck automatically saves your raw counts into a ranged Summarized experiment:
class: RangedSummarizedExperiment
dim: 56963 506
metadata(1): data_release
assays(1): HTSeq - Counts
rownames(56963): ENSG00000000003 ENSG00000000005 ... ENSG00000281912
rowData names(3): ensembl_gene_id external_gene_name
colnames(506): TCGA-3L-AA1B-01A-11R-A37K-07
TCGA-DM-A1D8-01A-11R-A155-07 ... TCGA-AA-3675-01A-02R-0905-07
colData names(101): sample patient ...
subtype_vascular_invasion_present subtype_vital_status
Hello Sir,
Thank you for the reply,
I know the GDCprepare automatically saves the raw counts into SummarizedExperiment, But whenever i run that GDCprepare step it throws error, telling as cannot connect to biomaRt website. I have posted one question in biostars and bioconductor forums regarding this GDCprepare error, you can find it here (help with GDCprepare ) and ( because every time i run this it shows error.
So i was thinking to save the GDCpraper object as data.frame by setting SummarizedExperiment to false and then converting it to Summarizedexperiment object.
I think the problem is more clear this time, i hope this helps you to understand my problem. Thank you very much for your response, but if you know how to convert from data.frame to SummarizedExperiment then it will be really my happy day.
Thank you so much.