Entering edit mode
8.7 years ago
Sean Davis
27k
This is a little survey question meant to result in a list of client software to access TCGA data from R, python, or any other language. Please consider adding more to the list or to comment on your experiences with each.
R
- http://www.liuzlab.org/TCGA2STAT/
- https://www.bioconductor.org/packages/release/bioc/html/RTCGAToolbox.html & https://github.com/link-ny/rtcgatoolbox (Bioconductor compatible objects returned)
- http://www.compgenome.org/TCGA-Assembler/
- https://www.bioconductor.org/packages/release/bioc/html/TCGAbiolinks.html
- http://www.cbioportal.org/cgds_r.jsp
- https://github.com/isb-cgc/examples-R
- https://github.com/mariodeng/FirebrowseR to pull Firehose output into R
There is a very brief tutorial on the use of RTCGAToolbox (plain, not the linkNY enhanced version) at
http://genomicsclass.github.io/book/pages/tcga.html
The layout is somewhat clumsy, but approaches to use of survival, stage, mutation, expression, and methylation data are all illustrated. The conclusion of the tutorial includes the remarks:
TCGA is an obvious candidate for infrastructure development to support multiomic analysis. We have seen some of the challenges that arise when even a nicely developed tool like RTCGAToolbox is used to acquire the data: we must be alert to mismatched sample identifier labels, missing data, inadequate documentation of sample provenance and assay conduct, and so on. Human effort is invariably required; standards for data quality must go beyond numerical accuracy and address transparency and usability.