TCGA data set with both expression (rna seq/ microarray) and exome seq of same samples?
3
1
Entering edit mode
8.4 years ago

I have tried downloading some TCGA RNA seq data before from TCGA site, also trying TCGA-Assembler download route. However I don't remember being able to do this:

Does anyone have any idea/ suggestion regarding how to find data sets of a particular cancer that have both some sort of expression results (RNA-seq or microarray) and exome sequencing results?

Edit:

I was able to use http://gdac.broadinstitute.org/ and found the data I needed.

tcga cbioportal cgdsr • 3.2k views
ADD COMMENT
1
Entering edit mode
8.4 years ago

TCGA has both gene expression and exome/genome sequence data for nearly all samples. To get access to the actual sequencing data, you will need to apply for access, as sequencing data for human subjects is almost always controlled access. See here for instructions:

https://wiki.nci.nih.gov/display/TCGA/Application+Process

ADD COMMENT
0
Entering edit mode

Thank you! After poking around I found http://gdac.broadinstitute.org/

However, while I tried to download the data from the above link, for example, through COADREAD Archives- After downloading MAF files and mRNAseq files -

I found overlapping samples (based on project, TSS, participant ID of TCGA barcodes), however there are only 74 overlapping samples between mRNAseq data and Mutation Annotation files. While when I tried using R package cgdsr_1.2.5 for querying data from CBio portal as well and found that in the COADREAD datasets there should be at least 195 cases in one of the studies (Colorectal Adenocarcinoma (TCGA, Nature 2012)) with complete data (mutation, mRNA,etc) . The only problem I have with using cgdsr to query CBio portal is that there isn't a way to do bulk download, I need to specify specific genes. Not sure why I am getting fewer overlapping cases through GDAC website though

ADD REPLY
0
Entering edit mode
8.3 years ago
nwon ▴ 60

All TCGA data has migrated to Genomic Data Commons Link to Genomic Data Commons

Within this web resource is the legacy TCGA data within the legacy database.

ADD COMMENT
0
Entering edit mode
8.3 years ago
pel ▴ 20

You can find the largest selection of level2 and level3 (no human subjects protocol required) data for somatic mutations, CNVs, SNPs, methylation, and RNA-Seq and chip-based expression for each tumor in TCGA for multiple cancer sites at the PanCancer 12 site

Recall, as was pointed out above, you cannot get the sequence data without approval, however, the mutation (.maf) files are level2 and have most of the mutation calls.

ADD COMMENT

Login before adding your answer.

Traffic: 1979 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6