How Do You Retrieve The Segmented Copy-Number Scores Of The Tumor Samples And The Paired-Normal Control From The Level 3 Data?
2
0
Entering edit mode
11.0 years ago
liu4gre ▴ 210

In "Integrative eQTL-Based Analyses Reveal the Biology of Breast Cancer Risk Loci", the author did this analysis. Anyone can tell me how to do this? Thanks,

cnv • 4.8k views
ADD COMMENT
0
Entering edit mode

You'll get a better response if you provide some indication of having tried to research the question for yourself: e.g. linking to the article, have you read it, have you explored TCGA, if so what problems did you have.

ADD REPLY
0
Entering edit mode

Thanks. I have read the paper, and I am wondering "how to retrieve the segmented copy-number scores of the tumor samples and the paired-normal control from the level 3 data?". To my knowledge, the level-3 data gives the log2 ratio (segment-mean), am I right? Then how can the author get the copy-number scores for both tumor and control samples? Maybe they only mean to get the copy-number fold-change?

ADD REPLY
1
Entering edit mode
11.0 years ago
pstew ▴ 50

The Cancer Genome Atlas (TCGA) "Level 3" CNV data is highly processed and provides mean copy number estimate of segments covering the whole genome. Are you having trouble locating the URL to download the data, or are you having trouble opening/interpreting the files provided in your download of Level 3 patient data? Here is a link to the TCGA "Data Matrix" which allows you to query the database and download requested samples.

ADD COMMENT
0
Entering edit mode

Actually I have problem in interpreting the files in Level 3 data. For SNP array, the Seg-ment means the log2 fold change (btw tumor and normal cells)? Thanks.

ADD REPLY
0
Entering edit mode

I can't find a mention of a log2 ratio in TCGA documentation. Level 3 data is typically processed and normalized, but I believe it would be very clear in the documentation or in the file if a transformation was performed or a ratio was taken. Here's a description of the levels of data for different data types: https://tcga-data.nci.nih.gov/tcga/tcgaDataType.jsp . If you're still unsure, why not just start from the Level 1 data? It looks like this Bioconductor package/tutorial allows you to get segment means from raw/Level 1 TCGA data: http://www.bioconductor.org/packages/release/bioc/vignettes/cghMCR/inst/doc/findMCR.pdf .

ADD REPLY
1
Entering edit mode
11.0 years ago
B. Arman Aksoy ★ 1.2k

If you want the latest segmentation files, you can download them from the Broad's Firehose archives (under the stddata folder):

http://gdac.broadinstitute.org/runs/

For example, for GBM:

http://gdac.broadinstitute.org/runs/stddata__2013_11_14/data/GBM/20131114/

and more specifically:

http://gdac.broadinstitute.org/runs/stddata__2013_11_14/data/GBM/20131114/gdac.broadinstitute.org_GBM.Merge_cna__hg_cgh_415k_g4124a__hms_harvard_edu__Level_3__segmentation__seg.Level_3.2013111400.0.0.tar.gz

Alternatively you can use http://www.broadinstitute.org/tcga and use their own ready-to-go IGV webstart options.

ADD COMMENT

Login before adding your answer.

Traffic: 2087 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6