Help finding specific breast cancer datasets
1
0
Entering edit mode
5.5 years ago
oludhe ▴ 90

Hi All,

I am pretty new to bioinformatic analysis and so I apologize if this seems like a simple/redundant question. I am wondering whether specific datasets are available - or how to filter existing datasets to find specific data.

I am looking for RNA-Seq data, for breast cancer which has metastasized to the bone, so RNA-seq of the secondary tumour. Furthermore, I would like to access data as to the time to recurrence so I can identify how long it took to outgrowth in the secondary site.

Please help me identify how to go about finding the data for this type of specificity.

Thank you.

breast cancer metastasis bone • 1.4k views
ADD COMMENT
1
Entering edit mode

Hi Kevin,

Thank you so much for this answer, I found some of the datasets I was looking for!

ADD REPLY
4
Entering edit mode
5.5 years ago

TCGA has 17 [breast] primary tumour samples that eventually metastasised to bone; however, the RNA-seq is of the primary tumour. In total, only 7 breast cancer samples in TCGA are actual metastatic samples, and none of these are bone mets, from what I can see. I got this information from my 2 answers, here: C: TCGA metastatic samples

Then there is this study that looked at bone mets, but it is microarray data: Latent bone metastasis in breast cancer tied to Src-dependent survival signals (GSE14020)

Probably your best bet is to browse the curated datasets at the relatively new Human Cancer Metastasis Database. Go to Browse, filter for 'breast cancer', and then look for 'bone' in the Metastasis Site column.

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin,

Do you happen to have an R/bioconductor or even a Python script for querying and downloading TCGA files? I am not finding it so easy to download the relevant data and view it as a dataframe object.

Thanks!

ADD REPLY
0
Entering edit mode

Is it now resolved?

ADD REPLY
0
Entering edit mode

What do you mean by it?

I found the datasets and have an excel sheet of the IDs of the datasets, but I was wondering if you had a script that could help me automate the finding and downloading of the datasets through multiple queries with the TCGA/GDC portal

ADD REPLY
0
Entering edit mode

Can you elaborate on which IDs you have, specifically?

ADD REPLY
0
Entering edit mode

Hi Kevin, I downloaded all the data in csv, filtered for primary 'breast' and secondary 'bone' or 'bone,liver,lung,other' and got 107 matches of microarray data from GEO (GSM) or other sequencing platforms from TCGA.

GSM352100 GSM352103 GSM352105 GSM352109 GSM352117 GSM352119 GSM352123 GSM352124 GSM352126 GSM352131 GSM352144 GSM352149 GSM352151 GSM352154 GSM352155 GSM352159 GSM352163 GSM352167 GSM352100 GSM352103 GSM352105 GSM352109 GSM352117 GSM352119 GSM352123 GSM352124 GSM352126 GSM352131 GSM352144 GSM352149 GSM352151 GSM352154 GSM352155 GSM352159 GSM352163 GSM352167 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1312932 GSM1312938 GSM1312944 GSM1312946 GSM1312948 GSM1312953 GSM1312955 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-AR-A5QQ-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-AR-A5QQ-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-AR-A5QQ-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A

These are the sample IDs in totality

ADD REPLY
0
Entering edit mode

You’ve mentioned RNAseq in the original post - are you ok with Microarray data also?

ADD REPLY
0
Entering edit mode

Yes, I am okay with Microarray data as the datasets I have to choose from are very limited.

ADD REPLY

Login before adding your answer.

Traffic: 2032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6