Triple Negative Breast Cancer: Undergraduate project help? Finding publicly accessible data.
1
2
Entering edit mode
7.7 years ago
Caitlin ▴ 100

Hi all

I am enrolled in an undergraduate computational biology course and I am struggling with the task of locating publicly accessible data for a course project. Ideally, from human patients / participants with control (unaffected) and experimental (affected) tissue sequences. I have been exploring The Cancer Genome Atlas, but I find the interface confusing in that I do not know where, or indeed if, sequence data in a format I understand, e.g., FASTA, can be found. I am required to determine the secondary RNA structures, e.g., alpha pleated sheets, beta barrels, etc. and PSIPRED requires that all submissions be amino acids. I am familiar with both Python and R so I could quickly learn how to use a specific package if doing so would facilitate the completion of a noteworthy project. Please note: Although, I mentioned a specific form of breast cancer in the title, triple negative, I am perfectly willing to change the major focus to another form of cancer as long as sequence data can be readily obtained. I also endeavor to obtain microarray data for the subsequent construction of a heat map illustrating differing levels of gene expression.

Thank you.

~Caitlin

cancer microarray Python R • 1.7k views
ADD COMMENT
0
Entering edit mode

Have a look at OncoTrack and/or the Etriks portal. It may be worth having a look at the Open Targets Platform too. These are the targets associated with triple-negative breast cancer based on our latest release. Try searching for other cancers as well. The association is made based on differential expression (from microarray and RNASeq experiments) and other data sources. We also show if the expression is up or downregulated in patients x controls (look for 'increased' under the 'Activity' column in this table showing the evidence for BRIP1-breast carcinoma association. Just be aware that the Platform does not store patient data though. This data comes from Expression Atlas and we link the study back to the original database.

ADD REPLY
1
Entering edit mode
7.7 years ago
Jake Warner ▴ 840

Perhaps easier than wading through the sequencing archives would be to find a recent study, downloading their data, and then performing your analyses. For example putting "triple negative breast cancer sequencing" into pub med turned up this:

http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1002193#sec034

You can find their data in the "data availability" box. This is just one example but I'm sure you can find a study and dataset that is close to what you're looking for!

ADD COMMENT

Login before adding your answer.

Traffic: 1918 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6