We are interested in using the data available through The Cancer Genome Atlas (TCGA) to look at pharmacogenomic outcomes. Essentially, we want to look at how patients with a type of cancer (e.g. breast cancer) responded, or failed to respond, to medication used. There are 3 levels of data available for SNP and CNV genotype data. The clinical treatment data is apparently sparse with a lot of missing variables or incorrectly annotated. I wonder if anyone has experience with this or could comment on the correctness or completeness of this dataset and the feasibility of examining pharmacogenomic outcomes.
I have had less luck getting a straight answer to this question from the source and the application process is a bit of a pain. Is there someplace this information can be found? Or do any of you know?
I have written to them. They wrote back telling me to use the data matrix to answer these questions. https://tcga-data.nci.nih.gov. After downloading it, I was astonished to see the amount of data missing. It looks like the kind of exploration we intended is not possible--or at least not easy--using TCGA.
On the plus side, their data matrix does appear quite useful. It would just be nice if they could relax their iron-clad grip on level 1/2 data to make it a little less annoying for researchers to get to.