Publicly Available Tumor/Normal Illumina Data For Evaluation Of Somatic Variant Callers
1
11
Entering edit mode
12.1 years ago

Is anyone aware of some publicly available paired-end Illumina data that is unencumbered by data use agreements so that they could be used in a teaching context?

I would need the data to correspond to a matched tumor/normal sample pair. Ideally sequenced on the Illumina HiSeq platform.

There are of course many, many papers describing analysis of these kinds of data. My own center has many such data sets. But they are all significantly encumbered by data access restrictions that prevent them from being used publicly or disseminated as part of a course/workshop. I need to be able use the data in classes and ideally for the students to be able to access that data later to practice their skills on.

Perhaps a tumor cell line that was sequenced along with an EBV transformed lymphoblastoid 'normal' cell line from the same individual? Or perhaps a patient sample that was consented in very broad terms with the express purpose of making the data publicly available with minimal restrictions on use (including explicit approval for teaching purposes)?

Please feel free to suggest something that doesn't quite meet might my criteria but you think is likely to be the next best thing.

variant somatic • 7.0k views
ADD COMMENT
0
Entering edit mode

Does anyone know publicly available data (ex: fastq files) for tumor/normal pairs from mice? Thank you in advance!

ADD REPLY
4
Entering edit mode
12.1 years ago
Irsan ★ 7.8k

You can go to the SRA from NCBI and search for SRA-entries with filters on DNA, whole genome and public access. When I did that I crossed a study with tumor-normal pairs here You might want to install the Aspera client for file transfer (faster downloads) but you can also download the data without

If it is not critical to have tumor-normal pairs you can try the data from the 1000 genomes project available from the ftp. Every directory from there corresponds to an individual providing you with the original reads and (exome) alignment. No tumor-normal pairs though but no restriction on the data use.

ADD COMMENT
0
Entering edit mode

That link goes to cancer cell line data. Anyone know of public matched tumor/normal sample data? Cell lines tend to acquire traits that are not in the original sample. Thanks.

ADD REPLY
0
Entering edit mode

Were you able to find any such study ?

ADD REPLY
2
Entering edit mode

Here's a study that fits the original request -- freely released tumor-normal pairs from 7 patients: An open access pilot freely sharing cancer genomic data from participants in Texas

ADD REPLY
0
Entering edit mode

@Eric, do you by any chance came across data for tumor-normal pairs from mice? Thank you!

ADD REPLY
2
Entering edit mode

The sequencing data for benign, primary tumor and metastatic samples from 103 mice from McCreery et al. 2015 is available through ENA: https://www.ebi.ac.uk/ena/data/view/PRJEB4767

ADD REPLY
0
Entering edit mode

This is a very nice dataset. I needed to check Tumor ID in Supplementary Table 1 and "Submitter's sample name" on EBI to be able to match the annotation of samples and know which fastq files belong to which class of tumor. On EBI website, you click on "Select columns" and choose "Submitter's sample name".

ADD REPLY

Login before adding your answer.

Traffic: 1556 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6