Public Disease Exomes (VCF)
2
0
Entering edit mode
10.1 years ago
davmlaw ▴ 130

Hi, I am writing a webapp to help people rapidly search through exome variants. I would like some publicly available test data sets for examples/tutorials.

I know people have previously asked for something similar:

However I would specifically like exomes that are NOT healthy, ie have confirmed diagnoses of diseases, that do not have any restrictions on them. For instance something that requires written approval is out, as I want to put it on my public site.

Ideally I'd like:

  • Famous / Well known disorders
  • Different types, eg de-novo and a pedigree with Mendelian disorder
vcf exome • 3.6k views
ADD COMMENT
1
Entering edit mode

As I don't think you will find these datasets, could you just create make examples that show of your webapps functionality.

For instance, using the 1000 genomes exomes and randomly assigning people into case / control groups then displaying that data. This would, I think, be powerful enough, to show the apps functionality for any complex traits, but would be limited for the mendelian traits you mention.

ADD REPLY
5
Entering edit mode
10.1 years ago

Many of the samples from ExAC (released yesterday, to my knowledge) have genetic diseases. I kind of doubt that they're annotated as such (privacy and all that), but if you pick a few diseases (say BRCA) and look through the exome data at the affected genes I would suspect that you'll find a number of useful samples (that are >60,000 samples to begin with, so getting enough samples for your needs shouldn't be an issue). Those samples should be free to use for your needs.

ADD COMMENT
1
Entering edit mode

Devon, the ExAC exomes are not annotated with phenotypes, but they do include samples from disease cohorts, so your idea of searching for samples with known disease mutations is a good one.

ADD REPLY
0
Entering edit mode

Thanks, was talking to someone after work and they suggested BRCA from TCGA. But a mendelian would be nice. How many hundreds of thousands of exomes are there and yet no public diseased datasets!

ADD REPLY
1
Entering edit mode

The biggest issue here is one of privacy. That's likely why you won't find many datasets from whole family exome sequencing.

ADD REPLY
3
Entering edit mode
7.4 years ago

Hi there, I made a public data with patients with Mendelian Disorders

https://github.com/raonyguimaraes/mendelmd/tree/master/examples

ADD COMMENT

Login before adding your answer.

Traffic: 2681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6