Question

Relational Database For De Novo Variants

1

Entering edit mode

13.1 years ago

Davy ▴ 410

I have been asked to look into creating a database to store data relating to de novo variants found during a sequencing project in my institution. The database would then used to centrally store all the information amassed on the discovered variants, to be accessed via a website, by other members of my institution to do things like download, upload, and modify information relating to these variants. I am somewhat familiar with pyhton, django, and mysql - (as in I've been using python for a number of years for simple scripting, and I've been through the django and mysql tutorials).

I have been thinking about the database design. The database will need to store things like chromosome, bp, gene, exon, strand, cytogenetic band, individual in which the variant was discovered, validation status, gerp score, polyphen prediction, and perhaps more information that I haven't thought of yet. Does anyone have any ideas for optimal db design in this instance. Should I use seperate tables for Variant, Gene, Individual, and then use relational tables (is that what they are called?) to put all the info together for the end user?

Any suggestions are welcome, and also if you know some better tools than python, django, and mysql, please let me know. Cheers, Davy.

denovo database python mysql • 3.8k views

ADD COMMENT • link updated 13.1 years ago by Leonor Palmeira 3.9k • written 13.1 years ago by Davy ▴ 410

0

Entering edit mode

If you want a subjective opinion, I am currently partial to flask instead of django and postgresql rather than mysql.

ADD REPLY • link 13.1 years ago by Sean Davis 27k

0

Entering edit mode

cross-posted on SO: http://stackoverflow.com/questions/11181142

ADD REPLY • link 13.1 years ago by Pierre Lindenbaum 166k

score 3 · Answer 1 · 2012-06-23

3

Entering edit mode

13.1 years ago

Pierre Lindenbaum 166k

In my bookmarks: LOVD:

"Leiden Open (source) Variation Database."

LOVD's purpose : To provide a flexible, freely available tool for Gene-centered collection and display of DNA variations.

http://www.lovd.nl/2.0/

ADD COMMENT • link 13.1 years ago by Pierre Lindenbaum 166k

score 2 · Answer 2 · 2012-06-23

Not knowing your level of expertise, here are some thoughts from my own database experience. First, you should try answering the following questions:

which fields belong to the same category (describe the characteristics of another field)
which fields of the db will be queried by users?
which combination of fields will be queried most frequently?

These answers should lead you to the database desgin that is most appropriate for your situation.

Briefly, each table should describe the features of a given variable. For instance, if you have a 'Gene' table, each line should describe chromosome, strand, ORF position, ... It may sound obvious, but conceptually separating each variable description into a table will give you a sane design. Then, according to the expected usage, you should avoid having to join tables to access data. If this is done frequently, you should consider having an intermediate table containing the join.

score 0 · Answer 3 · 2012-06-23

0

Entering edit mode

13.1 years ago

Christian ★ 3.1k

You might find this useful:

Human variation database: an open-source database template for genomic discovery

http://bioinformatics.oxfordjournals.org/content/27/8/1155.abstract

ADD COMMENT • link 13.1 years ago by Christian ★ 3.1k

0

Entering edit mode

This is an old question but I am looking at setting up a local DB for variations from local exome sequencing projects which will allow us to see rare variants that are due to our local population and are not causal disease variants. I looked at using this tool but following the website's instructions it really seems like it just doesn't work. Anyone successfully using this?

ADD REPLY • link 12.9 years ago by DG 7.3k

0

Entering edit mode

Have you tried contacting the author?

ADD REPLY • link 12.9 years ago by Christian ★ 3.1k

0

Entering edit mode

Not yet, decided to cut my losses and just implement my own database. Was going to be quicker and easier for my needs anyway.

ADD REPLY • link 12.8 years ago by DG 7.3k

score 0 · Answer 4 · 2012-06-23

0

Entering edit mode

13.1 years ago

Sean Davis 27k

Before going off to build something from scratch, you should look at using biomart. It may suit your needs and will be less work than building something from scratch (though it is less flexible than DIY).

ADD COMMENT • link 13.1 years ago by Sean Davis 27k