Storage Of Dna Database In Mongodb
3
1
Entering edit mode
14.1 years ago
Sasikala ▴ 40

Can i use MongoDB for DNA database storage

sequence database • 8.1k views
ADD COMMENT
3
Entering edit mode

You should probably specify the use cases/scenarios that you have in mind.

ADD REPLY
0
Entering edit mode

How come peple vote up such a qustion ?

ADD REPLY
6
Entering edit mode
14.1 years ago
Neilfws 49k

Short answer is "yes"; you can store anything you like in MongoDB, it's schema-free.

Longer answer: as discussed above, the main consideration is the 4 MB size limit for documents. This constrains your document design, so you need to think about what goes into a document. As Brad suggested, you'll need GridFS to store larger objects.

As with any database, it's good to think about design before you start. The temptation with MongoDB is simply to "stuff and forget" - any data that can be parsed into a hash-like structure is easy to save. But then how do you retrieve documents and what do you want to do with them? It helps to have a good idea of document structure, which keys to index, which keys to query on and so on. This is particularly the case if you intend to employ map-reduce, e.g. compound keys will not work for that case. Some people like to impose a schema using one of the many available object document mappers (ODMs), either when saving or later on for query/retrieval.

MongoDB also allows relations between collections. These can be useful in certain scenarios, although some purists suggest that it's better to avoid "relational thinking" and aim for a purely key-value approach, in order to better understand MongoDB.

ADD COMMENT
4
Entering edit mode
14.1 years ago

(Neil's not here ? yessss ;-) ).Yes, MongoDB is a key Value DataStore, so, for example, if you want to save a pair (name,sequence), then it is straightforward with mongodb

use mydb;
db.dna.save({_id:"CB017399", seq:"GGAAGGGCTGCCCCACCATTCATCCTTTTCTCGTAGTTTGTGCACGGTGCGGGAGGT..."});

and you can also add some indexed data:

db.dna.save(
   {_id:"CB017399",
   gi:27592135,
   organism:{name:"Gallus gallus",taxid:9031},
   seq:"GGAAGGGCTGCCCCACCATTCATCCTTTTCTCGTAGTTTGTGCACGGTGCGGGAGGT..."}
   );

However I'm not sure it would a good way to store some large sequences.

If you really want to use a key/value datastore, have a look at BerkeleyDB. This (free) engine is interesting because it is fast (everything is binary data), it can be embedded (no network involbed) and you can ask for only retrieving a chunk of your value. So, say, if you store the human chr1 and ask for the very first bases, you won't have to load the entire chromosome in memory.

ADD COMMENT
4
Entering edit mode

Individual objects in MongoDB have a 4MB size limit. For large sequences, use GridFS storage in Mongo: http://www.mongodb.org/display/DOCS/GridFS

ADD REPLY
0
Entering edit mode

Thanks for this comment Brad. I didn't know about this 4MB limit

ADD REPLY
0
Entering edit mode
8.2 years ago
vibes1002003 ▴ 30

Anyone know the source where application of Mean Stack(http://mean.io/#!/) in NGS/bioinformatics web development given for instance any tutorial or website ? Thanks!!

ADD COMMENT
0
Entering edit mode

sorry, your question makes no sense, just from grammar standpoint.

Are you looking for mean stack tutorials? What makes you think that this framework is great for NGS /bioinformatics web development? What is 'bioinformatics web development' anyway? Where is that supposed to be different than 'normal' web development? Why do you post this in a 6 year old thread, I guess because of mongo db - but why does it have to be mongo db, what do you want to accomplish anyway? Question, questions, questions.

ADD REPLY
0
Entering edit mode

I am sorry I pose so many mysteries. I am simply looking for web-link/source where there is example of web-tool development in bioinformatics (biological tool) using Mean Stack platform. Definitely I posted here because of mongodb, which is also part of mean.

ADD REPLY
0
Entering edit mode

Ok. I don't know any links for that matter. And I also don't think that this exists. As I stated above, I don't see where bioinformatic web-tools are different than 'normal' webtools. I think you just should go through the normal tutorials for the programming language you want to use. I don't think anybody put together specific tutorial for this specific topic but if you find something, post it would be interesting to read.

ADD REPLY

Login before adding your answer.

Traffic: 1889 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6