Biology Database Project
3
1
Entering edit mode
12.8 years ago
Mattyd ▴ 10

I'm an applications programmer normally with an interest in biology.

I had an idea of creating a web based application with complete listings of biology related things such as:

  • Primers

  • Parts

  • Genes

  • Promotors

  • Proteins

  • Bio Programs

Where would be a good place to find lists of these or even just parts of these lists...

Do you think wikipedia would be a good source of this info?

PS I've posted this on reddit.com/r/biology and was pointed to http://genome.ucsc.edu/ and http://uswest.ensembl.org/index.html

Are there any legalities about using their data?

Thanks in advance

programming dataset • 3.6k views
ADD COMMENT
7
Entering edit mode

Complete lists of "things", even if they were possible, are not very interesting or useful. Can I suggest that you discuss your idea in more detail with some biologists? Bioinformatics is very much about helping biologists get the most from their data; this means finding out what they want and implementing it, rather than just creating tools for fun.

ADD REPLY
3
Entering edit mode

You're going to redo UCSC genome browser?!? By yourself?

ADD REPLY
3
Entering edit mode

@MattyD: You do realize that UCSC and Ensembl are web applications. They have backend databases, all of which are fully open. You can download the entire site. I would suggest that you clarify for yourself what questions you want to answer. If there are existing databases and websites that answer your questions, then you are done. If not, then you will have a much clearer path forward knowing where you want to end up.

ADD REPLY
1
Entering edit mode

good luck ! :-)

ADD REPLY
1
Entering edit mode

@Matty - are you going to spider the UCSC genome browser? That's even funnier .. in fact a few years back I talked with Jim Kent and he mentioned laughing that there were webcrawlers that endlessly followed all form buttons on the UCSC genome browser: next, zoom in zoom out, previous, next etc .

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

@neilws: How does http://genome.ucsc.edu/ differ from complete lists? Thats the sort of thing I was looking at doiing...

ADD REPLY
0
Entering edit mode

@MattyD, you definitely need to find a biologist partner in your endeavor. Without being too glib, your last question to Neil is a bit like asking "How is python different than TCP/IP?" Two concepts that are generally in the same space, but not at all comparable...

ADD REPLY
0
Entering edit mode

@Madelaine: Well if I write a nice web crawler/spider the work will mostly be done for me

ADD REPLY
0
Entering edit mode

@Istvan: Maybe not that site in particular even if I just point it to ceriain pages to extract that data displayed rather than all the form buttons etc etc

ADD REPLY
4
Entering edit mode
12.8 years ago

Have you tried searching on the likes of delicious or diigo.

e.g. http://delicious.com/search?p=bioinformatics+proteins

I would argue its pointless collecting these links in a single centralised webpage but rather that user communities such as delicious and diigo will be a much better and up to date, self-correcting list of links to tools.

ADD COMMENT
1
Entering edit mode
12.8 years ago
Gmoney ▴ 220

The data is available to be downloaded as whole tables from ucsc. If you're thinking about making some nice visualizations, that's a good place to start.

There is no such thing as a complete listing. The only constant is change.

Are you going to go so far to list things like MitoCarta? Or PlasmoDB? These are domain-specific databases that carry a lot of weight in their respective fields. The rabbit hole may be deeper than you think.

ADD COMMENT
0
Entering edit mode
17 months ago

Hey, that sounds like a cool project! If you're looking for lists of biology-related things like primers, genes, and proteins, Wikipedia can be a good starting point, but it might not have everything you need. The websites you were pointed to, genome.ucsc.edu and uswest.ensembl.org, are also great resources. As for the legalities, it's important to check the terms of use and any licenses associated with the data on those websites.

ADD COMMENT

Login before adding your answer.

Traffic: 2182 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6