Complete lists of "things", even if they were possible, are not very interesting or useful. Can I suggest that you discuss your idea in more detail with some biologists? Bioinformatics is very much about helping biologists get the most from their data; this means finding out what they want and implementing it, rather than just creating tools for fun.
@MattyD: You do realize that UCSC and Ensembl are web applications. They have backend databases, all of which are fully open. You can download the entire site. I would suggest that you clarify for yourself what questions you want to answer. If there are existing databases and websites that answer your questions, then you are done. If not, then you will have a much clearer path forward knowing where you want to end up.
@Matty - are you going to spider the UCSC genome browser? That's even funnier .. in fact a few years back I talked with Jim Kent and he mentioned laughing that there were webcrawlers that endlessly followed all form buttons on the UCSC genome browser: next, zoom in zoom out, previous, next etc .
@MattyD, you definitely need to find a biologist partner in your endeavor. Without being too glib, your last question to Neil is a bit like asking "How is python different than TCP/IP?" Two concepts that are generally in the same space, but not at all comparable...
@Istvan: Maybe not that site in particular even if I just point it to ceriain pages to extract that data displayed rather than all the form buttons etc etc
I would argue its pointless collecting these links in a single centralised webpage but rather that user communities such as delicious and diigo will be a much better and up to date, self-correcting list of links to tools.
The data is available to be downloaded as whole tables from ucsc. If you're thinking about making some nice visualizations, that's a good place to start.
There is no such thing as a complete listing. The only constant is change.
Are you going to go so far to list things like MitoCarta? Or PlasmoDB? These are domain-specific databases that carry a lot of weight in their respective fields. The rabbit hole may be deeper than you think.
Hey, that sounds like a cool project! If you're looking for lists of biology-related things like primers, genes, and proteins, Wikipedia can be a good starting point, but it might not have everything you need. The websites you were pointed to, genome.ucsc.edu and uswest.ensembl.org, are also great resources. As for the legalities, it's important to check the terms of use and any licenses associated with the data on those websites.
Complete lists of "things", even if they were possible, are not very interesting or useful. Can I suggest that you discuss your idea in more detail with some biologists? Bioinformatics is very much about helping biologists get the most from their data; this means finding out what they want and implementing it, rather than just creating tools for fun.
You're going to redo UCSC genome browser?!? By yourself?
@MattyD: You do realize that UCSC and Ensembl are web applications. They have backend databases, all of which are fully open. You can download the entire site. I would suggest that you clarify for yourself what questions you want to answer. If there are existing databases and websites that answer your questions, then you are done. If not, then you will have a much clearer path forward knowing where you want to end up.
good luck ! :-)
@Matty - are you going to spider the UCSC genome browser? That's even funnier .. in fact a few years back I talked with Jim Kent and he mentioned laughing that there were webcrawlers that endlessly followed all form buttons on the UCSC genome browser: next, zoom in zoom out, previous, next etc .
For background see #7 of http://www.judegomila.com/2012/01/challengeyourself-in-2012-ivelisted.html
@neilws: How does http://genome.ucsc.edu/ differ from complete lists? Thats the sort of thing I was looking at doiing...
@MattyD, you definitely need to find a biologist partner in your endeavor. Without being too glib, your last question to Neil is a bit like asking "How is python different than TCP/IP?" Two concepts that are generally in the same space, but not at all comparable...
@Madelaine: Well if I write a nice web crawler/spider the work will mostly be done for me
@Istvan: Maybe not that site in particular even if I just point it to ceriain pages to extract that data displayed rather than all the form buttons etc etc