For the past few days I've been trying to gather a list of interesting open source projects where tools from machine learning are applied to biological problems. Here's the (way too small) list so far.
https://github.com/dhbradshaw/ml-for-genomics/blob/master/README.md
It's been surprisingly hard to find a lot of material. So since Google is failing me, I thought I'd come here. What are the most interesting such projects that you know of?
Or let's remove the filter! Even if they're boring: What open source projects do you know of where machine learning techniques are applied to biological questions?
Some machine learning aspects could be integrated in more generic frameworks e.g. Biopython, BioJava, BioPerl, Cytoscape... Are you also interested in those ?
Definitely.
I guess the strongest thing to do in a case like that would be to find the machine learning portions of the projects and figure out how to link to them rather than just link to the projects as a whole.
Since deep learning seems to be the hot topic of the moment I'd point out this paper Deep learning for regulatory genomics (Nat Biotech 2015) and refs therein.
If you struggle to find material about ML applied biology it might be simply because ML is just a set of tools after all. So a project might be using ML without explicitly "advertising" it.
Thanks for putting together this list anyway!
By the way, all the many Bioconductor packages relying on some variation of linear modelling (limma, edgeR, DESeq, ...), shouldn't they be included as well?
Thanks for posting the paper!
I'm yet not well informed enough to make a judgement on edgeR, DESeq, etc. (Hence the project :-) ) Do you think they would strengthen or dilute the list?