I'm trying to compile an overview of (academic) infrastructures out there solving the Next Generation Sequencing data analysis and storage problem by utilizing HPC (-like) resources. I'm interested mainly in academic infrastructures, and also leaning more towards HPC based solutions than cloud-based ditto, even if the latter one is interesting to take in for comparison.
If you happen to know of any such infrastructure, feel free to mention it, and if you can, provide some or all of the following information:
- Website URL.
- Any publication describing the infrastructure.
- Resource size: Approx. size of the resource (number of cpu cores, nodes, RAM per node, storage available etc).
- Software: How is the plethora of NGS software taken care of? Who installs/maintains it, etc?
- Reference genomes etc. How is that dealt with?
- User interfaces: What means of accessing the system is there? Command line only? Any graphical user interfaces (like "Galaxy")?
- User training: How is that dealt with?
- Any other comments.
Interesting! Indeed, they seem to have quite a number of the typical NGS tools istalled: http://biowulf.nih.gov/apps/#ngs