I have access to my institutions super-computer and I recognize that knowing how to use a cluster-computing environment is a valuable skill for a bioinformatician, but I do not know how to go about learning to use one.
I imagine there must be some good tutorials out there that I could use to learn the basics. Can some point me in the right direction?
Are you sure that your institution doesn't have a tutorial session or workshop series? They almost always do, because it keeps questions like these from flooding their inbox :)
What cluster software is it running? I may have some notes handy. You might be able to infer the cluster management software by typing one of the following on the command line:
Let me know if one of those commands gives you a manpage.
I'll add
man sbatch
to that list.That would be my first suggestion too; get in touch with the IT people, ask about courses or online material. They usually provide at least some basic guides, in the hope that fewer people will break their system :)
Asking internal IT first is necessary, and easiest, also to learn how LSF (or whatever platform is in use) has been set up and some options might be made mandatory. For example, to submit a job it might be as easy as
bsub < myscript.sh
but probably you need to say how much memory, run time you want.The man pages are certainly authoritative and worth referring to but they might give the impression that submitting jobs is more complicated than it actually is in practice!
man pages for LSF are horrible!
Glad to hear I'm not the only one! Between that and
man curl
it's a tough competition.They do offer such courses but somewhat infrequently and I just missed the last one.
Bring the IT folks coffee and/or beer and I bet they'll give you the quick version of the course (or at the very least give you the slides).
man bsub brings up a page about "LSF jobs"
man qsub brings up a page about "PBS job"
I guess they've got both types of cluster software. Found a couple of tutorials online but would still be happy to take a look at any notes you have to share.
do you know how to use linux and bash? If so, then using a queuing system is a relatively small step. Usually if you can do:
then it can run as:
and you simply have to learn about 5 common flags to reserve the correct number of CPU's and amount of memory.
Maybe not a technical skill, but you should learn good practices and common courtesies. Always benchmark your programs/tasks for memory usage and CPU usage efficiency. Usually your goal is to either decrease the wallclock time needed to perform some task, or utilize multiple nodes to overcome some hardware limitation (e.g. memory). You should always look for ways to achieve these goals while utilizing your hardware as efficiently as possible.
A few pointers:
In general you want to be more prudent than usual, HPCs are great, you can do huge amounts of stuff in parallel but just remember that "stuff" can mean getting work done, or it can mean "creating a disaster". Though not permanent, it isn't fun to find out that your 1200 jobs run in parallel made 1200 messes.
In addition to learning the typical batch systems, you may want to explore the various tools and means of parallelization your cluster has to offer. Everything from relatively simple tools like GNU parallel, to software specific parallelization (e.g. MATLAB), to simple code based approaches (e.g. R's snow package) to more complex (e.g. MPI). You may need to use these to develop custom tools, or you may need to know how they work for using software from others (e.g. MrBayes).
As others have stated, it isn't difficult to get started if you're already familiar with shell/*nix environments and can program. You may want to see if your school/company/entity has courses or classes on HPC. It can be very useful as the classes typically use whatever system you'll be working on. They'll also go more in depth into different areas of HPC/parallel computing which can be useful in the long run. I've found that they're pretty good places for networking and developing contacts you can approach with technical questions.