Okay, well then I'll go ahead and throw some info out there in the hopes that it's useful to you.
What I can tell you is that the cluster we share time on has 8-core machines with 16GB of RAM each and they're sufficient for most of our needs. We don't do much assembly, but we do do a ton of other genomic processing, ranging from mapping short reads all the way up to snp calling and pathway inference. I also still do a fair amount of array processing.
Using most cluster management tools, (PBS, LSF, whatever), it should be possible to allow a user to reserve more than one CPU per node, effectively giving them up to 16 GB for a process if they reserve the whole node. Yeah, that means some lost cycles, but I don't seem to use it that often - 2GB is still sufficient for most things I run. It'd also be good to set up a handful of machines with a whole lot of RAM - maybe 64GB? That gives users who are doing things like assembly or loading huge networks into RAM some options.
I more often run into limits on I/O. Giving each machine a reasonably sized scratch disc and encouraging your users to make smart use of it is a good idea. Network filesystems can be bogged down really quickly when a few dozen nodes are all reading and writing data. If you're going to be doing lots of really I/O intensive stuff (and dealing with short reads, you probably will be), it's probably worth looking into faster hard drives. Certainly 7200RPM, if not 10k. Last time I looked 15k drives were available, but not worth it in terms of price/performance. That may have changed.
I won't get into super-detail on the specs - you'll have to price that out and see where the sweet spot is. I also won't tell you how many nodes to get, because again, that depends on your funding. I will say that if you're talking a small cluster for a small lab, it may make sense to just get 3 or 4 machines with 32 cores and a bunch of RAM, and not worry about trying to set up a shared filesystem, queue, etc - it really can be a headache to maintain. If you'll be supporting a larger userbase, though, then you may find a better price point at less CPUs per node, and have potentially fewer problems with disk I/O (because you'll have less CPUs per HD).
People who know more about cluster maintenance and hardware than I do, feel free to chime in with additions or corrections.
Some recommendations on this question: Any Hardware Recommendations For A Molecular Biology Lab That'S Getting Into Bioinformatics?
@Simon: I did see that thread, thanks though!
Can you provide a little more info? I ask because the answers to this question will largely depend on what you're doing with that sequence data. De novo genome assembly programs typically require huge amounts of RAM (really, the more the better). Modern algorithms for mapping reads, though, need CPU but not a lot of RAM. The final question is: how busy will these CPUs be? If they'll be idle 75% of the time or more, you might look into EC2 or other cloud-computing options.
@chris: I wish I could. On the "what", it's really a mixed bag ranging from digital gene expression to assembly. It's currently mapping reads however, thus the clusterability. Currently we're doing the processing on a HP linux machine w/ 16 CPU cores and 96GB of RAM, and a bulk of the processes take 4-8GB of RAM and as much CPU as they can get. The largest problem we have w/ the current platform's software is that they make use of SQLite DBs and will quickly flood the machine's IO limitation if we have many processes running.
A quick note: you will most likely also need substantial system administration expertise as well; this is particularly true when investing into cluster computing type of solutions.
@chris part 2: I've looked at EC2. It's not really an option for the same reason CUDA isn't - the people signing the checks don't like that idea.
@Istvan: Sysadmin isn't a big deal. We already do most of the management of our servers ourselves, and have "real" sysadmins behind that if anything happens.