I would like some approximate estimates for CPU, RAM, etc that I must have in order to be able to process the Illumina reads. I suspect that the biggest difference compared to 454 is the sheer volume of generated sequence per run (I think ~55Gb) and the subsequent assembly of all these reads...
Can you provide more information, please? What throughput (number of lanes over time), what genome(s) e.g. bacterial vs. vertebrate vs. metagenome, what types of experiment e.g. RNAseq vs. ChIPseq vs. whole genome. Do you have a reference genome? Are you getting raw data, or just reads?
We want to do metagenomic study of microbial populations in environmental samples. At a first stage I was thinking about de-novo assembly and if this doesn't work very well, maybe mapping to reference genome(s). And to tell you the truth I don't yet know what's the difference between raw data and reads! What do they (the sequencing companies) usually give you back?