Which assembly programs would be good for doing an assembly using data from Illumina, SOLiD and Ion torrent?
Also, it is generally not a good idea to convert from color space to basespace as any errors will propagate down the read, but is it okay to convert from basespace to colorspace since that is unambiguous and then I could possibly use the colorspace assembler from Velvet.
It sounds reasonable that sequencing the same genome with multiple technologies should always be a good thing when aiming for a de novo assembly. However, that's not always the case and it may have been better to optimise the use of a single technology. To help assess whether this is the case there are several important considerations which should be taken into account before any sequencing has been carried out. The aim is to ensure each technology adds something useful to the final result. The main factors are accuracy, read length and insert size and sometimes coverage of difficult to sequence regions.
To help answer your question you should post an estimate of your genome size and ploidy. Additionally state what the read lengths are for each sequencing run of each technology you have, and the approximate depth coverage obtained based on the estimated genome size. Are any of the runs are paired-end, and if so, what is the estimated insert size?
Armed with that information an assembly strategy can be devised. I have to say that combining Illumina, SOLiD and Ion Torrent for de novo assembly is not a commonly seen strategy and may not be ideal, but this will depend on the exact nature of the data you have.
"not commonly seen" is a nice expression. I never saw it and would probably never think of doing it: all these technologies are more or less "short read" atm. Ion + Illumina could make sense by mixing Ion 200+bp reads with Illumina 100-150bp, one cancelling the artifacts of the other. I do not see the added value of SOLiD in there, I'd mix in something longer (454 or PacBio).
It is a haploid genome less than a GB in size. Illumina is paired-end. I have a 300 bp insert, and a 600 bp insert. The SOLiD is matepaired with a 1.5kb insert. The Ion Torrent data is minor in quantity compared to Illumina and SOLiD so I dont plan to rely on it much. There is about 30x coverage with Illumina and 20x or so with SOLiD.
Is this genomic? transcriptomic? Are you doing a de novo assembly or reference assembly? Base-space to color-space is fine.
This is genomic DNA. I am trying to do a denovo assembly.