I have the following transcriptomic datasets from a lower eukaryote (no genomic sequence yet):
- Dataset 1: An "old" assembly from Strain A, obtained from Sanger and 454;
- Dataset 2: A novel 454 output from Strain B;
- Dataset 3: Several novel solexa outputs from different cell cycle points, from Strain A;
- Dataset 4: Another solexa output from Strain C;
- Dataset 5: And finally several solexa outputs from the wild type strain.
The question here is how to proceed for the assembly;
- Is it ok to perform the assembly of the datasets 1 and 2 together, even when they belong to different strains?
- The solexa datasets 3, 4 and 5 should be first assembled de novo, and later together with 1 and 2? And if so, which program should be used to perform the assembly using a transcriptomic sequence as reference?