I'm currently working on a project to characterize the transcriptome of a cell line from a species without a reference genome or transcriptome. We had previously used 454 to generate a transcriptome (de novo) from unconditioned cells and PMA stimulated. However we didn't net many transcripts with from our PMA stimulation.
We're currently carrying out another experiment using a time course comparison of two different treatments and untreated, neither treatment is PMA, using the same cells but we're planning on using Illumina paired-end for the sequencing. We expect to find new transcripts from the new stimulations but the goal of this experiment is to measure changes in gene expression.
My question is what is the best approach to handle generating a single de novo transcriptome and dealing with the differential expression when you have two data from two different platforms.
My thought was to use all of the reads from both experiments and use something like Mira to assemble them into a single transcriptome. Then I can use this transcriptome to align the reads from the second (illumina) experiment when calculating differential expression.
I don't think using the previous 454-generated transcriptome as a reference to assemble the illumina data from the time course experiment is a good idea. I'm worried that doing this could lead to novel transcripts being missed.
However I've never dealt with this type of situation before, I'm worried there might be some pitfalls I'm not seeing.