Question

Transcriptomic Assembly From Different Strains

3

Entering edit mode

13.4 years ago

Ke86 ▴ 30

I have the following transcriptomic datasets from a lower eukaryote (no genomic sequence yet):

Dataset 1: An "old" assembly from Strain A, obtained from Sanger and 454;
Dataset 2: A novel 454 output from Strain B;
Dataset 3: Several novel solexa outputs from different cell cycle points, from Strain A;
Dataset 4: Another solexa output from Strain C;
Dataset 5: And finally several solexa outputs from the wild type strain.

The question here is how to proceed for the assembly;

Is it ok to perform the assembly of the datasets 1 and 2 together, even when they belong to different strains?
The solexa datasets 3, 4 and 5 should be first assembled de novo, and later together with 1 and 2? And if so, which program should be used to perform the assembly using a transcriptomic sequence as reference?

rna assembly transcriptome • 2.3k views

ADD COMMENT • link updated 13.4 years ago by Israel Barrantes ▴ 790 • written 13.4 years ago by Ke86 ▴ 30

score 1 · Answer 1 · 2011-06-27

Could you give some more details: what is expected genome size and coverage, number of genes, ploidy, read length? Is transcriptome stranded, or you don't know from which strand you detect expression? Are the Illumina reads paired-ends?

You might consider to use Velvet coupled with Columbus. Here is some info from Columbus manual:

" Assisted transcriptome assembly
You sequenced the transcriptome of a new species, strain or individual, and you happen to know the gene sequences of a nearby species, strain or reference individual. You would then map the reads onto the reference genome, using the short-read mapper of your choice, and provide the alignments along with the known exonic sequences to Velvet. It would rebuild contigs based on the alignments, which could then be used by the Oases package."

I think it'll fit well for you project.

score 1 · Answer 2 · 2011-06-27

1

Entering edit mode

13.4 years ago

Israel Barrantes ▴ 790

Although I haven't tried it yet in this way, you could use Mira, which its manual says it is able to perform de novo hybrid assemblies, and it can also process transcript sequences from different strains.

ADD COMMENT • link 13.4 years ago by Israel Barrantes ▴ 790