Transcriptomic Assembly From Different Strains
2
3
Entering edit mode
13.4 years ago
Ke86 ▴ 30

I have the following transcriptomic datasets from a lower eukaryote (no genomic sequence yet):

  • Dataset 1: An "old" assembly from Strain A, obtained from Sanger and 454;
  • Dataset 2: A novel 454 output from Strain B;
  • Dataset 3: Several novel solexa outputs from different cell cycle points, from Strain A;
  • Dataset 4: Another solexa output from Strain C;
  • Dataset 5: And finally several solexa outputs from the wild type strain.

The question here is how to proceed for the assembly;

  • Is it ok to perform the assembly of the datasets 1 and 2 together, even when they belong to different strains?
  • The solexa datasets 3, 4 and 5 should be first assembled de novo, and later together with 1 and 2? And if so, which program should be used to perform the assembly using a transcriptomic sequence as reference?
rna assembly transcriptome • 2.3k views
ADD COMMENT
1
Entering edit mode
13.4 years ago
Leszek 4.2k

Could you give some more details: what is expected genome size and coverage, number of genes, ploidy, read length? Is transcriptome stranded, or you don't know from which strand you detect expression? Are the Illumina reads paired-ends?

You might consider to use Velvet coupled with Columbus. Here is some info from Columbus manual:

" Assisted transcriptome assembly
You sequenced the transcriptome of a new species, strain or individual, and you happen to know the gene sequences of a nearby species, strain or reference individual. You would then map the reads onto the reference genome, using the short-read mapper of your choice, and provide the alignments along with the known exonic sequences to Velvet. It would rebuild contigs based on the alignments, which could then be used by the Oases package."

I think it'll fit well for you project.

ADD COMMENT
1
Entering edit mode
13.4 years ago

Although I haven't tried it yet in this way, you could use Mira, which its manual says it is able to perform de novo hybrid assemblies, and it can also process transcript sequences from different strains.

ADD COMMENT

Login before adding your answer.

Traffic: 2332 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6