Question

Strategies For De Novo Assembly Of A Reference

0

Entering edit mode

9.9 years ago

yp.cun • 0

Dear all,

I am using Oases to assembly 15 paired-end RNA-SEQ data (5 tissues with 3 repeats of unknown refer genome) to generate

a) transcriptome of each sample, which mean 15 xx.fa files,

b) reference transcriptome based all 15 xx.fa form step a).

Taking the tissue type as A,B,C,D, E, and 1,2,3 as their repeat id. I using two following strategies to got the reference transcriptome:

1) using mutliple k-mer (31,41, 51,61) to generate 15 transcriptome of each sample(15 transcripts.fa). And then assembly these 15 transcripts.fa to reference transcriptome by single k-mer (51). (In later step, I choose single k-mer because mutilple k-mer not working).

2) using single k-mer 51 to generate 15 transcriptome of each sample(15 transcripts.fa). And then assembly these 15 transcripts.fa to reference transcriptome by single k-mer (51).

In our preliminary analysis, the reference transcriptome generated by strategy 1) seem much better compared to strategy 2). I am not sure the strategy 1) is wright way, could anyone give me some suggestions? Many thanks.

Assembly RNA-Seq Transcriptome Oases • 2.3k views

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 9.9 years ago by yp.cun • 0

Ram · Answer 1 · 2015-07-03

0

Entering edit mode

9.9 years ago

mark.ziemann ★ 2.0k

Trinity assembler is a very good tool for the job because it has very few parameters to fine tune, it "just works". You can boost the accuracy of the assembly using a read corrector such as BFC.

Also, why do you want to generate individual (partial) transcriptomes for the 15 data-sets? The 15 sample assembly will be the most accurate build. You can then map reads from each of the 15 datasets to the new assembly in order to quantify transcript expression.

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 9.9 years ago by mark.ziemann ★ 2.0k

0

Entering edit mode

Thanks, Mark.

We choose Oases, cause it more computational efficient and fast. But we also tried Trinity. As Oases have multiple k-mers which theoretical and practical good compared to single k-mer, we keeping to using Oases.

As assembly 15 PE at same time required heavy memory, so that's why we use the current strategies.

ADD REPLY • link updated 2.4 years ago by Ram 45k • written 9.8 years ago by yp.cun • 0