Question

Merge 2 de novo assembly generated using Trinity?

0

Entering edit mode

10.2 years ago

nbvasani ▴ 240

Hi Fellow Users,

In past I have generated transcriptome de novo assembly for plant using trinity and annotated about 90% of assembly. Recently I got RNA-seq data from hiseq illumina of same plant. So this time I mapped new rna-seq data to existing assembly and unmapped reads were extracted.

I used this unmapped reads to generate new de novo assembly using trinity. Finally merged both de novo assembly which were generated using trinity. After thoroughly looking at merged assembly I found out that there were many transcript with same ID but different count number. I think number system while generating transcript is same in trinity assembler.

So my next step is to use different assembler like Velvet/oases in order to get most uses of my existing assembly.

Any suggestion about how can I better use of my existing assembly for new data.

Can I make some changes while running trinity?

Can I used different assembler?

I would really appreciate any input.

Thanks in advance
naresh

:-)

transcriptome RNA-Seq Assembly • 5.7k views

ADD COMMENT • link updated 2.9 years ago by Ram 44k • written 10.2 years ago by nbvasani ▴ 240

Ram · Answer 1 · 2014-09-09

1

Entering edit mode

10.2 years ago

Rayan Chikhi ★ 1.5k

How about doing a single Trinity run using the old data + the new data?

The same advice was given in a related question, worth reading.

Another discussion here.

And a paper documented an assembly merge (using EvidentialGene).

I have some doubts that running Trinity on unmapped reads from another assembly would be good practice.

ADD COMMENT • link updated 2.9 years ago by Ram 44k • written 10.2 years ago by Rayan Chikhi ★ 1.5k

0

Entering edit mode

Thanks Rayan!

Advice on thread is vice-versa.

Thanks for all link and suggestion.

Naresh

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.2 years ago by nbvasani ▴ 240

Ram · Answer 2 · 2014-09-10

1

Entering edit mode

10.2 years ago

Prakki Rama ★ 2.7k

Instead of collecting unmapped reads and assembling them, assembling the new data using trinity, and then cluster both the assemblies using the cd-hit-est with certain similarity cutoff might also be option. This way same contigs between two assemblies are clustered and the representative is taken by cd-hit-est.

You can also try to merging the assemblies using CAP3 assembler.

ADD COMMENT • link 10.2 years ago by Prakki Rama ★ 2.7k

0

Entering edit mode

Hi Prakki!

Thanks for the suggestion.

The main reason why I didn't want to create new assembly:

My previous assembly is 80% annotated.
My new RNA-seq data consist of 550million reads and if I use this reads to generate assembly it will used my whole PC memory and will freeze the PC.

Thanks,
Naresh

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.2 years ago by nbvasani ▴ 240