PacBio long-reads impact in transcriptome de novo assembly

0

Entering edit mode

4.1 years ago

Manuel Mendoza ▴ 10

Hi!

We are strongly interested in assembly a good transcriptome of reference for a non-model organism and build a local database. We have sequenced the same individual with Illumina (150 millions of pair-end reads) and PacBios IsoSeq v3 (2 SMRT cells, one for shorter transcripts, shorter than 5kb and other for longer transcripts, up to 5kb).

To process long-reads, I have followed the PacBio IsoSeq pipeline proposed in their Github repo (https://github.com/PacificBiosciences/IsoSeq). The final result was removing 70% of the long-reads. Is that normal?

Using this data, I have assembled the transcriptome using only short reads and another combining long- and short-reads. In the end, I have not found any difference... Approx. The same N50, the same number of transcripts assembled, the rate of misassemblies... Does anyone know if PacBio data does not worth for transcriptome de novo assembly?

RNA-Seq Assembly Transcriptome de novo long-reads • 744 views

ADD COMMENT • link 4.1 years ago by Manuel Mendoza ▴ 10

Login before adding your answer.