Dear all,
I am currently writing a project that would involve gene expression of different non-model animals. Until now I've been working with Illumina, generating RNA-Seq transcriptomic libraries and de novo assembling them with Trinity to have a reference transcriptome. This time I want to use Oxford Nanopore Technologies (ONT) MinION sequencer but I cannot find any assembler specific for de novo transcriptome assembly with just ONT long reads (I did find some genome assemblers). - Is there any? - I've been finding a lot of info about Hybrid-Seq, combining Illumina+ONT reads. Is this currently the only option to generate a de novo transcriptome using ONT?
Thanks in advance!
As it is now, a Nanopore-only assembly would be a bad idea. The following post is about genes predicted from genome assemblies, but the problem would affect transcriptome assemblies similarly:
On stuck records and indel errors; or “stop publishing bad genomes”
I thought ONT reads would be, in average, long enough so you can sequence full length transcript without the need to assemble transcripts... Isn't it ?
You still have sequencing errors (and a lot of them), and you still sequence different copies of the same transcript several times, so there is still need to assemble to reduce the errors and collate the reads into one / few transcripts.
So in that case, the "transcriptome assembly" is more like creating a consensus of a transcript to get ride of errors due to the technology, is that it ? What about alternative splicing ? If you have any papers I could read explaining how it works, I would really appreciate. I'm curious regarding that subject and don't really know where to start from :)
I still don't have first hand experience with 3rd generation sequencing platforms, but this tutorial is really good:
https://github.com/sib-swiss/2017-10-longreads-training/wiki/Transcriptome-assembly:-introduction
From what I read, the ONT RNA-seq reads are indeed long enough but too erroneous as expected. But there is not yet a clearly defined protocol to correct the reads as error rate is decreasing and decreasing with technologies advances... Thanks for the link :)