Question

softwares for gene prediction on assembly of nanopore sequencing reads

0

Entering edit mode

4.0 years ago

boymin2020 ▴ 80

Hi all,

I just have completed the genome assembly with nanopore long reads and I have transcriptomics data. In the next step, I want to do gene prediction (including the information of CDs, exon, intron, etc.), could you recommend some software developed for this purpose? Like MARKER2 for next-generation sequencing data.

Many thanks,

nanopore gene prediction • 964 views

ADD COMMENT • link updated 4.0 years ago by lieven.sterck 15k • written 4.0 years ago by boymin2020 ▴ 80

score 2 · Accepted Answer · 2020-11-16

In essence there is no difference in annotating a genome that is assembled from Illumina reads, pacbio reads, nanopore etc ...

The main issue here is the quality of the resulting assembly and that can indeed differ from seq technology! Illumina being the most accurate on the per-base level and the other two are less accurate. It is this important that you get whatever genome assembly up to a (quality) point that makes it worth to annotate it (polishing). On that aspect it is advised to improve the quality of your long read assemblies with Illumina to improve the per-base accuracy of your assembly. If you don't have Illumina data at hand you will have to work with what you have of course.

The biggest issue you will encounter with such kind of assemblies (long read only) are indels and wrong base calls. These from a real problem when doing annotation (especially the indels) but as far as I know there is not a single gene prediction software that handles these better or worse than an other.

bottom line, regarding software there is no specific one that do this and any tools that does gene prediction is as feasible as the next one.

ah, and I assume you mean "MAKER" in stead of MARKER? (and yes, that one is a valid one to use)

For a quite comprehensive list of gene prediction tools, have a look at this post : List of genome annotation tools (>140)