Hi there,
I am looking for a tool that can perform a simple annotation process based on mapping of full/partial transcripts to a genome sequence. I've been using MAKER's est2genome option for a while, but it seems to have a few issues, so I'm curious to see if there is a similar tool out there. I would expect something that parses blast/blat output to produce gene models.
Is anypne here aware of such a tool?
Thanks!
PASA, GenomeThreader, you can even use minimap2 nowadays (But need to run agat_convert_minimap2_bam2gff.pl from AGAT)
You can have a look at this list of annotation tools: https://github.com/NBISweden/GAAS/blob/master/annotation/knowledge/annotation_tools.mdenter link description here
Thanks you! that's very helpful. I think I'll take a look at genomeThreader first.
Just updating for future readers: I tried out minimap2 + AGAT, which works nicely and very fast, but does not produce gene structure (UTRs, exons, CDS). The resulting GFF contains cDNA_match features which indicate gene boundaries, so this is not what I need. I've also tried genomeThreader, which does produce detailed gene structures, but I found quite a few "chimeric" genes - cases in which two transcripts were merged into one gene with a predicted intron linking them. So far I haven't found a solution for that, so I'll probably give PASA a try as well.
Thank you for the feedback. My bad, minimap2 can be used as a splice aware transcript aligner, but you are right it does not perform structural gene prediction.
I can mention GMAP that people use often to replace Exonerate (PASA use Gmap or blat). You can try GAWN too.
Thanks, I'll take a look at GAWN (liked the name...). In any case, it seems like PASA is also not the right tool for the job. See this thread for details.
your link does not work
Works fine for me... can you try again?
It’s fine now. Thank you
Now you have generated "hints" with exonerate, PASA and minimap2, you can provide all of them together to MAKER, the annotation might be better.
Based on this hints you can generate gene models with Ipred, EvidenceModeler (and MAKER).
GAWN seems to generate pretty good results and it's quite fast too. There is a fair amount of noise (gene models with very long, unlikely introns and other curiosities), but it can be cleaned using simple cutoffs on alignment stats. I should also note that it's very easy to run GAWN - indeed no nightmares... So I think this tool is the best choice available to me right now. Thanks for the help!