Simple genome annotation using transcripts data
1
0
Entering edit mode
4.5 years ago
liorglic ★ 1.5k

Hi there,
I am looking for a tool that can perform a simple annotation process based on mapping of full/partial transcripts to a genome sequence. I've been using MAKER's est2genome option for a while, but it seems to have a few issues, so I'm curious to see if there is a similar tool out there. I would expect something that parses blast/blat output to produce gene models.
Is anypne here aware of such a tool?
Thanks!

annotation fasta gff • 2.0k views
ADD COMMENT
1
Entering edit mode

PASA, GenomeThreader, you can even use minimap2 nowadays (But need to run agat_convert_minimap2_bam2gff.pl from AGAT)
You can have a look at this list of annotation tools: https://github.com/NBISweden/GAAS/blob/master/annotation/knowledge/annotation_tools.mdenter link description here

ADD REPLY
0
Entering edit mode

Thanks you! that's very helpful. I think I'll take a look at genomeThreader first.

ADD REPLY
1
Entering edit mode

Just updating for future readers: I tried out minimap2 + AGAT, which works nicely and very fast, but does not produce gene structure (UTRs, exons, CDS). The resulting GFF contains cDNA_match features which indicate gene boundaries, so this is not what I need. I've also tried genomeThreader, which does produce detailed gene structures, but I found quite a few "chimeric" genes - cases in which two transcripts were merged into one gene with a predicted intron linking them. So far I haven't found a solution for that, so I'll probably give PASA a try as well.

ADD REPLY
1
Entering edit mode

Thank you for the feedback. My bad, minimap2 can be used as a splice aware transcript aligner, but you are right it does not perform structural gene prediction.

I can mention GMAP that people use often to replace Exonerate (PASA use Gmap or blat). You can try GAWN too.

ADD REPLY
0
Entering edit mode

Thanks, I'll take a look at GAWN (liked the name...). In any case, it seems like PASA is also not the right tool for the job. See this thread for details.

ADD REPLY
0
Entering edit mode

your link does not work

ADD REPLY
0
Entering edit mode

Works fine for me... can you try again?

ADD REPLY
0
Entering edit mode

It’s fine now. Thank you

ADD REPLY
0
Entering edit mode

Now you have generated "hints" with exonerate, PASA and minimap2, you can provide all of them together to MAKER, the annotation might be better.

Based on this hints you can generate gene models with Ipred, EvidenceModeler (and MAKER).

ADD REPLY
0
Entering edit mode

GAWN seems to generate pretty good results and it's quite fast too. There is a fair amount of noise (gene models with very long, unlikely introns and other curiosities), but it can be cleaned using simple cutoffs on alignment stats. I should also note that it's very easy to run GAWN - indeed no nightmares... So I think this tool is the best choice available to me right now. Thanks for the help!

ADD REPLY
0
Entering edit mode
3.5 years ago
sagnik ▴ 50

Hello,

We have developed a gene annotator called FINDER which can annotate eukaryotic genomes using short-read RNA-Seq reads and protein sequences. It is completely automated and requires no manual intervention. FINDER also runs BRAKER to incorporate predicted genes in the repertoire. You can access the paper from FINDER and the software from here GitHub.

Thank you.

ADD COMMENT

Login before adding your answer.

Traffic: 1567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6