Hi,
I am trying to annotate my assembly using MAKER. My plan was to run it first with EST2genome and protein2genome, using fastas to inform, then use the resulting HMM to train SNAP, then continue onto iterative SNAP runs, Augustus, etc.
My problem comes when i check the output of the first SNAP run, which looks like this:
##gff-version 3
19__unscaffolded . contig 1 5673700 . . . ID=19__unscaffolded;Name=19__unscaffolded
19__unscaffolded snap match 43 5639198 242017.148 + . ID=19__unscaffolded:hit:0:4.5.0.55;Name=snap-19__unscaffolded-abinit-gene-55.0-mRNA-1;target_length=5673700
19__unscaffolded snap match_part 43 89 16.744 + . ID=19__unscaffolded:hsp:0:4.5.0.55;Parent=19__unscaffolded:hit:0:4.5.0.55;Target=snap-19__unscaffolded-abinit-gene-55.0-mRNA-1 1 47 +;Gap=M47
19__unscaffolded snap match_part 172 221 20.284 + . ID=19__unscaffolded:hsp:1:4.5.0.55;Parent=19__unscaffolded:hit:0:4.5.0.55;Target=snap-19__unscaffolded-abinit-gene-55.0-mRNA-1 48 97 +;Gap=M50
19__unscaffolded snap match_part 293 361 21.730 + . ID=19__unscaffolded:hsp:2:4.5.0.55;Parent=19__unscaffolded:hit:0:4.5.0.55;Target=snap-19__unscaffolded-abinit-gene-55.0-mRNA-1 98 166 +;Gap=M69
19__unscaffolded snap match_part 405 530 36.538 + . ID=19__unscaffolded:hsp:3:4.5.0.55;Parent=19__unscaffolded:hit:0:4.5.0.55;Target=snap-19__unscaffolded-abinit-gene-55.0-mRNA-1 167 292 +;Gap=M126
... and on.
When i try and use maker2zff
, it generates empty genome.ann
and genome.dna
files.
When i use the GAAS package, and the script gaas_merge_outputs_from_datastore.pl
, I get a gff with no actual genes (CDS,transcript,exon).
Grepping for these also results in zero hits..
I thought this could be a problem with the SNAP installation/path in MAKER, so installed SNAP separately through conda, which ran fine and generated exon predictions, but the "gff" output of SNAP is not gff, and cannot be read into maker2zff
to generate a .hmm for further annotation.
I have had this exact same issue on two different assemblies now from distinct phyla with distinct genomic architecture.
My question is, has anyone had issues like this, where either SNAP in MAKER generates no gene models, or where they have succesfully ran SNAP and fed the output back into MAKER? How do i improve the output of a MAKER SNAP run?
Also, what are peoples opinions on MAKER? I have been trying to get MAKER to run without near-constant troubleshooting for almost two years now, and a look at the forums shows me i'm far from the only one.
Cheers
Edit 1: Since posting this, i wondered if it could be my singularity install of MAKER/SNAP that could be causing issues, so i ran SNAP again, this time from a MAKER conda install that i have had succesful SNAP runs from in the past. This also failed to predict any models other than snap match and snap match_part.
May you could give MOSGA a chance for user-friendly genome annotation.
I have had issues with running MOSGA in the past, as well as that it is not as reproducible (with it being web based), but wil give it a try to get past the SNAP stage.