Hi All,
I'm trying to train GeneMark-ES for use it in a following analysis with Maker2. I'm using a sample of sequences and a gff of CDS obtained mapping some RNAseq data on these sequences with Trinity. I used Geneious to obtain the gff.
This is a portion of my gff:
gff-version 3
source-version geneious 10.2.3
FCD_0297 Geneious CDS 3874157 3874871 . - . Name=TRINITY_DN13090_c0_g1_i1
FCD_0297 Geneious CDS 3873567 3873636 . - . Name=TRINITY_DN13090_c0_g1_i1
FCD_0297 Geneious CDS 3873404 3873489 . - . Name=TRINITY_DN13090_c0_g1_i1
FCD_0297 Geneious CDS 440175 440212 . + . Name=TRINITY_DN16051_c0_g1_i1
FCD_0297 Geneious CDS 439015 439129 . + . Name=TRINITY_DN16051_c0_g1_i1
FCD_0297 Geneious CDS 438757 438864 . + . Name=TRINITY_DN16051_c0_g1_i1
FCD_0144 Geneious CDS 769114 769734 . - . Name=TRINITY_DN14911_c3_g2_i1
FCD_0144 Geneious CDS 768367 769024 . - . Name=TRINITY_DN14911_c3_g2_i1
FCD_0144 Geneious CDS 766199 766381 . - . Name=TRINITY_DN14911_c3_g2_i1
This is the command line I used for GeneMark-ES:
$ perl gmes_petap.pl --ES --training --evidence evidence.gff --sequence sequences.fasta
The process gave me hundreds of errors like these:
Data format error: dna.fa_1 Geneious CDS 946644 946847 . + . Name=TRINITY_DN17542_c0_g1_i1
gmhmme3 : warning, file /data/evidence_training.gff line ignored : dna.fa_1 Geneious CDS 946644 946847 . + . Name=TRINITY_DN17542_c0_g1_i1
Data format error: dna.fa_1 Geneious CDS 3185913 3185948 . - . Name=TRINITY_DN3711_c1_g1_i1
gmhmme3 : warning, file data/evidence_training.gff line ignored : dna.fa_1 Geneious CDS 3185913 3185948 . - . Name=TRINITY_DN3711_c1_g1_i1
Data format error: dna.fa_1 Geneious CDS 3185633 3185827 . - . Name=TRINITY_DN3711_c1_g1_i1
gmhmme3 : warning, file /data/evidence_training.gff line ignored : dna.fa_1 Geneious CDS 3185633 3185827 . - . Name=TRINITY_DN3711_c1_g1_i1
Anyway, the process finished and produced some output.
"data" folder:
dna.fna, evidence_training.gff, training.fna
"training" folder:
dna.fa_1, dna.fa_2
"run" folder:
ini.mod, ES_A.mod, ES_B.mod, ES_C.mod
My questions are:
1) Is it correct to use an evidence file in gff format or the program requires another format? Is there something wrong in my gff file that generate these errors?
2) Did the training finish without using my gff?
3) Which ".mod" file is supposed to be the input for Maker2?
Thank you in advance for any advice!