I am using GeneMark-ES Suite version 4.33 for Eukaryotic gene prediction for plant genome and I am kind of stuck
command
perl gmes_petap/gmes_petap.pl --evidence protein.fa --cores 40 --sequence genome_assembly.fa --ET transcripts.gff
input files
protein.fa = a multi-fasta file having amino acid sequences from a closely related plant
genome_assembly.fa = genome assembly multi-fasta file having scaffold sequences for which I want to predict the genes
transcripts.gff = gff file for transcript sequences
error message
error, unexpected format found on line: >prot.1
error on call: /gmes_petap/reformat_gff.pl --out data/evidence.gff --trace info/dna.trace --in protein.fa --quiet
I think I am providing a wrong file in the --evidence
parameter as shown below
--ET [filename]; to run training with introns coordinates from RNA-Seq read alignments (GFF format)
--evidence [filename]; to use in prediction external evidence (RNA or protein) mapped to genome
What could (RNA or protein) mapped to genome
possibly mean? Any ideas?
An alignment file in gff format sounds alien to me.
well, I meant alignment as in 'HSP coordinates of aligned proteins to the genome' ( obtained by using eg. blast (not recommended), GenomeThreader, GeneWise ... ). apologies for the brevity .