Dear community,
I am currently attempting to train my Augustus model using WebAUGUSTUS.
I have the following files at my disposal:
Genome of my species (GEN_ID.fasta) Annotation (GEN_ID.gb) A set of RNA sequences that I assembled using a protocol for genome-guided RNA assembly (RNA.fasta). I have successfully trained a model using GEN_ID.fasta and GEN_ID.gb. However, I am encountering issues when attempting to input my assembled RNA.fasta set.
Upon reviewing the instructions, I noticed the following: 'We filter for the average length of cDNA fasta entries and may reject the entire training job in case the sequences are on average too short, i.e., shorter than 400 bp.' Indeed, by ensuring sequences are shorter than 400 bp, I was able to resolve this issue.
However, when the training commenced, after several hours, it concluded with the following error:
ERROR: Number of UTR training examples is smaller than 50. Abort UTR training. If this is the
only error message, the AUGUSTUS parameters for your species were optimized ok, but you
are lacking UTR parameters. Do not attempt to predict genes with UTRs for this species using
the current parameter set!
failed to execute: perl /usr/share/augustus/scripts/autoAugTrain.pl --cpus=72
g=/data/www/webaugustus/webdata/augtrain/trainVSM2RRKk/autoAug/seq/genome_clean.fa
-s=trainVSM2RRKk --utr
e=/data/www/webaugustus/webdata/augtrain/trainVSM2RRKk/autoAug/cdna/cdna.f.psl
aug=/data/www/webaugustus/webdata/augtrain/trainVSM2RRKk/autoAug/autoAugPred_hints/predictions/augustus.gff -w=/data/www/webaugustus/webdata/augtrain/trainVSM2RRKk/autoAug -v -v -v --opt=1 --useexisting.
Do you have any suggestions on how to resolve this issue?
Thank you for your assistance!