Entering edit mode
7 months ago
Vijith
▴
90
I am seeking help with Augustus gene prediction!
I am performing a whole genome assembly of a plant species. I have completed the gene prediction using the Augustus pipeline. The output file is of format .gff
. Now I want to perform the gene annotation by performing BLAST
for which I need the coding sequences in a .fasta.
file.
This is the method that I've thought of approaching.
- Use the
perl
scriptgetAnnoFasta.pl
to get theamino acid sequence
, and later the sequence infasta
format.
But I've also seen others mention various other tools at gff3 to CDS fasta. Can someone provide some insights, as to what tool could be of assistance.
You can get a look here Extracting genomic feature sequences from GTF/GFF files with AGAT
I have tried installing AGAT. But it failed the tests. for example. ~
By doing some online search, I tried installing one module
cpan Bio::DB::Fasta
and it is running like a never-ending installation process. Theperl
version isv5.38.2
, and I was installing it manually followingapt install libbio-perl-perl libclone-perl libgraph-perl liblwp-useragent-determined-perl libstatistics-r-perl libcarp-clan-perl libsort-naturally-perl libfile-share-perl libfile-sharedir libfile-sharedir-install-perl libyaml-perl liblwp-protocol-https-perl libterm-progressbar-perl
Right I heard there is an issue since Perl v5.36. You may have better luck using conda. Or the best way is to use the container. Or downgrade your perl version.