Hi all,
I am a sprog to BUSCO and have been struggling with the current issue for near 2-3 weeks. Briefly speaking, the Augustus invoked by BUSCO can not recognize any genes. Below is my shell command line.
${busco} --config /public/home/lvzhenming/soft/tools/busco-5.0.0/config/config.ini -f -c ${Ncores} --offline -i /home/lvzhenming/public/home/lvzhenming/project/sjl/reference/GWHACFF00000000.genome.fasta -o sjl_batchRef_busco_v --out_path /public/home/lvzhenming/project/sjl/reference/resultsFromBusco4Ref -m genome -l /public/home/lvzhenming/project/sjl/reference/busco_downloads/vertebrata_odb10
And the following is the version information of the main software used in my script: BLAST v2.3.0+, BUSCO v4.1.4, AUGUSTUS v3.3.2. The input GWHACFF00000000.genome.fasta derived from a fish was downloaded from a published paper. For lineage_dataset, except vertebrata_odb10, I also tried eukaryota_odb10 and actinopterygii_odb10 but failed with the same error.
Regarding the busco.log file generated by BUSCO was partly shown below.
INFO:busco.run_BUSCO ***** Start a BUSCO v4.1.4 analysis, current time: 03/26/2021 17:13:12 *****
DEBUG:busco.ConfigManager Getting config file
INFO:busco.ConfigManager Configuring BUSCO with /public/home/lvzhenming/soft/tools/busco-5.0.0/config/config.ini
INFO:busco.BuscoConfig Mode is genome
INFO:busco.BuscoConfig Input file is /home/lvzhenming/public/home/lvzhenming/project/sjl/reference/GWHACFF00000000.genome.fasta
DEBUG:busco.BuscoConfig State of BUSCO config before run:
DEBUG:busco.BuscoConfig {'_allow_no_value': False,
...
INFO:busco.BuscoDownloadManager Using local lineages directory /public/home/lvzhenming/project/sjl/reference/busco_downloads/vertebrata_odb10
DEBUG:busco.BuscoAnalysis Check all required tools are accessible...
DEBUG:busco.BuscoAnalysis Checking dataset for HMM profiles
INFO:busco.BuscoAnalysis Running BUSCO using lineage dataset vertebrata_odb10 (eukaryota, 2021-02-19)
DEBUG:busco.BuscoTools Tool: makeblastdb
DEBUG:busco.BuscoTools Version: 2.3.0+
INFO:busco.Toolset Running 1 job(s) on makeblastdb, starting at 03/26/2021 17:13:15
INFO:busco.BuscoTools Creating BLAST database with input file
DEBUG:busco.Toolset cmd call: /public/home/lvzhenming/soft/tools/ncbi-blast-2.3.0+/bin/makeblastdb -in /home/lvzhenming/public/home/lvzhenming/project/sjl/reference/GWHACFF00000000.genome.fasta -dbtype nucl -out /public/home/lvzhenming/project/sjl/reference/resultsFromBusco4Ref/sjl_batchRef_busco_v/blast_db/GWHACFF00000000.genome.fasta
INFO:busco.Toolset [makeblastdb] 1 of 1 task(s) completed
INFO:busco.BuscoTools Running a BLAST search for BUSCOs against created database
DEBUG:busco.BuscoTools Tool: tblastn
DEBUG:busco.BuscoTools Version: 2.3.0+
INFO:busco.Toolset Running 1 job(s) on tblastn, starting at 03/26/2021 17:13:27
DEBUG:busco.Toolset cmd call: /public/home/lvzhenming/soft/tools/ncbi-blast-2.3.0+/bin/tblastn -evalue 0.001 -num_threads 12 -query /public/home/lvzhenming/project/sjl/reference/busco_downloads/vertebrata_odb10/ancestral -db /public/home/lvzhenming/project/sjl/reference/resultsFromBusco4Ref/sjl_batchRef_busco_v/blast_db/GWHACFF00000000.genome.fasta -out /public/home/lvzhenming/project/sjl/reference/resultsFromBusco4Ref/sjl_batchRef_busco_v/run_vertebrata_odb10/blast_output/tblastn.tsv -outfmt 7
INFO:busco.Toolset [tblastn] 1 of 1 task(s) completed
INFO:busco.GenomeAnalysis Running Augustus gene predictor on BLAST search results.
INFO:busco.BuscoTools Running Augustus prediction using human as species:
DEBUG:busco.BuscoTools Tool: augustus
DEBUG:busco.BuscoTools Version: 3.3.2
INFO:busco.Toolset Running 3971 job(s) on augustus, starting at 03/26/2021 19:43:15
INFO:busco.Toolset [augustus] 3971 of 3971 task(s) completed
INFO:busco.BuscoTools Extracting predicted proteins...
**ERROR:busco.run_BUSCO Augustus did not recognize any genes matching the dataset vertebrata_odb10 in the input file. If this is unexpected, check your input file and your installation of Augustus**
Thanks a lot for any advice !!!
There is something odd here: you seem to have a version conflict BUSCO v4.1.4, but your path contains "busco-5.0.0". Is there a reason why you need to run the older version, e.g. trying to reproduce published scores? If not, try to install the latest version 5.0.0 via conda and run it using MetaEUK as gene predictor.
If you use Augustus, it doesn't seem to go down well with MacOS/BSD or large number of parallel processes. Try running with -c 1 (max 10).
Also, it could be that the linage database file is broken. Try to download it again and check with different linage files, like eukaryota, metazoa. If those work, the lineage file could be broken.
See: https://gitlab.com/ezlab/busco/-/issues/222
Thank you very much for your advice.
Regarding the BUSCO version problem, no matter how I install it (by conda or from source), the log file from busco-5.0.0 always show "Start a BUSCO v4.1.4 analysis". But I tried both of v4.1.4 and v5.0.0 and got the same error.
Regarding the OS, I run the script on a Linux Server with one core (-c 1) and got the same error.
Regarding the different linage files, I already tried Eukaryota, metazoa & Vertebrata and got the same error.
Regarding MetaEUK, I think it is the last option to debug. I am trying...