Hi everyone, I'm actually running a busco.py program to find orthologous genes present in all insects in my genomes. To do so I created a bash file to run it on the cluster. Here is the file:
#!/bin/bash
#SBATCH -t 24:00:00
#SBATCH -e path/busco_job.log/busco_job.error
#SBATCH -o path/busco_job.log/busco_job.out
date;hostname;pwd
ASSEMBLY=path/genome.fasta
LINEAGE=path/hymenoptera_odb9
SAMP=my_species
NAME=$SAMP'_BUSCO_v3'
#########################################
# define PATH to sofwtare used by BUSCO #
#########################################
#Augustus
export PATH=/bin:/usr/bin:/usr/remote/bin:path/Augustus3.3/bin:path/Augustus3.3/scripts
# hmmer
PATH=$PATH:/path/hmmer-3.2.1/bin
# blast et python
PATH=$PATH:/path/ncbi-blast-2.8.1+/bin
PATH=$PATH:/usr/bin
# augustus
export AUGUSTUS_CONFIG_PATH=/path/Augustus3.3/config
################
# Command line #
################
export PATH=/usr/remote/Python-3.6.5/bin:$PATH
PATH=$PATH:/usr/bin
out_path = path/run_busco
export PYTHONPATH=$PYTHONPATH:~/path/site-packages
python3 /path/busco-masterV3/scripts/run_BUSCO.py -i $ASSEMBLY -o $NAME -l $LINEAGE -m geno -f
The main issue is that the program busco.py by default write the output files into the directory where the python busco.py is ran but I would like to change the directory where are written the output files. And in the documentation they say that the option out_path can be modified from 2 ways: One is to modifie the path directly on the config.ini file or to provide input parameters through the command line which will override those defined in config.ini (and it is this solution I want to use). But it does not work even if I write in the run.sh file out_path = my_desired_path
Here is the documentation concerning the path:
In this file (config.ini
), you must declare the paths to all dependencies (see below) and you can optionally define the required input parameters (described later in this document). Note: providing input parameters through the command line will override those defined in config.ini. The config.ini.default file is extensively commented and self explanatory.
here is the head of the content of config.ini
:
# BUSCO specific configuration
# It overrides default values in code and dataset cfg, and is overridden by arguments in command line
# Uncomment lines when appropriate
[busco]
# Input file
;in = ./sample_data/target.fa
# Run name, used in output files and folder
;out = SAMPLE
# Where to store the output directory
;out_path = ./sample_data
# Path to the BUSCO dataset
;lineage_path = ./sample_data/example
# Which mode to run (genome / protein / transcriptome)
;mode = genome
# How many threads to use for multithreaded steps
;cpu = 1
# Domain for augustus retraining, eukaryota or prokaryota
# Do not change this unless you know exactly why !!!
;domain = eukaryota
# Force rewrite if files already exist (True/False)
;force = False
# Restart mode (True/False)
;restart = False
# Blast e-value
;evalue = 1e-3
So I was wondering why even if I write in my script : out_path = /path/run_busco
the out_file are still in the ./sample_data
??
Thank you for your help.
Hello,
I don't know the program. But I guess you have to remove the
;
before theout_path
parameter in the config file, so that whatever you declare there have an effect.fin swimmer
Yes I removed the ; part but there is still the same issue.
It would be odd if the config file is using
;
in some way. But in that case can you specify a directory you want the output to go to in;out_path = /path_to_dir_you_want
The php config file
php.ini
for example uses this to comment out parameters.Yep it works if I modify it directly in the config.ini file of course but the output path will change depending on the script I use...
I have around 100 script to run with a unique path for each job, that is why I want to incorporate the out_path directly in my script and not in the config.ini which does not change.
Have the script generate/modify the config.ini.