Hello, I'm very new in Eukaryotic genome annotation; going through internet I found Comparative annotation toolkit (CAT) is very good tool for doing so....I have installed Comparative annotation toolkit (CAT) using Anaconda after creating a separate python=2.7 environment using following commands. While I have installed BUSCO into base environment.
$conda create -n catpy27 python=2.7 pip
$conda activate catpy27
$conda install -c bioconda comparative-annotation-toolkit
Now I don't understand what to do with that? Anyone acquainted with this kind of tool please suggest me any good tutorial or manual for Eukaryotic genome annotation using CAT.
Hi thank you very much for your prompt reply, I already have BUSCO output which consists AUGUSTUS result as well as BUSCO Ortholog genes. Now I'm in mystery that how can I relate my BUSCO output (Ortholog genes, Augustus output etc.) with the Comparative annotation toolkit (CAT).
How many genomes are you annotating and are there reference genomes/annotations with high quality for CAT?
A way to relate your BUSCO output with CAT is to reuse the pre-trained parameters resulting from BUSCO. You can add
--long
option when you use BUSCO on an assembly to optimize when retraining AUGUSTUS during the process. Then reuse the retraining parameters as a custom species for CAT by specifying--augustus-species CUSTOM_SPE
. Note that you need to move the files from the directoryretraining_parameters
in BUSCO results to the species directory in your AUGUSTUS config path and rename the files (replace the prefix with your species name; for example, make themaugustus/config/species/CUSTOM_SPE/CUSTOM_SPE_exon_probs.pbl
andaugustus/config/species/CUSTOM_SPE/CUSTOM_SPE_intron_probs.pbl
)If you use CAT in the Docker way, you need to rebuild the image to include your custom parameter files. See discussions: Custom Augustus training parameters for --augustus-species....