Entering edit mode
8 months ago
sansan96
▴
130
Hello,
I am using funannotate for the first time with a non-model plant with a genome of approximately 3Gbp. I have an assembled transcriptome and RNA-seq data, at this moment I am doing the training with the transcriptome and the libraries, but I have doubts about the prediction part. Is it necessary to use --busco_db
and --busco_seed_species
when I have RNA-seq? I checked the manual but it is not very clear.
The training script I am using is the following:
#!/bin/bash
#PBS -N rnaseq_train
#PBS -l nodes=1:ppn=20,vmem=150gb,walltime=700:00:00
#PBS -o output.log
#PBS -e error.log
#PBS -q ensam
#PBS -V
#Module
module load funannotate/2023.1
#Directory
cd $PBS_O_WORKDIR
#Funannotate with RNA-seq
funannotate train -i alt.fasta -o alt_fun \
--left CP148H_R1_001_P.fastq.gz CP17D_R1_001_P.fastq.gz \
--right CP148H_R2_001_P.fastq.gz CP17D_R2_001_P.fastq.gz \
--trinity ../Transcriptomes/Transcriptome.fasta \
--stranded no --cpus 20 --memory 150G --no_trimmomatic --max_intronlen 100000
The script I want to use for the prediction is the following:
#!/bin/bash
#PBS -N predict_fun
#PBS -l nodes=1:ppn=20,vmem=150gb,walltime=700:00:00
#PBS -o predict_output.log
#PBS -e predict_error.log
#PBS -q ensam
#PBS -V
#Module
module load funannotate/2023.1
#Directory
cd $PBS_O_WORKDIR
#Fun predict
funannotate predict \
-i alt.fasta \
-o alt_fun \
--optimize_augustus \
--repeats2evm \
--busco_db embryophyta \
--cpus 20 \
--max_intronlen 100000 \
--organism other \
--busco_seed_species rice
Add rice because it is the most similar among all plant species. I would greatly appreciate comments and suggestions.
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or use one of (a) the option highlighted in the image below/ (b) fenced code blocks for multi-line code. Fenced code blocks are useful in syntax highlighting. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.