Hi everybody still a newbie in bioinformatics, stuck on masurca 3.3.0... any help will be more than welcome. I am trying to assemble a bacterial genome from miseq paired end reads in masurca 3.3.0. without grid options. my compilation file looks like:
PE= aa 519 844
/home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R1_001.fastq
/home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R2_001.fastq
#Illumina mate pair reads supplied as <two-character prefix> <fragment mean> <fragment stdev> <forward_reads> <reverse_reads>
#JUMP= sh 3600 200
#pacbio OR nanopore reads must be in a single fasta or fastq file with absolute path, can be gzipped
#if you have both types of reads supply them both as NANOPORE type
#PACBIO=/FULL_PATH/pacbio.fa
#NANOPORE=/FULL_PATH/nanopore.fa
#Other reads (Sanger, 454, etc) one frg file, concatenate your frg files into one if you have many
#OTHER=/FULL_PATH/file.frg END
PARAMETERS
#set this to 1 if your Illumina jumping library reads are shorter than 100bp
#EXTEND_JUMP_READS=0
#this is k-mer size for deBruijn graph values between 25 and 127 are supported, auto will compute the optimal size based on the read data
and GC content GRAPH_KMER_SIZE = auto
#set this to 1 for all Illumina-only assemblies
#set this to 0 if you have more than 15x coverage by long reads (Pacbio or Nanopore) or any other long reads/mate pairs (Illumina MP,
Sanger, 454, etc) USE_LINKING_MATES = 1
#specifies whether to run mega-reads correction on the grid
#USE_GRID=0
#specifies grid engine to use SGE or SLURM
#GRID_ENGINE=SLURM
#specifies queue (for SGE) or partition (for SLURM) to use when running on the grid MANDATORY
#GRID_QUEUE=all.q
#batch size in the amount of long read sequence for each batch on the grid
#GRID_BATCH_SIZE=300000000
#use at most this much coverage by the longest Pacbio or Nanopore reads, discard the rest of the reads
#LHE_COVERAGE=25
#set to 1 to only do one pass of mega-reads, for faster but worse quality assembly MEGA_READS_ONE_PASS=0
#this parameter is useful if you have too many Illumina jumping library mates. Typically set it to 60 for bacteria and 300 for the
other organisms
#LIMIT_JUMP_COVERAGE = 60
#these are the additional parameters to Celera Assembler. do not worry about performance, number or processors or batch sizes -- these
are computed automatically.
#set cgwErrorRate=0.25 for bacteria and 0.1<=cgwErrorRate<=0.15 for other organisms. CA_PARAMETERS = cgwErrorRate=0.25
#minimum count k-mers used in error correction 1 means all k-mers are used. one can increase to 2 if Illumina coverage >100
KMER_COUNT_THRESHOLD = 1
#whether to attempt to close gaps in scaffolds with Illumina data CLOSE_GAPS=1
#auto-detected number of cpus to use NUM_THREADS = 20
#this is mandatory jellyfish hash size -- a safe value is estimated_genome_size*estimated_coverage JF_SIZE = 460000000
#set this to 1 to use SOAPdenovo contigging/scaffolding module. Assembly will be worse but will run faster. Useful for very large
(>5Gbp) genomes from Illumina-only data SOAP_ASSEMBLY=0 END
I get an error
[Mon Mar 4 12:18:14 EET 2019] Overlap/unitig failed, check output
under CA/ and runCA1.out
with less on runCA1.out I get:
----------------------------------------
END Mon Mar 4 12:18:14 2019 (0 seconds) Created 13 overlap jobs. Last batch '001', last job '000013'.
----------------------------------------
START Mon Mar 4 12:18:14 2019 sbatch -D `pwd` -J "ovl_genome[1-13]" -a 1-13 \ -o /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/%A_%a.out \ /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/overlap.sh
sh: 1: sbatch: not found
----------------------------------------
END Mon Mar 4 12:18:14 2019 (0 seconds) ERROR: Failed with signal 127
================================================================================
runCA failed.
---------------------------------------- Stack trace:
at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 1613.
main::caFailure("Failed to submit batch jobs.") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 87
main::submitBatchJobs(" -D `pwd` -J \"ovl_genome[1-13]\" -a 1-13 \\\x{a} -o /home1/casca"..., "ovl_genome[1-13]") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 3809
main::createOverlapJobs("normal") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 6523
----------------------------------------
Failure message:
Failed to submit batch jobs.