If you've got a related genome you might be able to scaffold those 4m contigs to reduce them down a bit. Probably many of them only have exons and not full genes on them so utility will be poor. This tool might be useful - https://github.com/malonge/RagTag
In general though you might be able to find long read data for this genotype which will create a 100X better assembly at least .... if not, then maybe next time...
You need to use -p option specify the number of threads that match the cores you are asking in your bowtie2 command line. You will also want to ask for more memory explicitly, if the default allocation on your cluster is low (using #SBATCH --mem=NNg option.
4 million contigs
Yikes! that is a fragmented assembly.
There is no need to make SAM file unless you have a specific reason. Pipe directly into samtools to make a sorted/indexed BAM file.
You might want to run seff on your job ID after it has finished, some SLURM configurations require you to use some variant of srun before your command, and without srun the job won't use all the CPUs that you've requested leading to slow jobs. But that might not be the case on your cluster. seff will tell you what percentage of your required CPUs was actually used.
If you've got a related genome you might be able to scaffold those 4m contigs to reduce them down a bit. Probably many of them only have exons and not full genes on them so utility will be poor. This tool might be useful - https://github.com/malonge/RagTag
In general though you might be able to find long read data for this genotype which will create a 100X better assembly at least .... if not, then maybe next time...