Hi,
I'm totally new to running scripts, and I'm struggling to get going with submitting a job to our university supercomputer.
Basically, I want to index a Fasta reference genome before I perform my mapping, using BWA to do this.
The script provided by the university, to submit jobs to the supercomputer, is not that clear to me, and I was hoping that someone might be able to help. The documentation gives the following, which I can follow:
#!/bin/sh
# Grid Engine options (lines prefixed with #$)
#$ -N hello
#$ -cwd
#$ -l h_rt=00:05:00
#$ -l h_vmem=1G
# These options are:
# job name: -N
# use the current working directory: -cwd
# runtime limit of 5 minutes: -l h_rt
# memory limit of 1 Gbyte: -l h_vmem
However, I have zero idea of where to insert the following script:
bwa index -a bwtsw Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
Also, this is probably really daft, but I also have no idea of how to reference the location of the file, so that the script knows where it is...
Sorry if this is all very basic stuff, and thanks for any help!
Matthew
You would add
bwa
command line at the end of the script above (keeping in mind$PATH
considerations). Initial part with#
is setting up parameters for your job scheduling system. You would need to provide correct parameters. This is just a skeleton example.I recommend that you follow basic UNIX tutorial here. Investing sometime in it will be forever useful.
Thanks for the quick reply! So when I add the bwa command, when I put in the file name, do I just have to ensure that I include the full path so that it knows where the file is?
You could include full path for now as you learn about unix file system and relative path concepts.
Thanks so much. You've been really helpful!
A runtime limit of 5 minutes is very short.