Entering edit mode
3.8 years ago
Amaranta Remedios
▴
20
Hi, I am trying to run a sequence alignment with STAR. I have a total of 28 files paired-end files, 14 R1 and 14 R2. My files are called like this:
mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam_R1.fqq
The code I have written to run star so far is this one:
#!/bin/bash --login
#$ -cwd
#$ -l short
#$ -pe smp.pe 12
module load apps/intel-18.0/star/2.7.2b
#get only files names
for i in *R1.fq; do name=$(basename ${i} _R1.fq);
STAR --genomeDir /scratch/STAR_index \ #Path to the index generated previously
--runThreadN 12 \ #Number of cores
--readFilesIn ${name}_R1.fq ${name}_R2.fq \ #Path to the input files (forward and reverse)
--outFileNamePrefix ${name}_aligned_transcriptome \ #Prefix to the output files
--outSAMtype BAM SortedByCoordinate \
--limitBAMsortRAM 31000000000
done
Yet I keep getting this error
/opt/site/sge/default/spool/node403/job_scripts/1635775: line 17: syntax error near unexpected token `('
/opt/site/sge/default/spool/node403/job_scripts/1635775: line 17: ` --readFilesIn ${name}_R1.fq ${name}_R2.fq \ #Path to the input files (forward and reverse)'`
This is the output of the part of the code that should feed into line 17
for i in *R1.fq; do name=$(basename ${i} _R1.fq); echo $name
> done
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC2a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC2b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC3a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC3b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC4a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC4b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC5a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC5b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC6a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC6b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam
And this is line 17 itself and it looks alright. So, I don't understand why this loop is not working
for i in *R1.fq; do name=$(basename ${i} _R1.fq); echo ${name}_R1.fq; done
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC2a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC2b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC3a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC3b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC4a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC4b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC5a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC5b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC6a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC6b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam_R1.fq
Which is exactly how the files are called in my folder.
I hope someone can help me!
You seem to be pointing to a salmon transcriptome index instead of a STAR genome index.
I corrected that, I can see how that would be confusing. Thanks, It really is a STAR index
use a workflow manager like nextflow or snakemake
I'm not sure bash likes the strings after a
\