I have four files from paired end reads: SRX10603399_SRR14240730_1.fastq.gz
,SRX10603399_SRR14240730_2.fastq.gz
,SRX10603417_SRR14240748_1.fastq.gz
, and SRX10603417_SRR14240748_2.fastq.gz
. I want to use #STAR aligner to align the four files and get two bam
files. The code I have is producing four bam
files. The following is my code:
module load software/star-2.7.9a
# define variables
index=/scratch/oknjav001/sarsCovRNA/star_index
# get our data files
FILES=/scratch/oknjav001/sarsCovRNA/pbmcs_healthyvscovid/pbmcs/fastq/*.fastq.gz
for f in $FILES
do
echo $f
base=$(basename $f .fastq.gz)
echo $base
STAR --runThreadN 3 --genomeDir $index --readFilesIn $f --outSAMtype BAM SortedByCoordinate --outTmpDir /scratch/oknjav001/sarsCovRNA/tempalign --quantMode GeneCounts
--readFilesCommand zcat --outFileNamePrefix $base"_"
done
echo "done!"
What is the problem here?
I want to pass in
_R1.fq.gz
and_R2.fq.gz
to get one combined bam file from theforward
andreverse
reads. I want to do this in a loop.You are getting 4 bam files because you are running STAR 4 times, once per
fastq
file. This is due to$FILES
being an array of 4 differentfastq
filenames.