Entering edit mode
2.6 years ago
biology_inform
▴
50
Hello, I have a question about STAR alignment with multiple files. I am using it from server and so my memory is limited (I have some restrictions unfortunately). So I have 12 files can I align them by grouping them as 6 to 6. In the first run; 6 of them and in the last run the other 6 of them. Does it create a problem for further processes such as quantification and diff gene exp analysis?
#!/bin/bash
#SBATCH -p hamsi
#SBATCH -A proj2
#SBATCH -c 28
#SBATCH -N 1
#SBATCH -t 0-4:00
#SBATCH -J star_alignment
#SBATCH -o star_alignment_%j.out
#SBATCH -e star_alignment_%j.err
cd ~/GSE121634_HCC4006
for file in $(cat ./acc_number.txt);
do
~/STAR-2.7.10a/bin/Linux_x86_64_static/./STAR --runThreadN 28 \
--genomeDir ~/index/ \
--readFilesIn ./${file}_1.fastq.gz ./${file}_2.fastq.gz \
--readFilesCommand zcat \
--outFileNamePrefix ~/GSE121634_HCC4006/${file} \
--outSAMtype BAM SortedByCoordinate
done
It gives me a memory error then I tried without for loop.
#!/bin/bash
#SBATCH -p hamsi
#SBATCH -A proj2
#SBATCH -c 28
#SBATCH -N 1
#SBATCH -t 0-4:00
#SBATCH -J star_alignment
#SBATCH -o star_alignment_%j.out
#SBATCH -e star_alignment_%j.err
cd ~/GSE121634_HCC4006/Alignment
~/tools/STAR-2.7.10a/bin/Linux_x86_64_static/./STAR --runThreadN 28 \
--genomeDir ~/index/ \
--readFilesIn SRR8088215_1.fastq.gz,SRR8088216_1.fastq.gz,SRR8088217_1.fastq.gz,SRR8088218_1.fastq.gz,SRR8088219_1.fastq.gz,SRR8088220_1.fastq.gz,SRR8088221_1.fastq.gz,SRR8088222_1.fastq.gz,SRR8088223_1.fastq.gz,SRR8088224_1.fastq.gz,SRR8088225_1.fastq.gz,SRR8088226_1.fastq.gz,SRR8088227_1.fastq.gz,SRR8088228_1.fastq.gz,SRR8088229_1.fastq.gz,SRR8088230_1.fastq.gz,SRR8088231_1.fastq.gz,SRR8088232_1.fastq.gz SRR8088215_2.fastq.gz,SRR8088216_2.fastq.gz,SRR8088217_2.fastq.gz,SRR8088218_2.fastq.gz,SRR8088219_2.fastq.gz,SRR8088220_2.fastq.gz,SRR8088221_2.fastq.gz,SRR8088222_2.fastq.gz,SRR8088223_2.fastq.gz,SRR8088224_2.fastq.gz,SRR8088225_2.fastq.gz,SRR8088226_2.fastq.gz,SRR8088227_2.fastq.gz,SRR8088228_2.fastq.gz,SRR8088229_2.fastq.gz,SRR8088230_2.fastq.gz,SRR8088231_2.fastq.gz,SRR8088232_2.fastq.gz
--readFilesCommand zcat \
--outSAMtype BAM SortedByCoordinate
So will it create a problem by running them separately? Thanks in advance
Why do you need to run all the samples together? Do alignment for paired reads for each sample one by one.
If required you can concat the bam files after alignment.
You may try with unsorted bam and then sort it with samtools.