Hello guys,
Recently, I using STAR to map reads with multiple files ,here is the script:
for NAME in individual1 individual2 individual3
do
STAR --runMode alignReads \
--runThreadN 10 \
--genomeDir $REF \
--readFilesIn ${INPUT}/${NAME}_input/${NAME}_input_R1.fq.gz \
--readFilesCommand zcat \
--outSAMstrandField intronMotif \
--outFileNamePrefix ${OUT}/${NAME}_input_wasp \
--outSAMtype BAM Unsorted \
--varVCFfile ${OUTPUT}/${NAME}_input.vcf \
--waspOutputMode SAMtag \
--outSAMattributes vA vG
done
PATH is right for sure . The key problem is when it get one file done, it stop. NO warning at all. When I type "ps" , is shows like this.
PID TTY TIME CMD
19335 pts/0 00:00:00 bash
19384 pts/0 00:00:00 bash
19665 pts/0 00:36:35 STAR
19668 pts/0 00:00:00 sh <defunct>
19708 pts/0 00:00:00 ps
Only when I type ''kill 19665 '' , the next file can be processed .
I have no idea about this issue, this confuse me a lot . Could anyone tell me how to fix it?
THANK YOU !
These two directory represent results of two different step ,${OUT} is where I store my STAR result.
By the way ,I test STAR with one single file, "defunct"still happen.
I also execute STAR in a loop and use two differnt ways to get the file names. Either I submit the file names (with the respective paths) to STAR by a document which holds a filename per line:
# For every name in the file
while read SAMPLE; do
# Get single file name
FILEBASE=$(basename "${SAMPLE%.fq.rm_bl}")
# Make new directory for every sample
mkdir /path_to_later/gap_table/$FILEBASE.STAR
# Enter the new directory
cd /path_to_later/gap_table/$FILEBASE.STAR
# Align with STAR
/path_to_STAR/STAR --outFilterType BySJout --outFilterMismatchNmax 10 --outFilterMismatchNoverLmax 0.04 --alignEndsType EndToEnd --runThreadN 8 --outSAMtype BAM SortedByCoordinate --alignSJDBoverhangMin 4 --alignIntronMax 300000 --alignSJoverhangMin 8 --alignIntronMin 20 --genomeDir /path_to/star_index_hg38_hiv_r100/ --sjdbOverhang 100 --quantMode GeneCounts --sjdbGTFfile/path_to/hg38_pnL43_fusion_annotation.gtf --outFileNamePrefix /path_to/gap_table/$FILEBASE.STAR/ --readFilesIn $SAMPLE > STARaligning.log
done </path_to_filename_file/filename
Another way would be to search within a directory for certain filenames, to use them subsequently in STAR as input:
Here the first row of the code above is replaced with this 2 lines:
# For every file in the given directory (/path_to_file/), use the filenames showing a ".fq" at the end
find /path_to_files/ -name "*.fq" | while read SAMPLE
# Get single file name
FILEBASE=$(basename "${SAMPLE%.fq}")
I suppose, the extra space between individual2 individual3 is not in the real code? Otherwise, I don't know the reason for the error during your particular kind of loop.
Hi caggtaagtat ,I am happy to tell you that I have known what's going on . It is RAM issue . The problem happened because I set 10 threads. Maybe it is too large to account . Hope this can help other people who encounter same problem like me .
See my suggestion for a simple parallelization script (for
bowtie2
but I think you'll get the idea) A: perl script for BWA-mem on multiple different filesThanks ! It seem useful , I will try in my code .
do
${OUT}/
and${OUTPUT}/
exist before you run STAR?How do you define $OUT and $OUTPUT?
It just like this
These two directory represent results of two different step ,${OUT} is where I store my STAR result. By the way ,I test STAR with one single file, "defunct"still happen.