I need help to write a for loop to run Trimmomatic tool for quality trimming of single-end fastq files. I need to write a for loop so that I can run an executable for all multiple files. I read the exchanges of a similar question for the paired-end data. But it does not help me much.
Any help please! Thanks!
where $1 is your input file, and basename will remove the .fastq.gz and replace with the suffix .trimmomatic_out.fastq.gz. Save as run_all_trim.sh.
List all of your single end files in a a file as a list (single column): SE_files.txt. If they are all in one dir: ls -1 *.fastq.gz > SE_files.txt. Then pass each of your single end files to the trimmomatic command.
Hi, your comment is getting old but was very useful.
However, could you explain how it works? I don't get it.
I wrote this:
trimmomatic SE -threads 16 -phred33 $1 “/trimmomatic/`basename $1 .fastq.gz`.trimmomatic_out.fastq.gz" \
It uses as input my files that are in ./raw and send them to ./trimmomatic. This was my intention, but how does it understand to use ./raw as input and not just ./ ?
List your files in a text file. If they are all in a folder called raw, and you want to run from there, the filenames in the SE_files.txt would be raw/prefix.fastq.gz. The point is you're putting the command in a bash script, and then looping through each line (file) in the text one at a time.
Similarly you can write a bash script, as shenwei pointed out above, where you can do:
#!/usr/bin/bash
for file raw/*.fastq.gz; do
echo $file
java -jar trimmomatic-0.35.jar SE -phred33 $file "`basename $file .fastq.gz`.trimmomatic_out.fastq.gz" ILLUMINACLIP:TruSeq3-SE:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
done
You can save this as run_trim.sh, and run in the background with:
nohup bash run_trim.sh > log.txt &
Each line in the log file will have a filename to track the progress. As it's running, you can:
wc -l log.txt
to see where it's at compared to the total number of files (ls -1 raw/*.fastq.gz | wc -l )
Hi, your comment is getting old but was very useful. However, could you explain how it works? I don't get it.
I wrote this:
It uses as input my files that are in
./raw
and send them to./trimmomatic
. This was my intention, but how does it understand to use./raw
as input and not just./
?List your files in a text file. If they are all in a folder called raw, and you want to run from there, the filenames in the SE_files.txt would be raw/prefix.fastq.gz. The point is you're putting the command in a bash script, and then looping through each line (file) in the text one at a time.
Similarly you can write a bash script, as shenwei pointed out above, where you can do:
You can save this as
run_trim.sh
, and run in the background with:Each line in the log file will have a filename to track the progress. As it's running, you can:
to see where it's at compared to the total number of files (
ls -1 raw/*.fastq.gz | wc -l
)