Hi all,
I am using an automated script to trim paired end fastq.gz files using Trimmomatic. I have this code:
#!/bin/bash
for f1 in *1.fastq.gz
do
f2=${f1%%1.fastq.gz}"2.fastq.gz"
java -jar /Trimmomatic-0.36/trimmomatic-0.36.jar PE -trimlog
TrimLog $f1 $f2 -baseout filtered.fq.gz SLIDINGWINDOW:5:20
done
I would like to name my output files according to the input files. For example:
Say I have these 2 input files: 12345.1.fastq.gz
and 12345.2.fastq.gz
. I would like the output files (using the -baseout
flag) to be named: filtered.12345.1P.fastq.gz filtered.12345.1U.fastq.gz filtered.12345.2P.fastq.gz filtered.12345.2U.fastq.gz
(Trimmomatic spits out 4 output files, and automatically adds The 1P, 1U, 2P, 2U
, parts of the output file names. )
Does anyone know how to edit my script to do this? Thanks!
You can specify the outputs in the command:
So, for your code, I think it would be something like this, using basename to get each file prefix:
Alternatively, you can create a tab-delimited file, and use xargs:
prefixes.txt
:Save as
trimmomatic.sh
:Run as:
Thanks for your reply! I do have another question regarding using the basename to get each file prefix. Your code outputs my files as:
filtered.12345.1.fastq.gz.1P.fastq.gz filtered.12345.1.fastq.gz.1U.fastq.gz filtered.12345.1.fastq.gz.2P.fastq.gz filtered.12345.1.fastq.gz.2U.fastq.gz
Is there a way to edit your script so that it is just
filtered.12345.1P.fastq.gz filtered.12345.1U.fastq.gz
etc etc ? So, just taking out the extrafastq.gz
part?I'm not sure why you're getting those outputs. Running this test, with
12345.1.fastq.gz
in a dir:Yields:
that is very weird. I ran that exact script and this is what I got: ( i know the file names are different than 12345.1... just used that for simplicity. Maybe these different file names are the cause of the problem? )
I believe I got it to work by adding:
(basename -s .fastq.gz $f1)
so it strips off the suffix of $f1. Thanks for all the help.