Entering edit mode
2.6 years ago
connorjfausto
▴
30
Hello!
I wrote code to have my 40 fastq files be aligned to SAM files, but after running this script it ended up created one big SAM file. I've used this same formatting before for other purposes, so I'm wondering why it merged everything?
for i in *_pass.trimmed.fastq; do
name=$(basename ${i}_pass.trimmed.fastq); bowtie2 \
-p 8 \
-x mm10 \
-U ${name}_pass.trimmed.fastq \
-S ${name}.sam
done
One example of a file name is SRR6165747_pass.trimmed.fastq
which goes from the ending numbers 47 to 90. I was expecting to have output files SRR6165747.sam
up to 97, but in my output folder there was one file named *.sam
Helping me re-format this would be greatly appreciated, thank you!
I believe you did not declared
$
.Try this (not tested):
do name=$(basename ${i}_pass.trimmed.fastq); bowtie2 \ -p 8 \ -x mm10 \ -U ${name}_pass.trimmed.fastq \ -S ${name}.sam
This isn't necessary. Bash understands globbing (assuming you have your shell set up correctly), so
for i in /*.ext ; do ...
is valid.Almost certainly the
basename
manipulation you're doing isn't resulting in the filenames you think it is, sobowtie
ends up writing to the same.sam
. I would have assumed it would overwrite, not append, but perhaps it has some particular behaviour.I suggest you just debug the loop by replacing your bowtie command with some
echo
's to make sure you are getting the right filename format. You should also quote your variables:"${var}"
.I would start with something like:
For me personally I tend not to reach for
basename
out of habit and usebash
operators instead:Thank you for the reply! I changed my code to write with bash operators, and got it working. One follow up question I had though is that I was not able to specify an output file path, I had to let the output files be made in the same directory. I was wondering if this was due to the way my code was written?
Code that worked:
Code that failed:
I know I can simply move everything afterwards, but was wondering if there's a way I can implement this into the loop?
The
"${var%.*}"
syntax only strips off the part of the filename after the trailing.
of the file extension, essentially (or after whatever you switch.
to):This means it leaves the path structure intact. If you want to obtain just the basename of the file, then this is where
basename
actually does become the better option, sincebasename
can also take an argument to remove file extensions as well. This can be done with bash operators, but not in a single step (requires intermediate variables):The equivalent operator to do 'left handed' stripping instead of 'right handed' is
#
instead of%
.Essentially the above code doesn't work because you're adding a path to a path, not to the filename. This is why I suggest you
echo
your variables first, so you understand what has happened to them.