I am trying to perform the alignment using bwasw and then using samtools to report the aligned sequences. So, essentially I am piping samtools and bwa. Here is the code:
$ for file in *.amb;
do
.././bwa bwasw -z 5 ${file:0:13} ../QUERY/updated_635_500bp.fasta>../OUTPUT_635/"file".sam| ../../../samtools-0.1.18/samtools-0.1.18/samtools view -bS -F 4 ../OUTPUT_635/"$file".sam ../OUTPUT_635/"$file".bam; rm -f "$file"sam ;
done
It tells me that
`[main_samview] fail to open "../OUTPUT_635/AC_000091.fna.amb.sam" for reading.`
So it is performing the samtools operation before bwasw?
I'd appreciate help with this! I am using cygwin on Windows 7 btw.
Thank you!
I'm sorry, but there are actually quite a number of things completely wrong here.
First of all you have to decide if you want to redirect the sam output to a temporary file (>) or if you want to pipe (|) it directly into samtools without the need for a temporary file. That is probably what makes most sense here. But you actually try both at the same time, which cannot work.
Furthermore, the use of your file variable is wrong in three different ways. And the use of samtools is wrong as well: you want to use stdin (-) as input and direct (>) the output to a BAM file.
Anyway, the following should in theory work (didn't check). Note, that I've shortened paths to make it more readable.
for file in *.amb; do
bwa bwasw -z 5 $file ../QUERY/updated_635_500bp.fasta | \
samtools view -bS -F 4 - > ../OUTPUT_635/${file}.bam
done
All of this assumes of course, that your amb files are actually fasta files and ../QUERY/updated_635_500bp.fasta is a fastq file (and it looks like the opposite might be true).
I would recommend to read up on Unix basics. This would help you a lot.
Thank you for your reply. I'll try this and see if it does the trick. Also, amb index files were generated from .fna files and the query files(which contain about 70,000 sequences, 500 bp) are .fasta. I thought this should work? I changed the spacing in this query file to get rid of the extra blank spaces. Does this make sense? Thanks!
Indeed. Make sure one command works end to end before trying to wrap it up in anything more complex!
Thank you for your reply. I'll try this and see if it does the trick. Also,
amb
index files were generated from.fna
files and the query files(which contain about 70,000 sequences, 500 bp) are.fasta
. I thought this should work? I changed the spacing in this query file to get rid of the extra blank spaces. Does this make sense? Thanks!