I am trying to pipe the output from BWA to sambamba to sort and index the sam files. I have 20 files with reads from sequencing (pair end) and want to have the resulting bam file (not the intermediate sam or bam files). This is the code I have at the minute:
for filename in ./seqtk_1/subsample_1/*_1.fq.gz;
do file=`echo $filename|sed 's/_1.fq.gz//'`;
filenopath=`basename $file`;
outputpath=./BWA/seqtk_1/subsample_1;
bwa mem -v 3 ./combine_reference.fa.gz ${file}_1.fq.gz ${file}_2.fq.gz > ${outputpath}/align_${filenopath}_BWA.sam |
sambamba view -S -f bam - > ${outputpath}/align_${filenopath}_BWA.bam |
sambamba sort -o - > ${outputpath}/sorted_${filenopath}_BWA.bam |
sambamba index - > {outputpath}/indexed_${filenopath}_BWA.bam;
done
This is the output:
-bash: {outputpath}/indexed_sub_NC_001539_BWA.bam: No such file or directory
sambamba-view: Unrecognized option -
sambamba-sort: Cannot open or create file '' : No such file or directory
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 100000 sequences (10000000 bp)...
[M::process] read 100000 sequences (10000000 bp)...
That continues through the rest of the files. I get a sam file and a sorted_${filenopath}_BWA.bam file but the bam file isnt populated.
My thinking is that the code isn't read/completed linearly and it is trying to create files that can't be created because BWA hasn't started running yet.
Is there a way to fix this? Or do I just need to run BWA and sambamba separately? I don't want to keep these sam files because the size is too large.
Thanks in advance
Thank you, so the pipe should look more like this:
I am completely new to bash
No, the pipe should look like this:
The concept of pipes is not generating intermediate files. So don't do that. Leave out the
>
redirection symbol, and stream|
directly to the next command. Using>
and|
is mutually exclusive.Piping into
index
won't work. You'll need to do that separately.Note again that I don't know sambamba but I'm just applying linux logic. Your mileage may vary. Somehow you'll have to tell sambamba that it should expect input on stdin. You'll have to dive in the documentation for that.
For samtools it would be like this:
Note the final
-
which tells samtools that input is on stdin and not a file.Thank you! That makes alot of sense
That doesn't work
Comes up with this:
Apparently (I told you I didn't know about sambamba and you had to look into the documentation) you can use /dev/stdin and /dev/stdout to read from pipes and stream to pipes.
Okay, thank you for the help. I really appreciate it.
Glad I could help. If my answer resolved your question you should mark it as accepted.
Here is how to send
stdin
tosambamba
.@wouter, Oops! same link :P