Using samtools version 0.1.18, the following bowtie alignment pipeline works well to create a sorted BAM file:
bowtie -v 0 -S genome.fasta test.fastq | samtools view -S -b -u - | samtools sort - test0.1.18
[samopen] SAM header is present: 2108 sequences.
# reads processed: 80094
# reads with at least one reported alignment: 70028 (87.43%)
# reads that failed to align: 10066 (12.57%)
Reported 70028 alignments to 1 output stream(s)
However, if I install and use samtools version 0.1.19, I get an error message claiming the input is truncated (2nd line below):
bowtie -v 0 -S genome.fasta test.fastq | samtools view -S -b -u - | samtools sort - test0.1.19
[bam_header_read] EOF marker is absent. The input is probably truncated.
[samopen] SAM header is present: 2108 sequences.
# reads processed: 80094
# reads with at least one reported alignment: 70028 (87.43%)
# reads that failed to align: 10066 (12.57%)
Reported 70028 alignments to 1 output stream(s)
I assume the error message is coming from samtools sort. But it is confusing, since both versions create identical results, including fully intact headers and properly formatted BAM alignments. Removing the -u switch in the samtools view call has no effect.
I'd like to fix the reporting of this error because a similar pipeline is used in some scripts of mine -- users who have samtools 0.1.19 are worried about the error message. I know I could just send the STDERR of the samtools sort to /dev/null, but I'd rather try and understand what is happening first ...
Thanks for digging that out for me - much appreciated
No problem. Given how frequently I use samtools (and its C API), I should probably check if the current devel branch still has this bug and submit a fix if not.