Dear everyone,
I have received a couple of bam files processed by CASAVA pipeline (CASAVA-1.9.0) from a collaborator. I would like to create index files for them using samtools index. All but one seem to behave as expected. However, there is this one bam file which results in Segmentation Fault. The error can be reproduced in samtools releases 0.1.18 and 0.1.19 .
Please do not suggest running samtools sort first, as I have tried it and it results in the same segfault error.
I was wondering if anyone can help me out.
Thank you!
P.S. Following dpryan79's suggestion, here is the output of bt when I tried running index inside gdb:
#0 0x000000371a88ede3 in __memcpy_sse2 () from /lib64/libc.so.6
#1 0x0000000000426686 in bgzf_read (fp=fp@entry=0x6620a0, data=<optimized out>, length=1597059097) at bgzf.c:358
#2 0x000000000042d170 in bam_read1 (fp=fp@entry=0x6620a0, b=b@entry=0x682ba0) at bam.c:218
#3 0x000000000043192f in bam_index_core (fp=fp@entry=0x6620a0) at bam_index.c:182
#4 0x000000000043392e in bam_index_build2 (fn=0x7fffffffe3c1 "../../data/BAMS/SS6004353.bam", _fnidx=_fnidx@entry=0x0) at bam_index.c:484
#5 0x0000000000433a89 in bam_index_build (fn=<optimized out>) at bam_index.c:510
#6 bam_index (argc=<optimized out>, argv=<optimized out>) at bam_index.c:520
#7 0x000000371a821b75 in __libc_start_main () from /lib64/libc.so.6
#8 0x000000000040337d in _start ()
just a question, can you do a
samtools flagstat in.bam
or asamtools view in.bam|wc -l
on that file?samtools flagstat
--> nosamtools view
--> yes, and it returns ~19k for a WGS bam file about 169G.Does this mean there is a buggy read/line that is causing the problem?
Yeah, it sounds like the file is just corrupt.
Thanks for all the great tips and the walk-thru! Does the output of bt indicate the same cause?
Yeah, in this case it does. It looks like the information specifying one of the compressed blocks is damaged, which is causing attempted access out of range.
Thanks a lot dpryan79!
Have you tried compiling samtools with debug symbols and then running it inside
gdb
? That would answer what's going wrong, since no one here will be able to give you more than a guess without a reproducible example.Thank you for the prompt response. Can you please elaborate more on how I can do that? a pointer will be much appreciated!
It turns out that samtools has debug symbols by default, so that makes life easier :)
The general process for using gdb would be like
You'll then get an error at some point and can use commands like
bt
(print a backtrace) to find out exactly where the problem is happening. You could just update your post with the output ofbt
, since that'll give us all enough to get started.