I'm running SomaticSniper on whole-exome tumor/normal pairs. A typical run takes a couple hours. One sample, however, keeps running after 24hrs. No error message. I can't figure out why.
It doesn't seem to be a hardware problem because other bams from the same dataset process just fine. (all from Illumina HiSeq 2500, 100x coverage, whole exome using NimbleGen Exome v3.0)
The size of the problemmatic bam is 29GB. I've run Sniper on 35GB and 39GB bams without an issue. I've also tried reducing the size of the bam by filtering out duplicate reads, zero-map-quality reads, unmapped reads, bad mates etc. It didn't resolve the issue.
Below is the top output. As you can see, each instance of bam-somaticsnip is running at %100 CPU.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25941 jack 20 0 78192 67m 916 R 100 0.1 1436:46 bam-somaticsnip
28418 jack 20 0 75480 64m 916 R 100 0.1 925:23.75 bam-somaticsnip
1379 root 20 0 132m 1288 696 S 0 0.0 76:55.48 Xorg
It seems like the hang up occurs near or at the end of the bam file. Because the output shows mutations from chrUn_gl000241, which is located near the end of the UCSC reference genome. (Unlocalized contigs run up to 249.)
chrUn_gl000230 19978 C Y C 52 46 46 41 78 0 43 307 261 34 49 222 33 20 86 34 49 207 0 0 0
chrUn_gl000234 3507 T Y T 64 59 59 42 85 0 42 97 109 32 51 70 35 21 25 32 50 84 0 0 0
chrUn_gl000237 2302 a W A 54 94 94 38 71 0 39 34 36 29 31 25 33 57 9 32 38 33 0 0 0
chrUn_gl000237 2406 a R A 41 34 34 38 99 0 43 65 90 29 38 53 31 38 12 29 45 79 0 0 0
chrUn_gl000241 4303 C M C 42 35 35 47 125 0 45 203 230 33 46 160 30 52 37 33 45 196 0 0 0
Any insights on why this might happen? Any suggestions?
Is there an undocumented command-line parameter to run SomaticSniper in "verbose mode" so I can see where it is hanging up exactly?
*** UPDATE ***
I removed unlocalized contigs 241-249 from the bam file, and that resolved the issue. I visualized those contigs on IGV. Largest depth on any of them is about 1700. chrUn_241 seems to be pretty badly aligned, but removing only that contig didn't solve the problem. Check out the visualization below.
You'll probably need to provide more information in order to get a useful response. There are any number of things that could be going on, ranging from hardware problems to an abnormally large bam.
Chris, I've updated the question with more details. Please let me know if you need any other information. Thanks for your help.
I have only observed behavior like this on BAMs with extremely high depth, but that doesn't appear to apply to your situation here. What version are you using?
I'm using 1.0.3. It doesn't seem like depth is the problem. Please see the update to my post.
I'm happy to share the data, if you'd like to figure out what's going on.
This sounds suspiciously like a bug. I'd want to run Sniper in the debugger and see where it is churning in order to understand this better. For that, a small example dataset would be best. Does the behavior still occur utilizing only unplaced contigs? How big is that BAM file?
I've not tested running SomaticSniper only on the unlocalized contigs. Bam files are ~40Gb each.
Let me know what you think.
I'm interested in learning more about the bug here, but I don't think transferring the whole BAM is feasible. 40Gb is quite sizable for troubleshooting. In addition, I'm concerned about transferring human sequencing data around with regards to data protection and access permissions.
In the past, we had a similar problem due to coverage issues which we resolved and haven't seen come up internally. Based on your screenshots, I wonder if gl000248 is your actual issue. What if you remove just that contig from your BAM?
I've just run into the same problem with SomaticSniper 1.0.4.
The job got stuck somewhere around
chrUn_gl000241
. SomaticSniper is running at 100% but no progress is made.I observe a similar problem... ran it in gdb after letting it run for an hour or two (on a targeted sequencing dataset ... high depth)
here are the last couple of output lines.
You're probably going to want to either post this as a new question or file an issue here