I'm using breakdancer-max 1.4.4-unstable-7-6213d5a on exome sequencing data from tumor-normal pairs (all from Illumina HiSeq 2500, 200x coverage, whole exome using NimbleGen Exome v3.0). BAM files are ~40Gb in size, all quality-trimmed using fastqc, aligned using BWA-sampe, processed using GATK pipeline (remove duplicates, do indel realignment, relirecalibrate base quality score).
I first generate the config file using
bam2cfg.pl -g -h tumor.bam normal.bam > t.cfg
The output is
readgroup:CGATGT.L002 platform:ILLUMINA map:t.bam readlen:101.00 lib:310.T1.CGATGT num:9984 lower:0.00 upper:760.11 mean:245.43 std:101.07 SWnormality:-59.31 flag:0(13.62%)18(85.59%)2(0.13%)32(0.43%)4(0.23%)10771 exe:samtools view
readgroup:CGATGT.L001 platform:ILLUMINA map:t.bam readlen:101.00 lib:310.T1.CGATGT num:9984 lower:0.00 upper:760.11 mean:245.43 std:101.07 SWnormality:-59.31 flag:0(11.15%)18(88.06%)2(0.12%)32(0.53%)4(0.13%)8(0.02%)12148 exe:samtools view
readgroup:ATCACG.L002 platform:ILLUMINA map:n.bam readlen:101.00 lib:310.N.ATCACG num:9994 lower:0.00 upper:737.03 mean:241.92 std:99.54 SWnormality:-57.95 flag:0(14.29%)18(84.85%)2(0.11%)32(0.47%)4(0.27%)9586 exe:samtools view
readgroup:ATCACG.L001 platform:ILLUMINA map:n.bam readlen:101.00 lib:310.N.ATCACG num:9994 lower:0.00 upper:737.03 mean:241.92 std:99.54 SWnormality:-57.95 flag:0(12.77%)18(86.36%)2(0.11%)32(0.55%)4(0.21%)10618 exe:samtools view
I run breakdancer as shown below
breakdancer-max -q 10 -d t t.cfg > t.ctx
The run takes about 90 min, and it terminates with a "Max Kahan error:0." However, it also produces a 16Mb .ctx file. The file contains calls from all chromosomes, so it seems like breakdancer-max ran fine, except the error message. First couple lines from the output is below.
#Software: 1.4.4-unstable-7-6213d5a (commit 6213d5a)
#Command: breakdancer-max -q 10 -d t t.cfg
#Library Statistics:
#n.bam mean:241.92 std:99.54 uppercutoff:737.03 lowercutoff:0 readlen:101 library:310.N.ATCACG reflen:3052047611 seqcov:6.83207 phycov:8.18225 1:14714 2:393238 4:426044 8:16096 32:569558
#t.bam mean:245.43 std:101.07 uppercutoff:760.11 lowercutoff:0 readlen:101 library:310.T1.CGATGT reflen:3052047611 seqcov:6.77381 phycov:8.23018 1:15354 2:549576 4:367739 8:16740 32:635131
#Chr1 Pos1 Orientation1 Chr2 Pos2 Orientation2 Type Size Score num_Reads num_Reads_lib 310.bam 310t.bam
chrM 4 6+17- chrM 16564 17+3- ITX 16001 99 14 n.bam|8:t.bam|6 129.24 111.03
chrM 509 100+26- chrM 743 100+26- ITX -144 99 12 n.bam|8:t.bam|4 NA NA
chrM 839 100+26- chrM 1201 2+4- DEL 535 46 2 t.bam|2 124.40 100.40
chrM 828 3+11- chrM 3198 151+249- ITX -144 99 58 .bam|32:t.bam|26 6.05 5.05
chrM 3102 151+249- chrM 3160 1+11- DEL 584 98 8 .bam|4:t.bam|4 47.92 45.76
I saw older unresolved posts on Biostar about the same issue. (Error Running Breakdancer-Max: Max Kahan Error: 0) I was wondering if anyone came up with a solution since then.
Thanks.
...that's quite possibly the worst status message ever... Nice tool, though. :)
Agreed lol. I added that a long time ago when I was worried about numerical precision in the scoring code, and never got around to taking it out... until now. It's gone in master on github :)
I got intrigued, who is Max Kahan, whose Error of Zero should be taken as a good sign.
Then I realized! It must be the maximal Kahan summation error that we're happy to see at being zero.
That's hilarious! I've been thinking it was an inside joke, a reference to a TV show I've never seen or something similar. Google gave me this, which kind of puzzled me... http://www.crainsnewyork.com/article/20131117/FINANCE/311179997/max-kahan-sets-the-gold-standard#