I'm attempting to run breakdancer-max, and am running into an error that I can not find documentation on. My script looks like this:
=======
#breakdancer directory
BD="/share/apps/ngs-ccts/breakdancer-1.4.2"
#File name
FILE="CNTL3487.sort.bam"
#Output directory
DIR="breakdancer_test"
#Create config file
perl $BD/perl/bam2cfg.pl $FILE > $DIR/CNTL3487.cfg
$BD/bin/breakdancer-max -d CNTL3487 $DIR/CNTL3487.cfg > $DIR/CNTL3487.ctx
======
And the output that I get looks like this:
======
[Wed Oct 30 11:25:46 2013 /share/apps/ngs-ccts/breakdancer-1.4.2/perl/bam2cfg.pl] Processing bam: CNTL3487.sort.bam
[Wed Oct 30 11:25:49 2013 /share/apps/ngs-ccts/breakdancer-1.4.2/perl/bam2cfg.pl] selected_libs is : 1
[Wed Oct 30 11:25:49 2013 /share/apps/ngs-ccts/breakdancer-1.4.2/perl/bam2cfg.pl] Closing BAM file
[Wed Oct 30 11:25:49 2013 /share/apps/ngs-ccts/breakdancer-1.4.2/perl/bam2cfg.pl] Send TERM signal for 10538
[Wed Oct 30 11:25:51 2013 /share/apps/ngs-ccts/breakdancer-1.4.2/perl/bam2cfg.pl] samtools pid process 10538 is still there...
[Wed Oct 30 11:25:51 2013 /share/apps/ngs-ccts/breakdancer-1.4.2/perl/bam2cfg.pl] invoking kill -9 on 10538 ...
[Wed Oct 30 11:25:51 2013 /share/apps/ngs-ccts/breakdancer-1.4.2/perl/bam2cfg.pl] Closing samtools process : 10538
Max Kahan error: 0
=====
All of the output except for the Max Kahan error : 0 line is generated by bam2cfg.pl, and the Kahan line is generated by breakdancer-max. This also generates two files in the home directory (same directory I'm running in, and the one containing CNTL3487.sort.bam) with a size of 0 called CNTL3487.NA.2.fastq and CNTL3487.NA.2.fastq. It also creates a CNTL3487.ctx file in the breakdancer_test/ folder that looks like this:
======
#Software: 1.4.3 (commit 1bd2363)
#Command: /share/apps/ngs-ccts/breakdancer-1.4.2/bin/breakdancer-max -d CNTL3487 breakdancer_test/CNTL3487.cfg
#Library Statistics:
#CNTL3487.sort.bam mean:605.16 std:159.56 uppercutoff:1222.92 lowercutoff:0 readlen:144 library:NA reflen:3005868224 seqcov:0.238852 phycov:0.501888 1:100 2:6504:636 8:50 32:2722
#Chr1 Pos1 Orientation1 Chr2 Pos2 Orientation2 Type Size Score num_Reads num_Reads_lib CNTL3487.sort.bam
======
Note that the .bam file used here contains sequence reads from genomic material that first went through a capture step to isolate a region from Chr1.
I was expecting there to be predictions, and the program finished running in only a few minutes (whereas I was expecting a run time of at least an hour or so based on what I've read). These facts lead me to believe that either breakdancer isn't working or I'm not implementing it correctly.
I just reran the script and echo "$?" returns 24085. The whole process finished in less than a minute.
Below is the configuration file.
So I see that you do have reads flagged as crossing chromosomes, but I'm also wondering if you have included all reads from this experiment. You restrict your reads to only those mapping to chromosome 1, for example?
Your insert size distribution also looks to be quite large. Do you have multiple libraries in this experiment? If so, you need to have them properly marked using read groups for BreakDancer to perform optimally.
The .BAM file has not been filtered and I am not using multiple libraries.
For this experiement, genomic DNA underwent a BAC-clone based capture method in which two BAC clones were used to amplify a ~180kb region of Chr1. The result of this capture method is that ~50-70% of the reads could be mapped to this region of Chr1.
The variability in insert size may be due to a property of this region: it has a very high degree of sequence homology within it. The first half of the region is ~98% identical at the nucleotide sequence level to the second half.
Based on what I've seen in IGV, the reads spanning chromosomes could be due to repetitive regions, within the region that map to multiple chromosomes.
The types of variations I expect to see would be intra-region duplications and/or deletions. Prior studies have shown that these variations occur within the region, although intra-chromosomal junctions can't be ruled out.
Based on this and your other Biostar question, we think there may be a bug. The released 1.4.2 version does seem to be functioning properly though. You can get that copy of the code by adding the following line after the initial git clone in the install instructions.
git checkout v1.4.2
Then proceed with creating the build directory and compiling as normal.
This was in fact a bug. It's been fixed on master and tagged v1.4.4. If you git pull to update your checkout and try again, you may get a better result.