I have some breakpoint data and I would like to determine type / class of each rearrangement. I believe BreakDancer can do this.
Could someone please show me how a BAM file is formatted so I know if my data has enough information to be reformated for use with this tool?
EDIT:
Example of typical data -
Germline/Somatic Evidence #Solexa reads Chr L Pos L Strand L Chr H Pos H Strand H Microhomology length (bp) Microhomology seq Non-templated sequence length (bp) Non-templated sequence
Somatic Seq 1 18 19092052 + 18 30289323 + 0 0
I already have a script which pulls back the sequence observed across a breakpoint if this is helpful.
Many bam files are available here: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/ . Typically, a BAM file is produced by an aligner after a next-generation sequencing experiment. I suspect that your "breakpoint data" are not from such a pipeline and that breakdancer is not the right tool. You'll need to clarify what type of data you have if you have more questions.
What do you mean by "breakpoint data"?
Thank you for the information. I'm looking at the sort of data published here: http://genome.cshlp.org/content/early/2013/02/13/gr.143677.112/suppl/DC1 (e.g. Supplementary Table 3). In this case there is column "variantClass" but not all data has this.