I am trying to estimate the frequency of a structural breakpoint in a pooled NGS sample. The coordinates of the breakpoint have been confirmed by PCR. Are there any off the self tools that can do this?
Potential solution:
I could count reads that contain the breakpoint in the BAM file. I would need to use some sort of fuzzy pattern matching. However, I would rather not write new code.
b) this is generally difficult, because breakpoint-spanning reads may map to one side or the other, but also may not map at all. To get a high confidence call, you could create a short contig containing your breakpoint sequence +/- 200 bp, append it to the reference genome, then realign all of your reads against this. Compare the depth of breakpoint spanning reads on your contig to the depth at the breaks on the original reference sequence, and that ratio will give you a pretty good idea of the frequency.
You're right, I suspect my reads with the breakpoint will be tossed. 1) they will have poor alignment scores and 2) end-to-end in bowtie2 shouldn't allow them. For some stupid reason I though this would be a trivial task.
You're right, I suspect my reads with the breakpoint will be tossed. 1) they will have poor alignment scores and 2) end-to-end in bowtie2 shouldn't allow them. For some stupid reason I though this would be a trivial task.
No worries, man. If I had a dollar for every time something turned out to be way harder than I expected...