I would like to phase pairs of closely situated somatic mutations. Basically, I want to grab the sequencing read IDs supporting the variant allele for each mutation, and compare to see if the same reads report nearby mutations in the pair.
I know that I can view short segments of bam files centered at each pair in IGViewer or other similar tools, and visually verify this. But I am looking for a more systematic way.
Please excuse me if this is too obvious of a question. Any pointers will be appreciated.
Thank you. I took a quick look and seems relevant. But I am not really interested in doing so on a large scale; and I do not see an option specifying the regions of interest from the bam file. So, I assume I will have to make small bam files around my regions of interest. Is this right?
You might be able to make a BED file with the regions of interest in it and samtools view -bL regions.bed something.bam | samtools phase ... -. Hopefully that works since it'd be simpler.
Glad it's doing what you need. I made that an answer for the sake of posterity (and so others don't feel compelled to see if this still needs answering).
The simplest solution would be to use samtools phase. If you only have a few regions of interest (ROIs), then it'll be more efficient to pipe a subset of your original alignment into samtools. This can be done with:
Have you tried
samtools phase
?Thank you. I took a quick look and seems relevant. But I am not really interested in doing so on a large scale; and I do not see an option specifying the regions of interest from the bam file. So, I assume I will have to make small bam files around my regions of interest. Is this right?
You might be able to make a BED file with the regions of interest in it and
samtools view -bL regions.bed something.bam | samtools phase ... -
. Hopefully that works since it'd be simpler.Thanks for the great suggestion. I am going to give it a try. If It achieves what I am looking for, you should move your comment to an answer :)
It indeed achieves what I need! You have a correct answer here.
Glad it's doing what you need. I made that an answer for the sake of posterity (and so others don't feel compelled to see if this still needs answering).
@Noushin N: Do you have an example script? I tried samtools phase with a bed file of about 3 SNPs, and it resulted in a Segmentation fault :(