How Does The Samtools Phase Operation Work?
2
1
Entering edit mode
10.8 years ago
Adrian Pelin ★ 2.6k

Hello,

I have a highly polymorphic (10% polymorphism rate) genome. I expect this variation to be due to heterozygosity. What is the best way to try and phase my haplotypes? My data is reads in fastq format, representing illumina NGS of the entire genome. I aligned it to a reference using bwa.

I found samtools has such a module called phase, and I phased my .bam file. However, I have no idea what to do with it to analyze the ouput. I wanted to be able to extract the build phased haplotypes and to measure their frequencies.

Thank you.

vcf ngs haplotype samtools • 8.4k views
ADD COMMENT
6
Entering edit mode
10.3 years ago
Adrian Pelin ★ 2.6k

I was never able to use this unfortunately, but I did get some info from the authors:

The algorithm is very simple, but does not work (i.e. produce switching errors) when there are long gaps between markers. It is based on a score based HMM. The hidden states are all possible 15-marker haplotypes. The best phase, in terms of minimal error corrections (the so-called MEC problem), is found by a straightforward dynamic programming. The method is described in a paper published in 2010 (Optimal algorithms for haplotype assembly from whole-genome sequence data) in Bioinformatics, though I found the algorithm independently - it is very simple. The paper also describes an algorithm to eliminate the long-gap limitation, but it is not implemented in samtools.

The method is robust to sequencing and sort of mapping errors. As it is primarily designed for fosmid pool sequencing (Kitzman et al), it is also implemented to correct switching errors due to wrong fosmid identification.

Heng

Hope this helps.

ADD COMMENT
0
Entering edit mode
10.3 years ago
moha • 0

Hello,

I have the same problem, I have used linkSNPs before and now i want to use samtools phase module. But I couldn't find any manual about how to use this module .Could you find any answer for your question? How did you analysed the output? I would be grateful if you could share your experience with me.

Thank you

ADD COMMENT

Login before adding your answer.

Traffic: 1652 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6