Tool:Read-based phasing with WhatsHap
0
4
Entering edit mode
7.9 years ago
Marcel M ▴ 100

WhatsHap logo

We are happy to announce WhatsHap, a tool that phases variants with the help of sequencing reads. It was designed to fully exploit PacBio and Oxford Nanopore reads, which are well-suited for phasing because they span many variants. WhatsHap works also well on Illumina data. WhatsHap gives highly accurate results according to our comparison.

WhatsHap expects a VCF and a BAM file as input, and it outputs a standards-compliant VCF file with added phasing information.

WhatsHap can even make use of related samples such as trios by combining read-based phasing with genetic phasing, boosting the accuracy even further.

Additional features:

  • Open Source (MIT license)
  • Phases insertions and deletions
  • Installable from PyPI or bioconda
  • Can use reads from multiple technologies (such as PacBio and Illumina) simultaneously
  • Optionally outputs ReadBackedPhasing-compatible VCFs
  • Accepts already phased VCFs as input, letting you combine 10X Genomics output with PacBio, for example
  • Comes with extra subcommands for working with phased VCFs
  • Helps you in visualizing phasing results

Please visit http://whatshap.readthedocs.io/ or read the pre-print to learn more.

We also have a mailing list.

phasing • 5.2k views
ADD COMMENT
1
Entering edit mode

In default mode, the genotypes provided as input are fully trusted, which can indeed lead to additional switch errors at false positive heterozygous sites. If your variant calls/genotypes are not rock solid, you should use --distrust-genotypes. Then WhatsHap will change genotypes that are incompatible with the phasing based on the provided GLs: less confident genotypes are overturned more easily than more confident ones. Especially in pedigree-mode, I'd strongly recomment using --distrust-genotypes since wrong genotypes can have a big impact on phasing results.

ADD REPLY
0
Entering edit mode

Great work! How robust is the phasing with regard to false-positive variant calls?

ADD REPLY
1
Entering edit mode

PS: Saw on your profile page that you are working with Nanopore data. You might be interested in Michael Simpson's talk about using ONT to sequence a human genome. They've used WhatsHap for phasing: https://nanoporetech.com/human-genetics/results

ADD REPLY
0
Entering edit mode

Shoot, pasted me reply in the wrong box (see my answer below).

ADD REPLY
0
Entering edit mode

I moved it - but not optimal as you can see. Feel free to delete and post again.

ADD REPLY
0
Entering edit mode

That's a cute name :)

ADD REPLY
0
Entering edit mode

This has been an excellent tool! Really great work from the authors. In addition, I've been trying this on samples with more than two alleles with a mixed bag of results. Was this designed only to deal with diploid genome, or there would be future enhancement to accommodate multiple alleles?

ADD REPLY
0
Entering edit mode

Thanks! There’s recently been some work on polyploid phasing in a separate branch. As I understand it, this is mostly done with some details to work out, so I would expect this to be part of one of the next WhatsHap releases.

ADD REPLY

Login before adding your answer.

Traffic: 2064 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6