Practical Haplotype Graph v2 not finding correct paths
2
1
Entering edit mode
6 months ago

Hello, I am using Practical Haplotype Graph v2.2.85.134 to build a pangenome graph using six diploid plant species (12 haplotypes). I was able to go through their Build and Load module. Then, I simulated 10 million WGS Illumina Novaseq paired end reads from two of my haplotypes (australasica_primary and fallglo_primary) and mapped it to the pangenome using their Imputation module. The idea was that the most haplotype paths in the imputed vcf file should originate from those two haplotypes however that's not the case. Below are the number of haplotype paths from each of my haplotypes.

  • 5,232 australasica_alternate
  • 325 australasica_primary
  • 11,390 australis_alternate
  • 14,105 australis_primary
  • 1,439 fallglo_alternate
  • 10,086 fallglo_primary
  • 12,617 fortune_alternate
  • 5 inodora_alternate
  • 17 inodora_primary

I understand there are a lot of moving parts to this pipeline and I would be happy to provide more details if requested. Any suggestion is highly appreciated. Thank you.

Pangenome PHG graph • 888 views
ADD COMMENT
1
Entering edit mode
6 months ago
pjb39 ▴ 220

There have been a couple of bugs found that affect imputation accuracy, one in build-kmer-index and one in find-paths. The find-paths bug only affects haploid path finding. Those should be fixed in the next few days. I will post here when the fix is available.

ADD COMMENT
0
Entering edit mode

Hello, just wanted to check in if the code has been fixed yet? Thank you.

ADD REPLY
1
Entering edit mode

The timing for your question is excellent. The fix was released yesterday.

ADD REPLY
0
Entering edit mode

That's great. I will test it and let you know how it goes. Thank you.

ADD REPLY
0
Entering edit mode
6 months ago

So, I just tested it and it's performing much better now. Most of the haplotype paths are coming from the expected haplotypes (australasica primary and fallglo primary).

  • 302 australasica_alternate
  • 32,242 australasica_primary
  • 60 australis_alternate
  • 19 australis_primary
  • 1616 fallglo_alternate
  • 11,797 fallglo_primary
  • 7,662 fortune_alternate
  • 32 inodora_alternate
  • 33 inodora_primary
  • 6,805 wilking_alternate
  • 6,972 wilking_primary

My follow up question is how can we further improve imputation accuracy with our PHG database? Are there any parameters related to Anchorwave aligning step that we should try changing? Any suggestion is greatly appreciated. Thank you!

ADD COMMENT

Login before adding your answer.

Traffic: 1458 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6