Question

Practical Haplotype Graph -- Pathing Error

0

Entering edit mode

21 months ago

wcs98 • 0

I have an error in my pathing step in the PHG (version 1.3) pipeline. I have 19 taxa each with ~71,000 reference ranges, anchorwave haplotypes from assemblies, and I have been able to map short-read samples to the indexed pangenome. However, when I try to run the -imputePipeline plugin to "path" I encounter an error during the BestHaplotypePathPlugin. Do I need to do use a smaller # of reference ranges/haplotypes to avoid this error?

Line that causes an error (it prints out all ~1.3 million hap ids):

[pool-1-thread-1] INFO net.maizegenetics.pangenome.api.CreateGraphUtils - CreateGraphUtils:addNodes - query=SELECT haplotypes_id, gamete_grp_id, haplotypes.ref_range_id, asm_contig, asm_start_coordinate, asm_end_coordinate, asm_strand, genome_file_id, seq_hash, seq_len FROM haplotypes WHERE haplotypes_id in (71255, 71256, 71257,...

Error:

[pool-1-thread-1] DEBUG net.maizegenetics.pangenome.api.CreateGraphUtils - [SQLITE_TOOBIG] String or BLOB exceeds size limit (statement too long) 
org.sqlite.SQLiteException: [SQLITE_TOOBIG] String or BLOB exceeds size limit (statement too long)

Full log file

PHG • 1.1k views

ADD COMMENT • link updated 20 months ago by lcj34 ▴ 420 • written 21 months ago by wcs98 • 0

0

Entering edit mode

WIll you post the commands you have used for running these steps? When creating the graph that is used for the imputation steps, you have included a list of hapids. This is a very long list and it makes the query sent to the db too large to process.

When you give the method names for creating the graph this should pull the hapids needed and they should not need to be explicitly defined unless you want/need only a specific subset of haplotypes. If you will post your commands and indicate specific ways you need the graph filtered, we can help get you the correct commands.

ADD REPLY • link 21 months ago by lcj34 ▴ 420

0

Entering edit mode

Thanks for your posting. We've been able to reproduce this issue (From another instance where it was reported inhouse) and have identified a bug in the code that effects sqlite db queries when imputing with long hapids lists. We hope to have this fixed in the next PHG release. I'll let you know when it is available.

ADD REPLY • link 21 months ago by lcj34 ▴ 420

score 0 · Answer 1 · 2023-03-20

0

Entering edit mode

20 months ago

lcj34 ▴ 420

FYI - The latest PHG docker version (tag 1.4) has a fix for this issue.

ADD COMMENT • link 20 months ago by lcj34 ▴ 420