vg autoindex seems stuck after a week of running
1
1
Entering edit mode
6 months ago
StevieP ▴ 10

HI,

We have a small local server which I started running vg autoindex on a week ago and it's still running. To me it's likely to just keep going ad infinitum as it doesn't seem to be pruning very efficiently. It's running inside the provided docker container on OSX with the genome directory mounted as /genome. I've created a folder within that directory (on the server) called tmp for all temp output to be written to. The command given was:

vg autoindex -t 24 -w mpmap -w rpvg \
  -M 240G \
  -T /genome/tmp \
  -p /genome/full/hprc-v1.1-mc-grch38-gencode45 \
  -x /genome/gencode.v45.primary_assembly.annotation.gtf \
  -r /genome/GRCh38.primary_assembly.genome.fa \
  -v /genome/hprc-v1.1-mc-grch38.vcfbub.a100k.wave.vcf.gz \
  -V 2 &> /genome/full/complete_build.log

We've just hit seven days which really seems excessive to me, and all files have been created except the gcsa files

$ ll full 
total 32457336
-rw-r--r--  1 stevieped  wgs   1.2M 22 May 10:24 complete_build.log
-rw-r--r--  1 stevieped  wgs   701M 15 May 16:39 hprc-v1.1-mc-grch38-gencode45.haplotx.gbwt
-rw-r--r--  1 stevieped  wgs   5.3G 15 May 17:42 hprc-v1.1-mc-grch38-gencode45.spliced.dist
-rw-r--r--  1 stevieped  wgs     0B 20 May 17:28 hprc-v1.1-mc-grch38-gencode45.spliced.gcsa
-rw-r--r--  1 stevieped  wgs     0B 20 May 17:28 hprc-v1.1-mc-grch38-gencode45.spliced.gcsa.lcp
-rw-r--r--  1 stevieped  wgs   9.0G 15 May 17:07 hprc-v1.1-mc-grch38-gencode45.spliced.xg
-rw-r--r--  1 stevieped  wgs   442M 15 May 16:39 hprc-v1.1-mc-grch38-gencode45.txorigin.tsv

The section that seems stuck on an infinite loop in what appears in the log file as

[IndexRegistry]: Constructing GCSA/LCP indexes.
[IndexRegistry]: Forked into child process with PID 9007.
[IndexRegistry]: Forked into child process with PID 9008.
InputGraph::InputGraph(): 8676064820 kmers in 48 file(s)
InputGraph::readKeys(): 1721650899 unique keys
InputGraph::readFrom(): 7331206567 unique start nodes
GCSA::GCSA(): Prefix-doubling from path length 16
GCSA::GCSA(): Step 1 (path length 16 -> 32)
GCSA::GCSA(): Step 2 (path length 32 -> 64)
GCSA::GCSA(): Step 3 (path length 64 -> 128)
GCSA::GCSA(): Step 4 (path length 128 -> 256)

This has appeared previously twice in the log file as

[IndexRegistry]: Constructing GCSA/LCP indexes.
[IndexRegistry]: Forked into child process with PID 48452.
[IndexRegistry]: Forked into child process with PID 48453.
InputGraph::InputGraph(): 8640165890 kmers in 48 file(s)
InputGraph::readKeys(): 1722608560 unique keys
InputGraph::readFrom(): 7331277927 unique start nodes
GCSA::GCSA(): Prefix-doubling from path length 16
GCSA::GCSA(): Step 1 (path length 16 -> 32)
GCSA::GCSA(): Step 2 (path length 32 -> 64)
GCSA::GCSA(): Step 3 (path length 64 -> 128)
GCSA::GCSA(): Step 4 (path length 128 -> 256)
GCSA::GCSA(): Merging the paths
MergedGraph::MergedGraph(): Size limit exceeded, construction aborted
[IndexRegistry]: Exceeded disk or memory use limit while performing k-mer doubling steps. Rewinding to pruning step with more aggressive pruning to simplify the graph.
[IndexRegistry]: Pruning complex regions of spliced VG to prepare for GCSA indexing with GBWT unfolding.

and above that was the first time it appeared

[IndexRegistry]: Constructing GCSA/LCP indexes.
[IndexRegistry]: Forked into child process with PID 1264.
[IndexRegistry]: Forked into child process with PID 1265.
InputGraph::InputGraph(): 8582023968 kmers in 48 file(s)
InputGraph::readKeys(): 1723923355 unique keys
InputGraph::readFrom(): 7331378415 unique start nodes
GCSA::GCSA(): Prefix-doubling from path length 16
GCSA::GCSA(): Step 1 (path length 16 -> 32)
GCSA::GCSA(): Step 2 (path length 32 -> 64)
GCSA::GCSA(): Step 3 (path length 64 -> 128)
GCSA::GCSA(): Step 4 (path length 128 -> 256)
PathGraphBuilder::write(): Memory use of file 0 of kmer paths (240.0 GB) exceeds memory limit (240.0 GB)
PathGraphBuilder::write(): Memory use of file 0 of kmer paths (240.1 GB) exceeds memory limit (240.0 GB)
[IndexRegistry]: Exceeded disk or memory use limit while performing k-mer doubling steps. Rewinding to pruning step with more aggressive pruning to simplify the graph.
[IndexRegistry]: Pruning complex regions of spliced VG to prepare for GCSA indexing with GBWT unfolding.

This suggests to me there may be problems with resource allocation and the internal estimation vg is performing. There are no limits on the HDD space (until 3.8TB) and the OSX docker interface has been set carefully to allow the correct resources for the process. Swap seems to be being set dynamically by the process and is oscillating between using ~13Gb of 15Gb and using 30Gb of 32Gb, and everywhere in between. None of us really understand what's happening there.

We need the server for other work but don't want to shut the process down at this point. This >7 day duration feels concerning and are anxious that this step is just going to keep repeating over & over. Testing on individual chromosomes ran to completion within hours so were unprepared for losing our server for days or even weeks.

vg pangenome • 511 views
ADD COMMENT
2
Entering edit mode
6 months ago
Jouni Sirén ▴ 470

GCSA construction is fragile. It can use effectively unlimited amounts of time, memory, and disk space if the graph is too complex. That's why we try to detect it early, abort the construction, and try again with more aggressive pruning.

Because you got the "Size limit exceeded" message in the merging phase, disk usage must have been almost low enough for the construction to finish. By default, temporary disk space usage is limited to 2 TB. You can increase the limit to N bytes with the hidden vg autoindex option --gcsa-size-limit N. And it has to be in bytes; you can't use the k/M/G suffixes available in -M.

It's possible that the construction will eventually find pruning parameters that work. However, since the 2 TB limit is not that much above the expected space usage with a well-behaving human graph, it's also possible that the graph you have will need above 2 TB disk space for GCSA construction.

ADD COMMENT
0
Entering edit mode

Thanks for this. As a follow-up, the process did indeed complete after 8.5 days so I guess my concerns were unwarranted in the end.

ADD REPLY

Login before adding your answer.

Traffic: 1830 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6