Hello everyone,
I'm currently working on constructing my first pangenome using VG and could use some help. My approach involves using VCF files from individuals in HGDP and following the steps outlined on their GitHub tutorial: GitHub Tutorial Link. This tutorial fits my needs but wasn't updated since 2020 .. so I am not sure if that's still up to date.
Here's an overview of the commands I've employed:
To begin, I generated .vg files for each chromosome and then pruned them:
vg construct -r ref.fa -v sub-chrXX.vcf.gz > pXX.vg
vg prune -t 8 -k 45 -r p${SLURM_ARRAY_TASK_ID}.vg > pruned${SLURM_ARRAY_TASK_ID}.vg
Then, I created various indexes:
vg ids -j $(for i in $(seq 1 22); do echo pruned${i}.vg; done)
vg index -x all.xg $(for i in $(seq 1 22); do echo pruned${i}.vg; done)
vg index -t 8 --temp-dir temp -g wg.gcsa pruned{1..22}.vg -p -Z 32768 2>&1
Everything was going well until I attempted to align some fastq reads using the following command:
vg map -x all.xg -g wg.gcsa -f ERR1423010_1.fastq > test.gam
And I have the following error:
terminate called after throwing an instance of 'std::runtime_error'
what(): Attempted to get handle for node 5777068 not present in graph
I've verified the validity of all the .vg files using vg validate
, and they were all returned valid.
I've been troubleshooting for a couple of days but have yet to identify the source of this issue. I'm using VG v1.48. The tutorial I'm following hasn't been updated since 2020, so I'm uncertain whether it's still working the same way they are describing there. I would greatly appreciate your help as it took me several months just to generate that gsca file :')
Thank you so much.
The XG should be produced from the full, unpruned graphs. Pruning is only a preprocessing step for making the GCSA index.
Thanks a lot for your answer! Apologize for the late answer but it took me a full day to re-create the .xg and make sure the alignment work.
It is now working thanks to you but I had another error that happens (only) when I map my paired-end reads. I reported the issue there if you have any idea why: VG mapping paired-end reads: error [xg]: multiple hits for XXX
Thank you very much !