Dear developers,
I am trying to construct a reference pangenome of a fungi species. After successfully constructing my pangenome using minigraph-cactus, I am struggling to add my isolates’ annotations.
For some background: We have de novo assembled and annotated 11 isolates and used the current reference (which has a chromosomal resolution) for the pangenome construction. Now, I am trying to add the annotations of both my isolates and the reference to it with vg autoindex
.
Here is the command I used:
singularity run docker://quay.io/vgteam/vg:v1.51.0 vg autoindex -w mpmap -w rpvg -p Ref1_pg_index -g Ref1_pg.gfa -H gff_files/isolate1_prefix.gff3 -H .gff_files/isolate2_prefix.gff3 -H gff_files/isolate3_prefix.gff3 -H gff_files/isolate4_prefix.gff3 -H gff_files/isolate5_prefix.gff3 -H gff_files/isolate5_prefix.gff3 -H gff_files/isolate6_prefix.gff3 -H gff_files/isolate7_prefix.gff3 -H gff_files/isolate8_prefix.gff3 -H gff_files/isolate9_prefix.gff3 -H gff_files/isolate10_prefix.gff3 -x gff_files/Ref1_prefix.gff3 --gff-tx-tag Parent -t 8
Here is the error I get:
INFO: Using cached SIF image
[vg autoindex] Executing command: /vg/bin/vg autoindex -w mpmap -w rpvg -p Ref1_pg_index -g Ref1_pg.gfa -H gff_files/isolate1_prefix.gff3 -H .gff_files/isolate2_prefix.gff3 -H gff_files/isolate3_prefix.gff3 -H gff_files/isolate4_prefix.gff3 -H gff_files/isolate5_prefix.gff3 -H gff_files/isolate5_prefix.gff3 -H gff_files/isolate6_prefix.gff3 -H gff_files/isolate7_prefix.gff3 -H gff_files/isolate8_prefix.gff3 -H gff_files/isolate9_prefix.gff3 -H gff_files/isolate10_prefix.gff3 -x gff_files/Ref1_prefix.gff3 --gff-tx-tag Parent -t 8
[IndexRegistry]: Checking for haplotype lines in GFA.
[IndexRegistry]: Constructing a GBZ from GFA input.
[IndexRegistry]: Constructing haplotype-transcript GBWT and spliced graph from GBZ-format graph.
ERROR: Chromosome path "isolate1#0#scaffold_12" not found in graph or haplotypes index (line 74946).
After some investigation, I understand that the problem comes from the filter_paf_deletions
function of cactus_graphmap.py
, which removes some contigs. However, some of these deleted contigs had annotations linked to them (such as the scaffold_12 of my isolate1). And from what I can understand, vg autoindex cannot ignore these exons if they are not part of the .gfa file?
Is there another way to use vg autoindex
than dropping these annotations from the .gff3 file?
Thank you,
Regards,
Marion