Entering edit mode
19 months ago
Hendricks27
▴
20
Hi,
I am playing with the genome graph I built from 2 different individual whole genome sequencing data. Also, I generated 1 million simulated reads. However, while I tried to map the simulated read back to the genome graph, the node_id in either gaf or gam format matches the original segment_id in my gfa file. I did check quite a few reads and all of them behave this way.
My question is, is this intended? Any suggestions would be greatly appreciated. Thanks!
The node IDs in VG don't necessarily correspond to the node IDs in an input GFA, because VG will chop long nodes into shorter ones for computational reasons. If you want the alignments to have the original segment IDs from
vg giraffe
, you can usevg autoindex
on the GFA to get a segment file (along with the other indexes), and then supply it tovg giraffe
with--named-coordinates
.Thank you so much. Your solution works!