Hello,
I am trying to find why I have to use files like .vg.gbwt and .vg.xg to feed sequenceTubeMap? Wouldn't it be more convenient to use human readable file types like json or txt?
Also, I want to know how I can produce these files when I have something human readable with me.
Thank you for your response!
Lets say I need to visualize some kind of genomic data, which I currently have in a human readable format. How can I convert my files to these formats?
You would use the vg toolkit. The
vg convert
subcommand can convert a GFA to an XG index, and thevg gbwt
subcommand can make GBWT indexes.I was successfully able to read the xg format, but am still having issues with opening up my gbwt file in .gfa or any other readable formats. I suspect I need to edit this file to change the number of tracks in a plot. The menu that pops up when I type vg gbwt isnt very informative on how I can do so. Could you throw some light on this?
GBWT isn't convertible into a GFA on its own. The GBWT index contains only the haplotype paths, so it lacks the node sequences that are required to fully specify the graph. If you want to make a full graph with the GBWT, you should augment it into a GBZ, which is essentially a GBWT with node sequences added. You can do that in
vg gbwt
as well.Noted, but how can I change the number of tracks keeping the .xg and .gbwt file separate?
Could you clarify what you mean by the "number of tracks"?
By tracks I mean the thick lines that represent each sequence that go through the various nodes. By being able to change 'number of tracks' I meant increase or decrease these lines while making necessary corresponding changes to the .xg file.
In short: I want to add another sequence (something I think can achieved by editing the gbwt file.)
GBWTs are not particularly easy to edit, but if you have two GBWT files over a graph with the same node IDs, you can combine them with
vg gbwt --merge
. You can also remove haplotypes withvg gbwt --remove-sample
.Thank you very much for your answers, I really appreciate it. Could you please refer me to some material which talks about gbwt, probably about their properties and generation.
This wiki article is probably the best resource. The academic papers on the GBWT are more focused on the underlying algorithmics. https://github.com/vgteam/vg/wiki/VG-GBWT-Subcommand