vg path associated with sample name
2
0
Entering edit mode
4.1 years ago
xwwang ▴ 20

Is there any way to associate the pathname in vg graph with sample ID? details: the vg graph is generated from variants in vcf of multiple samples How to know each path of vg graph is associated or tagged with variant source tag, e.g. sample ID? Thanks.

vg • 1.8k views
ADD COMMENT
1
Entering edit mode
4.1 years ago
Jouni Sirén ▴ 470

VG assumes that path names are opaque strings. While some path names starting with _ (e.g. _alt_* and _thread_*) are used for technical purposes, VG generally does not understand the information encoded in path names.

In VG terminology, there is a conceptual difference between paths and threads:

  • Paths are defined simultaneously as node sequences and nucleotide sequences. They are stored in the graph itself, and most graph implementations support random access within the paths. Storing many paths generally requires a large amount of space.
  • Threads are lightweight paths that are only defined as node sequences. They are stored in a GBWT index, which only supports sequential access to the threads. If the threads are similar enough, they can be stored very space-efficiently.

Unlike path names, thread names are structured. They consist of four fields: sample name, contig name, phase identifier, and running count / fragment identifier. If multiple contig names are used, many VG algorithms assume that contig names match the names of the paths embedded in the graph.

Graphs built with vg construct ignore sample information. However, if option -a is used during construction, variants will be stored as paths in the graph (using _alt_ prefix for the names). With these alt paths and the VCF files, you can then generate threads for the samples and store them in a GBWT index with the vg index subcommand. There is some documentation in the vg wiki.

ADD COMMENT
0
Entering edit mode
4.0 years ago
xwwang ▴ 20

Thanks a lot. It is much clear now.

ADD COMMENT

Login before adding your answer.

Traffic: 1615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6