Hi everyone,
I'm wondering what would be the best strategy to obtain the list of all k-mers in the graph with each k-mer labeled as reference (i.e., from specific path) or not-reference. A very nice to have extra feature would be to have, for each k-mer, the index(es) of its occurrence(s) in the reference (linear) space.
I see two possible roads:
vg kmers ...
vg find -k ...
Both strategies have prons and cons. The former produces more succinct output but I haven't found a way to discriminate whether k-mers belongs or not to a path (nor I found an easy way to obtain the index). The latter has more extensive output, but also produces much larger outputs.
Thanks for any suggestion, Michele S.
Thank you very much for the prompt and detailed reply. I was considering using external tools, but I wanted to be sure that there were no better ways built in the toolkit.