The application contexts: reduced representation or other sequencing libraries where ~all reads start at restriction enzyme cleavage sites. Also, RFLP analysis.
The old approach: with a linear reference genome, it is a small matter to perform in silico digestion to identify all reference locations of matching a specific motif/kmer/site.
What is needed: an analogous tool for a pangenome graph, i.e., to find all occurrences of an individual, fixed short sequence in all haplotypes. The output would be a list of those positions projected/surjected to a specific reference (in my perfect world, sequences adjacent to those positions could also be extracted).
I've found a few things about but most are too generalized (to degenerate motifs, all kmers, etc.), don't project to a reference genome, etc.
Advice on approaches appreciated.