As far as I understand, after forming DBG graphs from short reads (SR),the use and store data in GATB library which allows to traverse any path in the graph and to get the sequence of any node. also PacBio long reads (LR) are divided to k-mer which will be compared (traversed) to the k-mer in the DBG of SR.
They used solid k-mer as anchor to correct PacBio reads
We found that if we additionally require that for a k-mer to be
considered solid, it must also have at least one incoming and at least
one outgoing arc
how they correct the reads?
Consider the k-mers of a long read starting at position 1,2,3, … :
some k-mers belong to the graph and are solid, while others do not and
are weak. Basically, solid k-mers are expected to be correct, while
weak ones suspectedly include sequencing errors and require a
correction. Solid k-mers are entry points in the DBG, and LoRDEC
corrects a region made of weak k-mers by finding the best path in the
DBG between the solid k-mers bordering this region. Sometimes, an LR
has no solid k-mer, in which case, LoRDEC marks it as such in the
output and skips it.
The idea here is traversing not mapping (as I understand ) for more about traversing
graph traverse
and here explanation about how it works
http://ivory.idyll.org/blog/2015-wok-error-correction.html
and here is a paper talking about the mapping of sequence to DBG
Read mapping on de Bruijn graphs
It is divided into two phases :
have more details in this phases? because it's not clear in the paper. Thanks
unfortunately I do not have enough info