List all solid k-mers with DSK
2
1
Entering edit mode
8.6 years ago

Hi,

I am using DSK to count the k-mers that are present in multiple fasta files. I would like to get the list of all solid k-mers that were found in all files.

The command line interface has a "-solid-kmers-out" argument which, I assume, should output this list in a file. However, even when I set a value, it does not seem to output a file. Am I using it correctly?

Also, is there a programmatic way to go from the integer representation of k-mers to their string representation?

Thanks for your help, Alex

DSK • 2.5k views
ADD COMMENT
3
Entering edit mode
8.6 years ago
Rayan Chikhi ★ 1.5k

Bonjour Alexandre,

To output the solid kmers in string representation, you can either use the dsk2ascii tool provided with DSK, or parse the hdf5 file using h5dump. See https://github.com/GATB/dsk#results-visualization

To integrate DSK results in a C++ program, it is currently recommended to use the GATB-core library. Have a look at the doxygen page and also the example code in the GATB-core repository that demonstrate some custom processing:

  • kmer10.cpp is the most up to date version of that snippet.
  • kmer16.cpp shows how to get separate kmer counts (one abundance per file) over multiple files (multi-bank kmer counting)
  • kmer11-15 are a bit more advanced

Thanks for pointing this out by the way, the "-solid-kmers-out" argument was actually made for GATB-core graph construction (to output the graph and the solid kmers in different files), and is not used in DSK. I've added a note for the future versions of DSK.

ADD COMMENT
0
Entering edit mode
8.6 years ago

Merci Rayan! I will have a look at the snippets.

Alex

ADD COMMENT

Login before adding your answer.

Traffic: 1600 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6