k-mer tools - probability based models
1
0
Entering edit mode
10.4 years ago
sam ▴ 130

I have recently been looking at different k-mer tools (E.g., jellyfish). They all perform well with different computational complexities. However, most of them are counting tools. I'm interested in a tool that finds k-mers that are more than expected (more of a probability-based approach). I was wondering if anyone has worked with or seen a tool that generates k-mer counts + a background distribution?

RNA-Seq k-mers • 3.2k views
ADD COMMENT
3
Entering edit mode
10.4 years ago
edrezen ▴ 730

You can use DSK from the GATB project, which is a kmer counter that also provides an histogram of kmer abundance (see README file for more information). For instance:

dsk -file myreads.fa -kmer-size 31

It will produce a HDF5 file from which you can extract the kmers histogram with the following (the h5dump tool is provided with DSK) :

h5dump -y -d dsk/histogram myreads.h5 | grep [0-9] | grep -v [A-Z].* | paste - -

You can plot directly with gnuplot :

h5dump -y -d dsk/histogram myreads.h5 | grep [0-9] | grep -v [A-Z].* | paste - - | gnuplot -p -e 'plot [][0:100] "-" with lines'

There is also a tool 'dsk2ascii' that gives the list of (kmers,count) in a human readable format, so you can do some processing on it.

ADD COMMENT

Login before adding your answer.

Traffic: 3016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6