Hi all I don"t got the concept of K_mers? what it's really doing in the assembly?
Hi all I don"t got the concept of K_mers? what it's really doing in the assembly?
These are 2 separate questions to a degree.
Kmer's are very simple. You might be familiar with the term polymer
or oligomer
. The suffix -mer
comes from Greek meaning 'part' so a polymer
is many parts
and an oligomer
is 'few' or 'some'. This tells you about the length or size of the molecule you're talking about.
A kmer
is simply a molecule of length k
. Generally this is talked about in the context of DNA, but there isn't strictly any requirement for it to be.
Therefore, a kmer
of length k = 5
would be a 'five-mer'. A molecule of length 5 (whatever the individual components are, usually bases, but could be amino acids etc.)
In genome assembly, kmers
are used in de Bruijn graph assemblers. This is a type of network made up of nodes and edges which come from the overlaps between the kmers. This is quite a complex topic, so it would be worth reading up on this yourself, with some small practical examples. Given an alignment, and the k
, its not too difficult to see how the graphs are assembled in the toy example below.
In very brief, de Bruijn graphs built from kmers are an efficient data structure for describing how the 'puzzle pieces' of the genome are connected (i.e.: "I have this short read, but I don't know where in the genome it belongs. If I figure out what it overlaps with, I can fit it in to the puzzle in the right place")
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
[ Please read before posting a question ] -- How To Ask A Good Question