velvet user manual - node lengths?
2
1
Entering edit mode
9.5 years ago
scchess ▴ 640

In the user manual of velvet, there is a section:

4.2.2 The stats.txt file

This file is a simple tabbed-delimited description of the nodes. The column names are pretty much self-explanatory. Note however that node lengths are given in k-mers. To obtain the length in nucleotides of each node you simply need to add k - 1, where k is the word-length used in velveth.

I don't understand the paragraph. Let's say k==10, each node in the graph would have a length of 10. Why do we have to add k-1 to convert it? The length in nucleotides should also be k! In my example, each node would have a length in nucleotides of 10 (==k).

velvet • 3.3k views
ADD COMMENT
2
Entering edit mode
9.5 years ago
lexnederbragt ★ 1.3k

The nodes in the stats.txt file refer to the final graph after merging unambiguous paths (parts of the De Bruijn graph that do not split/merge) into single nodes. This no longer a De Bruin graph. And, now the lengths of the new nodes is the number of nodes from the De Bruin graph that went into that node in the final graph. But each node in the De Bruijn graph only represents one base (the overlap between that kmer and the previous one), so to derive the sequence length, you have to add k-1. There should really be a drawing to explain this, may one day I'll make one (as I need to explain this during my course on assembly).

ADD COMMENT
0
Entering edit mode

Brilliant. That explains everything.

ADD REPLY
0
Entering edit mode
9.5 years ago

When using Bruijn graphs, the kmer o length of K must be always an odd value. That way the program can distinguish palindromic sequences (think about creating the reverse complementary of GAATTC (odd node) or GAATTCA (even node)

The Bruijn graphs then takes from the nodes the first base (therefore k-1) and look for the next node/s starting with the second base. If node is GAATTCA, it looks for all nodes starting by AATTCAX, then adding X to the assembled sequence

I am pretty sure you will understand how the de Bruijn graphs work if you look a little bit deeply into internet

ADD COMMENT
0
Entering edit mode

Well, according to the wikipedia article, the length of a node is actually k.

ADD REPLY

Login before adding your answer.

Traffic: 2656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6