Question

What Kinds Of Efficient Algorithms Can Be Used To Correct Read Errors

1

Entering edit mode

12.4 years ago

Snowball ▴ 10

This error correction problem caught my eye. Is there an O(n) algorithm for this, where n is the number of reads? There's the relatively straightforward O(n²) method of checking all the pairs, but I'm wondering if there's something better out there.

sequencing error read • 2.1k views

ADD COMMENT • link updated 12.4 years ago by jts ▴ 240 • written 12.4 years ago by Snowball ▴ 10

0

Entering edit mode

Forgive my ignorance, but I am not aware of the "O" term in the algorithm you mention, can you give us some more information on this method?

ADD REPLY • link 12.4 years ago by Josh Herr 5.8k

5

Entering edit mode

big O notation is just a way to describe how an algorithm performs given an input. You can read more about it here: http://en.wikipedia.org/wiki/Big_O_notation

For example, looking for an element in a dictionary/hash structure would be O(1) because you can directly access it in one operation. Whereas a primitive search algorithm where you iterate through an entire list of elements to find a specific element would be O(n) where n is the number of members in the list. A binary search tree where you effectively reduce your search space by 1/2 every node would give you O(logN) processes.

It's basically a way to describe the performance of an algorithm

ADD REPLY • link 12.4 years ago by Damian Kao 16k

1

Entering edit mode

+1 and thanks Damian! I come from a more strictly biological background so I need a little help with things sometimes.

ADD REPLY • link 12.4 years ago by Josh Herr 5.8k

score 5 · Answer 1 · 2012-12-14

5

Entering edit mode

12.4 years ago

jts ▴ 240

k-mer based error correction is much faster than overlap based methods. The complexity is effectively O(N) where N is the total number of bases. See the quake paper for a representative approach. Many other programs use a similar approach.

ADD COMMENT • link 12.4 years ago by jts ▴ 240