Question

KMP algorithm's time complexity

0

Entering edit mode

23 months ago

jessewhite8077 ▴ 20

I'm attempting to implement strstr using the KMP method. This is the algorithm as described in Wikipedia. The temporal complexity of the KMP algorithm is O(n), where n is the length of the bigger string.

vector<int> KMP(string S, string K)
{
    vector<int> T(K.size() + 1, -1);
    vector<int> matches;

    if(K.size() == 0)
    {
        matches.push_back(0);
        return matches;
    }
    for(int i = 1; i <= K.size(); i++)
    {
        int pos = T[i - 1];
        while(pos != -1 && K[pos] != K[i - 1]) pos = T[pos];
        T[i] = pos + 1;
    }

    int sp = 0;
    int kp = 0;
    while(sp < S.size())
    {
        while(kp != -1 && (kp == K.size() || K[kp] != S[sp])) kp = T[kp];
        kp++;
        sp++;
        if(kp == K.size()) matches.push_back(sp - K.size());
    }

    return matches;
}

I'm not sure how the complexity of this algorithm is O. (n). Can somebody explain how this code's complexity is O(n)?

algorithm • 1.7k views

ADD COMMENT • link updated 23 months ago by umang • 0 • written 23 months ago by jessewhite8077 ▴ 20

0

Entering edit mode

How is this related to bioinformatics?

ADD REPLY • link 23 months ago by Ram 44k

0

Entering edit mode

Follow up question: Why is this post almost a word-for-word copy of https://forum.wearedevs.net/t/29958?

ADD REPLY • link 23 months ago by Ram 44k

score 1 · Answer 1 · 2023-01-10

1

Entering edit mode

23 months ago

lennykovac ▴ 110

The KMP algorithm searches a pattern P of length m in a query Q of length n, then the algorithm compares worst-case m * (n - m +1) elements.

Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation. The letter O is used because the growth rate of a function is also referred to as the order of the function. WIKI

EDIT: As Rob pointed out this would result in a complexity of O(n*m)! And this is indeed not linear. But there are implementations that scale it down to O(n+m) which is in O(n).

Here you can find the most common time-complexities

ADD COMMENT • link 23 months ago by lennykovac ▴ 110

4

Entering edit mode

This isn't the case. That is, m (n - m + 1) is not in O(n). In fact, KMP is not actually O(n), but O(m + n). It takes time linear in the size of the pattern (O(m)) to build the partial match table, and then O(n) time given the query and partial match table to enumerate the occurrences. Performing m (n-m+1) comparisons would make the algorithm O(m*n), which is, in fact, the asymptotic complexity of "naive" substring search.

ADD REPLY • link 23 months ago by Rob 6.9k

score 0 · Answer 2 · 2023-01-31

0

Entering edit mode

23 months ago

umang • 0

Not sure if this is the right forum to ask about KMP. Anyways you can read https://www.w3spot.com/2020/07/kmp-algorithm-explained-in-plain-english.html article. It explains time complexity of KMP algorithm in detail.

ADD COMMENT • link 23 months ago by umang • 0