Fundamentals 10 min read

KMP String Matching Algorithm: Theory, Implementation, and Comparison with Other Search Methods

This article explains the KMP string‑matching algorithm, how it builds the prefix (next) table to avoid the O(M·N) worst‑case of naive search, provides JavaScript implementations, compares it with Boyer‑Moore and V8's indexOf strategy, and includes detailed code examples.

Beike Product & Technology

Aug 23, 2019

KMP String Matching Algorithm: Theory, Implementation, and Comparison with Other Search Methods

When discussing common string‑matching algorithms, the naive (brute‑force) method and the more sophisticated KMP algorithm are usually mentioned.

For example, to find the pattern ABCDABE in the text ABCDABCDEF, a brute‑force approach compares characters sequentially, moving both i and j pointers forward when they match.

The worst‑case time complexity of this naive method is O(M·N), which becomes unacceptable because binary data often causes mismatches only near the end of the pattern.

KMP solves this by keeping the i pointer fixed on the master string M and moving the pattern pointer j to a position k derived from previously matched prefixes, eliminating unnecessary re‑comparisons.

The key property is that when a mismatch occurs, the next position k satisfies P[0..k‑1] == P[j‑k..j‑1]. This leads to the construction of a next (prefix) array that tells where to jump in the pattern.

Using the prefix table, KMP can achieve linear time O(M+N). The article shows how to compute the next array:

function compute_prefix(p, next) {
  next[0] = -1; // initialization
  let j = 0;
  let k = -1;
  while (j < p.length - 1) {
    if (k == -1 || p[j] == p[k]) {
      next[++j] = ++k; // update next
    } else {
      k = next[k];
    }
  }
}

Comments explain why next[0] = -1 and how the algorithm handles the case when p[j] == p[k] by moving k forward.

An improved version avoids the “next‑value” pitfall when P[j] == P[next[j]]:

function compute_prefix(p, next) {
  next[0] = -1; let j = 0; let k = -1;
  while (j < p.length - 1) {
    if (k == -1 || p[j] == p[k]) {
      if (p[++j] == p[++k]) {
        next[j] = next[k];
      } else {
        next[j] = k;
      }
    } else {
      k = next[k];
    }
  }
}

The complete KMP search uses this table:

function KMP(M, P) {
  let i = 0, j = 0;
  let next = [];
  compute_prefix(P, next);
  while (i < M.length && j < P.length) {
    if (j == -1 || M[i] == P[j]) {
      i++; j++;
    } else {
      j = next[j];
    }
  }
  if (j == P.length) return i - j; else return -1;
}

Although KMP is efficient, the Boyer‑Moore (BM) algorithm often outperforms it, especially for longer patterns, achieving 3‑5× speedups in practice (e.g., GNU grep, text editors).

The article also examines how V8 implements String.prototype.indexOf. In src/string-search.h, five search strategies are defined: SingleCharSearch, LinearSearch, InitialSearch, BoyerMooreHorspoolSearch, and BoyerMooreSearch. The engine selects the most suitable algorithm based on the pattern length.

StringSearch(Isolate* isolate, Vector<const PatternChar> pattern)
    : isolate_(isolate), pattern_(pattern), start_(Max(0, pattern.length() - kBMMaxShift)) {
  if (sizeof(PatternChar) > sizeof(SubjectChar)) {
    if (!IsOneByteString(pattern_)) { strategy_ = &FailSearch; return; }
  }
  int pattern_length = pattern_.length();
  if (pattern_length < kBMMinPatternLength) {
    if (pattern_length == 1) { strategy_ = &SingleCharSearch; return; }
    strategy_ = &LinearSearch; return;
  }
  strategy_ = &InitialSearch;
}

Thus, indexOf dynamically chooses the fastest matching algorithm, combining the strengths of several classic methods.

Overall, the article provides a thorough walkthrough of KMP, its prefix table construction, JavaScript implementation, and its place among other string‑search techniques.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

algorithm KMP pattern search Prefix Table string-matching

Written by

Beike Product & Technology

As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.