Fundamentals 9 min read

Understanding the KMP String Matching Algorithm with Implementation and Optimizations

This article explains the Knuth-Morris-Pratt (KMP) string‑matching algorithm, describes how the next array is built to capture prefix‑suffix information, shows Java implementations of the search and next‑array construction, and discusses common optimizations to reduce redundant comparisons.

IT Services Circle

Dec 24, 2024

Understanding the KMP String Matching Algorithm with Implementation and Optimizations

The article introduces the KMP (Knuth-Morris-Pratt) algorithm, a linear‑time string‑matching technique whose time complexity is O(m + n), where *m* is the pattern length and *n* is the text length.

It explains that ordinary brute‑force matching restarts the pattern from the first character after a mismatch, while KMP exploits the information that the prefix of the pattern that matched before the failure can be reused. This reuse is achieved through the next array, which records for each position the length of the longest proper prefix that is also a suffix.

The article provides a concrete example with the pattern abcabea, showing how the next values are computed (‑1, 0, 0, 0, 1, 2, 0) and why they allow the algorithm to skip already‑matched characters.

Two Java code snippets are presented. The first implements the KMP search function strStr and a basic getNext method that fills the next array:

public int strStr(String s, String p) {
    int i = 0; // index in text
    int j = 0; // index in pattern
    int[] next = new int[p.length()];
    getNext(p, next);
    while (i < s.length() && j < p.length()) {
        if (j == -1 || s.charAt(i) == p.charAt(j)) {
            i++; j++;
            if (j == p.length()) return i - j; // match found
        } else {
            j = next[j]; // jump using next array
        }
    }
    return -1;
}

private void getNext(String p, int[] next) {
    int length = p.length();
    int i = 0, j = -1;
    next[0] = -1;
    while (i < length - 1) {
        if (j == -1 || p.charAt(i) == p.charAt(j)) {
            i++; j++; next[i] = j;
        } else {
            j = next[j];
        }
    }
}

A second version shows an optimized getNext that avoids unnecessary comparisons when the characters at the fallback position are identical:

private void getNext(String p, int[] next) {
    int length = p.length();
    int i = 0, j = -1;
    next[0] = -1;
    while (i < length - 1) {
        if (j == -1 || p.charAt(i) == p.charAt(j)) {
            i++; j++;
            if (p.charAt(i) == p.charAt(j))
                next[i] = next[j];
            else
                next[i] = j;
        } else {
            j = next[j];
        }
    }
}

The article also discusses how, when a mismatch occurs, the algorithm can jump directly to the next viable position without re‑examining characters that are known to match, and it illustrates this behavior with several diagrams.

Finally, the author provides a brief biography, noting extensive experience in data structures and algorithms, and offers a downloadable PDF of over 1,000 pages of algorithm notes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Optimization algorithm KMP string-matching Next array

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.