The Problem
Given two strings needle and haystack, return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.
This is the classic str.find() or string.indexOf() problem.
Approach 1: Brute Force Sliding Window — O(N * M)
We align needle at each possible starting index of haystack and check character by character.
- Time Complexity: $O((N - M + 1) \times M)$ which simplifies to $O(N \times M)$ in the worst case (e.g.
haystack = "aaaaaaaaab",needle = "aaab"). - Space Complexity: $O(1)$ auxiliary space.
Approach 2: Knuth-Morris-Pratt (KMP) Algorithm — O(N + M)
KMP avoids resetting the search pointer all the way back when a mismatch occurs. It uses a LPS (Longest Proper Prefix which is also a Suffix) table to know how many characters we can safely skip.
- Time Complexity: $O(N + M)$ because the pointers never move backward.
- Space Complexity: $O(M)$ to store the LPS array.
KMP Substring Search in Action
Observe how KMP skips redundant comparisons using prefix matches. This prefix logic helps search strings in linear time:
Key Insight
The brute-force approach does redundant checks, while KMP remembers previous character comparisons to skip ahead. While KMP is harder to implement from memory, its $O(N + M)$ execution profile is a crucial optimization for heavy text-processing pipelines.