## Posts Tagged ‘**matching**’

## Knuth–Morris–Pratt Algorithm (KMP)

Knuth–Morris–Pratt algorithm is the most popular linear time algorithm for string matching. It is little difficult to understand and debug in real time contests. So most programmer’s have a precoded KMP in their kitty.

To understand the algorithm, you can either read it from Introduction to Algorithms (CLRS) or from the wikipedia page. Here’s a sample C++ code.

void preKMP(string pattern, int f[]) { int m = pattern.length(),k; f[0] = -1; for (int i = 1; i<m; i++) { k = f[i-1]; while (k>=0) { if (pattern[k]==pattern[i-1]) break; else k = f[k]; } f[i] = k + 1; } } bool KMP(string pattern, string target) { int m = pattern.length(); int n = target.length(); int f[m]; preKMP(pattern, f); int i = 0; int k = 0; while (i<n) { if (k==-1) { i++; k = 0; } else if (target[i]==pattern[k]) { i++; k++; if (k==m) return 1; } else k=f[k]; } return 0; }

NJOY!

-fR0DDY

## The Z Algorithm

In this post we will discuss an algorithm for linear time string matching. It is easy to understand and code and is usefull in contests where you cannot copy paste code.

Let our string be denoted by S.

* z[i]* denotes the length of the longest substring of S that starts at i and is a prefix of S.

*denotes the substring.*

**α***denotes the index of the last character of α and*

**r***denotes the left end of α.*

**l**To find whether a pattern(P) of length n is present in a target string(T) of length m, we will create a new string S = P$T where $ is a character present neither in P nor in T. The space taken is n+m+1 or O(m). We will compute z[i] for all i such that 0 < i < n+m+1. If z[i] is equal to n then we have found a occurrence of P at position i – n – 1. So we can all the occurrence of P in T in O(m) time. To calculate z[i] we will use the z algorithm.

The Z Algorithm can be read from the section 1.3-1.5 of book Algorithms on strings, trees, and sequences by Gusfield. Here is a sample C++ code.

bool zAlgorithm(string pattern, string target) { string s = pattern + '$' + target ; int n = s.length(); vector<int> z(n,0); int goal = pattern.length(); int r = 0, l = 0, i; for (int k = 1; k<n; k++) { if (k>r) { for (i = k; i<n && s[i]==s[i-k]; i++); if (i>k) { z[k] = i - k; l = k; r = i - 1; } } else { int kt = k - l, b = r - k + 1; if (z[kt]>b) { for (i = r + 1; i<n && s[i]==s[i-k]; i++); z[k] = i - k; l = k; r = i - 1; } } if (z[k]==goal) return true; } return false; }

NJOY!

-fR0D