COME ON CODE ON

A blog about programming and more programming.

The Z Algorithm

with one comment

In this post we will discuss an algorithm for linear time string matching. It is easy to understand and code and is usefull in contests where you cannot copy paste code.

Let our string be denoted by S.
z[i] denotes the length of the longest substring of S that starts at i and is a prefix of S.
α denotes the substring.
r denotes the index of the last character of α and l denotes the left end of α.

To find whether a pattern(P) of length n is present in a target string(T) of length m, we will create a new string S = P$T where $ is a character present neither in P nor in T. The space taken is n+m+1 or O(m). We will compute z[i] for all i such that 0 < i < n+m+1. If z[i] is equal to n then we have found a occurrence of P at position i – n – 1. So we can all the occurrence of P in T in O(m) time. To calculate z[i] we will use the z algorithm.

The Z Algorithm can be read from the section 1.3-1.5 of book Algorithms on strings, trees, and sequences by Gusfield. Here is a sample C++ code.

bool zAlgorithm(string pattern, string target)
{
	string s = pattern + '$' + target ;
	int n = s.length();
	vector<int> z(n,0);
	
	int goal = pattern.length();
	int r = 0, l = 0, i;
	for (int k = 1; k<n; k++)
	{
		if (k>r)
		{
			for (i = k; i<n && s[i]==s[i-k]; i++);
			if (i>k)
			{
				z[k] = i - k;
				l = k;
				r = i - 1;
			}
		}
		else
		{
			int kt = k - l, b = r - k + 1;
			if (z[kt]>b)
			{
				for (i = r + 1; i<n && s[i]==s[i-k]; i++);
				z[k] = i - k;
				l = k;
				r = i - 1;
			}
		}
		if (z[k]==goal)
			return true;
	}
	return false;
}

NJOY!
-fR0D

Written by fR0DDY

August 29, 2010 at 12:03 AM

One Response

Subscribe to comments with RSS.

  1. buggy
    try: aabaa

    nima

    December 5, 2011 at 5:22 PM


Leave a comment