Prefix function, Z function


To solve this problem, the usual enumeration will not work, because its asymptotics will be O(t*s). Therefore, for substring search tasks, the KMP (Knuth-Morris-Pratt) algorithm is used.
To implement this algorithm, the string prefix function is used, this is an array of numbers of length (strong>s length), in which each element is the  maximum length of the largest own suffix of the substring s, matching its prefix. For example:
 
Line Prefix function Note
abababcab 0 0 1 2 3 4 0 1 2  
abcabcd 0 0 0 1 2 3 0  
aabaaab 0 1 0 1 2 2 3  

Trivial prefix function algorithm with O(n^3) asymptotics:

vector<int> prefix_function (string s) {
int n = (int ) s.length();
vector<int>pi(n);
for (int i=0; i<n; ++i)
for (int k=0; k<=i; ++k)
if (s.substr(0,k) == s .substr(i-k+1,k))
pi[i] = k;
returnpi;
}

And now we need to make a prefix function for a string of the form: & nbsp; t + # + s, where # is a delimiter character that is clearly not in the text.  Analyzing the values ​​of the prefix function after the corresponding separator character, it should be noted that if the resulting value is equal to the length of the desired substring, then its occurrence is found. For example, for the string "abababcab" and the desired substring "abab", the prefix function will be:
0 0 1 2 0 1 2 3 4 3 4 0 1 2 we need to parse the elements of the corresponding string s:
1 2 3 4 3 4 0 1 2 - there are two cases where the value is 4 ( 4th and 6th ), i.e. length t,  as a result, the answer must be written: 4 - 4 (length t) & nbsp; = 0 and 6 - 4 = 2. It can be seen that these are the correct answers and the string "abab" is actually a substring in the string "abababcab" in positions 0 and 2.

 

After optimizing the prefix function (for details here ), we get the final algorithm with O(n) asymptotics:

vector<int> prefix_function (string s) {
int n = (int ) s.length();
vector<int>pi(n);
for (int i=1; i<n; ++i) {
int j = pi[i-1< /span>];
while (j > 0 && s[i] != s[j])
j = pi[j-1];
if (s[i] == s[j]) ++j;
pi[i] = j;
}
returnpi;
}

Z-function
Z-function from string S - array Z, each element of which is Z[i ] is equal to the longest prefix of the substring starting at position i in the string S, which is also the prefix of the entire string Z. The value of the Z-function at position zero is usually either zero or the length of the entire string.
Difficulty
O(|S| ^ 2) or O(|S|).
 
Prefix function from the string S - array P, each element of which P[i] is equal to the longest suffix of the substring starting from position < code>i in the string S, which is also the suffix of the entire string S. The value of the P-function at position zero is usually either zero or the length of the entire string. 
Difficulty
O(|S| ^ 2) or O(|S|).
 
Try to implement Z function and prefix function for O(|S| ^ 2).

 
Both Z and the function prefix can be used to implement the KMP(Knuth-Morris-Pratt) algorithm for finding a substring in a string in O(|S|). The essence of this algorithm is as follows: we attribute to the string we want to find the string in which we are searching. It is highly desirable to put a separator character between these lines, i.e. a character that does not occur in any line (usually #).