Suffix arrays – a contest approach



Yüklə 247,7 Kb.
Pdf görüntüsü
səhifə12/13
tarix20.06.2022
ölçüsü247,7 Kb.
#89807
1   ...   5   6   7   8   9   10   11   12   13
suffix-array

Problem 10:
(ACM SEER 2004) 
Given a string S, determine for each of its prefixes if it’s a periodic string. Hence, for every i (2 <= 
i <= N) we are interested in the greatest K > 1 (if there exists such one) satisfying the property that S’s 
prefix of length ican be also written as A
k
, or A concatenated with itself k times, for some string A. We 
are also interested which is that k. (0 <= N <= 1000000) 
Example: aabaabaabaab
Result: 
2 2 
6 2 
9 3 
12 4 
Explanation: prefix aa has the period a; prefix aabaab has the period aab; prefix aabaabaab has the period 
aab; prefix aabaabaabaab has the period aab. 
Solution:
See what happens when trying to match a string with one of its suffixes. Take a string and break it 
in two parts, getting a prefix and a suffix 
S = aab aabaabaaaab 
15


suf = aab aabaaaab 
pre = aab 
If the suffix matches some number of characters of the initial string which is >= |pre|, it means that 
pre is also a prefix of the suffix and we can also break the suffix in prefix and suffix1, and the string can 
be broken in prefix, prefix and suffix1. If the string matches some arbitrary number of characters from the 
string >= 2|pre| then the suffix matches suffix1 on a number of characters >= |pre| thus suffix1 can be 
written as prefix and suffix2, then suffix can be written as prefix prefix suffix2, so S can be written as 
prefix prefix prefix sufix2. 
S = aab aab aab aaaab 
suf = aab aab aaaab 
suf1 = aab aaaab 
pre = aab 
Observe that if S matches at least k * |prefix| characters of its suffix, then S has a prefix of length (k + 1) * 
|prefix|, which is periodic. 
Using the suffix array, we can find out for each suffix its maximum matching with the initial string. If the 
i
th
suffix matches the first k * (i – 1) positions then we can update the information that says that the 
prefixes of length j * (i – 1) (where 2 <= j <= k) are periodic. For every suffix Si the update has a 
maximum complexity of O(n/(i – 1)). Thus the algorithm gets an overall complexity of O(n log n). 
There is a similar solution using the KMP algorithm, that can be done in O(n), but it doesn’t match this 
paper’s purpose.

Yüklə 247,7 Kb.

Dostları ilə paylaş:
1   ...   5   6   7   8   9   10   11   12   13




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə