Suffix arrays are useful data structure that you already used

in the previous modules but now you will learn how to build it really fast and

first we'll recall what is a suffix array.

So the problem of construction of a suffix array is very simple you're given a string

and you need to sort all of it's suffixes in lexicographic order.

However as we will soon see you won't need to actually compute all the suffixes and

then solve them and

output all of them because that will use too much both time and memory.

You will just need to know in which order are those suffixes.

And the suffixes themselves sorted in lexicographical order

are only in our head.

They're not stored anywhere in the problem.

So we assume that the alphabet from which our strings are built are ordered, so

that any two characters we can say which one of them is smaller.

For example, in English we can order all the characters from a to z in a binary

alphabet we just have zero and one and zero is less than one.

By definition a string S is smaller than a different string T if either

S is a prefix of T or S and T coincide from beginning up to some character and

then the next character in S is smaller than the corresponding character in T.

For example, if s is ab, and t is bc.

Then they don't coincide.

But the first character is already different.

And the character in s is a.

And the character in t is b.

A is less than b, so s is less than t.

And in the second example, s and t coincide for the first two characters.

And then the third character c is less than character d.

So s is smaller than t.

And in the third case, s is a prefix of t, but

it is different from t, so s is smaller than t.

And here is an example of suffix array.

We have a string, s, and

all suffixes ordered in lexicographic order are a, aa, and so on.

So, here are exactly six suffixes because the length of string S is six,

and so we have six different suffixes.

We want to avoid this case when S is a prefix of T and that is why S is less than

T because this case is different from all others and usually you just compare S and

T from the first character and go the right until they differ.

And then see which character is smaller.

And this is a corner case when you go up to the end of the S, and

then you see that there is nothing there and so that is why S is smaller.

So to avoid using that rule at all, we will append a special character

called dollar to the end of the string for which we'll build suffix array.

So all the suffixes will have this dollar on the end.

And now if initially some suffix was a prefix of another suffix.

Now it is just smaller by the usual rule, because as soon as it ends,

and it is still coinciding with the prefix of the bigger suffix, the next character

in the smaller one is taller, which is smaller than all other characters.

And so we can determine by the usual rule

that the smaller suffix is actually smaller.