0:00

In this video we are going to prove formally

Â an upper bound on the running time of the randomized quick sort algorithm.

Â Namely we're going to prove that the running time on average of

Â the randomized quick sort algorithm is n log n.

Â That is it must be go for n log n.

Â And that is, that, in the worst case, the running time is big O of n squared.

Â 0:24

Well, before going into the details of the proof.

Â Let's again, intuition.

Â First of all, let's know that what we need to estimate is the number of comparisons.

Â Why is that?

Â Well because a quick sort algorithm contains

Â of first the call to the partition procedure and then to recursive calls.

Â Each of these two recursive calls is going to be unwinded into

Â 0:50

the call to partition procedure and another to small recursive calls.

Â So what is going on inside the quick sort algorithm is essentially many calls

Â to the partition procedure.

Â While inside the partition procedure, what we do actually is to compare

Â all the elements of the curve and separate who is the pivot element, right?

Â So what we estimate is a total number of comparisons.

Â Which is not surprising because the quick sort

Â algorithm is a comparison based algorithm.

Â 1:21

Well, now let me also explain why balanced partitions are better for us.

Â I will explain this intuitively.

Â So consider this story example.

Â This array of size seven.

Â Assume that we selected one as the pivot element.

Â So we partitioned the rate is shown here on the left.

Â Now let's see what we now know.

Â We can pair it all the elements in this array to the pivot element

Â which is one in this case.

Â So now we know that one, the final position of

Â the pivot element is just the first position in the array.

Â So we known that 1 is the minimum value.

Â However, we know nothing about pairs of other elements, right?

Â So we only learn that 1 is the minimum value.

Â Now consider another possibility,

Â consider the following balanced partition shown on the right.

Â So assume that we selected 4 as the pivot element.

Â 2:20

I claimed that in this case, this partition is much better for

Â us because we saved many subsequent comparisons.

Â So look, in this case, in the subsequent trends of the partition precision,

Â we are not going to compare elements 3, 1, 2 with element 6, 5, 7.

Â Because we already know that all the elements to the left are for a.

Â As in all the elements to the right are formed.

Â Well the left part will stay in a separate recursive code and

Â the right part well stay in a separate recursive code.

Â So once again, balanced partitions save us a lot of comparisons

Â that we do not need to make in the subsequent calls to partition procedure.

Â Another thing I would like to discuss with you before growing and

Â know details of the proof is the following.

Â Our algorithm is randomized, so its running time and

Â its number of comparisons depends on the random beats used inside the algorithm,.

Â In particular for

Â any two elements there is some probability that they are going to be compared.

Â 3:30

And using this toy example shown on the slide,

Â I would like to just build an intuition on how to estimate

Â this probability on which factors this probability depends.

Â So consider this small example.

Â So this is an array of say it's nine containing all the digits.

Â And I would like to estimate the probability that elements 1 and

Â 9 are going to be compared if we call randomized quick sort physics already.

Â So, let's see what happens.

Â Assume that in the very first quarters of partition procedure,

Â we select the elements 3 for example as the pivot element, so what happens?

Â In this case, 1 will go to the left of 3 and 9 will go to the right side.

Â To the right of three, I'm sorry.

Â So in this case 1 and 9 will be in different parts and

Â they will never be compared in

Â 4:24

as a partician procedure just because they are already in different parts.

Â Okay, for the ways it means, that we already know that 1 is smaller than 9,

Â because 1 is smaller than 3, and 3 is smaller than 9, right?

Â We do not need to compare them.

Â 4:43

Well then this happens if we select as our pivot element, any of the elements,

Â 2, 3, 4, 5, 6, 7 or 9, 8, I'm sorry.

Â If on the other hand we select 1 and 9 as our first pivot element,

Â then 1 and 9 will become pivot.

Â Just because, well, if we select, for example, 9 as the pivot element,

Â we can pivot with all the elements of our array, in particular with 1.

Â So there are two cases when 1 and 9 are compared.

Â And this is how exactly the case is when either 1 or

Â 9 are selected as a first pivot.

Â In all other seven cases there are not compared.

Â This means that the probability that they are compared are 2 over 9.

Â 5:42

And the explanation is the following,

Â there is no element inside RRE that can help the randomized

Â weak sort algorithm to understand that 3 is smaller than 4 without comparing them.

Â I mean for 1 and 9, there are seven such elements.

Â All they are is elements.

Â I mean, if we partition with respect to any of the elements,

Â we already know that 1 is smaller than 9 because they go to other parts.

Â Different parts with respect to this pivot.

Â For three and four there is no such element.

Â So algorithm just must compare these two elements

Â to be sure that 3 is smaller than 4.

Â So in this case the probability is 1.

Â Well this shows that the probability of comparing two elements

Â depends on how close they are in the sorted array.

Â In particular if they're very far apart of each other than the probability

Â is small and if they are close to each other than the probability is high.

Â We will use this observation in the formal proof of our statement.

Â We now start to formally prove an upper bound

Â on the running time of the randomized quicksort algorithm.

Â For this, we introduce the following random variable.

Â Let i and j be different indices from 1 to m.

Â We define Xi of ij to be equal to 1 if two elements, A'[i] and

Â A'[j] are compared in the [INAUDIBLE] quick sort algorithm and

Â to be equal to 0 otherwise.

Â 7:13

Once again, to estimate the running time of the quick sort algorithm,

Â we would like to estimate the total number of comparisons made, so we would like to

Â estimate, for any pair of elements, what is the probability that they are compared?

Â As we discussed on the previous slide, the probability that two elements

Â are compared depends on how close they are in the sorted version of our array.

Â For this reason, we define c of ij dependent on the sorted array.

Â We do not have this sorted array, right?

Â We are only constructing this in the quick sort algorithm but

Â we use it just for the analysis, okay?

Â 7:55

The next thing to note is the following.

Â For any two elements of our initial array assorted array doesn't matter, so for

Â any two elements they are either compared just once or they are not compared at all.

Â So, why is that?

Â Why just once?

Â Well, if two elements are compared at some point,

Â this means that at this point one of these elements is because in the partition

Â procedure we can put a with all of the elements of the current summary.

Â So, if two elements are compared that one of them is a pivot.

Â This also means that right after the call of this partition

Â procedure, we are not going to use this pivot element.

Â We will put the pivot element into its final place, and

Â we are not going to touch it in any of the subsequent calls.

Â This immediately implies the quadratic upper bound

Â on the worst case right in time with final algorithm.

Â 9:08

Now comes the most important observation of this proof.

Â I claim that the elements A'[i] and

Â A'[j] are compared if and only if

Â the first pivot selected in the subrange of the solitary a prime

Â from either side to index j is either a prime of a, of i, or a prime of j.

Â 9:39

First of all, when we select it pivot the random pivot which is not in

Â their sub range, and then all the elements from this sub range in this

Â sort of element goes either to the left or this to the right.

Â So, they all stay together in the same branch of three, okay.

Â So before we select a pivot which stays inside this range,

Â all these elements stay together in the same sub-array.

Â Now, assume that we selected a pivot from this sub-range, and

Â assume that it is not A'[i] or A'[j], for example.

Â In this case a prime of A and a prime of J will be splitted apart.

Â They will go into different parts with respect to this pivot, right?

Â At this point I must humor that all the elements in my summary are different,

Â and in duality are different, okay?

Â So once again, if the first selected element

Â from this subrange is not prime of A or

Â a prime of j then these two elements are not going to be compared.

Â Because right after the partition procedure uses this pivot

Â from this range A prime of a and A prime of j will go to different parts, right?

Â If, on the other hand, the first selected

Â pivot from this subrange is either A prime of a or A prime of j,

Â then these two elements are going to become paired, right?

Â So this is the most important observation in this proof.

Â Everything else is just calculations.

Â So if this is clear, let's then estimate

Â the probability that second respondent to elements are compared.

Â So we know that they're compared if and

Â only if the first selected Pivot in this sub range is one of these two elements.

Â This helps us to estimate the probability of

Â 11:47

not the fact that c of i j is equal to one.

Â Well this is equal to two.

Â I mean because we have only two choices.

Â I mean either a prime of a, or a prime of j divided by the total number of choices,

Â I mean the total number of elements in this subrange.

Â And this is j minus i plus 1.

Â So the probability that Z of ij is

Â equal to 1 equals 2 divided by g minus i plus 1.

Â For example, if j and i differ by 1.

Â So j is equal to y plus 1.

Â So neighboring element in the.

Â Then this probability is equal to 2 divided by 1 plus 1.

Â This is 2 by 2, this is 1.

Â [INAUDIBLE] Just reflects the fact that if there are two neighboring elements inside.

Â This sorted array, then the algorithm just must compare them,

Â to understand that one of them is smaller.

Â There is no other element that can help our algorithm to realize

Â that one of these element is smaller than the other one, okay.

Â 12:49

This in turn helps us to estimate the expected value of this random variable.

Â So recall that if we have a random variable, which takes only values zero and

Â one, then its expected value is one multiplied by

Â the probability that it takes

Â value one plus zero multiplied by the probability that it takes the value zero.

Â Well zero multiplied by something is zero.

Â So what is left is just probability.

Â That it takes multiplied by one.

Â So the expected value of CIJ is equal to 2 divided by g minus i plus one.

Â 13:36

random variables to see they all possible I and J.

Â So, once again the expected value of average value

Â of the sum of the number of comparisons made is the expected value of the sum

Â of all possible x adjacent, or for all I Js.

Â So the expected value of their sum is the sum of their expected values.

Â So we can write the following.

Â The average running time is equal to the sum overall possible

Â different of the expected values cij, and

Â we know this expected value already.

Â So this is a some overall possible different ij,

Â in where j is greater than i of 2 divided by g minus i plus one.

Â Well we can take this constant two out, and consider all the possible.

Â And consider a fixed i.

Â For this i what we have j ranges from i+1 to n.

Â So what we have for the specified time in this sum is a subset

Â of the following sum, 1/2+1/3+1/4 and so on.

Â And this is actually a known sum.

Â This is called harmonic series and it is known that it grows arithmetically.

Â Once again, 1 over 2 plus 1 over 3 plus, and so on,

Â 1 over n, is theta of logarithm of n.

Â Well, this means that, for each they correspond in sum, which

Â ranges over all j from i plus 1 through n, grows, at most, logarithmically.

Â This means since we have m traces for

Â i from one to m that the grows vertically as m.

Â Okay, and this concludes our proof.

Â