0:39

doing a process called partitioning, which is the third line from the bottom.

Â A partitioning takes an array to be sorted and

Â divides it into two pieces out of position j with the property that

Â every element with an index less than j is less than a(j), and

Â every element with an index bigger than a(j) is bigger than a(j) or maybe equal.

Â So it partitions the array into those two parts, and if an array has that property,

Â then you can get it sorted simply by making the two recursive calls

Â that do this sort, sort the left half, sort the right half.

Â The partitioning process is handled by running two indices, one i and

Â one j, i going from low to high and j going from high to low.

Â And we simply scan from the left, incrementing i as long as

Â we have elements that are less than the partitioning element,

Â which we put it a to low to start it up.

Â And then scanning from the right, as long as we have bigger ones,

Â if we found a small one on the right and a big one on the left, we exchange them.

Â And then we keep going until those pointers cross at which point we put

Â the actual partitioning element into the place.

Â If you're not familiar with that, read about the details in Algorithms

Â 4th edition or on the book's site, that's a quick review.

Â So we have to make some preliminary decisions when we're going to study

Â that kind of code.

Â And so the first thing is the cost model, and

Â cost we're going to talk about is the running time.

Â 2:26

But really, it's better to come up with an abstraction.

Â We're not really talking like about microseconds or seconds.

Â What we want to do is talk about something abstract.

Â And for sorting algorithm, what it is is it compares,

Â what the algorithm is doing most often is comparing two items.

Â So let's just count compares.

Â And then the work we do for counting compares,

Â that takes us from time to just to discrete counting process, and

Â that will mean that our analysis is valid for any kind of machine at all.

Â We get the frequency of execution that compares them, and

Â then the time is just how quickly can you do a compare.

Â That all comes rolled up into a constant.

Â So our hypothesis is going to be, if the number of compares is C,

Â then the running time is till the -aC on any computer.

Â Now a is going to be different for different computers or

Â different systems or different languages, but it's a reasonable hypothesis, and

Â it bears out in many, many practical situations.

Â So, that's a big decision at the beginning to get away from time and

Â get into something abstract associated with the algorithm without

Â losing the ability to make some scientific studies later on.

Â So the input model, we're going to assume the input to be randomly ordered, and

Â as I mentioned, in the case of sorting algorithms, that's pretty easy to arrange.

Â We just randomly order them.

Â Also, just to make the math simpler in this first cut,

Â we're going to assume the keys to be all different.

Â That's not always so easy to arrange,

Â and I'll have more to say about that in a little later.

Â So it's a key question, are these assumptions realistic, and

Â I think the answer is, we'll get into that.

Â But we made these assumptions and these models

Â with the knowledge that we're going to be able to test them scientifically.

Â And we can go ahead and do that, and

Â that's part of the process that I'll describe.

Â So the bottom line from this slide is that we're going to count compares, and

Â we're going to assume that the keys are randomly ordered and distinct.

Â So in that context,

Â now there's some relevant questions about this quicksort code.

Â So we're going to assume array of size N, entries distinct, and randomly ordered.

Â So first question is, let's break out the partitioning process.

Â So how many compares are involved for this partitioning process?

Â Well, that doesn't take much analysis to see that what happens is that every

Â item in the array is touched in order to get the partitioning done.

Â Actually, there's one extra because our partitioning elements cross, so

Â it's N + 1.

Â 6:38

So it's a recursive program that has a random input model.

Â And that leads to a formula called a recurrence relation.

Â And we'll look at recurrence relations in detail later on.

Â We've got N entries, distinct and randomly ordered.

Â If we let CN be the expected number of compares used by quicksort,

Â then we can write down an equation for CN, that is [COUGH] a mathematical

Â model that reflects the recursive model of the program itself.

Â 7:30

And those are randomly ordered, so the cost of doing

Â the left subarray, which is a size k-1, is just Ck-1.

Â And the cost of doing the right subarray is C sub N- k.

Â So that's a formula that describes the running time,

Â that number of compares use by quicksort.

Â There's one more detail, if there's only one element, we don't do anything So

Â we have to make sure that we define the small values of n appropriately.

Â 8:21

So, I'm going to do it for this specific recurrence, and then in a couple of

Â lectures we'll talk more generally about how to solve problems like this.

Â And that's a mathematical problem, CN = N + 1,

Â and how are we going to possibly approach something like this?

Â Now, in this one, there's a couple of tricks that get the job done.

Â And, I'm not going to try to explain the origin of those tricks.

Â What I want to show is how simple the calculation is.

Â But we'll get this solved in just a couple of slides.

Â So first thing to notice is the sub files are really kind of the same.

Â And actually, if you kind of write this out in different notation,

Â both of them are c zero plus c one plus dot dot dot plus cn minus one.

Â That is the subsequent files could be any one of those sizes with equal probability.

Â So the two sums are the same.

Â So you could reduce it to just one sum with a factor of two over n and

Â the one over n is not inside the sum.

Â So that's a simpler form right there, just the observation

Â either just totally mechanically that the two signs are the same or

Â you can argue more broadly that there is no difference between the two sub files.

Â So, you can start with this one if you wanted.

Â They are symmetric.

Â 9:53

Now, the next trick is to get rid of the sum.

Â And the way we're going to get rid of the sum is,

Â write down the formula for N-1, and then subtract.

Â So if we do that on the left, we get N C N-(N-1) C N-1.

Â Same formula for N-1 on the left.

Â And then on the right, this term N(N-1)- (N-1)N, just leaves 2N.

Â So that's a little algebra.

Â And then what's the reason we did it is if you take this formula for N, and subtract

Â the same formula for n- 1 all that's left is just one term on the right the 2cn- 1.

Â For now we have a recurrence relation that has no so

Â many and that's clearly going to be much simpler to deal with.

Â And then just collecting terms, we have this very simple recurrence relation,

Â describing the number it compares taken by quick sort at the bottom.

Â It's not a solution yet, but it's way simpler than what we started with, and

Â certainly way simpler than scratching our head, looking at the program,

Â trying to understand how many compares that it takes.

Â 11:08

And so, actually, this is a pretty good milestone,

Â in the analysis of this algorithm.

Â Because we've got now, a pretty efficient algorithm for computing the result.

Â If we try to compute the result from the original recurrence,

Â it would take quadratic time.

Â So, that's the code.

Â And I'll talk more about using code to learn about recurrences.

Â But in order to do N, you have to do a double for

Â loop, you have to for every N, you have to do for every k quadratic time.

Â So, we can use that to try to estimate the value of c[N] for

Â say N equals a million >> We don't have a million squared cycles,

Â even on fast computers today.

Â Or a billion or a trillion.

Â People are trying to sort files that much, so we would want to be able to compute it.

Â But that original thing is not good enough to do it, but with the manipulations that

Â we just made we have a linear time algorithm for computing it.

Â Just started at zero and then just use this formula to compute later.

Â [COUGH] Values, so

Â we might not know too much about what it is, but we know enough to calculate

Â the numbers of compares for any N just by running this simple program.

Â And it is interesting to think of really, I talked about suppressing detail,

Â really, what we want is a fast way to compute the running time of a program.

Â And this is a good.

Â There's a lot of algorithms we'd like to be able to get this far on.

Â Actually, we're going to have much, much faster way.

Â We just want to compute it with just evaluating a few elementary functions.

Â So let's look at how to do that next.

Â 12:53

Now we're going to actually solve it.

Â So we're going to express this quantity in terms of familiar elementary functions.

Â How are we going to do that?

Â Now the first step is a magic step.

Â It's kind of tricky but actually there's solid motivation for

Â it that we'll talk about later on.

Â So we're going to take both sides of the equation and divide by N times N plus 1.

Â So on the left CN divided by N times N+1, is just CN over N+1.

Â On the right, the N+1s cancel.

Â We get CN-1 over N.

Â And then 2N over N times N+1, is just 2 over N+1.

Â That looks like a little bit more complicated form.

Â But it's actually a much simpler form because the first term

Â on the right is the same as the term on the left, it's just within reduced by one.

Â And what that means is we can apply this same formula to that term.

Â And so if we apply that same formula to that term then so

Â that's cn+1=cm2x1 If we plug

Â in CN- 1 = C N-2 over N-1 plus 2 over N, then we have that formula.

Â And then we can continue doing that until we get down to our starting point, C1.

Â 14:18

And then, there's no unknowns on the right-hand side, and that's a solution.

Â So, [COUGH] It's a simple version of this solution.

Â There's a small constant term left out.

Â It's that C N is tilde 2N times the sum from k from one to N over 1/k -2N.

Â And that's just running that out, multiplying by N+1, and doing the sum.

Â That's not a simple version with no unknowns on the right.

Â Now this looks more complicated than what we started with because its got a sum

Â back in there so what do we do about that.

Â Well sums often, simple sums we can approximate with integrals.

Â And again, this may be familiar to many of you,

Â but certainly we'll go into the math behind approximating sums with integrals.

Â So this is Close to integral of 1/x dx

Â plus gamma which is Euler's constant which is about 0.57721.

Â And that gives us a formula for c n.

Â With, in terms of familiar mathematical functions.

Â It's 2n, natural log n, minus this constant time n.

Â And that's Cn is a tilde of that value.

Â 15:44

Now there are approximations.

Â And we will be talking about how to be precise about making these approximations.

Â But other than that, this is Fairly straight forward

Â mathematical manipulations that get us to a solution for

Â the number of compares used by Quicksort.

Â So now, even when we're just doing math for the analysis of algorithms,

Â we can always check our math with a computer.

Â And that's always worthwhile.

Â I said I'm claiming that this is pretty close to CN.

Â Well, I can just write a program to compute this.

Â That's 2*N*Math.log in Java,

Â is natural log of N minus, and look up Euler's constant.

Â That's a value.

Â And I can also, as I mentioned, compute the actual values of CN.

Â And then I can just print out a table.

Â And sure enough, that's a check that we're within 30 for a million.

Â That's pretty close.

Â And we could actually get close if we wanted.

Â That's the first thing that's wonderful about the analysis of algorithms, even for

Â beginners, is that you can always check your work.

Â It's very specific what this thing is.

Â And you're saying, I think I have something that's close to it.

Â Well, you can check and then you can know.

Â And then you can have confidence to move on.

Â So that's the first thing.

Â So with that, we've checked the math.

Â But there's one other thing that we have to check, and that's to check the model.

Â 17:21

And so we thought that if it's randomly ordered distinct keys,

Â that should be the running time.

Â So what we'll do is, we'll run the code.

Â And this is 1,000 trials for each value of N,

Â N from 1 to, I don't know, a 1,000, I think.

Â And there's a gray dot for each trial in the spot.

Â You can hardly see the gray dots, with a red dot right in the middle,

Â which is the average for each N.

Â So that's what the experiment gives us is actual results,

Â which is the number of compares.

Â And then what we want to compare that against is this function.

Â So we can just plot that function and boom, it's right on.

Â Run the algorithm, and here we run the algorithm maybe a million times.

Â And that experiment shows us that the formula that we got for

Â the number of compares is right on.

Â 18:14

So that's checking the model, and very successful in the case of Quicksort.

Â So just as an observation,

Â that maybe we're not just interested in the average costs.

Â Maybe we're interested in the distribution of costs.

Â How likely is the actual thing supposed to be, how close is

Â the actual running time of my one sorting problem going to be to this average?

Â And you can see from the narrowness of these gray dots is that it comes

Â pretty close to the average.

Â It's very typical in the analysis of algorithms.

Â That standard deviation is a slower growing function than the average, and so

Â it regresses to the average as N increases.

Â But there's interesting mathematical questions, plenty of interesting and

Â challenging mathematical questions when we start looking at problems like this.

Â What is the distribution of costs?

Â Most people would say well maybe it's a normal distribution,

Â maybe it's a bell curve.

Â But it's not normal.

Â And actually it was only ten years ago.

Â After Quicksort was discovered in 1960, and people were looking for

Â a long time to try to understand is it normal.

Â And that's actually not normal.

Â 19:31

So that's the exact distribution from the recurrence, which we can compute,

Â not just the probability that the average is a certain value.

Â We can compute for

Â every k, the probability that it takes exactly k comparisons.

Â And then what this plot is, is taking those curves and

Â centering them on the mean, to get an idea of what kind of curve we come close to.

Â And it's not a bell curve, it's got kind of a tail off to the right.

Â 20:00

And actually, this is just an empirical validation of running, again,

Â 1 million sorts just to show that the actual behavior is like that.

Â And that's interesting mathematic.

Â What is this curve?

Â That's a new function.

Â It's not a normal distribution, it's definitely worthy of mathematical study.

Â After all, this is probably one of the world's most heavily used algorithms.

Â We might want to understand this property of it.

Â Maybe that'll help us understand something else in the future.

Â So the bottom line from all of this work, and

Â this is just a summery of the 50 years of reasearch on Quicksort.

Â Is that really a lot is known about the performance of Quicksort.

Â And there is plenty of interesting of research problems in the, literally,

Â thousands of papers that have been written

Â about Quicksort in the 50 years since it was discovered.

Â And that's not the end of the story, I have two more comments.

Â The first one is about prediction, and

Â this is something that we teach in the very beginning of the algorithms book.

Â That just using a hypothesis like this makes it simple to predict performance,

Â even of extremely complicated algorithms in extremely complicated scenarios.

Â So if that's our hypothesis, what we're going to do is run the experiment for

Â some input size N, some big N.

Â And maybe run it a bunch of times to get a good average on the running time.

Â If we wanted to, that's enough information to solve for a.

Â Whatever our time is for whatever N is, then a's the only unknown,

Â and we can go ahead and compute a.

Â But actually, we usually don't need to even do that.

Â We can just predict the time for some constant times N, just by doing the math.

Â So say we wanted to predict the time for

Â 10N knowing that we have the running time for N.

Â Well, if you just do the math, if we take a(10N) log (10N), over aN log N.

Â The a's cancel, we don't even have to know the constant.

Â It says that the running time is going to increase by a factor of 10,

Â plus 1 over a log base ten of N.

Â And this gets smaller as N gets bigger, so

Â its going to increase the problem size by about 10.

Â You're going to increase the running time about 10.

Â And that's a very accurate prediction, and that's useful.

Â And that we now teach to first year students, and

Â in fact we teach to nearly 70% of all Princeton students this simple idea.

Â That if you run a program for a big input, and you have a decent model for

Â its performance, you can predict the running time.

Â So for example, run it for a 100,000, 100 times.

Â And again we can run as many experiments as we want nowadays, and

Â it takes 4 seconds.

Â So that means that I'm going to predict that it's going to take

Â just a little more than 40 seconds for 1 million.

Â 22:59

So this is an experiment, and then I'll run it for 1 million.

Â It takes a little bit longer, it takes 40 seconds.

Â And I can observe the running time of 41 seconds.

Â That gives me a lot of confidence to predict how long it would take for

Â 1 billion, which is maybe the problem that I really need to solve.

Â It's going to take 11 hours to solve for 1 billion.

Â And that really, any programmer can

Â have this kind of understanding on the performance of programs.

Â It's all about getting that first model.

Â If all you're interested in is prediction of running times.

Â It's very powerful.

Â 23:33

Now, I'm not saying that we shouldn't have an accurate mathematical model.

Â It's also important to have a mathematical model.

Â You might want to think about the reason for that.

Â And I'll show an example later on and you'll have an exercise.

Â The reason is, that there might be parameters that are involved

Â in the running time of the program, and

Â we might have the freedom to set the values of those parameters.

Â Even for one parameter, you'll see an example of that in the exercise.

Â We can't afford to run experiments for

Â every possible value of the parameter but if we have a mathematical model,

Â then we can use the mathematical model to find the optimum value of the parameter.

Â And that's a very strong reason to keep working towards good mathematical models.

Â Now as I mentioned the other thing that really we should do

Â is validate model, our model and applications.

Â 25:59

And so it was very important to

Â be able to shave every microsecond off that sorting running time.

Â And the machines of those days, and

Â particularly the Cray-1 wouldn't even work unless

Â every instruction took precisely the amount of time that it said in the manual.

Â And we could work with machine language versions of Quicksort and

Â predict the running time to within a microsecond.

Â Which is predicting the running time for doing it a million times to within

Â a second or within a billion times within a few hours.

Â 26:57

What happened was that a user came along and

Â had an application that violated one of our assumptions.

Â He had files that had only a couple of different distinct values and and

Â his application performed poorly.

Â So people went in and looked more carefully and over a few years time,

Â John Bentley, who's one of the researchers at Bell Laboratories and

Â I, and some others, came up with ideas for addressing this situation.

Â And those things are described In the fourth edition of Algorithms where we

Â modify the algorithm with a better understanding of it to handle

Â this situation.

Â Now, to go back and do the analysis and refine the model and

Â check scientifically required some serious research and that was only

Â recently that we came to the resolution of how those algorithms perform in practice.

Â But still that gave library models that gave us much more robust

Â sorting algorithms.

Â Whose efficiency did not depend on this idea that the keys must be distinct.

Â And so much broader use of libraries that

Â was only enabled through trying to get the analysis of algorithms right.

Â Let's try to get a model that reflects what's seen in applications.

Â And that's ongoing.

Â Just recently I was contacted by somebody working

Â in a college working in a networking start-up.

Â And what they need to do is sort a billion

Â records that are about a thousand characters each.

Â And they had to do it fast or

Â their competition would beat them and they'd go out of business.

Â And he found a refinement to our three way partitioning methods for strings.

Â That was able to improve it even further.

Â 28:57

And so, again, this is over 40, 50 years the analysis of algorithms

Â really helping us to best understand the performance of this important problem.

Â Now I'm not going to be talking that much about the systemy and

Â stuff in this course but I wanted to go through this example in detail because

Â the map is very relevant to what we do in this course,

Â and I want to make sure that people are motivated

Â to look into the kind of math that we're going to be talking about.

Â So that's a full example of the analysis of algorithm that's Quicksort.

Â And Knuth succinctly

Â expressed the idea that I want to leave you with for this example.

Â 29:45

People who analyze algorithms have double happiness.

Â First they experience the sheer beauty of elegant mathematical

Â patterns that surround elegant computational procedures.

Â Just in terms of the pure math what programs bring to the table

Â is another level of interesting mathematics.

Â But then, if you're in the analysis of algorithms, you get a practical payoff,

Â because your theories make it possible to get other jobs done more quickly and

Â more economically.

Â So, it's not just the math, it's that the math is helping us

Â press forward in all kinds of important and identifiable ways.

Â So that's a full introduction to the analysis of algorithms through the study

Â of Quicksort.

Â