0:18

In the theory of algorithms, when they're looking at upper bounds

Â on worst case performance, they use these notations big O,

Â big omega, and big theta to try to capture the order of growth, and

Â that's what they use for classifying algorithms.

Â If g(N) = O(f(N)), it means that the ratio of g(N) to

Â f(N) is bounded from above as N goes to infinity.

Â And if it's omega, it means it's bounded from below.

Â And if it's theta, it means that it's bounded from both above and below.

Â So this one says there's a constant such that g(N) is less than that

Â constant times f(N).

Â Now, this one says there's two constants that it's in between.

Â So that allows classification according to functional growth,

Â as I mentioned, merge sort is N log N, and quick sort is N squared.

Â 1:14

So, that's the notation that you often see

Â throughout the literature describing the performance of algorithms.

Â But as I mentioned, big O-notation is dangerous,

Â and I'll have more to say about that in just a minute.

Â So it's not scientific to use the big O-notation to try to compare algorithms.

Â If you say the running time is big O(N to the c),

Â that's not of any use for predicting performance.

Â It's an upper bound on the worst case.

Â The actual performance may be much better than the worst case.

Â And it could be that even the actual bound is less than whatâ€™s given

Â by the big O-notation.

Â It's fine for a first cut of classifying algorithms, but not useful for comparing.

Â What we use instead is whatâ€™s called the tilde-notation, and so

Â what we'll typically say is the running time of an algorithm is ~,

Â a constant, times, say, some function of N, where N is the input size.

Â That does provide an effective path for predicting performance, and

Â I'll show some examples of that later on.

Â So we don't use the common big O, big theta, omega notation very much,

Â except in a specific technical sense that I'll talk about later on,

Â when we talk about asymptotic approximations.

Â So [COUGH] big O-notation is useful for a lot of reasons,

Â and it dates back a few centuries, and we do use it in math.

Â But it's a common error to thing that the big O-notation is useful for

Â predicting performance.

Â And I just want to make sure to nip that problem in the bud right away.

Â So this is what often happens to me when I give talks around the world on this topic.

Â Typical exchange in, say, the Q&A for my talk,

Â depending on how formal it is, I'll say okay,

Â big O notation is dangerous, you can't use it to predict performance.

Â And somebody will ask or shout out, but

Â an algorithm that is big O(N log N) surely beats when it's big O(n squared).

Â And then I trot out and say the quick sort, merge sort example and say well,

Â not by the definition, big O is just an upper bound on the worst case.

Â And then they'll say, well, so use the theta notation,

Â which says that it's in-between.

Â And I say well, that maybe gets rid of the upper bound,

Â but you're still typically bounding the worst case.

Â Is your imput a worst case?

Â And even with that logic and we've got a compelling example like quick sort versus

Â merge sort, usually what happens is the questioner whispers to one of his

Â colleagues well, I'd still use the N log N algorithm, wouldn't you?

Â [LAUGH] And actually, such people usually don't program much and

Â shouldn't be recommending what practitioners do.

Â But surely, we can do better than this.

Â That's part of what analytic combinatorics is all about.

Â There's another idea that's out there as well, and

Â that's the concept of a galactic algorithm.

Â And I found this on a friend's blog, Dick Lipton,

Â who said let's define an algorithm that will never be used as being galactic.

Â And why will it never be used?

Â Because no one would ever notice any effect about this algorithm in

Â this galaxy.

Â 4:46

Because any savings is going to happen for an input size so

Â large that it couldn't happen in this galaxy.

Â So, an example of a galactic algorithm, and

Â this one is actually maybe planetary or something.

Â It's actually close to the real world,

Â is Chazelle's linear time triangulation algorithm.

Â So the problem is to find a way to triangulate a polygon in linear time.

Â It was a surprising result and a theoretical tour de

Â force to prove that it was possible to solve this problem in linear time.

Â But the method is much too complicated for anyone to implement.

Â And if anyone did implement it,

Â the cost of the implementation would definitely exceed any savings,

Â until N is so large that it would take another galaxy to deal with it.

Â 5:43

And [COUGH] this is an interesting situation.

Â I think one of the problems is that so many algorithms that

Â are out there that are being published in the literature in this category.

Â After Lipton introduced the concept,

Â one of the contributors to his blog estimated that something like 75 to 95% of

Â the papers in the theoretical computer science conferences are galactic.

Â And I think the problem is that practitioners aren't necessarily aware

Â that they're galactic, and they maybe try to use them.

Â When a relatively simple analysis would say, there's no point in a practitioner

Â taking a look at this, the paper should have asterisks on them or something.

Â So I think it's okay for

Â basic research to drive the agenda, and there's nothing wrong with

Â trying to find the algorithm with the best upper bound and worst case performance.

Â But we have to do something about the common error,

Â where people think that a galactic algorithm is actually useful in practice,

Â there's a lot of denial out there.

Â Here's another thing that often happens to me, this was an actual exchange

Â with a very prominent theoretical computer scientist a few years ago.

Â And he said in a talk, well, the algorithm A that actually is

Â a pretty straightforward algorithm that's in widespread use is a bad algorithm.

Â Google and other Internet providers should be interested in my new algorithm,

Â algorithm B.

Â [COUGH] And so in the question and answer, I said, well,

Â what's the matter with algorithm A?

Â 7:41

By the way, we all know that log log N is less than 6 in this universe.

Â That's 2 to the 64th, and so if N is 2 to the 64th, it's less than 6.

Â And so not only that, it's just an upper bound.

Â Not only that, your algorithm is so complicated,

Â it's certain to run 10 to 100 times faster in any conceivable real world situation.

Â Why should Google care about algorithm B, as you said?

Â And then the response was, well, I like algorithm B, I don't care about Google.

Â And again, that's fine to do research for the intellectual challenge,

Â just don't say that as some practitioner, I should be interested in it.

Â Surely, we can do better than that.

Â 8:25

So what I want to talk about specifically is the scientific approach,

Â say the modern rendition of what Knuth taught us.

Â That is used for many, many algorithms in the fourth edition of my Algorithms book,

Â which is co-authored with Kevin Wayne.

Â So this is described in a lot of detail with examples in Section 1.4 of the book.

Â Again, as Knuth said, we start with a complete implementation that we can test.

Â And then therefore, we'll be able to

Â make hypotheses about the performance of that implementation and test them.

Â And then what we're going to do is maybe try to avoid some of the detail in Knuth.

Â And we're going to analyze the algorithm by trying to find

Â an abstract operation that is in the inner loop.

Â That is, it gets executed more times than any other operations.

Â 10:07

So, what we'll do then is, once we have the hypothesis,

Â we're going to validate the hypothesis by, first of all,

Â we want to generate some large inputs according to the model.

Â So, for example, we'll look at sorting algorithms,

Â where the model is that the things are randomly ordered and distinct.

Â And so it's easy to write a program to generate large,

Â randomly ordered, distinct files.

Â And then, we can just run the program for a large input to calculate a.

Â And actually, we can even skip that step.

Â I'll show you that in a minute.

Â But we definitely can get a that way, an estimate of a.

Â And then that gives us a model that we can use to predict performance for

Â even larger inputs and check our hypothesis.

Â 11:12

That's actually the hardest part that's often overlooked nowadays.

Â And then as in any application of a scientific method,

Â we'll refine and repeat and revise the model and

Â the algorithm as necessary, or as we learn much about the problem.

Â As I mentioned earlier, one of the great things that happens with this process is

Â that often, we learn things about the algorithm that

Â enable us to develop improved versions by doing this.

Â We have many, many examples of this for sorting and searching algorithms and

Â graph algorithms and string processing and data compression.

Â And many, many other applications, where we successfully apply this

Â approach to develop good implementations and understand their performance.

Â Now what this course is going to be all about is this analysis of the frequency

Â execution.

Â That's where the math is.

Â And so, but I want to give this full context,

Â so people can understand why we're going through this trouble.

Â 12:44

So that's the notation, so we're just using tilde and that's good.

Â That simplifies things a bit, and

Â then we're just left with these basic components.

Â So, one of the basic components of algorithm analysis involves programming.

Â So, people who do analysis of algorithms need to be comfortable with

Â implementing algorithms, running them, and be able to at least count operations.

Â So you need a good implementation.

Â Now maybe somebody did the implementation, but still, you want to have experience

Â with it on your own computer to test various hypotheses that you might have.

Â Then we've got the mathematical challenge, and that's, again,

Â going to be most of our emphasis in this course,

Â where we develop a model, analyze the algorithm in the model.

Â And what we're going to talk about in this course is that really,

Â we need to do the math.

Â And that's the kind of math that I'm going to be talking about very soon,

Â in just a minute.

Â And then there's the scientific process of running the algorithm to solve a real

Â problem and checking for agreement with the model.

Â And so you can't do that,

Â you can't really compare two algorithms until you've done all the rest of this.

Â 14:11

Now there's definitely potential drawbacks still

Â in [COUGH] the theory of algorithms,

Â it still goes places where we can't go.

Â Number one is that model might not have a realistic model, and

Â that's actually a challenge in every scientific discipline.

Â Now we do have a big advantage in computer science because we

Â can randomize to make the model actually totally valid, and

Â I'll show an example in just a minute.

Â So if we don't have a realistic model,

Â maybe we can get a realistic model by randomizing.

Â And that's a very powerful concept that if you don't get if you're doing science that

Â involves going to the moon or killing monkeys or something.

Â 15:01

That's also a big challenge in any scientific discipline,

Â statistical physics, and many other areas,

Â new developments in mathematics were involved in order to do the science.

Â And that's certainly going to be it here, and really,

Â that's what analytic combinatorics is.

Â A calculus for analysis of algorithms is the true motivation for this course.

Â Thirdly, it might be that the experiments are too difficult

Â to really validate the model.

Â 15:31

And again,

Â that's not much of a point compared to any other scientific discipline.

Â No problem for us to run an algorithm a million times, and we do it a lot.

Â Whereas if you had to feed a million mice, you might have some restrictions or

Â go to a million planets or whatever else.

Â But it is a road block for

Â a lot of people trying to study algorithms because actually, they're trying

Â to study galactic algorithms that might be way too difficult to implement.

Â And if you can't implement it, why try to analyze it?

Â And it's fine for the intellectual challenge,

Â but again, there are people out there that are thinking

Â that the algorithms that you're developing are useful in practice.

Â And really, they should be validated scientifically

Â before the poor working programmer is faced with them.

Â So that's an outline of our approach to the analysis of algorithms.

Â