0:01

In fact the order of growth classifications are so important they've

Â led to enormous amount of research in recent years and just talk briefly about

Â that now. So there is, life is a little bit more complicated than pointed out in

Â the last example and one problem is that the inputs can cause the performance of

Â the algorithm to vary widely. So often we have to think about different ways of

Â analyzing the algorithm depending on the input. So, the running time is going to be

Â somewhere between the best case and the worst case. Best case is the lower bound

Â on cost it. It provides something that the running time is going to be bigger than

Â that always or not less than that and then there's the worst case which is the most

Â difficult input. If we analyze that then we can guarantee that the running time in

Â the algorithms not going to be bigger than that. And then in a lot of situations we

Â might consider our input to be random. Well we need to, someway to model, what we

Â mean by random for the problem that we're solving but there is a lot of situations

Â where we can do that and then we have a way to predict performance even when the

Â input might vary widely. So for example for 3-sum, it's kind of always the same.

Â With the tilde notation, the only variability in that algorithm is the

Â number of times the counter is incremented and that's in low order terms so it

Â doesn't need to chew up in our analysis. For binary search it's, you might find the

Â thing right away in which case is constant time and we can show that the average and

Â the worst case are both lg based two(N). There's other, in another examples that be

Â much more variability even. So, we have this different types of analysis depending

Â on the input. And but the question is, what about the actual problem that the

Â client is trying to solve? So we have to understand that two in order to be able to

Â understand performance of the algorithm. And there's two approaches that are, or

Â successful in this. One is to design for the worst case. Just to make sure that

Â your algorithm are, always runs quickly and that's definitely ideal. Another is

Â to, if you can't do that is to randomize and then depend on some kind of

Â probabilistic guarantee and we'll see examples of both of these as we go through

Â the course. Now, those kinds of considerations, you know the idea of order

Â of growth leads to discussion of, what's called, what I call the theory of

Â algorithms. And here our goals are, we have a problem to solve like solve the

Â 3-sum problem and we want to know how difficult it is. We want to find the best

Â algorithm for solving that problem. The approach that the computer scientist use

Â for this is to try to suppress as many details as possible in the analysis. And

Â so just analyze the running time to or within a constant factor. That's what

Â order of growth is getting at and also I want to, not worry about the input model

Â at all. And so we focused on worst case design and we can talk about performance

Â of algorithms just in turn of the order of growth and it's actually possible, it's

Â actually possible to do that in a very rigorous way that it's taught us a lot

Â about the difficulty of solving problems. And our goal is to find an optimal

Â algorithm where we can guarantee to within a constant factor certain performance for

Â any input cuz we discovered the worst case but we also can have approved that didn't

Â know algorithm could provide a better performance guarantee. I'll give a couple

Â of easy examples of this. Now in order to do this they're, these commonly used

Â notations called the big theta, big O and big omega notations. So the and those

Â definitions are given here. So big theta notation is just the way to describe the

Â order of growth. Theta(N)^2 is kind of short hand for anything N^2. It's bounded

Â above and below by constant time N^2 and that's what we really use to classify

Â algorithms. And then, there is big O notation which is upper bounds on

Â performance. When we say O(N^2), we mean that it's less than some constant time N^2

Â as N grows. And big omega is used for lower bounds means greater than some

Â constant time N^2 as N grows. So those three notations were able to use to

Â classify algorithms and I'll show them in the following. So, examples from our

Â 1-sum, 2-sum, and 3-sum are easy to articulate so our goals are to establish

Â the difficulty of the problem and to develop an optimal algorithm. So, the

Â 1-sum problem is 00 in the array. Well, an upper bound on the difficulty of the

Â problem is some specific algorithm. So, for example, the brute force algorithm

Â that looked, that looks at every array entry is a specific algorithm and it means

Â that and that takes O(N) time. We have to look at every, it's less than a constant

Â time N for some constant. So, the running time of the optimal algorithm has to be

Â O(N) that is that's specific algorithm provides an upper bound on the running

Â time of the optimal algorithm. And but in this case it's also easy to develop a

Â lower bound, that's a proof that no algorithm can do better. Well, for 1-sum

Â you have to examine all entries in the array. If you miss one, then that one

Â might be zero so that means that the optimal algorithm has to have a running

Â time at least some constant times N where we say the running time is omega of n. Now

Â in this case, the upper bound and the lower bound match. So, doing the constant

Â factor so, that's a proof that the brute force algorithm for 1-sum is optimal. It's

Â running time is theta(N). It's both omega and O(N). That's, for that simple problem

Â it was okay to get the optimal algorithm. For a more complicated problems it's going

Â to be more difficult to get upper balance and lower balance and particularly upper

Â balance and lower balance that match. For example let's look at 3-sum. So, upper

Â bound for 3-sum, say our first brute force algorithm, say that the proof, was a proof

Â that the running time of the optimal algorithm is O(N^3) but we found a better

Â improved algorithm. Whose running time is O(N^2) lg N. So, that's a better upper

Â bound. Lower bound well, we have to examine all entries cuz again, we might

Â miss one that makes 3-sum = zero and that's a proof that the running time in

Â the optimal algorithm is omega(N) but nobody knows higher or lower bound for

Â 3-sum. So there's a gap between the upper bound and the lower bound and open

Â problems. Is there an optimal algorithm for 3-sum? We don't know what it is. We

Â don't even know if there's a algorithm whose running time is < O(N^2) or we don't

Â know higher lower bound and linear. So that's an example of an open problem in

Â the theory of algorithms we don't know how difficult it is to solve the 3-sum

Â problem. Now, this point of view has been extremely successful in recent decades. We

Â have a new problem, develop some algorithm, proves some lower bound. If

Â there's a gap, we look for new algorithm that will lower the upper bound or we try

Â to find a way to raise the lower bound. Usually it's very difficult to prove

Â non-trivial or lower bounds. Trivial or lower bound like look at every input items

Â is not so hard non-trivial lower bounds like for example, the proof that we're

Â talking about for Union-find problem are much more difficult. And in the last

Â several decades people have learned about the computational difficulty of problems

Â by examining steadily decreasing upper bounds so the algorithms were better worst

Â case running times for lots and lots of important problems and plenty of optimal

Â algorithms and plenty of gaps still remain. It's a fascinating field of

Â research that many people are engaged in. Now there is a couple of caveats on this

Â on the context to this course. And the first one is maybe it's overly pessimistic

Â to be focusing on the worst case. We've got data out there. We've got problems to

Â solve. Maybe it's not worst case data and lots of fields of engineering and science.

Â We don't focus on the worst case. The worst case for this course would be

Â lightning to strike and it would be over so we don't plan for that. And since

Â similar it's true for algorithms. Maybe we should be focusing on understanding prope

Â rties of the input and finding algorithms that are efficient for that input. And the

Â other thing is in order to really predict performance and compare algorithms we need

Â to do a closer analysis than to within a constant factor. So we talked about the

Â tilde notation in the big theta, big O, and big omega, omega that are used in the

Â theory of algorithms. And really there's so much published research in the theory

Â of algorithms that a lot of people make the mistake of interpreting the big O

Â results that are supposed to give improved upper bounds on the difficulty of the

Â problem as approximate models for the running time and that's really a mistake.

Â So in this course, we're going to focus on approximate models by, you know making

Â sure that we use the tilde notation and we'll try to give specific results for

Â certain quantities of interest and the constant, any unspecified constant in the

Â running time. We'll have to do with properties in the machine and in the

Â system so they will be able to use these results to predict performance and to

Â compare algorithms.

Â