The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

From the course by Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

441 ratings

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

From the lesson

Week 1

Two motivating applications; selected review; introduction to greedy algorithms; a scheduling application; Prim's MST algorithm.

- Tim RoughgardenProfessor

Computer Science

So the plan for this video is to develop a greedy algorithm that always minimizes

Â the waited sum of completion times of a given set of in jobs.

Â But more than the specific problem, more than the specific algorithm I want you to

Â focus on the process by which we arrive. At this greedy algorithm.

Â because I think this process is really something which you can use yourself, in

Â your own applications. The process we're going to use is we're

Â going to first look at just a special case in the problem, where it's

Â reasonably intuitive what should be the optimal thing to do.

Â Looking at these special cases will then motivate a couple of natural greedy

Â algorithms. Then, we'll figure out how to narrow a

Â couple of greedy algorithms down to just a single candidate, and that in fact, as

Â we will prove in the following lectures, will always be correct.

Â So let's just briefly recall what is it we're trying to do.

Â The computational problem and instant to specify by end jobs which come along with

Â waits and links, and among all end factorial ways that we can sequence the

Â jobs, we want to somehow home in on the one that minimizes the sum of waited

Â completion times. Recall from the previous video that the

Â completion time of a job is just the amount of time that elapses before it's

Â done. So that's going to be the length of all

Â the previous jobs plus the length of job J itself.

Â What we're hoping is going to work out is that we can devise a greedy algorithm

Â that always solves this problem. So maybe I should take a step back and

Â ask, you know why do greedy algorithms seem like a sensible way to approach this

Â scheduling problem? Well, you know.

Â In general, greedy algorithms are not guaranteed to work.

Â You may have to do something more complicated.

Â But scheduling still seems like a good place to try them out.

Â Remember what a greedy algorithm does, it iteratively makes myopic decisions.

Â An then you hope you have a, a reasonably good result at the end.

Â Now, what are we doing? We're studying a sequencing problem.

Â The definition of the problem is to schedule a job then another job then

Â another job all the way up to the last job and so this iterative nature of the

Â solution suggests that at least if you're lucky if the problem is simple enough

Â maybe there's a greedy algorithm which simply schedules the jobs in the correct

Â order one at a time. So, we're going to see if that works for

Â minimizing the sum of waited completion times.

Â So let's start by thinking positive, being optimistic.

Â So let's pause it that the greedy algorithm does exist for this problem.

Â Given that we're in the greedy algortihm section of the course you're, probably

Â you'll going to find this hard to believe.

Â But suppose one existed, how would we discover what it is?

Â Well a useful technique not just for this problem but, you know more generally in

Â real life, first focus on some special cases of the problem, where it's

Â relatively clear how you should proceed. And the two special cases I want you to

Â think about for this problem are first of all, suppose I told you all of the jobs

Â had exactly the same length but they had different weights, then, what order do

Â you think it makes sense to schedule the jobs in?

Â Secondly, suppose that I told you, that all of the jobs had exactly the same

Â weight, but they had different lengths. Then, what order do you think you should

Â schedule the jobs in? So first if all the jobs have the same

Â length, you should prefer jobs with larger weights.

Â Certainely, eh, this intuitively jives with our semantics of weights, that says

Â more importance, which suggest that higher weight jobs should go first, if

Â you look at the actual formula of minimizing the sum of weight and

Â completion times, if the jobs all have the same length.

Â Then the completion times are going to be the same your going to see the same set

Â of them. If all the jobs have length one, then the

Â completion times of the jobs are going to be one, two, three, four all the way up

Â to N. No matter what sequence you use.

Â So to make this as small as possible, you want the highest waits to be associated

Â with the smallest completion times that is, you want them upfront as early as

Â possible. The second special case where jobs have

Â equal waits but varying lengths, I think is a little more subtle.

Â Here what you want to do is you always want to favor small jobs, jobs with the

Â smallest lengths, everything else being equal.

Â The reason for that is that scheduling a job at a given position forces all the

Â other jobs to wait for that job to complete.

Â So all the, so whatever job you schedule first has a negative impact on all of the

Â rest of the N minus one jobs. So I'll.

Â Things being equal, you want the smallest job there that minimizes the consequences

Â for the jobs that are to follow. If you find this a little unintuitive, I

Â suggest just looking at a very simple example.

Â Two jobs. Both have weight one, one has length one,

Â one has length two. If you schedule the small job first

Â you'll have completion times of one and three for a total of four but if you

Â schedule the bigger job first you get completion times of two and three for the

Â bigger sum of completion times of five. So the next step is to move beyond

Â special cases, which we understand well. To the general case, which perhaps we

Â don't understand. So suppose all of the weights are

Â different, and all of the lengths are different.

Â Well, if we have two jobs, and both of these rules of thumb give us the same

Â advice, we're good. If there's one job which is both higher

Â weight, and smaller than another job, then clearly that job should go first.

Â But what if our two rules of thumb to prefer high weight jobs and to prefer

Â small jobs, give us conflicting advice. What if we have a pair of jobs, where one

Â of them is on the one hand higher weight, higher priority but on the other hand,

Â bigger than the other one. Which one should go first?

Â Well let's again stay positive, and let's try to think about the simplest kind of

Â algorithm that could conceivably work. It won't be a guarantee that it works,

Â but it might work. So we have these two different

Â parameters, length and weights. Maybe we can aggregate these two

Â parameters into a single one, into a single sort of score for each of the jobs

Â so that if we schedule the jobs from high score to low score, we'll always be

Â optimal. That would be great.

Â If we could compile these two numbers into one for each job, and then just sort

Â and be done. There is of course the question of

Â exactly how do we choose this aggregation function.

Â How do we compile length and weight into a single number.

Â Well as guidelines we should recall our special case and make sure we respect our

Â two rules of thumb. So all else being equal we should prefer

Â jobs with higher weight. So that says higher weight should meet

Â the higher scores if we're going to schedule the job from high score to low

Â score. And then also if a length is bigger that

Â should decrease the score. We should prefer jobs that have a small

Â length. So this idea leaves open the question of

Â exactly how do we aggregate the length and the weight of a job into a single

Â number. So what I want you to do now is I want

Â you to think for a minute about what kind of simplest possible functions you could

Â use. So again, these are mathematical

Â functions. They take as input two numbers, a length

Â and a weight of a job, and they output a single number, a score.

Â And the function should have the properties that it's increasing in the

Â job's weight, and it's decreasing in the job's length.

Â So there's more than one answer to this question but just sort of dream some sort

Â of ideas of what this function might look like.

Â Alright so there is certainly any number of functions that have these properties,

Â but I'm just going to write down for concreteness two of what I think are of

Â the simplest functions that have these properties.

Â So one is going to be based on taking the difference of the two numbers and one is

Â going to be based on taking the ratio of the two numbers.

Â So if you're going to use a function based on the difference, then you're want

Â to be increasing in the way of decreasing the length.

Â Then of course, the obvious difference to use is weight minus length, this can be

Â negative sometimes but that doesn't bother us the algorithm is still well

Â defined. And if you're going to use a ratio and

Â you want it to be increasing in weight, decreasing in length, then the sensible

Â ratio to use is, the weight of a job, divided by the length of a job.

Â It is of course possible that you have ties for either one of these scoring

Â functions, so let's just allow ties to be broken arbitrarely.

Â Now, what we're seeing here is a concrete instantiation of something I promised you

Â in our high level discussion of greedy algorithms, namely, it's both a strength

Â and a weakness of them that they're really easy to come up with and propose.

Â So here we have just, you know, this one simple problem, and we now have two

Â different, competing greedy algorithms for the problem.

Â Now, because these two algorithms don't do the same thing.

Â Only one of them, at most, can be always correct.

Â At least one of them has to be wrong sometimes.

Â So, as the algorithm designer, what the process now is.

Â Maybe we can rule out at least one of these two proposed greedy algorithms, by

Â showing an example where it doesn't do the right thing.

Â So I want to emphasize this is the type of scenario that's very likely to arise

Â in your own algorithm designed adventures.

Â You might have some problem. You're not sure how to solve it yet.

Â You've brainstormed up a couple of proposed algorithms.

Â And a good thing to do, a good time saver is too quickly rule out some of those

Â algorithms as not the right way to go as a poor approach to the problem.

Â So in this context we have these two greedy algorithms.

Â Let's quickly break one of them. Show that's it's not always correct.

Â How do we do that? Well, a smart way to go would be to come

Â up with an input where the two algorithms do different things.

Â If they do different things, at most one of them is going to be correct.

Â At least one of them is going to be incorrect so that's the plan.

Â Now to execute this goal, as usual we want to keep things as simple as possible

Â but no simpler. So, what's the simplest possible instance

Â that could lead to different behavior by two algorithms?

Â Well, obviously one job is not enough because there's only one possible

Â feasible solution. But already with two jobs we might be

Â able to have one algorithm flip them one way and the other algorithm schedule them

Â in the opposite order. In fact it is not difficult to come up

Â with an instance with two jobs where they do different things.

Â let me just go ahead and write such an instance down for you now.

Â So suppose I give you two jobs, the first one is both longer and more important

Â than the other one, specifically, its length is five, its weight is three.

Â The second job, its length is merely two but its weight is merely one.

Â So what I want you to do is I want you to take our two proposed greedy algorithms,

Â the first one which orders by difference, the second one which orders by ratio.

Â I want you to execute them on this 2-job input and compute the sum of weighted

Â completion times. And then answer what is the sum of

Â weighted completion times of the corresponding two schedules.

Â Alright, so the correct answer, is answer B.

Â Let's just briefly go through why. So first let's just make sure we

Â understand which algorithm produces which schedule.

Â So the first job has the better ratio. It's ratio is five 3rds, where as the

Â ratio of the second job is one-half which is smaller.

Â Where as the second job has the larger difference, it has a difference of -1

Â where as the first job has the more negative difference of -2/ So, the first

Â algorithm which orders by difference will schedule the second job first then the

Â first job. The second algorithm will schedule the

Â first job first and then the second one. So it just remains to compute the

Â objective function value of those two schedules.

Â So for the first schedule, with the second job first, while we're, the second

Â job has waits of one, has a completion time of two.

Â The second job has a weight of three and a completion time of seven.

Â So that gives us a total of 23. Whereas the schedule produced by the

Â second algorithm we have the weight three job first.

Â It's completion time, now that it's first, is only five and then the second

Â job with weight one get the completion time of seven for a total of 22.

Â So ordering by difference gives us a value of 23.

Â Ordering by ratio gives a value of 22. So in this case the ratio does better

Â than the difference. So certainly the difference is not

Â optimal for this specific example. So what have we accomplished?

Â Well, what we've done is we very quickly ruled out one of our natural proposed

Â greedy algorithms. We know that ordering by difference is

Â not always correct. Again, it's going to be correct in

Â special cases like when all the lengths are equal, where all weights are equal

Â but it is not correct in general. That said, please remember the warning I

Â gave you in the high level discussion of greedy algorithms which is greedy

Â algorithms are very often wrong. Just because we know algorithm number one

Â is incorrect sometimes does not at all imply that algorithm number two is

Â guaranteed to be correct. It's really easy to come up with multiple

Â incorrect greedy algorithms for the same problem.

Â It does, however, turn out. In this case for this greedy algorithm,

Â algorithm number two driven by ratio it is happily always correct.

Â But you certainly shouldn't believe this claim until I provide you with a proof.

Â A rigorous argument explaining the correctness.

Â Always maintain healthy skepticism about the performance of a greedy algorithm

Â until you learn otherwise. So, this fulfills another promise I gave

Â you in the high-level discussion of greedy algorithms.

Â Namely, when they're correct it's often quite difficult to prove it.

Â So this would be the topic of the next couple videos the correctness proof for

Â this greedy heuristic. The third and final thing we discussed

Â about greedy algorithms typically is that their running time is not difficult to

Â analyze. So that's a break that we catch relative

Â to divide and conquer algorithms, and again that's certainly true here,

Â right? So, what does this algorithm do? All it does is compute these ratios and

Â then sort the jobs by ratio, so essentially the algorithm reduces to a

Â single sorting computation and of course from part one, we know very well, how to

Â sort in N log N time.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.