So the plan for this video is to develop a greedy algorithm that always minimizes
the waited sum of completion times of a given set of in jobs.
But more than the specific problem, more than the specific algorithm I want you to
focus on the process by which we arrive. At this greedy algorithm.
because I think this process is really something which you can use yourself, in
your own applications. The process we're going to use is we're
going to first look at just a special case in the problem, where it's
reasonably intuitive what should be the optimal thing to do.
Looking at these special cases will then motivate a couple of natural greedy
algorithms. Then, we'll figure out how to narrow a
couple of greedy algorithms down to just a single candidate, and that in fact, as
we will prove in the following lectures, will always be correct.
So let's just briefly recall what is it we're trying to do.
The computational problem and instant to specify by end jobs which come along with
waits and links, and among all end factorial ways that we can sequence the
jobs, we want to somehow home in on the one that minimizes the sum of waited
completion times. Recall from the previous video that the
completion time of a job is just the amount of time that elapses before it's
done. So that's going to be the length of all
the previous jobs plus the length of job J itself.
What we're hoping is going to work out is that we can devise a greedy algorithm
that always solves this problem. So maybe I should take a step back and
ask, you know why do greedy algorithms seem like a sensible way to approach this
scheduling problem? Well, you know.
In general, greedy algorithms are not guaranteed to work.
You may have to do something more complicated.
But scheduling still seems like a good place to try them out.
Remember what a greedy algorithm does, it iteratively makes myopic decisions.
An then you hope you have a, a reasonably good result at the end.
Now, what are we doing? We're studying a sequencing problem.
The definition of the problem is to schedule a job then another job then
another job all the way up to the last job and so this iterative nature of the
solution suggests that at least if you're lucky if the problem is simple enough
maybe there's a greedy algorithm which simply schedules the jobs in the correct
order one at a time. So, we're going to see if that works for
minimizing the sum of waited completion times.
So let's start by thinking positive, being optimistic.
So let's pause it that the greedy algorithm does exist for this problem.
Given that we're in the greedy algortihm section of the course you're, probably
you'll going to find this hard to believe.
But suppose one existed, how would we discover what it is?
Well a useful technique not just for this problem but, you know more generally in
real life, first focus on some special cases of the problem, where it's
relatively clear how you should proceed. And the two special cases I want you to
think about for this problem are first of all, suppose I told you all of the jobs
had exactly the same length but they had different weights, then, what order do
you think it makes sense to schedule the jobs in?
Secondly, suppose that I told you, that all of the jobs had exactly the same
weight, but they had different lengths. Then, what order do you think you should
schedule the jobs in? So first if all the jobs have the same
length, you should prefer jobs with larger weights.
Certainely, eh, this intuitively jives with our semantics of weights, that says
more importance, which suggest that higher weight jobs should go first, if
you look at the actual formula of minimizing the sum of weight and
completion times, if the jobs all have the same length.
Then the completion times are going to be the same your going to see the same set
of them. If all the jobs have length one, then the
completion times of the jobs are going to be one, two, three, four all the way up
to N. No matter what sequence you use.
So to make this as small as possible, you want the highest waits to be associated
with the smallest completion times that is, you want them upfront as early as
possible. The second special case where jobs have
equal waits but varying lengths, I think is a little more subtle.
Here what you want to do is you always want to favor small jobs, jobs with the
smallest lengths, everything else being equal.
The reason for that is that scheduling a job at a given position forces all the
other jobs to wait for that job to complete.
So all the, so whatever job you schedule first has a negative impact on all of the
rest of the N minus one jobs. So I'll.
Things being equal, you want the smallest job there that minimizes the consequences
for the jobs that are to follow. If you find this a little unintuitive, I
suggest just looking at a very simple example.
Two jobs. Both have weight one, one has length one,
one has length two. If you schedule the small job first
you'll have completion times of one and three for a total of four but if you
schedule the bigger job first you get completion times of two and three for the
bigger sum of completion times of five. So the next step is to move beyond
special cases, which we understand well. To the general case, which perhaps we
don't understand. So suppose all of the weights are
different, and all of the lengths are different.
Well, if we have two jobs, and both of these rules of thumb give us the same
advice, we're good. If there's one job which is both higher
weight, and smaller than another job, then clearly that job should go first.
But what if our two rules of thumb to prefer high weight jobs and to prefer
small jobs, give us conflicting advice. What if we have a pair of jobs, where one
of them is on the one hand higher weight, higher priority but on the other hand,
bigger than the other one. Which one should go first?
Well let's again stay positive, and let's try to think about the simplest kind of
algorithm that could conceivably work. It won't be a guarantee that it works,
but it might work. So we have these two different
parameters, length and weights. Maybe we can aggregate these two
parameters into a single one, into a single sort of score for each of the jobs
so that if we schedule the jobs from high score to low score, we'll always be
optimal. That would be great.
If we could compile these two numbers into one for each job, and then just sort
and be done. There is of course the question of
exactly how do we choose this aggregation function.
How do we compile length and weight into a single number.
Well as guidelines we should recall our special case and make sure we respect our
two rules of thumb. So all else being equal we should prefer
jobs with higher weight. So that says higher weight should meet
the higher scores if we're going to schedule the job from high score to low
score. And then also if a length is bigger that
should decrease the score. We should prefer jobs that have a small
length. So this idea leaves open the question of
exactly how do we aggregate the length and the weight of a job into a single
number. So what I want you to do now is I want
you to think for a minute about what kind of simplest possible functions you could
use. So again, these are mathematical
functions. They take as input two numbers, a length
and a weight of a job, and they output a single number, a score.
And the function should have the properties that it's increasing in the
job's weight, and it's decreasing in the job's length.
So there's more than one answer to this question but just sort of dream some sort
of ideas of what this function might look like.