The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

From the course by Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

707 ratings

From the lesson

Week 1

Two motivating applications; selected review; introduction to greedy algorithms; a scheduling application; Prim's MST algorithm.

- Tim RoughgardenProfessor

Computer Science

Okay, so it's time to discuss our first minimum spanning tree algorithm namely

Prim's algorithm. Definitely a candidate for the greatest

hits compilation. And again remember even though it's

called Prim's algorithm, it was actually discovered earlier by Jarnik.

So how's it work? Well before showing you any pseudo code,

let's first illustrate it on an example. As we go through the example, I hope that

the similarities to Dijkstra's shortest path algorithm will be evident.

I'm going to work with the same example graph from the previous video with four

vertices and five edges. The plan is to grow a tree one edge at a

time. And we're going to keep growing this tree

like a mold. We're going to start from just a seed

vertex. And then we're going to suck up one new

vertex with each iteration of the algorithm.

So, this is similar to Dijkstra's Algorithm.

In Dijkstra's Algorithm, it was clear where we should grow the initial mold

from, because we were given a source vertex, that they're trying to compute

the shortest paths out of. We have no source vertex in the minimum

spanning tree problem, but it turns out that we can just pick an arbitrary vertex

to start. Doesn't matter which one, which is cool.

So the plan is in E generation we're going to add one edge and span one new

vertex adjacent to the ones we're already spanning.

Now as a greedy algorithm Prim is simply going to select the cheapest edge that

allows it to span one additional new vertex.

Now the start of the algorithm here we're not really spamming anything.

We are sort of thinking of ourselves as growing from and currently spanning the

vertex in the upper right. So what are the edges in which we can

span an adjacent vertex? Well, there is two inches.

There is the top inch that costs one then we'll span an addition in the upper left

vertex or the is the edge with cost two on the right.

If we include that, we'll be able to span the vertex in the bottom right.

So we're not going to be greedy, we're just going to choose the cheaper edge,

the edge of cost one. Now, the vertices that our tree thus far

spans are the top two vertices. So, in the next iteration, we want to add

one more edge [COUGH] to span one additional new vertex.

And now we see that there are three edges sticking out of what we're spanning thus

far that will allow us to span a new edge.

There's the edges that have cost two, three, and four.

The two and the three will allow us to span the vertex in the bottom right.

If we pick the four, that will allow us to span the vertex in the bottom left.

Yeah, and we're going to be greedy, so of these three candid edges, we're going to

pick the cheapest one which is the edge of cost two.

So now the mold that we've been growing is in effect, covers all of the verticies

except for the one in the bottom left. So now in the final iteration we want to

include one more edge so that we span that final remaining vertex.

The one in the bottom left. Note that there's there was this edge of

cause three that we never added. But it got sucked up into the tree that

we grew anyways. So we're going to go ahead and ignore

that. Adding the three wouldn't allow us to

span any more vertices. In fact, it would create a loop which we

don't want. So we're going to say, okay.

We'll have the two edges that would allow us to span an extra vertex.

There's the four and there's the five. We're going to be greedy,

we're going to pick the four. And once we have the edges of the cost

one and two and three and four we have a spanning tree there's no loops there's a

path from any vertex to any other vertex along the pink edges, the cost is seven

you might recall from the previous video this is indeed the minimum cost spanning

tree of this graph. Of course, the fact that we have this

simple procedure that works correctly in this toy example, which is four vertices

and five edges, really means nothing. I mean you shouldn't draw any immediate

conclusions that this is a good algorithm in general even though that is going to

be the case. So let's next go and actually define the

algorithm generally. So if you have a general graph, what does

it mean to start somewhere and grow a mold, span one more vertex each

iteration, always proceeding greedily until you are done.

So lets spell out the pseudo code on the next slide.

So here is Prim's minimum spanning tree algorithm.

We're going to start with just two lines of initialization.

We're going to maintain a set of vertices, capital X.

This is meant to the be the vertices that we span so far.

Again, we need some seed vertex from which to start the process.

It doesn't matter where, which one we pick.

We're going to get the same tree no matter what, so just call it little s.

That's an arbitrary vertex from which we start growth.

The other thing we're maintaining is, of course, the tree.

So that's initially going to be empty. We're going to add one edge to it in each

iteration. An invarient that we are going to

maintain throughout the algorithm is that the edges that currently reside in the

set capital T span the verticies that currently reside in the set capital X.

Then we're going to have our main while loop.

this is the workhorse of the algorithm. And it's very similar to the one in

Dijkstra's algorithm. Namely, each iteration is responsible for

picking one edge crossing the current frontier.

advancing to include one new vertex. And again, it's going to be greed.

The criterion's going to be different, in fact, simpler, than with Dijkstra's

Algorithm instead of looking at links. We're just going to say, what's the

cheapest edge that allows us to span a new vertex?

So the loop's going to keep going, as long as there are vertices that we don't

yet span. And then what we do is we search to the

edges that allow us to span a new vertex. So which edges are those?

Well we want there to be one endpoint in the set X of vertices we already have our

tree spanning and we want the other end point to be non-redundant, so we want it

to be outside of X. So if we have an edge that crosses the

frontier in this sense, one endpoint in X, in endpoint outside that's how we

increase the number of spanned vertices by one in an iteration.

So if E is the cheapest edge amongst all of those that cross the front here with

one end point on either side, that's the one we're going to add to our tree so far

capital T in this iteration, it's end point that's not already in capital X,

that's going to be the very text that we add to X in this iteration.

And again the semantics of an iteration is that we're trying to increase the

number of spanned vertices while paying as little as possible, that's the sense

in which a prim's algorithm is a greedy algorithm.

So as usual with a greedy algorithm, this seems natural enough, but it's not at all

clear that it's correct, that it always computes in minimum spanning tree.

In fact, if you think about it's not even obvious, it actually computes a spannin

tree at all, minimum or otherwise, but it is correct.

Let's make that statement precise on the next slide.

So the key claim is that Prim's Algorithm is correct.

Given any connected input graph, it is guaranteed to output a spanning tree with

minimum possible cost. So before we delve into any details, let

me just finish this video by telling you about the proof plan.

We're going to prove this theorem in two parts.

First, we're going to establish that it outputs some spanning tree.

Maybe, maybe not minimum. Even that's non trivial.

Then we'll worry about arguing that the spanning tree output actually is one of

minimum cost. Both parts of the proof are interesting.

For part one to argue that we output some spanning tree, we're going to review some

preliminaries about graphs and about cuts and about spanning trees and graphs.

For part two to argue optimality, we're going to rely on a very neat property of

spanning trees, minimum spanning trees called the cut property.

I'm happy to report so that the work that we do here and in both parts will bear

further fruit later we're going to reuse these ingredients when we prove the

correctness of another MST algorithm named McCrustgrals algorithm.

For those of you who would much rather talk about running time than correctness

don't worry your time will come after we wrap up this correctness proof I'll

address how do you implement prim's algorithm quickly in particular using

heaps we'll get the running time down to the near linear bound of O of M log n.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.