0:01

Hello everybody, welcome back to the Graph Algorithms course.

Â Today we're going to talk about algorithms for exploring graphs, in particular,

Â ways to tell us which vertices can be reached from which others.

Â So for an example of the type of problem we're trying to solve here,

Â suppose that you're playing a video game.

Â And you found the exit to some level, but you don't want to go there just yet.

Â You want to first sort of explore this whole level and

Â make sure you found all the secrets inside it, make sure you get all the loot and

Â XP that there is to be found.

Â And you can certainly wander around between the various rooms and

Â find a bunch passageways and so on.

Â And you've been wandering around for a while and

Â maybe haven't discover anything new yet.

Â But you'd like to make sure that you've really found everything before you leave.

Â How do you accomplish this?

Â How do you make sure that found everything?

Â And this sort of is a notion of exploring a graph.

Â You've got theses rooms, they're connected by passageways.

Â And you want to make sure that everything, at least everything that's reachable from

Â where you started, you can actually get to it and find, explore the entire region.

Â 1:06

And this actually sort of,

Â this of related questions are actually very useful in a number of applications.

Â For example, if you have a map and

Â you want to find a route between some location on the map and

Â some other, it often depends on this sort of exploration of the graph, making sure

Â that you can find some sequence of roads to follow to get you where you want to go.

Â This could also be used if you're sort of building a road system and

Â want to ensure that the entire thing is connected.

Â You need to have some sort of algorithm to tell what can be reached from what.

Â And finally, it has more recreational things, the video game example, but

Â also if you want to solve the maze.

Â This is very much, well,

Â can I get from the start to the finish in this sort graph that connects things?

Â But also various puzzles,

Â if you think of sort of vertices as sort of possible configurations.

Â And then edges that describe moves that you can make and

Â want to say, can I rearrange the puzzle to end up in this location?

Â Again, this sort of exploration algorithm becomes very important.

Â 2:17

And basically, the idea is you start at the vertex and

Â you're allowed to follow edges of the graph.

Â And you want to able to see what you can end up at.

Â And so to formalize this,

Â we'd find a path in a graph G to be a sequence of vertices v0, v1, v2,

Â 2:41

So the formal description of the problem that we would like to solve is

Â given a graph G, and a vertex s in G, we'd like to find the collection of

Â all vertices v in the graph such that there is a path from s to v,

Â everything that we can reach from S.

Â So, just to get a little bit of practice, if we have the following graph,

Â which vertices here are reachable from A?

Â Well think about it a little bit and you find out that A, C, D, F, H,

Â and I are all reachable.

Â And it's easy to see that these vertices sort of do all connect up from edges,

Â but you then can't get to B, E, G or J.

Â And this is sort of because there are no edges that connect

Â any of these vertices we can reach to any of these other vertices.

Â And so there's no way to escape, except for these six.

Â 3:33

And this is sort of the actual idea behind the algorithm.

Â What you want to do is you want to make sure that you can actually find

Â everything that you can reach.

Â And so what you do is you sort of expand, you find a bunch of vertices.

Â These are a bunch that I can reach.

Â But then if there are any edges that connect ones that you can reach to ones

Â that you can't reach so far, well, you have to sort of explore those edges and

Â find the new vertices on the other end, and

Â sort of add them to the list that you know about.

Â And you sort of keep expanding this list of vertices that you know you can get to

Â until you can't connect to anything new and then you're done.

Â 4:08

So to formalize this algorithm a little bit,

Â we are going to keep a list of DiscoveredNodes.

Â And this starts out just with the vertex axis you're supposed to start at.

Â But then what you do is while there is an edge e that leaves this set of

Â DiscoveredNodes that connects to something you have discovered to something you

Â haven't discovered, then what you do is you take the vertex at the other end of

Â that edge and add it to your list of DiscoveredNodes.

Â And you just keep doing this until there's nothing new to be found.

Â And then you return this list of DiscoveredNodes.

Â 4:41

Okay, that's a reasonable algorithm and it does work.

Â But in order to really code this up, you need to do

Â some work to handle the bookkeeping that's required for this algorithm.

Â You need to do things like you need to keep track of which

Â vertices you've discovered and which edges you've dealt with,

Â which edges you've actually checked and which ones you haven't.

Â You also need to know sort of which order to explore new edges in.

Â If there are several possible edges to follow, which one do you follow next?

Â 5:18

The first thing that we need to do is we need to keep track of which

Â vertices we've already found.

Â And for this, we're going to associate a boolean variable to each

Â vertex visited(v) which basically tells us have we visited it yet.

Â 5:31

The next thing that we're going to need to do is we need to,

Â most of the vertices that we visited will actually will have already sort of checked

Â all of the edges relating to them.

Â But some we haven't and we somewhere need to keep track of the list of vertices that

Â still have edges hanging off of them that might connect this to something new.

Â Now this list isn't going to appear explicitly in our program.

Â It'll actually sort of to be hidden in the program stock so

Â this is that points a little bit sudden.

Â We'll sort of see it once we introduce the algorithm.

Â 6:02

The final thing is we need to discover which order to discover,

Â to follow new edges in.

Â And for this we are going to use what is known as the Depth First order.

Â What this means is we're just going to start our initial vertex and

Â we're just going to start following a chain of edges.

Â We're just going to follow some really long path forward

Â until one of two things happens.

Â One thing is we could stumble across a vertex that we have already seen before.

Â In which case there's no reason to have followed that edge and

Â we'll just back up to where we were before.

Â 6:32

The second thing that could happen though is that we hit a dead end.

Â And we actually hit a dead end and

Â can't go any further forward, then we actually back up.

Â And then once we back up though, we don't just back all the way to the beginning.

Â We just back up once step and

Â then try going forwards again from that new vertex that we found.

Â 6:50

Okay, so that's the basic idea.

Â How do we implement this?

Â Well part of the beauty about this is that we have a very simple recursive algorithm.

Â So Explore(v),

Â the first thing you do is you set the visited marker of v to be true.

Â We say we have visited it.

Â Next, for each neighbor w of v, for each w that's adjacent to v,

Â if w has not already been visited, we recursively explore w.

Â 7:34

That's because we have this for

Â loop we want to iterate over all of the neighbors of v in our graph.

Â And for that, if you have an adjacency list, which

Â gives you a list of all the neighbors of v, that's incredibly easy to do.

Â If you don't have an adjacency list on the other hand,

Â this algorithm really isn't that efficient.

Â 8:01

So we mark it as visited.

Â We then check for unvisited neighbors.

Â And hey, look there is one.

Â So we recursively explore that other vertex.

Â We mark it as visited.

Â We search for unvisited neighbors.

Â And we have this one.

Â So remember now we're sort of three layers into the program stack here.

Â This is sort of a sub routine of a sub routine.

Â But now when we're exploring this vertex it has no unvisited neighbors.

Â So after we've done a little bit of checking,

Â we decide that we're done exploring this guy and we pop the stack.

Â This other guy still we visited both of his neighbors,

Â we pock the stock back to the original explorer call.

Â Now this vortex does have some unvisited neighbors left,

Â so let's visit one of them and explore that.

Â Mark it as visited, find an unexplored neighbor, explore that.

Â Mark as visited, unexplored neighbor, mark it as visited, unexplored neighbor.

Â Now when we explore this vertex though, once again we're stuck.

Â So we wrap up exploring that guy, pop a level up the stack, go back to exploring

Â this other vertex, who now actually does have another unvisited neighbor.

Â So we're going to go visit that one.

Â Now we've actually visited everything in the graph.

Â So all we're going to do is at each vertex,

Â we're going to note that all of their neighbors have been visited.

Â We're going to pop up the stack and get back to where we started and conclude.

Â So here we actually have found all these vertices.

Â And in fact if you look at it,

Â we've actually figured out how to reach them all.

Â If you look at sort of these darker edges which are the ones our algorithm sort

Â of actually followed when you ran it,

Â 9:48

Okay, so that's our algorithm.

Â Let's talk about correctness.

Â And the theorem is that if all the vertices in our graph start as unvisited,

Â when we run Explore(v) it marks as visited exactly

Â the vertices that are reachable from v.

Â And the proof here isn't so bad.

Â The first thing to note is that we only ever explore things that

Â are reachable from v.

Â And that's because, well, the way our recursive calls work are we either

Â start at v, or we explore a neighbor of a vertex that we've already explored.

Â So any vertex that we end up exploring has to be a neighbor of

Â a neighbor of a neighbor of a neighbor of a neighbor of a neighbor of the original

Â vertex or something.

Â But that does basically give us a path and say that wherever we got to was reachable.

Â 10:35

The next thing to note is that a w, vertex w,

Â is not marked as visited unless it has already been explored,

Â which is just the only way we mark things as visited is when we explore them.

Â 10:48

But finally, we should note that if w gets explored,

Â well, we then look at all the neighbors of w.

Â And either those neighbors have already been visited, in which case

Â it means they've been explored at some point, or we end up exploring them.

Â 11:10

So to finish things, suppose that we have some vertex u that is reachable from v.

Â That means that we've got a path from v going up to u.

Â And if we actually explored everything along this path we'd be done.

Â We would have explored u at some point.

Â 11:33

However, by what we had on the previous slide,

Â if you explore a vertex you also explore all of its neighbors.

Â So the next vertex z along this path must also be explored.

Â And so this is a contradiction.

Â This says the only way this can work is if we actually explored every

Â vertex along the path.

Â But that means we've explored u, which is good.

Â 12:27

So to look at an example on this graph, we find an unvisited vertex,

Â say that one, and we explore it.

Â So we find a neighbor and another neighbor and then we pop back up the stack.

Â And then we find something adjacent to our original vertex and come back.

Â And now we're done exploring that first vertex.

Â 12:45

So now we look for a new vertex we've never visited before, like say that one.

Â We explore its neighbor, come back, we're done exploring that guy.

Â We find a new vertex that we haven't visited, that one.

Â We explore its neighbor and his neighbor and then come back.

Â And now that we've actually visited all the vertices,

Â only now do we sort of wrap up and conclude our algorithm.

Â 13:21

The next thing to note is that no vertex

Â ever gets explored if it's already been visited.

Â And in fact if you look at every time we make an explore as even a recursive call,

Â then we always first check if it has not been visited, then we explore it.

Â And this means that no vertex gets explored more than once.

Â In fact, this means that each vertex gets explored exactly once because the outer

Â loop in the DFS explores every vertex if it hasn't already been visited.

Â 14:08

And so we have to be more proportional to the total number of neighbors

Â over all vertices.

Â And that's proportional to the number of edges because each edge connecting A and

Â B says that A is a neighbor of B and that B is a neighbor of A.

Â So it contributes to two neighbors.

Â But the total amount of work is still O of the number of edges.

Â