0:00

Hi, in this video you will learn how to construct a good node ordering for

Â the contraction hierarchies algorithm.

Â So far if you think about it, our algorithm works correctly for

Â any node ordering.

Â If you just choose any order and pass it to the processing procedure, it will

Â contract the nodes in this particular order and return an augmented graph.

Â If you then run queries on this augmented graph using the bidirectional Dijkstra

Â approach from the previous videos, everything will work correctly because

Â the correctness proof doesn't depend on any particular order.

Â However, this order will influence the order of operations in the bidirectional

Â Dijkstra algorithm for the queries.

Â And actually, it will heavily influence the time required for

Â queries and for preprocessing.

Â So we want to choose a good node ordering, so that our preprocessing and

Â our queries are really fast.

Â First thing we want from the order is we want to minimize the number of

Â added shortcuts.

Â Because each time you add a shortcut,

Â you add an edge to the augmented graph in the end.

Â And the more edges, the slower your Dijkstra algorithm will work.

Â So we want less shortcuts.

Â Second, you want to spread the important nodes across the map.

Â because from the common sense, it follows that there are important big

Â cities in every sufficiently large region of the country, of the map.

Â And then in each state, if you divide it into sub-regions,

Â there are still relatively big cities which are important nodes and so on.

Â And this should work well in practice.

Â And third, you also want to minimize somehow the number of edges in

Â the shortest paths in general in the augmented graph.

Â So basically, you want to make your graph look less like a road network graph and

Â more like a social network graph, where bidirectional Dijkstra algorithm is so

Â much faster than the regular Dijkstra's algorithm.

Â Because any two nodes in the social network are at most six edges away.

Â And then bidirectional Dijkstra finds, with forward surge and

Â backward surge, in just three steps they meet each other.

Â You want the same in the augmented graph and

Â you can somehow become close to it by adding shortcuts in the right places.

Â So in general, you want these three things.

Â Now we'll discuss how to do that.

Â So we want to order nodes by some importance, so

Â we need to introduce a measure of importance for that.

Â And then, we will at each step contract the least important node.

Â However, when we contract a node,

Â the properties of the nodes which are neighbors of this node can change.

Â And then importance can also change for those nodes after that.

Â So in general, the algorithm will work as following.

Â We'll keep all the nodes in a priority queue ordered by importance, so

Â that we can extract the least important nodes very fast.

Â But of course, priority will be something like minus importance so

Â that the hat of the priority queue contains the least important node,

Â not the most important one.

Â Then, on each iteration we will extract the least important node.

Â And we want to contract this node, but first we need to recompute its importance.

Â Because it could have changed because of the contractions of the other nodes.

Â Then, we need to compare its new updated importance with the importances

Â of all the other nodes.

Â And to do that, we will take the hat of the priority queue.

Â Because we've already extracted these nodes from the priority queue,

Â it's no longer there.

Â So we will compare the importance of this node

Â with the smallest importance of all the nodes which are still in the queue.

Â If the importance of the extracted node is still minimal,

Â then we can safely contract it, it's the least important node.

Â Otherwise, we will need to put it back into the priority queue so

Â that we can extract a node with even smaller priority from it.

Â Of course this is only approximate,

Â because at some point we could extract the node's recomputed importance and

Â then decide that it has really the least importance of all the nodes in the queue.

Â However, there is some other node which is still in the queue which has a smaller

Â importance, but we haven't yet recomputed it and so we don't know about that.

Â And that's okay,

Â we will try just to order the nodes approximately by this importance.

Â Also, it can seem that this process can go on and on indefinitely.

Â But we'll show that it will actually stop after a finite number of

Â iterations, always.

Â 4:38

Why is that?

Â Well, if we dont contract the node, we update its importance and

Â it actually changes.

Â And after that, its importance is corresponding to the current set of

Â nodes which are still not contracted.

Â And after at most number of nodes attempts,

Â all the nodes will have updated importance.

Â And then, the node with the minimum updated importance will be

Â contracted after that, because it will be extracted and

Â it will have the smallest importance at that point.

Â And then when we recompute its importance,

Â it will turn out to be the same because it already has updated importance.

Â And then we will just contract this node.

Â So eventually this kind of iteration always contracts a node.

Â And in practice, it usually takes one or two iterations usually,

Â to actually contract something.

Â So this works typically fast.

Â 5:34

Now, we'll talk about the importance criteria.

Â So first is edge difference.

Â It is related to the number of shortcuts added and

Â minimizing the number of edges in the augmented graph.

Â Second one down is number of contracted neighbors.

Â And this is related to spreading the nodes across the map.

Â So if there are already many neighbors of this node which already contracted,

Â then we will probably contract some node in some other place.

Â Because we want to spread important nodes and also spread unimportant nodes.

Â We don't want everything to be clustered in one region.

Â Also, shortcut cover is connected with how

Â important that node is, how unavoidable it is.

Â Do you remember the picture with an island between San Francisco and Oakland?

Â The treasure island which was unavoidable if you go from San Francisco to Oakland,

Â if you want to go with the shortest path?

Â So shortcut cover will somehow approximate how unavoidable a node is.

Â And the last one, the node level, has something to do with the number

Â of edges in the shortest paths in the augmented graph.

Â So now we'll go through all of these in detail.

Â So Edge Difference.

Â You basically want to minimize the number of edges in the augmented graph.

Â So when we try to contract the node, v, we can actually compute

Â how many shortcuts would be added if we actually contract this node.

Â We can also always compute the number of incoming edges, and

Â number of outgoing edges of any node.

Â And then we can compute the edge difference.

Â The number of shortcuts added minus number of edges deleted,

Â which is the degree of this node.

Â And this is called edge difference.

Â And the number of edges increases by edge difference after contracting v.

Â So this influences both the number of edges in the final graph and

Â also the number of edges in the graph being preprocessed.

Â Which is important because it influences the preprocessing time.

Â 7:37

So what we want is to contract nodes with small edge differences first.

Â Contracted Neighbors, I already talked about this.

Â We want to contract a node with small number of already contracted neighbors, so

Â that the nodes which we contract don't cluster in one place, but

Â they're actually spread out in all the regions of the graph.

Â Shortcut Covers, very important.

Â Basically it says how many nodes are there,

Â such that we cannot avoid visiting our node if we want to go to those nodes.

Â More formally, if we have some node, v, we can do the number of neighbors,

Â w of v, such that we have to create a shortcut, either to w or

Â from w in the case we need to contract v.

Â And this basically means that when we contract v,

Â we'll lose some shortest path to w.

Â And that's why we need to create an additional shortcut.

Â And so v is important for w, and if v is important for many, its neighbors,

Â then it's important.

Â And so, we need to first contract nodes with small importance,

Â with small shortcut cover.

Â Node level is an upper bound on the number of edges in the shortest

Â path from any node s to node v in the augmented graph.

Â It can be shown formally that this is an upper bound,

Â but I'll tell you the intuition.

Â So we initialize the level with 0.

Â And then when we contract some node v, we update the level of its neighbors.

Â So for a neighbor u of v, either we could get to u somehow without going through v,

Â and then contracting v doesn't influence its level.

Â Or we could go to u through node v.

Â But then to go to node v, we needed at most,

Â level of v edges, and then one more to get from v to u.

Â So that's why L(u) needs to be maximized by L(v) + 1.

Â So in the end, L(v) is an upper bound on the number of

Â edges on the shortest path to v.

Â And so we want to contract nodes with small level,

Â because first, we need to contract non-important nodes.

Â In the end we proposed the following importance measure.

Â We just sum up all those four values.

Â Note that this is a heuristic value, and you can actually play with weights of

Â those four quantities in the importance and see how preprocessing time and

Â query time change.

Â But we think that just summing them up will already work really well.

Â However, each of the four quantities is necessary for fast preprocessing and

Â queries.

Â If you avoid implementing at least one of them,

Â probably preprocessing or query time will be too long.

Â But then you can further optimize with the coefficients.

Â For example,

Â two times added distance plus three times contracted neighbors, and so on.

Â So it also means that you need to find a way to compute all of these

Â quantities efficiently at any stage of the preprocessing.

Â Both in the beginning, and also when you need to

Â contract a node but you've already contracted some other nodes.

Â This is not very hard and we'll leave this as an exercise to you.

Â So, we did a lot of work.

Â We've come up with a preprocessing stage.

Â We need a particular node ordering with that and

Â we've come up with a complex importance measure for nodes for that.

Â And also we need to do bidirectional Dijkstra, but

Â not classical bidirectional Dijkstra.

Â We need to do some tweaks in the query stage.

Â So, is it all worth it?

Â It turns out that yes, it is well worth it.

Â If we take for example, a graph of Europe with 18 million nodes, and

Â launch the classical Dijkstra's algorithm on it.

Â And we generate random pairs of vertices in different places of Europe and

Â then compute the shorter distances.

Â Then on average, it will work for 4.3 seconds.

Â However, if we launch Contraction Hierarchies on

Â the same graph with the same pairs of vertices and

Â we tweak the heuristics inside the contraction hierarchy as well.

Â It will work on average just for 18 milliseconds,

Â which is almost 25000 times faster.

Â So it is very well worth it.

Â For example, for a service like Google Maps or Yandex Maps,

Â working 25000 times faster means that you actually can provide

Â your service to millions of users and do it blazingly fast.

Â And the speed-up will only increase if you increase the size of a graph.

Â For example, if you go from the graph of Europe to the graph of the whole world,

Â the speed-up will only increase.

Â 12:55

And this algorithm works 1000s of times faster than Dijkstra.

Â If you tweak it really well, it can work even tens of thousands times faster.

Â But at least 1000s of times faster, this is what is guaranteed for

Â your own implementations.

Â You will try to do that in the programming assignments of this project for

Â this module.

Â And you will see that your solutions are actually much,

Â much faster than the classical algorithm.

Â And you will need them to work 1000s of times faster to actually

Â pass the assignment.

Â And also, we actually encourage you to compete on the forums on whose solutions

Â is the fastest.

Â So you can take the result by the grader and then post it on the forums, and

Â someone can organize the result table with the top results.

Â And it's very interesting how you can come up with your own heuristics for

Â improving this algorithm.

Â Because you see there are a lot of heuristics in the last part of this

Â lecture, and you can come up with your own ideas and maybe they will be so

Â good that you will speed up everyone else's solution 10 times or 100 times.

Â So really try to do that.

Â We will also show you some pointers to where you can get the data for

Â real graphs of the road networks of some parts of US and the whole US.

Â And in the end, this could become your own Open Source project which you could show,

Â for example, to potential employers.

Â So we wish you good luck with advanced shortest paths.

Â