0:00

So now, let's talk about the final and most serious problem, which is that stuff

Â in the Internet, nodes, links, etc., is failing all the time.

Â You know its true that we just argued that the asynchronous bellman ford

Â shortest path rounding protocol is guaranteed to converge to correct

Â shortest path. that was assuming that the network was

Â static that no legs were coming up and down.

Â So one particularly simple problem in a presence of failure is what's known as

Â the counting to infinity problem. So I'm going to show you a particularly

Â contrived version of this problem just to kind of illustrate the form in the

Â simplest way, but it's quite easy to come up with more realistic examples.

Â And this problem is not just some kind of theoretical flaw in the distributed

Â Bellmanu2013Ford algorithm. This problem really was observed in

Â practice with those running protocols from the 1970's.

Â So imagine we just have a super simple network verticies s, v, and t by directed

Â arch from s to v and an arc from v to t everything has unit cost and we're doing

Â it routing to the destination t. You might if you want a more realistic

Â example, think about the arcs between S and V as standing in for longer paths

Â that have multiple hops. In any case let's imagine that we

Â successfully computed shortest paths on this basic network so that distance form

Â T to itself is zero, from V to T is one and from S to T is two.

Â Now, again, the issue is that links are failing all the time on the internet.

Â So, at some point, this link from V to T might go down.

Â So, at some point, V is going to notice that its link to T has failed and it's

Â going to have to reset its shortest path estimate to T to be plus infinity.

Â And in an effort to recover connectivity to t, it makes sense for v to ask it's

Â neighbors, do they have paths to t? And if so, how long are they?

Â So in particular, v will pull s, and s will say, oh yeah, I have a path to t of

Â distance only two. And v says, oh well that's great.

Â I, currently my estimate to t is plus infinity.

Â I can get to s with length one, s says it can get back to t with length two.

Â So that gives me a path of length three to t.

Â Now, of course the flaw in this reasoning is that s was depending on v to get to t

Â and now all of a sudden, v circularly is going to use s in its mind to get all the

Â way back to t. So this already illustrates the dangers

Â of naively implementing the asynchronous Bellman Ford algorithm in the presence of

Â link failures. Where does the name counting to infinity

Â come from. Well you can imagine that for some

Â implementations of this protocol, S would then say, oh boy, okay, V just revised

Â it's shortest path estimate to T to be three.

Â My next hop was to V. So if V takes two longer to get to T then

Â I'm going to take two longer to get to T as well.

Â So at that point S updates it's shortest path distance to be four.

Â And now this keeps going on. At this point V says, oh well, if S is

Â going to take two longer to get to T, I was depending on S.

Â So, I have, have to go up by two. And then, S goes up by two and B goes up

Â by two and so on, forevermore. So failures cause problems for the

Â convergence of the distributive shortest-path routing protocols.

Â Not just the counting-to-infinity problem, there are others as well.

Â And this is a tough issue. Much blood and ink has been spilled over

Â the relative merits of different possible solutions for dealing with failures.

Â Back in the 1980s people were largely proposing sort of hacks to the basic

Â protocol. I will the, give them credit, they had

Â some pretty ridiculous and fun names for their hacks, like split horizon with

Â poison reverse." but what eventually people converged on is moving from these

Â so-called distance-vector protocols to what are called path vector protocols.

Â 3:40

And what happens in a path vector protocol is, each vertex maintains not

Â just the next top to every possible destination, but they actually keep

Â locally the entire path, which is going to get used from v to each destination t.

Â So there's definitely some irony here because this is essentially the solution

Â to reconstructing optimal solutions in dynamic programming that we've been

Â studiously avoiding this whole time. You might, you might recall when we first

Â started discussing reconstruction algorithms, this was back independent

Â sets of path graphs. We first said, well, you know, you could

Â in the forward pass through the array store not just the optimal solution value

Â but the optimal solution itself. But we argued, well, that's, that's not

Â really the best idea. It wastes time.

Â It wastes space. Much smarter is to just reconstruct an

Â optimal solution via a backward pass through the filled in table.

Â But what's happening here is this optimized version of the reconstruction

Â algorithm that doesn't use extra time or space, it's just not robust enough to

Â withstand the vagaries of internet routing.

Â So we are going to the crudest solution, where we literally just store optimal

Â solutions, not just the solution value it's self.

Â So that explains why this is called a path vector protocol.

Â Distance vector protocol you store a distance for each possible destination.

Â Here were storing a path for each possible destination.

Â So the drawback is exactly what we've been saying all along.

Â If you store the optimal solutions, and not just the optimal solution value, it's

Â going to take quite a bit more space. And indeed, doing this blows up the

Â routing tables in routers, that's just a fact.

Â There are two quite important benefits. The first we've already discussed,

Â they're more robust to failures. I am not going to discuss the details

Â about how you use this extra path information to handle failures better.

Â But certainly in our contrived motivating example you can see how if S and V

Â actually knew the entire path they were storing to T, they could make sure that

Â they didn't get into this counting to infinity problem.

Â But moving to a path vector protocol doesn't just add robustness, it also

Â enables new functionality. So in particular, passing around entire

Â paths allows. Vertices to express more complicated

Â preferences over routes than merely what their lengths are.

Â Imagine, for example, you were a small internet provider.

Â And you have a much more favorable contract with AT&T than you do with

Â Sprint. So it cost you much less money to send

Â traffic through AT&T as an intermediary than it does through Sprint.

Â Well, now imagine you're running this, distributive shortest-path protocol.

Â And two of your neighbors offer you two paths to get to some destination T.

Â The first path goes through AT&T, the second path goes through Sprint.

Â And, of course, the only reason you know this information is because it's a path

Â vector protocol, and you're passing around the actual path and you can see

Â what the intermediate stops are. Well then your free to say oh well I'm

Â going to use as my path to t and this is what I'm going to advertise to other

Â people I'm going to advertise and store the path that goes through AT&T.

Â I'm going to ignore the one that goes through Sprint.

Â So that is more or less how the border gateway protocol, aka the BGP protocol,

Â works, and that is the protocol which governs how traffic gets routed through

Â the core of the Internet. So that wraps up our discussion of how

Â shortest path algorithms dating back to the 1950's has played a fundamental role

Â in how shortest path routing has evolved in the Internet over the past half

Â century.

Â