0:00

I'm going to take you through

Â a threshold theorem for those people who are interested in seeing

Â a little more detail exactly how these thresholds

Â work and how some of these things are proven.

Â Just to give you a feel for the type of proof that works in this setting.

Â So the one that we're looking at is going to be one of

Â the more interesting theorems coming out of Erdos and Renyi.

Â So what we're looking at is that a threshold function for

Â the connectedness of one of these Poisson or GNP random networks,

Â is the point at which the probability of forming a link is proportional to log(n)/n.

Â And so if this then tells us that if P(n) is much larger than

Â this then-- so if P(n) is much

Â larger than log(n)/n then we're going to have a connected network.

Â And if it's much smaller than this then the network is not going to be connected and

Â we're going to have different components in it.

Â Okay. So that's the theorem,

Â and what we're going to do is just sort of work through part of

Â the proof which basically carries the main intuitions of the proof.

Â And there's still some details we have to

Â work out and I'm not going to go through all the details,

Â but this will give the essential bits of the proof.

Â And so one thing we can do is show that if P(n) compared to this threshold goes to zero.

Â So if P is much smaller than this threshold,

Â then there's going to be isolated nodes with probability one.

Â So if there's isolated nodes then certainly the network can't be connected.

Â Whereas if P(n) compared to t(n) goes to infinity,

Â then there can't be any components with less than half the nodes.

Â So all components have to have more than half the nodes

Â and basically that means there has to be one giant component.

Â So the idea here is going to be to show

Â that you know if you much smaller than this you going to get some isolated nodes.

Â And if you're much bigger than this you're not even going to get any isolated components.

Â And what I'll do is really concentrate on the fact that showing that there

Â are isolated nodes or there aren't isolated nodes is at this threshold.

Â And the idea that showing that a component-- a small component can exist or

Â not exist is more or less just a small variation on this proof.

Â First of all, just to

Â remind you of some useful approximations of the exponential function.

Â So in case your calculus or your pre-calculus is a little bit rusty.

Â So definitions of the exponential function two forms

Â of that will be very useful in this course in terms of understanding properties.

Â It's going to be one is that if we look at E of Z x we can also think of that as just

Â looking at the limits of a process where we

Â do one plus x over n raised to the n-th power.

Â So think of this as continuous compounding of interest looks like E to the X.

Â So the limit as n becomes quite large of one plus X to the n over n equals the exponent.

Â Another is a Taylor series approximation for exponential function.

Â We can write the exponential function as the sum of x to the n over n factorial.

Â So that's another way of getting that.

Â And you can go to your Taylor series approximation if you like.

Â But that will be a useful approximation and here what we're going to

Â do is is approximate probabilities of

Â these events happening and then using these approximations we'll be able to get

Â things back in terms of exponentials and then log, log with functions.

Â So what we're going to show now is

Â that right at the point where your expected number of connections looks like a log(n).

Â And so this is going to be the point where p is proportional to log(n)/n.

Â So the expected degree is just multiplying up by n

Â minus one of roughly n. This is the threshold

Â above which we expect each node to have some links

Â and below which we expect nodes to be isolated.

Â And in fact, this is the threshold above which we expect nodes to have many links and

Â once each node has many links the chance of having disconnected components vanishes.

Â So basically, the logic is we're going to show that

Â this part of it and the rest is a fairly simple variation. Okay.

Â So first thing we're going to do is we're going to write the expected degree as

Â some function r+ log(n) so the expected degree is actually some

Â r+log(n) and what we're going

Â to then do is now through calculation of the probability that some node is isolated.

Â Okay. So how is that going to depend on this expected degree?

Â So the probability that some link is not present.

Â So I look at a given node and ask what's the probability that it's isolated?

Â Well, it's the probability that it has no links.

Â What's the probability that it doesn't have one of its links?

Â Well, that's one minus P. What's the probability that it has none of its links?

Â Well, that's the probability that none of them formed.

Â So one minus p times one minus p times one minus p times raise of the n minus one power.

Â So this is the probability that some node is isolated.

Â And so what we want to do is then show that the probability if p is small

Â enough then you're going to have a high probability of these nodes being isolated.

Â And if p is large enough then you're not going to have any probability of this happening.

Â Okay. So the probability of the isolate is one minus P to the n minus one power.

Â Okay. So probability that some node is isolated is this.

Â And now, from the way we wrote the P,

Â remember we wrote that the expected degree was

Â r+log(n) the expected P is just that divided by n minus one.

Â So this is the P. One minus P to the n minus one

Â is just one minus r plus log(n) over and minus one raise to the n minus one.Okay.

Â So now, we've got an expression for this.

Â And now we're going to use our approximation that we went through before I

Â just told you the approximation for an exponential function.

Â So recall that if we have some x running n off to

Â infinity one minus x to the n over n to the n approximates E to the minus x.

Â Okay. So, and that's true of X over n is small, so vanishing.

Â And in particular, what we're going to need is

Â r+log(n) to be small relative to n minus one.

Â Now, what I'm going to do is just assume that that's going to be true.

Â And the completion of the proof here for people who are really interested.

Â If r+log(n) is large relative to n then this is a monotone property,

Â if it holds when its small relative n,

Â it's even going to hold more easily when it's large.

Â We're just adding more links.

Â So that part of the proof is quite easy.

Â So the case that's really going to be tough is when it's small relative to

Â n. And so what we're going to do now is look at this.

Â So now we've gone an approximation we're approximating this by e to the minus.

Â Right? So we get e to the minus r plus log of n as an approximation for this.

Â So the probability that some node is isolated begins to look like this,

Â right, and now we've approximated that using our exponential function.

Â So it looks like e of the minus r minus log n,

Â bring the log n out and we end up with the probability that

Â some node is isolated looks like e to the minus r

Â over n. But remember r was how much expected degree different from log n. Okay.

Â So now we've got a simple formula for the probability that some node is isolated

Â for large n. So if each one

Â is isolated with a probability e to the minus r over n then

Â the expected number of isolated nodes is

Â just n times this since nodes are identical in this world.

Â And so the n cancel out and we get

Â the the expected number of isolated nodes is e to the minus r.

Â Now, we're pretty much all the way there.

Â If r is going to infinity,

Â if r is getting quite large then the expected number of isolated nodes goes to zero.

Â If r goes to minus infinity,

Â then that tells us that the expected number of isolated nodes becomes infinite.

Â So basically what we're getting here is as a function of r if it's

Â either getting very largely positive meaning that the threshold is

Â exceeded by P or if it's getting very negative so that P is much smaller than

Â the threshold then we're either seeing the expected number of

Â isolated nodes going to zero or going to infinity.

Â Okay. And so basically the expected number of nodes goes to-- isolated nodes goes to

Â zero if r(n) tends to infinity and to infinity if r(n) tends to minus infinity.

Â So the expected number of isolated nodes tends to

Â zero if that's tending to zero then the probability of having one has to tend to zero.

Â You can't have the expected number be zero and have

Â the probability of having one be positive.

Â So you know, significantly higher than zero.

Â So the probability of having one is going to have to tend to zero.

Â And so this tells us now that basically if we're sitting in a situation

Â where r(n) goes to minus infinity then-- Oh, sorry.

Â If r(n) goes to plus infinity then the expected number tends to zero,

Â the probability of having one tends to zero.

Â So basically if p is much bigger than this t(n) being log(n) over

Â n then we've got a situation where

Â the expected number of isolated nodes goes to

Â zero and the probability of having an isolated nodes goes to zero.

Â Vice versa if it's going to infinity then an extra step you play with the variance.

Â You can't get the expected number going to infinity without

Â having the probability of having at least one go to one.

Â So it's impossible to have the expected number go to infinity and not

Â have a probability of going to one of actually having one.

Â So that's an extra step that I'm not going to go through but it's

Â a fairly easy one to do in terms of Chebyshev and equality.

Â So this was just one example.

Â I don't expect you to digest all of this on the first pass you can go through.

Â You can go through Barabasi's book in much more detail if you want to

Â see one of these proofs worked out in full detail.

Â But what I wanted to communicate to you through

Â this derivation is just the type of reasoning it goes through.

Â So what you do is you want a certain probability.

Â You can get that probability in terms of approximations based

Â on exponential functions often that approximation of n. Now,

Â we know for large n it becomes fairly accurate and that gives us bounds on probabilities.

Â So we could say that the probability of something is either

Â has to be tending to one or has to be tending to zero

Â based on what these functions have to be converging to as n becomes very large.

Â And those threshold functions come from

Â identifying where is it that these these expressions either go to one or to zero.

Â Whether we're talking about

Â isolated nodes or are we talking about the first link appearing etc.

Â It's going to be a different threshold.

Â And so the proofs have tend to have the same kind of structure to them.

Â They're just going to differ in a particular expressions depending on whether we're

Â talking about one type of threshold and

Â one property or a different property in a different threshold.

Â But this type of analysis is what underlies a lot of

Â random graph theory and a lot of the proofs behind the scenes

Â in terms of establishing properties of random graphs in large settings.

Â Okay. So that's just one example.

Â Now we're going to take a look at some other types of

Â random network models and other kinds of

Â things that we might be able to establish and using them.

Â