0:00

Okay, hi folks, we're back again, and we'll be talking a little bit more about

Â Random Networks. And in particular, we're going to start

Â looking at Growing Random Networks. So, situations where there are new nodes

Â entering over time. And, so this fits into our study of

Â network formation. So in terms of the course, we are now in

Â the second part of the course. And we're looking at Network formation

Â models. And again, we've looked at Static Random

Â Network models. And now we're going to be looking at

Â Dynamic, Random Network models. Where there's growing numbers of nodes

Â over time. And there's lots of examples where this

Â happens in the world. So, the prime example is citation

Â networks. New articles are born over time.

Â New articles can form links, by citing old articles.

Â Old articles can't cite new articles. so articles are going to accumulate links

Â over time. Newer articles are going to have fewer

Â links then older ones, so there's a natural form of heteragenania that forms

Â this way. Same thing with the web, in terms of web

Â pages. Collaboration networks and

Â co-authorships, you have older researchers, younger researchers older

Â researchers will have had more collaborations than younger ones.

Â Societies you known generally human situations you're going to have some

Â older. And some younger and they're going to

Â have different characteristics based, just simply, solely on age.

Â So there's a natural form of heterogeneity.

Â That comes in in form of age. So let's think a little bit about what

Â they add. why do we care about having growing

Â random networks? Well you know, one answer people say okay

Â what's more realistic. And one thing I want to emphasize just

Â from the start is when we build models, we know that we're not going to try and

Â capture everything in the world. And so, realism is not usually a good

Â reason for building a model to, to try and match reality.

Â the, the reason that we build models, is to try and use as few moving parts as

Â possible in order to pro, reproduce the world.

Â So the only reason we want to add this feature of growing random networks is not

Â because that's the way the world is. But because this richer model might

Â capture something in the real world that was not captured in the static models.

Â So again, you, you shouldn't judge models based on realism.

Â They're always wrong. the question is are they capturing

Â reasonable elements and so here. The natural form of heterogeneity versus

Â age is going to be useful in getting out more connected nodes and less connected

Â nodes. And, and starting to see, power laws and,

Â and fat tails. So we're going to get a natural form of

Â dynamics. And in particular, we're going to end up

Â with a natural way of getting different degree distribution without just building

Â them into the statistical distribution. So, we could just assume that there's a

Â distribution that has the, the, the features of reality.

Â But instead we want to do is see if build a model that will end up generating.

Â Features that look like the real world, and this sort of dynamics is going to

Â help quite a bit. Okay,[COUGH] , so in, in order to start

Â this, what we're going to do is start by taking Erdos-Renyi random network

Â situation. but instead, just enriching it to have

Â nodes being born over time. And so that'll be a simple benchmark,

Â that'll give us an idea of some of the techniques.

Â And then we can enrich it to have different kinds of formation processes

Â besides the uniform at random. Okay, so, this idea of growing and

Â uniformly at random, what we're going to have is each day is going to be a new

Â node's birthday. So nodes come in at time one, two, three,

Â four, etcetera. And when they are born, they will form a

Â number of links to existing nodes. And, to start, in terms of this

Â Erdos-Renyi kind of setting. What we're doing, we're going to assume

Â that each mode is going to form it's links, uniformly at random to the ones

Â that are already existing out there in society.

Â Okay so you're born, you decide who to link to, you form some numbers of links

Â and you form them to over the existing node.

Â To keep things simple, we're going to have the, the link formation process only

Â occur when the node is born. So you don't keep forming them over your

Â lifetime. You'll get new ones as new nodes are born

Â and they form their links. But a, a given note only forms it'[s

Â links it, it, it's outward links in some sense when it's born.

Â But it can accumulate more links later on as new nodes are, are forming their

Â links. Okay.

Â So in order to have some nodes that you can form links to when you're born what

Â we're going to do is each one is going to form m links at random.

Â So, we'll have m nodes that are already existing which are fully connected so

Â that when the first node is born it has somebody to connect to.

Â So the, the process is going to start out, we'll start it with a very simple

Â seed of already having some m nodes fully connected.

Â You could start with a whole series of different situations.

Â That's not going to effect the asymptotic limit of it, it might effect what it

Â looks like for the short periods of, of time, until you get to a very large time.

Â Okay, new node is born, it forms m links to existing nodes.

Â Say two, three, four, how, what, whatever, how, however many we want.

Â And what that means, is if I'm already born and, there's some time t then I'm

Â going to have a probability That's, that's roughly m of the, the

Â number of links being formed, compared to t of the existing number of, of nodes

Â that are, are already there, of, of getting one of the new links, right?

Â So, m over t will give, give us the probability.

Â So, the more nodes that are out there over time the less chance that any

Â particular node is going to get one the new links, right?

Â Okay, so very simple process, but, it, it's going to have some dynamics to it.

Â Okay. So, now let's think about sum node i,

Â that was born after the original m nodes that we started with that we fully

Â connect and, and before sum time t. Okay.

Â And now let's ask how many links has it, does it expect to have collected by time

Â t? Well, the first thing is, that it forms

Â some m links when it was born. So it's going to have m links for sure.

Â Then in terms of expectations, the next day after it was born, if it was born at

Â time i, then the date now is, i plus 1. And there's some new node which is born

Â and it's forming it's m links. There's already existing i plus 1 links

Â nodes out there. And so it has a chance m over i plus 1 of

Â getting these new links, okay? And then at time, the next state there's

Â i plus 2. It's going to, a number of new links are

Â formed, and so forth. Right?

Â So, so this is the overall sum of what it's going to get over time.

Â Right? So, we had the first m that it got when,

Â at birth, you expect it from the next node born, and so on.

Â So it has this sum.

Â [BLANK_AUDIO].

Â Now, if you want to look at a sum like this.

Â So, you look at this kind of sum, what is an approximation for this sum, well these

Â are harmonic numbers. So if you can if you remember your, your

Â number theory, these are what are known as harmonic numbers.

Â So, they're growing proportionately to i plus some number so it's growing

Â proportionately to the inverse of t. And if you sum a series like this an

Â approximation for this is going to be m times 1 plus log of t over i.

Â So, depending on, on how far you go out and when you started this series.

Â You're going to end up with sum that looks like this.

Â So, we have an approximation for what the expected degree is of any node, after

Â some time period t. Okay?

Â So, very simple calculation. And now what can do is we can ask what

Â does this generate in terms of the distribution of degrees in society.

Â Or the distribution of expected degrees in society at any point in time?

Â Okay, so let's do a simple calculation. How many nodes have an expected degree of

Â less than d at sometime t? Well, it's those for whom their expected

Â degrees at time t are less than some d. Right?

Â So if you say okay how many are going to have degree less than 100?

Â Then we can ask that. How many are going to have degrees less

Â than 35? We'll get some numbers.

Â So, at time t, it's going to be the i's for which this inequality holds.

Â Okay, so let's have a look at that inequality in more detail.

Â So, let's suppose that we did this calculation at time 100.

Â So, we've got here our t is 100. And let's suppose that each node at each

Â time is, is forming 20 new links. So, what does the degrees of different

Â nodes look like at time100? So here we have the birth date of the

Â node. So, here is birth date of the node and

Â this is then the equation that we had. So, so, so this is our equation that we

Â had in terms of m times 1 plus log t over i, right.

Â So, this equation right here is m times 1 plus log t over i.

Â 10:12

So, that's plotted out here. And what that means is, as you can see

Â the older nodes, have gotten more links than the younger nodes.

Â So, the youngest node is only formed at 20 and not gotten many more.

Â Some of the slightly older nodes have gotten a little bit more than 20 because

Â they've happened to get some of the new ones.

Â These ones have been around for a longer time, they have gained more links.

Â And the, the, the reason you have curvature here, is always easier to get

Â links early on. And as more and more nodes are born, it's

Â harder and harder to get the new links. So, ones that are born later and later,

Â are going to have a harder time getting new links than the ones that we're born

Â earlier and could get one. So, you get a natural curvature here,

Â just due to that fact. Okay.

Â So, what do we get? well we can do the same thing for degrees

Â at time 200. And at degree time 200 we'll have a

Â slightly different curve. And you know, the ones that were born at

Â 100, time 100, now I've had a chance to gain more links.

Â Everybody's gained more links. how much they've gained more, depends on

Â what their birth date was. And now we've got the, the 200th, node

Â born has just formed it's initial 20 links, and so forth.

Â So, we're going to get different degree distributions at different times.

Â We'll have a different, a, this is expected degrees over time.

Â we'll have a different distribution as time evolves.

Â Okay, so what does this distribution look like?

Â Well, we can ask then what are the nodes for instance, that have degree less than

Â 35? What's the fraction of nodes?

Â So, let's go back to our time 100, and figure out what is a distribution

Â function. So, what is, you know, how many nodes?

Â What does F of d look like. So, how many nodes have degree less than

Â some d. So, what does F of 35 look like, alright?

Â We can do that kind of calculation. Okay, these are the nodes with degree

Â less than 35, are the ones are expected degree less than 35.

Â there's going to be some node, which has expected degree right at 35.

Â And basically the nodes with degree, expected degree less than 35, are

Â going to be the ones that were born afterwards, right?

Â So, these ones. So, if we want to figure out the fraction

Â of nodes that were born after, that have degree less than 35.

Â It's going to be the ones that were born after this, compared to the overall 100

Â that exist in the society so far. So, let's go through that calculation.

Â So, what we want is the actual amount that they have, the expected degree they

Â have. Remember that this is m1 plus log t over

Â i. So, here we have t is 100, m is 20.

Â So, these are the nodes for which this equation is less than 35.

Â Okay we've got that, and so if we solve that equation.

Â Then what do we end up with? We end up with, these are the i's for

Â which they're greater than, you know, you just take exponentials of both sides of

Â this, solve out for i. And what you'll get is these are the i's

Â for which, they were born after time 47.2.

Â So, basically the ones that were born 48 or later are going to have expected

Â degrees less than 35. The ones that were born 47 and before are

Â going to have expected degrees, bigger than 35.

Â So, this is 47.2, this point right here, and now if you want to ask, what's the

Â fraction of nodes that have degree less than 35?

Â [SOUND] Well, that's going to be the 47.2.

Â So, we're going to have 100 minus 47.2 over 100.

Â Right? So, this is going to work out to be,

Â 0.628. Right.

Â So, the, the fraction of nodes that are you know, rough, roughly 63%, are

Â going to be the ones that have degree less than 35.

Â So, some where between 62 and 63% in this case have expected degrees less than 25.

Â Okay? So, given this process we can figure out

Â how degrees grow. We can figure out a degree distribution.

Â And, more generally for any degree, the ones that have expected degree less than

Â d at some time. Are going to be the ones for which i is

Â bigger than same solution as we just saw for the 35 t times e at the exponent of

Â minus d minus m over m. Okay.

Â So, if you go ahead and solve that out, what we'll end up with is a degree

Â distribution at time t. That says the fraction of nodes that have

Â an expected degree of less than d, is given by this formula 1 minus e minus to

Â the minus d minus m over m. Okay, so very simple set of calculations,

Â little bit of algebra invovled, some exponentiation, some summation of series.

Â But basically what we're able to do is now calculate what the degree

Â distribution looks like for this growing for this growing random network.

Â this is a exponential, negative exponential, distribution.

Â So what we have is the distribution of expected degrees is such that d minus m

Â is exponentially distributed with a mean of m.

Â what about the actual degrees? Well, a good approximation for large t,

Â we're going to have to you know, so, so here we did the calculations of expected

Â degrees. Right?

Â So we, we, we can say how many each node expected, and we got a nice curve.

Â Well some, some nodes are going to happen to get more, some are going to happen to

Â get few. So, the actual degrees of the true nodes

Â are not going to follow that really nice, smooth curve.

Â They're going to bounce around a bit. And what that means is that the

Â distribution can be slightly different than this distribution of expected

Â degrees. it turns out that this distribution of

Â expected degrees, is a good approximation of what the actual distribution of

Â degrees will look like. Some will end up pushed above, some will

Â end up pushed below. But on average, you'll still expect

Â around 60%, less than 35 and time 100 and so forth.

Â So, if you go through and actually calculate the true distribution function.

Â It's going to look as t becomes large, it's going to be well approximated by

Â this negative exponential function. Now actually doing that, you need some

Â careful law of large numbers argument I'm not going to do that here.

Â I, in the book there's some discussion of how you go about doing that.

Â It's fairly straightforward. Again, it's just making sure that the

Â variation in the actual distribution isn't too large.

Â As you get to a large t, the noise in the system is going to be well approximated

Â by the mean. So, so here we've got a distribution of a

Â growing random network things have worked out well.

Â Okay, so in terms of where we're going next we'll look at slightly richer

Â growing models. We'll also talk about mean field

Â approximations, other kinds of ways of calculating these things.

Â So, we'll at at another set of growing random network models that will allow us

Â to get richer and richer, the degree distributions out.

Â