0:00

Hi. In this lecture we're going to talk about replicator dynamics, and what we

want to do is we want to talk about it in the context of learning. And the idea here

again is pretty straightforward. What we imagine is, is there are some set of types

out there and these types are actions or they're strategies. And each type has a

payoff associated with it. That's how well that type is doing. When we look at them,

we think boy, these type 1's are doing really well. These type 7's are not doing

so well. And then there's also a proportion of each type. So maybe ten

percent of people are type one and 30 percent of people are type two, six

percent of people are type three and so on. So when you think about how people

learn, how do we do, well we've talked about a couple things in this class. One

is we realize that we've talked about how people just copy other people, and the

reason you might do that is because you might think that they're doing something

worthwhile. So in the standing ovation model, we just had people copying what

other people do, you don't have conformity models, people just copy what other people

do. Well if you copy, what's going to happen is you're gonna copy in proportion

what other people are doing. Now another thing you might do is you might hill

climb, that was one of our characteristics. If you hill climb, what

you're going to do is look and see which actions are paying off well. So when we

think about this sort of an environment, there's a whole population of people,

existed for proportions and they're getting different payoffs, we want some

way of capturing the dynamics of that process. And so what we're going to do is

we're going to introduce this model called replicator dynamics that's one way of

thinking about how that dynamic unfolds. Now let's again suppose you were rational,

So if you're rational you look out there and you see a bunch of types. You see that

there's different strategies people are playing: strategy one, strategy two and

strategy three. This one has a payoff of five, this one has a payoff of four, this

one has a payoff of three. You're just going to say, I'm going to choose strategy

one. I look out there and the strategy one people are doing the best. If that's a

rational model you could also have a more sociological model, this is a rule based

model. You say, 'I'm just going to copy the next person I meet figuring that if

they're doing this they must have chosen it for some good reason and I'm going to

pick in proportion to other people.' So now if I look at strategy one, strategy

two strategy three I can say, twenty percent of people are using strategy one,

70 percent are using strategy two, ten percent using strategy three. I'm more

likely to choose strategy two because I'm more likely to bump into someone using

strategy two. So there's two ways you could think about how people might choose

what to do. One would be to really do a detailed analysis of which actions seem to

be paying off the best, be rational and pick that action. Or another thing you

could do is you could be sort of just. I was wanting to know, I was just, and I was

starting to get photographic magazines and read and look. But I really wanted, wanted

to do it so I was doing it first, developing pictures in my. Question is how

do we do it. The ideas we're gonna wanna put weight one each possible action and

so, and we want that weight to include both the pay off, Which is this thing pi

and the proportion, which become probability of i. The one thing we do is

we could add those things up. We could say the weight is just the payoff plus the

proportion. Another thing we could do is the weight is the product. It's the weight

times the proportion. We do either one of these. What we're gonna do is we're gonna

use this one. We gonna use the weight is gonna be the probability that the

strategy's been used times its payoff. Why? Well, here's why. Suppose that you

had something that had a probability equal to zero. So nobody's using this strategy.

Well then there would be no way of seeing it, and so it would be very unlik ely that

they'd deduce it. But if you think about this model where weight, the weight to an

action is equal to the payoff plus the probability, if the payoff is really high,

it means the probability is low, people would use it. Well again we think of a

population of people, sort of copying rather than learning from other people, if

no one is using it. And you couldn't possibly think of it and you couldn't

possibly use. So we are not going to have any room for doing anything new. So if the

probability is zero, we're just going to assume that there's no one or anybody will

ever think of it. So that wipes out this model. We are going to assume the weight

is the product than of the path and the strategy and its proportion. So what that

means is we get a somewhat complicated procedure for figuring out how many people

use the strategy in the next period and this is going to be the replica of the

equation. So here's the idea, The probability that you play a strategy, N

period T plus one. So the probability of playing strategy on N period T plus one is

just the ratio of its weight to the total weight of all the strategies. Cuz remember

the weight is just the probably that somebody plays the strategy times its

payoff. And on the bottom, We're just summing over the weights of all the

different actions or strategies. So the probability you play something in the next

period. Probably someone's of that particular type is just gonna be its

relative weight. Okay, so let's do an example. We've got three strategies, each

one they have payoffs two, four and five, and they exist in proportions a third, a

sixth and a half. So what we want to do now is figure out the weight of each of

these strategies. So the weight on strategy one is its proportion, which is

one-third times its payoff, so that's two-thirds. The weight for strategy two is

its proportion, which is 1/6th times its payoff, which is four, which is also

two-thirds. And the weight on strategy three is its proportion, which is

one-half, times its payoff which is five, whic h is gonna be five halves. So what

you can is putting everything over six if we want. So this is four over six. This is

four over six. And this is going to be fifteen over six. So if we add up the

total weights what we're going to get is 23 over six. So now we want to figure out

what's the proportion they're going to be using strategy one in the next period And

period two plus one that's just going to be 4/6th over. 23 six which is four over

23. The property in E Strategy two is also four over 23, because it has the same way

in the E property Strategy is fifteen over six divided by 23 over six, which is

fifteen over 23. You notice if we add that up four plus four plus fifteen gives us 23

over 23. So, what we do is we start out with a population, The third strategy one,

A sixth strategy two a half strategy three and we end up with Four third twenty-third

strategy one, four twenty-third strategy two, and fifteen twenty-third strategy

three. That's how replicator dynamics works, it tells us how this population

moves over time as a function of the payoffs and the proportions. So here's

what we want to do. We want to apply this to games. So here's a very simple game. We

can think of this as the shake-bow game where shaking has a higher payoff and we

can ask, 'How do the dynamics change?' So what we're going to do is we're going to

assume that there's some population of people, some are shakers, some are bowers,

and we'll see how that population learns. Alright let's get started. So let's

suppose we start out with, one-half shakers and one-half bowers, that's our

original population. Now we want to know what's the pay off. So these are our

proportions. Well, the pay off, if you're a shaker, half of the time you're going to

meet a shaker and half of the time you're going to meet a bower. So if you going to

meet a shaker, you're going to get path of one if you meet a bower you're going to

get a path of zero. So your payoff is one. If you're a bower, half the time you're

gonna need a shaker, you're gonna get a payoff of zero. Half the time you're gonna

get a bower and get a payoff of one. So your payoff is gonna be a half. So now we

just have to figure out the weight for each of these strategies. So the weight on

shaking is just gonna be the proportion, which is one half times the payoff which

is one, so that way it's gonna be one half. The way on bowing is the proportion

of bowers, which is one half, times the payoff which is one, which is gonna be one

fourth. So what we get is, if not to forget them how many shakers of hours

[inaudible] in the next period, whats gonna happen is the probability of shakers

is gonna equal one half which is the [inaudible] of shakers over the weight of

shakers in the balance. So we're gonna get that's just two thirds. In the probability

of someone's a power is gonna be one-fourth over one-half plus one-fourth

which is one-third. So what we see is we started out with equal number of shakers

in powers and now moving towards more shakers but that makes sense because

shakers get a higher payoff. Now if we ran this a whole bunch of times and used

replica dynamics, And we started out with equal numbers of shakers or bowers,

eventually we'd end up with all shakers. So here's an interesting thing, we thought

about. Well, how do we model people? We said, well, we should model people as

rational. If we thought of rational people, we'd say, well then rational

people in this model would choose 2-2. So what replicator dynamics does, it gives us

another way to think about what you're gonna get in the game. It says, let's

assume a big population of people And let's assume initially that there's equal

numbers of each action. So there's equal numbers of shakers, equal numbers of

[inaudible]. And in this case sort of, let the population learn according to

replicator dynamics, and see what happens. And what we see is in this game.

Replicator dynamics would lead us to 2-2. So now if you want you say what's our

model of people, you say, 'We have two different models. One model's rational

actors, Rational actors are g oing to choose 2-2. Another model is people use

this simple rule; this learning rule called [inaudible] dynamics. And if we use

this learning rule called [inaudible] dynamics and unless we start out with a

whole bunch of hours, we're probably also going to end up With everybody shaking. So

that's great, it gives us another motivation for sort of, figuring out why

we're going to get the outcomes we're going to get. Well now we can ask the

question though, does repetitive dynamics always give us the same thing that we get

if we thought about sort of, super smart people playing the game. Well, let's see.

Here's another game and this is called the SUV/Compact game. So here's [laugh] how it

works. You can either drive an SUV or drive a compact car. If you drive an SUV,

your payoff is just gonna be two. Because you just drive your SUV, listen to the

radio, it doesn't matter. If you drive a compact car, and you run into someone

who's driving an SUV, your payoff is gonna be zero. And I don't mean physically run

in, but I mean if you're just driving [inaudible] and you see one, somebody else

has an SUV, there's two things going on. One is, you probably can't see around the

SUV, so that's bad. And also, you're gonna feel a little bit unsafe, so that's also

bad. But, if you're driving a compact car, and the other person is driving a compact

car, your path is gonna be three, cause you're both getting better gas mileage,

you can see around the other car, you feel safe, everybody wins. So if you're

thinking about rational people playing this game, you think, okay, what would you

want to do, you might think well look, three three's got the higher payoff, and

it's an equilibrium because if we're both driving compacts, then we have no reason

to switch. You'd think, that's what you get. Well let's hear what we get from

replicator dynamics. So let's start out with again, half the people driving SUVs

and half the people driving contracts, compacts. Let's figure out the way. On

SUVs. Well, half the people are playing, are driving SU Vs. And your payoff if

you're driving an SUV regardless of who you meet is two. So, the weight on SUVs is

just gonna be one. What about the weight on compacts? Half the people drive

compacts. Now, what's their payoff? If you're driving a compact half the time you

meet someone with an SUV. So that gives you a payoff of zero. And half the time

you meet someone driving a compact and that give you, so half the time you're

gonna get a payoff of three. So that means your total payoff is gonna be three

fourths. So the weight on SUV's is one, The weight on compacts is three fourths.

So now I wanna ask what's the probability that someone drives an SUV in the next

period, That's gonna be one. Over one plus three fourths, which is going to be four

sevenths? And if that's what the property is when somebody drives a compact, that's

gonna be three fourths over one plus three fourths, which is three sevenths. So, what

we see is we're gonna see a drift toward SUV's. And so, what we're gonna get in

this game is that people drive SUV's. [inaudible] We put replicate dynamics on

this game, we don't get 3,3, as an outcome, we're more likely to get 2,2 as

an outcome. And, what's gonna happen is evolution leads us to something that's

sub-optimal. So now we've learned something interesting, that these

replicator dynamics, this evolution of strategies leads us not to the optimal

thing which is 3-3 but leads us to 2-2. There's actually a book written on this

that's called high and mighty by Keith Bradshoe, where he talks about, why is it

that people drive these big SUVs if it doesn't make sense? Bradshoe makes and

argument that it's really through the evolution of choices that have caused us

to all be driving SUVs when we'd be collectively be better off if we were

driving the compacts. And it's evidenced by this picture here, you can see that in

the little car, you're sort of frightened by the big car and you can't see around

it, and so as a result, the dynamics, So led us towards big cars, even though we

collectively would be better off if we were all driving, Smaller cars. Alright,

what have we learned in this lecture? We've learned that one way to think about

what people decide to do is to construct a model based on [inaudible] dynamics. And

[inaudible] dynamics, they capture two fundamental social processes. One is the

fact that people are fairly rational. We copy. We take optimal actions. Another

thing that people tend to do, is we tend to copy other people. Cuz if we write down

a model where people do both, were we sort of try to do the thing that best but also

copy other people or think of combing those, we copy people who are doing well.

What we say is that's the way the sort of think about a population of people might

go. Now we saw in some cases like the shape [inaudible] game that's going lead

us to make the optimum choice. But then we saw other games, like the SUV compact car

game that in fact, it didn't let us choose something that is safer the SUV, than not

the thing that we choose if we were rational and sat back, and said what's the

best choice here. So [inaudible] dynamics really interesting way to model learning

and give us sort of surprising insights into some games what is likely to happen,

different insights then we get if we assume people were quote UN-quote

rational. Alright, Thanks.