0:00

So, let's speak a little bit about how hard it is to compute a Nash equilibrium

in a normal form game. So, let's, let's start with a little

history. John von Neumann, one of the founders of modern game theory, when he

investigated zero sum game it proved the, the existence of equilibrium there.

And he uses what's known as the Brouwer fixed point theorem for that. and that

led directly to algorithms for computing fixed points in such

in such linear programs. first those Danzig's Algorithm that

the real equivalent to what in modern day is called LP duality.

And it's an exponential procedure although it practice to use widely of

note is the Khachiyan polynomial-time approach to solving linear programs.

Although in truth in practice it's, it's not used as widely.

It's a practical procedure. And when you go beyond the zero sum games

so when John Nash approved the existence of equilibrium for general-sum games, he

used the same fixed point theorem of Brouwer.

And that also informed a series of algorithms and they're noted there on the

slide. And we will be looking at two of them,

the Lemke-Howson algorithm and a much more recent algorithm due to Ryan Porter

and others. I will note that all of these are

exponential in the worse case and I'll get back to that later.

So, let's start with the Lemke-Howson algorithm.

And let's start with a a formulation of the Nash equilibrium for 2-player games.

It looks, it looks as follows. and this is a busy slide but I'll walk

you through it and all will become become clear.

At heart are two sets of variables, the s's and the r's.

The s's will denote the will capture the mixed strategy used by the two players,

player 1 and player 2. s1, for example, of s, s2k for example,

would be the rate or the probability with which player 2 plays action k in in its

mixed strategy. So, the s1's and the s2's are the capture

the mixed strategy of the two players, player 1, player 2.

r is, are what they call the slack variables.

And to understand their roles, let's look at, for example, this equation right

there. So, this applies to any action of player

1. For any action of player 1, we look at

the value that it the, the value that it give with respect to the strategy of the

of the other player. And so, we look at all the actions

available to player 2. We look at the pay-off to player 1 given

that he is playing a particular action, j, and looking at that action of player

5:27

So, we're going to add this slack variable, as it's called, that will say,

is this is how much player 1 is missing relative to their best response when

they're playing strategy j. Those are the slack variables.

And, so now, that will also give us the sense for this condition here.

So, the slack variables are always non-negative.

And in a Nash equilibrium, they will be exactly zero, except when you speak about

strategies that are actually played with zero probability by the player.

So, now we talk about the s's. s's, as we said, speak about the weight

and the mix that each player gives to their each of their actions in the each

strategy they play. And so, when player 1 plays strategy j

with non-zero probability, it better be the case that is better, best responding

namely that the slight variable is zero. And when they're playing with zero

probability, you don't care what the factor variable is because they're not

playing that strategy at all. And you capture that by requiring that

the product to be zero. This is exactly the condition, and this

is what makes it a linear complementarity problem.

So, I hope that's clear and you can see now similarly for player 2.

For player 2, we take each of their actions and we say, if they're going to

play it with none, with, with some probability then, and we look at their

best response here given whatever player 1 is going to play their mixed strategy.

And we're going to look at the slack variable here, and again, we're going to

require that the product be zero. In other words the probability that they

play j is nonzero just in case the slack variable is zero, okay?

So this is the nature of this of this mathematical optimization program.

And, of course I forgot to mention, but of course, we want the, the s's to be a

probability distribution. Though, they sum to one and they are all

nonzero. Alright. So, this is our linear

complementarity program. And now come Lemke-Howson who suggest to

find a solution in a particular way. And we won't go over it, but the flavor

of it is to initialize the s's and the r's a particular way.

In fact, to artificially initialize them all to zero, and then one by one take

them, it's called a pivoting procedure. Take the, an, an s and an r in turn

alternating between the two taking them out of the set that has the current value

and replacing it with a complementary variable.

If it's an r, replacing it with an s. And if it's an s, replacing it with an r,

until you get a, an equilibrium. That's the general flavor of it and in,

in this lecture we won't go in more detail into the Lemke-Howson procedure.

But it is a procedure that looks very deeply at what a Nash equilibrium is.

Sets it up as a mathematical program, and then searches the space of variables in

an informed way. Let's now look at a very different

procedure. One that doesn't look in as much detail

as the structure of equlibria but compensates by by performing uristic

search in a certain. So so let's look at it and we'll look at

it at at two stages. The first step is to note that when you

fix the supports of a strategy profile, finding out whether there is a Nash

equilibrium with that support is an easy problem.

Remember that the support of a strategy is consists of all the actions to which

the players is giving nonzero probability in that mixed strategy.

So, let's look at this formulation. Let's look, and this will be limited to

two players. and so let's look at all players in turn, for example, player 1,

and let's look at every action of that player, for example, a sub i.

We'll be looking for some mix strategy, p, mix strategy profile for one for each

of the players that will give us a Nash equilibrium, namely each will be best

responding. And so, for all actions in the support

that we're considering, we'd want the agent to be best

responding. So, let's assume that the best response

value is v sub i, just call it that number.

Then, we want ai to in fact to be a best response to the rest. And what we want is

all other actions, and the other action, none is support, to

give us a value that's no greater than the best response.

And, we want it for each, each of the two players and each of their actions in the

support, so that makes sense. And we want these to be a probability so

we want the probabilities in the support to be nonzero.

We want the probabilities outside the support to be zero,

and we want it indeed to be a probability distribution.

This all makes sense. So, this is a linear program.

It's solved only in polynomial time. that is theoretically there is a

polynomial time procedure. in fact, these procedures used are not

polynomial of the worst case but but nonetheless effective.

By the way, notice that we actually did use the assumption that we're fixing the

support here. Superficially, you might look at it and

say, oh · , I could do the same thing but simply ignore the support part.

Where, where are we using that really? Well, we're using it in the assumumption

that when we're best responding inside and don't have a profit, proper

abbreviation. we're actually playing these pi's with probability with a

positive probability. Because if we, playing the remaining

strategy with zero probability, in fact doesn't matter if we're best responding

to it or not. And so, this is where this assumption is

hidden. So, now we know that when we fix the

support we can solve the question efficiently.

the [UNKNOWN] is the fact that there's an exponential number support to explore.

And this is the second part, we need to simply now start the exploring the the

sets of support. And, I won't go into details about how we

do it. But the basic idea is the following. We

will bias the support to supports that are close in size to one another, that is

we will not start by considering one agent looking at only two strategies

among which is randomizing, and the other agent looking at 17 strategies.

We'll look at a strategic profile of the similar whose support is similar in size.

We also start with small supports and gradualling with the larger supports.

If we do that and we involve, and we, we use another trick called conditional

domination that is look at certain thing we can ignore along the way, then what

turns out that although the procedure is in the worst case exponential as is the

Lemke-Howson in fact it performs in practice quite well.

And in fact better than essentially all other procedures tried including the the

Lemke-Howson. This procedures do have exponential worst

case and so the question is, can we do better?

Are there procedures that are less than exponential in the worst case?

And that takes us from the realm of complex algorithms to the realm of

complex analysis. So, let's first remind ourselves a little

bit about what complexity analysis looks like.

We're looking at classes, whole classes of problems such as the class of old

games, and the problem of determining a a sample Nash equilibrium of those games.

And we're looking at the how hard is that class as a whole. And

so, here are, it's a small part of the complexity hierarchy.

The class P as it's known is the class of problem for which a polynomial kind of

solution is, is known. The class NP is the class of problems for

which a class a solution can be verified in polynomial time, but not necessarily

found in polynomial time. The class NP-complete is the hardest

among all the NP classes, that is the classes to which all NP problems can be

reduced. And perhaps, the biggest

answer problem in theoretical computer science is a question of whether NP is

different from P, widely believed to be but has not been proved.

So, this is the general background we need to keep in mind.

And now, we can ask, well, where does where does the problem finding a Nash

equilibrium reside in P, in NP? What can we say?

Well, first of all, strictly speaking, we can't quite speak about it being in P or

NP because we know from Nash's theorem that a Nash equilibrium always exists.

So, the question, does it exist in Nash equilibrium is trivial, the answer is

yes. So, we need to look at it a little

differently. One way to look at differently is to ask

for Nash equilibrium with specific kind of properties.

So, for example, we can say does it have unique Nash equilibrium? Or does a,

existent of equilibrium that's strictly Pareto efficient?

Or does, is there a Nash equilibrium that guarantees a given player some minimum

pay-off? Or can we guarantee some minimum, some minimum social welfare in a

Nash equilibrium? Does the existent Nash equilibrium that include, that includes

some, fewer, fewer strategies, some action of the player in it?

Or conversely, that does exclude it? All of these and others, are examples of

questions that are provably NP NP hard. Okay. So, that tells us something about

the hardness. But still we still ask about just finding a sample in Nash

equilibrium. How hard is that? So, we've seen the

algorithm. People have tried very hard to find

algorithms computing a sample in Nash equilibrium.

and it does seem hard. The question is, can we somehow capture

that formally within the complexity hierarchy? And and and for that, we need

to introduce some new node, new new concepts.

the essential concept is that of the new class of problems called PPAD for

Polynomial Parity Arguments on Directed graphs introduced by Christophe

Papadimitriou in 1994. we won't go into detail, but just so you

know the chronology. PPAD is a specialization of the class

called TFNP, which is in turn was a specialization of a problem called FNP.

going detail here is, is, is beyond the scope of, of, of what we want to speak

about. But it does help us now position the

complexity of finding a sample Nash equilibrium in the complexity hierarchy.

And again, we have the class of polynomial time problems,

of problem that can be verified in polynomial time with these being the

hardest among them. And given that, PPAD turns out to reside

somewhere within this class. Now again, we don't know whether this

entire class does not collapse and all become one and the same.

It's why they believe that it does not but proof doesn't exist.

However we do know that PPAD lies someplace in between P and NP.

Now what does that have to do with the problem computing a Nash equilibrium.

Well, that's where the, the following theorems come in.

originally, it was shown that the problem of computing a Nash equilibrium is

complete for this class PPAD. That is, it's the hardest among all problems in

that class, initially proved for four players than for all, for games with

three or more players. And then, finally, in '06 for all, all,

all classes game. And so, we widely believe that the

problem is not polynomial, cannot prove it but we do know where it reside and

within the complex hierarchy that we are familiar with.