0:00

>> . In this module, we will be walking through

a review of linear optimization using a hedging example.

We are going to see how there are two kinds of optimization problems, something

called a primal optimization problem, something called a dual optimization

problem, how they are related. And we also introduced the idea of La

Grangian relaxation. Here's the hedging problem that we are

interested in. I've got d assets.

So take d, just to fix ideas, take d to be 3 or 4.

But some number of assets that are there in the market.

The price of these assets at time 0 is r to the d.

It's some price vector, which has d components, where every component tells me

the price of that particular asset. At time t equal to 1, the market outcomes

are uncertain. I don't know what, which state the market

is going to be, it's going to be in one of m possible states.

So here's the situation at time T equal to zero, and I'm in state zero, And I could

go to one of n different possible states. Go from one, two, all the way to, down to

m. What do I mean by a state?

For my purposes, state is simply telling me what are the different prices that can

happen. So I'm going to characterize every state

by the prices of the asset in that particular state And I can do it in 2

different ways. I could either tell you, what is the price

of all the assets in any given state, Or I can tell you the price of a given in all

possible states and we'll flip between these two ideas as we go through.

We'll use the power of matrices to see what it means, sometimes, it's easier to

represent it one way, Sometimes it's easier to represent it the other way.

And understanding it both ways gives us an insight of which one is more beneficial

for the particular application that I'm looking for.

So, to motivate that, I'm going to define something called Sj.

So S sub j, will be a column vector, and it will tell me the price of asset j in

all possible states, So it's s1j, s2j all the way down to smj.

J is fixed, it's asset j and the number of states would ran-, go from 1 to m.

Now how do I represent all the assets in the market?

I'm just going to take these column and stack them.

S1 would refer to the column corresponding to S, asset one S2, asset two and so one

up to S D. If I write them out in gory detail, you

end up getting this matrix. The matrix has M rows, because it's got M

states, its got D columns because its got D assets.

And what going on here, every row here, tells you the price of all the assets in a

particular state. So what I've circled here, are the prices

in state 1, S1d, 12S, 11S, 12, all the way up to S1D.

Genetically, somewhere it's going to be Sm1, Sm2 , all to Smd.

So every row tells me what happens to all the assets in a given state, every column

tells me what happens to a particular asset in all the different states.

Alright, so that's how I'm going to describe my market.

At times 0 I know the price, at time 1, which is the place where the market opens

again, I don't know the price, it's uncertain, it's going to be one of n

possible states, but I know that if the state is given to me, what the prices are.

Now you might think that this is a very simple representation of the market, and

indeed it is. But it turns out, that even in the most

modern methods, of risk management, like value at risk, and conditional value at

risk, people represent what happens to the market By a model which looks very much

like this. Except instead of talking about states,

we'll talk about simulations. I'm going to simulate returns and I'm

going to say, depending on all of these different scenarios, what happens?

But essentially the, the main ideas are going to be captured by this simple toy

model. Alright, I know what happens to the asset.

Now, what I want to do with these assets is to hedge an obligation.

I'm going to walk through what does hedging mean.

So, I have an obligation. What is an obligation?

It's a vector x in rm. Why m?

Because the obligation depends on the state.

In a good state I might had to pay more, in a bad state I might had to pay less.

So depending upon which state occurs, I'm going to pay an obligation xi, if this

state i occurs. And what I want to do, is I want to buy a

portfolio now, I want to buy a certain number of shares of the assets right now,

in order to have enough money to pay my obligation.

So, I'm going to choose a portfolio theta. Theta-1 through theta-d are going to be

the number of shares that I'm going to purchase of each of these assets.

I'm going to allow for the possibility of short selling, so thetas could be

negative, or I'll buy them long, which is, when thetas are positive.

And I want to do, I want to choose this portfolio, so I can hedge the obligation

that, That I'm interested in hedging. So I'll step you through what it means and

what, what hedging will ensue and then we'll go to what is the linear

optimization problem that we end up getting.

Okay! So at time 0, I'm going to put, buy a

position theta at rd. Why d?

Because it's got d different assets. So theta J's the number of shares of

assets J that I purchased, where J goes from 1 through d.

What's the cost of this purchase? The cost of the position theta, is simply

the price of every asset times the number of shares that I produce.

A lot of those assets! So it's pj times theta j, sum from j equal

to 1 through d. It's the inner product of the vector p,

with the vector theta. And we know that inner products are

nothing but p transpose times theta. What happens at time t equal to 1?

So, if a state i occurs, then I'm going to liquidate my position.

And when I liquidate positions, I'm going to sell the assets.

And what's going to happen? In state i, the price of asset j is s, i,

j, I hold theta j positi-, shares of this asset, so by selling asset j I get Sij

times theta j But I'm going to liquidate the entire portfolio.

So j I'm going to sum from 1 equal to d and that's the amount of money, that's the

pay-off that I'm going to get in the stake.

Its just Sij time theta j sum from j equal to 1 through d, That gives me yi.

And if you just think about it for a moment, if I stock up all the pay-offs as

a vector. If I call a vector y to be y1, y2, all the

way up to y m, the pay-offs in the m states This is nothing but the matrix S,

Times the vector theta. Just to remind you, this is my matrix S,

it has prices for every state as rows. So the payoff in the first state would be

this row times the vector. And the second state would be this row

times the vector. And that's exactly what is being done

here. Sij, as j goes from 1 through D, is a row.

You take the i'th row, multiply it by the vector, you get the payoff in the i'th

state. And therefore, the vector y is simply the

matrix, times theta. Now I want to look at this, in a different

way. I want to think of this, as not

multiplying row, row by row but column by column.

So, instead of looking at this matrix S, as row by row as we have done so far, I'm

going to put them in columns. I've got the column for the first asset,

column for the second asset, column for the d asset and, that's going to multiply

theta-1 theta-2 up to theta-d. And you can see that this multiplication

is nothing but, this row, times this column, so you'll get theta j times Sj.

J is summed from 1 through, that should not be an n, but a d.

It goes from j equals 1, through d. Interpreting it this way, you end up

getting a different interpretation of what is going on.

Now, the pay-offs ys are nothing but linear combinations of the columns.

So vector y, will, I can represent, I can generate a particular pay-off, if it's in

the range of the matrix S. Remember in the concept, we had introduced

this concept in the model of matrices, that a vector y, belongs to the range, of

s, if these are all vectors s-theta, where theta is in, r to the d.

And therefore, y belongs to the range of S.

And remember, we had also introduced the notion of rank.

We had said, that rank tells me how rich that space is.

Can I, how many different vectors can I generate in the range?

So if the rank of the matrix S is n, is equal to m, meaning all possible pay-offs

can be generated, then I can hedge everything.

If on the other hand, the rank of the matrix s, is less than m, that means there

are certain pay-offs. There are certain pay-off vectors that can

not be generated because they can not be produced as if I'm make, taking the matrix

S and multiplying it to theta. So that's the concept that's going to play

a role in the course.We're going to talk about complete markets and incomplete

markets, and that has to do with the rank of this matrix.

Before we get there, here's a simpler notion.

We'll say that a pay-off y hedges x, if y is greater than or equal to x.

Component by component, the pay-off that I've generated using my portfolio is

greater than the pay-off that I need to hedge or give-, the payoff that I need to

give at time 1. Alright, now comes our first optimization

problem. What's the optimization problem?

I want to minimize the price of the portfolio such that it hedges my

obligation. I want to minimize p transpose theta, such

that s-theta is greater or equal to x. And, what are some features about this

optimization problem? The objective!

What I'm trying to maximize or minimize is a linear function, linear function of

theta. The constraints!

Constraints, and what thetas that I can choose, are again linear constraints, as

theta must be greater than or equal to x. Linear or inequality.

So any optimization problem, which has a linear objective function, either

minimization or maximization it doesn't matter, and all the constraints are either

linear inequalities, as is the case here, or linear equalities.

We will call that problem, a linear optimization problem, or a linear program.

It turns out that linear programs are very rich, there's a rich theory about them,

and you can do a lot of interesting things with them.

You can model lots of problems, you can solve them very efficiently, you can get a

lot of interpretation out of them. So the one thing that we're going to focus

on in linear optimization, and the interpretation of linear optimization is

the notion of duality. And what do I mean by the notion of

duality? For every linear program, I can write

another linear program which is intimately connected to it, and this connection is

called duality. So I've got a linear program here minimize

x, c transpose x, Ax greater than or equal to b.

Here's another linear program. Maximize B transpose U, A transpose U

equal to C and U is greater than or equal to 0.

Min goes to max. Now, for the purposes of this course, you

will not be responsible for how I generated the dual linear program from the

primal linear program, or the second linear program from the first linear

program, we will give you that in the course.

The only thing that you're going to be responsible for is, to understand the

relationship, and we'll emphasize this again during the course.

They're here from-, some of the interesting resolve that comes from this

duality concept. The first thing is something called weak

duality. What it says is that this minimization

problem. So the way I, the picture that I have, I

wanted to keep in mind is that here's my value P.

For all feasible values, for all xs, such that Ax is greater than or equal to wha-,

b, I get some numbers on this side. Why do I get greater?

Because I'm trying to minimize. So I get the lowest possible number is p.

I've got another number d. Which is the value of this second linear

program down here. For all Us that are feasible, meaning that

satisfies A transpose U equal to C, and U is greater than or equal to 0, I'll get

values that are less than this D. Why?

Because this is a maximization problem. The largest possible thing that I can get

is B transpose U, is going to be equal to D.

The first theorem says, that in fact, this picture that I'm drawing is correct.

That P is going to be greater than D. There's no reason for it to be that way,

they could have crossed. But the nice thing is, if you construct

the dual linear program correctly, and in the next slide I'm going to show you a

simple example. Again let me emphasize you're not

responsible for knowing it. Just walking through the exercise, and

understanding how I'm going to use the duality.

So the primal and the dual are intimately connected; the optimum value of the primal

P is greater than equal to the optimum value of the dual D.

And because this is true, you end up getting a chain of inequalities that are

very good. We know that c transpose x is greater than

equal to p. Why?

Because I''m trying to minimize c transpose x, so for any feasible x,

anything that satisfies the inequality Ax greater than equal to b, I'll get a value

greater that b. The second piece is also true, D is

greater than equal to b, transpose u because I'm trying to maximize b transpose

u. And the inner part comes because of the v

duality. So now this gives you a very interesting

way, if I can find an x, and I can find a u, such that c transpose x is close to b

transpose u, then I know that the x must be very close to optimal, And u must also

be very close to optimal for the dual. Why?

Because P is greater than, equal to D, if these two at, points are very close, it

must, it must be that P is also very close to c transpose x, which means that it's

optimum. Similarly, b transpose u must be close to

D, which means that is optimum. You end up, you can go one step further,

and say that when either P, or D, is finite, either the primal value is finite,

or the dual value is finite, then in fact, they must be equal.

And finally, the reason why we call them dual, is because if you go from the primal

to the dual, and from, you take the dual of the dual, you get back the primal, and

that's why they're called dual linear programs.

Going a little bit further. Here's another pair of primal dual linear

programs that we'll be, that we'll be using in the course.

Minimize over x, c transpose x, Ax equals b, they, it's equal to maximize u, b

transpose u, A transpose u equal to c. And this equality, I'm putting it there to

emphasize the fact that there is strong duality between them.

But to keep in mind that this equality holds only if you can show that either

this one, p or this one, b is finite. So if p, or d is less than, is less than,

is not equal to let's say, plus infinity or minus infinity, in that case these two

values are the same, and in that sense they are equal.

So, the last piece in this module, we're going to walk through and tell you how to

construct a duel, it's a very general concept called La Grangian relaxation.

And we'll use that in the next module on non linear programming, as well.

So here's the problem. The primal problem is, minimize c

transpose x, Ax, greater than or equal to b.

Now I'm going to take a vector u, which is greater than or equal to zero.

And so u is component wise greater then or equal to 0, and remember Ax is going to be

greater than equal to b. So Ax minus b is component wise greater

than equal to 0, you take some vector which is component wise greater than equal

to the oh, multiplied to another vector that is component wise greater than equal

to 0, you end up getting a number, which is greater than or equal to 0.

So I'm subtracting it, so I'm getting a number, which is less than c transpose x.

So this linear program, which has a changed objective function involving this

vector u, is going to have a value, which is going to be less than equal to P.

B transpose U does not involve the decision x, it does not involve the

minimization, the minimization is going on over x, it does not involve x, so I pull

that out. I end up getting a new problem, which is

minimize c minus a transpose u times x. And what happened to this constraint?

I threw it away! Why did I throw it away?

It's complicated. I don't know how to deal with constraints,

so I threw it away. But I'm guaranteed that if I throw away

these constraints my set over which I can optimize, the xs over that I can chose

become larger, and therefore this minimum only becomes smaller.

So I end up getting that this quantity, is going to be smaller than the previous

line. Now, because I don't have any constraints,

I have a very simple problem. I've got some vector, let's call this

vector d. I've got d transpose x.

So I want to minimize d transpose x. So here's my d vector.

I want to minimize this, d transpose x, and I can choose my x to be anything.

So what am I going to do? I'm going to choose my x to be in the

negative direction and going off to infinity.

Here's my b, here's my x! If I multiply d and x together I get a

very large negative number. So if I have vector d which is not equal

to 0, I can make this optimization problem equal to minus infinity and that's what

this says. If d is not equal to 0 and, just to

emphasize d's equal to c minus A transpose u, if this is not equal to 0 you get to

minus infinity. If in fact, d is equal to 0, which means

that c is equal to A transpose u, then I can't do anything over here.

This vector is equal to 0, I multiply to any other vector, I'll get a 0.

So you end up getting that this minimization problem has a value equal to

0. And p is greater then equal to b transpose

u. Now u is arbitrary, the only thing that I

needed to do was, which is missed over here, is that I needed to have that u must

be greater than equal to zero. So now, you have p must be greater than

equal to maximize b transpose u, which is the value here, provided A transpose u is

equal to c and u is greater than equal to 0.

That immediately gives you a weak duality, a little bit more work gives you a strong

duality. So here's the connection.

Max-, minimize C transpose x, Ax greater than or equal to B is equal to maximize B

transpose U, A transpose U equal to C, and U greater than equal to 0.

That's what we derived over here. What we did, we dualize constraints, and

this word will show up some times during the course.

Dualize means I take the constraints and multiply them by a variable which has a

particular sign. We had a constraint Ax minus b, I

multiplied by a vector of u which is greater than equal to 0, that's

dualization and then I'd relax the constraint.

I have this constraint Ax greater than equal to b, I didn't like it, it was too

complicated, I threw it away. I'd relaxed them!

And by doing dualization and relaxation, you end up getting something called a La

Grangian relaxation, which gives you duals, And gives you some very nice

properties that we'll explore more in the course.