0:00

[SOUND] Let's talk about multivariate covariances.

Â So, these are a little less, these are often less described than

Â multivariate variances, but I find them very useful, so

Â I'm going to just go over their properties really quick.

Â If I have a vector, x and another vector x, then the co-variance between x and

Â y is going to be defined as the expectant value of x minus its mean.

Â Letâ€™s call i mu of x, out of product with y

Â minus its mean letâ€™s say mu of y transpose.

Â So notice first off the multivariate covariance is not symmetric,

Â so covariance x, y is not necessarily equal to covariance y, x.

Â We also note that if we plug in y equal to x, we get the variance,

Â so covariance x, x is going to be equal to the variance of x.

Â It also has a shortcut formula just like univariate covariants calculations have

Â a shortcut formula.

Â And that's going to be the expected value of the outer product of x and y,

Â minus the outer product of the expected value of x and the expected value of y.

Â And I think given the way in which we derive the shortcut formula for

Â the variants, you should be able to now, at this point,

Â derive the shortcut formula for the covariance.

Â 1:32

And perhaps the most useful one,

Â is that if we take the covariance of (Ax,

Â By), where A and B are constant matrices.

Â Then that's going to be A times the covariance of x and y, B transpose.

Â So the left hand arguments gets pulled out to the left side.

Â And the right hand one gets pulled out to the right side, but then gets transposed.

Â The second thing is that covariance of x + y and z.

Â Let's say, suppose we have three random variables,

Â covariance of x + y and z, = the co-variance of x and

Â z, + the covariance of y and z.

Â And similarly the covariance of x and

Â y + z is going to be the covariance of x + y.

Â 2:44

So we can also look at formulas that are useful such as the variance of a sum.

Â Now we can write it out correctly, if we want to know what the variance of x

Â + y is, and I suggest that you prove this yourself

Â just using the collection of rules that we've given you so far.

Â The variance of x+y = the variance

Â of x + the variance of y + the covariance

Â of x,y + the covariance of y,x.

Â And notice in this case the dimensions work out because we are assuming

Â that x and y are both end by one so addition is meaningful.

Â So that covariance x,y and

Â covariance y,x have the same dimension, its not guaranteed.

Â In general, it will often be the case that the covariance is not a square matrix if

Â y, for example, has a different dimension then x.

Â But in this case we are assuming it does because we are assuming that x + y is

Â meaningful.

Â So anyway, you get this nice covariance formula.

Â So, if x and y are uncorrelated then these two terms would be zero or

Â mutually uncorrelated the vector x are mutual and correlate with vector y.

Â Then those two terms would be 0.

Â And then the variance will distribute across sums.

Â Just like in the case if we have uncorrelated

Â random variables then the variance will distribute across sums, okay.

Â So I think that's pretty much all you need to know.

Â One more extremely useful fact.

Â 4:23

Often, and this may seem strange, but

Â it's often the case that A times y and B times y,

Â two random vectors that are functions of the same, kind of originating vector y.

Â Right, the covariance of those can be zero, right?

Â So, this is the co, this is A time the covariance of yand itself,

Â B transpose, and let's just assume that the covariance of y and

Â itself is this multivariance covariance matrix sigma.

Â So what we can see is that Ay is going to be uncorrelated with By if and

Â only if this quantity is exactly zero, okay?

Â And that's a tremendously useful fact that we will use quite often.

Â 5:17

So in the next couple part of this section we're going to talk about quadratic forms

Â and how we can calculate moments in those cases.

Â And we'll talk about a way in which we can prove an optimality result now that

Â we have multi-variant expected values.

Â If we make some assumptions about our response and our predictors we can start

Â to add statistical properties to our least squares estimate,

Â which up to this point we've only discussed their mathematical properties.

Â