0:00

So if you look at, if we break down the

Â components of the singular value decomposition we can take, look at

Â the u, the kind of the u and the v

Â matrix of the left singular values and the right singular values.

Â So you, if you look at the original data matrix here on

Â the left, so that's the image plot here that, that we saw before.

Â 0:19

And then on the, in the middle plot here, I plot the, the left, the left,

Â the first, sorry, the first left singular vector.

Â And it kind of roughly shows basically the mean of all the, of that kind of data set.

Â So if you look, if you plot, if you plot it across the rows, you can see that the

Â first singular vector is kind of, has a negative value.

Â For rows 40 through about 18 or so or 17.

Â And then it has a positive value for the remaining rows.

Â So that shows the clear separation in the means of the two sets of of rows, right.

Â And if you look at the the, the, the

Â right sing, the first right singular vector you can see

Â that this also shows the change in the mean between

Â the the first five columns and the second five columns.

Â And so the nice thing

Â about the singular value decomposition here is that

Â it immediately, it immediately picked up on the shift

Â in the means, both kind of in the

Â row, from the row dimension and the column dimension.

Â 1:27

Without you having to tell it anything.

Â It was totally unsupervised, and it just kind of picked it up automatically.

Â So remember, we made a plot that was very, very similar to this before, but that

Â was be, when we knew that there was this kind of pattern in the data set.

Â So here the, the singular value decomposition

Â picked up the pattern right away.

Â And, and, and in the first kind

Â of singular vet, left and right singular vectors.

Â 1:51

Another kind of component of the singular value

Â decomposition is, is known as the variance explained.

Â And this comes from the singular values that are in the d matrix.

Â Remember the d matrix, is really a diagonal matrix.

Â It only has elements that are on the diagonal of that matrix.

Â And you can think of each singular value as representing the percent of

Â the total variation in your data

Â set that's explained by that particular component.

Â And so, and the components are typically ordered so that the first one explains

Â the most variation as possible, and the

Â second one explains the second most, et cetera.

Â And so you can plot the kind of proportion of

Â variance explained as we have in the in this plot below.

Â And so here on the left hand side I've just plotted the raw singular values

Â and you can see that they kind of decrease in value as you go across the columns.

Â But of course the raw singular

Â value doesn't really have much meaning cause it's not on an interpretable scale.

Â So if I divide by the total sum of

Â all the singular values, then on the right hand side

Â I've got the, I can interpret it as the

Â proportion of the variance explained, and you can see that's

Â exactly the same plot but the y axis has

Â changed, and so you can see that the first singular

Â value if you recall it captures kind of the shift

Â and the mean in between the rows and the columns.

Â That,

Â that captures about 40% of the variation in your data.

Â Alright, so almost almost half of the variation in the

Â data is explained by a single kind of singular value.

Â Or you can think of it as a single dimension.

Â And then, the remaining variation in the data is explained by the other components.

Â 3:25

So, just to show that the relationship between the

Â singular value decomposition and the principle components is close, see,

Â I, I, here, I plotted, I've calculated the SVD

Â and the, and the PCA of the same data matrix.

Â And I've plotted the principle components on the, X axis, and the

Â for, sorry, the principle component, first principle component on the X axis.

Â And the first right singular val, vector on the y-axis.

Â And you can see that they fall exactly in a line that

Â they're exactly the same things.

Â So the svd and the principle components, and

Â principle components analysis is essentially the same thing.

Â And so if you hear, you know, you're in a cocktail party and you hear two people

Â talking about the svd and pca you can

Â rest assured that they're basically doing the same thing.

Â 4:18

a, kind of a matrix that has either zeros or ones.

Â And so the idea is that there's basically

Â there's only one pattern in this matrix, right?

Â And so the first five columns are zeros.

Â And the second five columns are wants that's it, nothing special and so if I

Â plot the data you can see on the left hand side that the, left part,

Â the first five are red with the further kind of second five are yellowish white.

Â 4:44

So now if I take the SVD of this, kind of,

Â relatively boring matrix, you can see that I can plot the

Â singular values and you can see that there's one singular value

Â that's very high and the rest are kind of zero, basically.

Â And what you'll see is that the first singular

Â value explains 100% of the variation in the data set.

Â So how can that be?

Â Well, if you think about this data set, even though there is, kind of,

Â 40 rows andten columns, there's really only one dimension to this data set, which

Â is that, if you're in the first five columns, your mean is, you're equal

Â to 0, and if you're in the second five columns, you're equal to 1.

Â So even though you have lots of so-called observations, there's really only

Â kind of a small fraction of information that's actually useful in this matrix.

Â And so that's captured by the SVD.

Â By the fact that you can explain 100%

Â of the variation with a singular, a single component.

Â 5:34

So let's add a second pattern to the data set.

Â So we can add a pattern that's kind of that's

Â kind of goes across the rows and also goes across the columns.

Â And so one pat, the first pattern to be,

Â going to become a block pattern that we saw before.

Â So the first half of this is going to have one

Â mean and one half of this is going to have another mean.

Â And the second

Â column, the second pattern, basically, is going to alternate between the columns.

Â So you can see what that looks like over here.

Â So on the left hand side that's the data and you can see that there's one pattern.

Â That's kind of like, that's kind of a block pattern so the

Â left five columns are low and the right five columns are higher.

Â And then you can see that, within that, kind of nested within

Â that, is a kind of pattern that's kind of every other column.

Â And so, so the, every other column is, one is higher than the other.

Â And so if you plot, if we, if you plot the kind

Â of, what the truth is, right, we can see that on the,

Â in the middle plot here, the first five columns have a have

Â a lower mean, and the second five columns have a higher mean.

Â 6:31

And that's the first pattern, and then the second pattern shows that the first column

Â has a mean of 0, the second column has a mean of 1.

Â The third column as

Â a mean of 0 and the fourth column et cetera.

Â So there's, there's two patterns here.

Â One is a shift pattern and the second is

Â this kind of alternating pattern, so that's the truth.

Â But of course, we rarely know the truth, so

Â we need to learn the truth from the data.

Â So if you, the idea is if you were presented

Â with the data set that we've shown here on the left.

Â 7:11

So if we run the SVD on this new matrix with the two patterns.

Â I got the data here on the left and the middle pan,

Â the middle panel here, I've got the first, the first right singular vector.

Â And you can see it roughly picks up the block pattern.

Â Right?

Â So the first, kind of, five are, are

Â somewhat lower, and the second five are somewhat higher.

Â Right? So it's not as pretty as the truth was.

Â But it's somewhat discernible, that there are two different kind of means here.

Â The second sorry, the second right singular

Â vector is on the right-hand panel here.

Â And you can see that, it's a little bit harder to see,

Â but it does try to pick up the alternating mean pattern, right?

Â It's not as obvious as it was when we were plotting the truth, of course.

Â But you can see that every other point is either higher or lower.

Â Now of course since we know what the truth is

Â it's a little bit easier to, kind of, talk about this.

Â 8:05

But but in general you can see that the two patterns are roughly confounded

Â with each other because it's a little bit hard to set the two patterns apart.

Â So even though they're clearly in the, we

Â know that the truth is, is, is two separate

Â patterns, the, the first and the second right

Â singular vectors are also known as the principle components.

Â 8:23

They kind of mix the two patterns together,

Â so each of them has a block pattern.

Â And each of them has a kind of alternating pattern.

Â And so unfortunately, just, just like with most real data, the truth is a

Â little, is a little bit hard to discern than if you'd known it in advance.

Â 8:40

Now if you look at the variance explained in this problem

Â you can see that on the right-hand panel which is the

Â percent of the variance explained, that the first component explains over

Â 50% of the variation, the total variation in the data set.

Â And that's basically because the shift pattern you know, with

Â the first five columns, the second five columns is so strong.

Â It represents a large amount of variation in the data set.

Â So you capture that shift

Â pattern, you kind of capture a lot of that variation.

Â You see the second component only captures about 18 or so

Â percent of the variation and it kind of trails off from there.

Â And so that's going to tell, that kind of tells you roughly,

Â you know, how many components are in this data set.

Â But that alternating pattern which kind of, which is kind of every other

Â column is a little bit harder to pick up, as you can see.

Â