0:08

And the idea of this contest was actually to estimate the weight of an ox.

So this ox was placed up on display and the, a bunch of

villagers had to guess the weight of the ox that was put on display.

So the way this worked is that the villagers would come

up, and they would get a good look at the ox and

then they would make a guess, write it down on a sheet

of paper, and maybe put it into some, some bin or something.

They would, they would put it in, submit their guesses in here.

And it was very important also, that they did not look at what

each other, had guessed so that there was no you couldn't copy anyone else.

So you looked at the ox, you formed an opinion of

what of, of what its weight was, and you submitted your guess.

And there were 787 participants.

So 787 villagers who participated in this experiment.

So Sir Francis Galton analyzed these results, and he

wanted to see what he could come up with.

And that one person guessed the true weight of the ox which was 1,198 pounds.

So, so no one guessed the true weight.

But the average of the guesses was 1,197 pounds

which is an error of only less than 0.1%.

So, very, very small error in this guess.

So you're probably wondering, well, how could that be, right?

Because nobody was anywhere near the true guess.

They were all over the place.

And some people guessed way too high.

Some people guessed way too low.

But when we average them, we get something really close, to what the true value was.

1:42

So as it turns out there were several factors

at work here, that played into favor of this

whole guessing game and averaging of the guesses, actually

coming out to be something, very meaningful and very accurate.

The first one is that the task is relatively easy, right?

So you just, you look at the ox and you guess the weight and you put it in, right?

It's a very objective task.

The second is that the estimates were independent and unbiased.

So independent was the, basically the idea the way of that is made,

is that nobody was able to look at any, what anybody else submitted.

So everybody does their task entirely independently, and it's not, so what

you guess is not going to be dependent upon what anybody else guesses.

Then unbiased means that there was no systematic tendency for anyone

to guess, or for everyone to guess higher or lower, right?

And the way to create that in this situation would be something like,

if somebody wanted the true, the people to guess too high or too low

and they added some, some decoration on top of the ox and made it

look like it was heavier than it was, then everybody might guess too high.

2:43

So that's the idea.

So, if we draw this out here, let's just put everything on a number line.

And we say that the true value whatever this is

in general, this guessing game is right here at the center.

And say this is something like 100.

So the idea is that if I submit a guess maybe I'll say 90, for whatever it is.

Maybe then you submit a guess and you say 120, or maybe you're smarter than I am.

So I guess 120 which is 20 off and you guessed 90 which is only ten off.

Someone else comes in makes a guess here.

Someone else makes a guess here.

Someone makes a guess, guess more and more.

The idea is that even though they're far

off, they're going to concentrate around this center point.

So the concentration's going to be around here, and when we

average them we're going to get something close to that center.

So if they were biased just so you understand what bias is, right?

This is the true mean.

So there would have to be some systematic tendency to

guess higher than this, so that everybody would guess higher.

And, if it was biased then that would mean

that this distribution was actually centered up around here.

So this would be a bias.

3:51

So everybody, instead of, you guessing 120,

or instead of me guessing 120, I would

have guessed, 180 or something and, you

would've guessed, the true value was being 140.

So, there, there aren't values below and above.

There's only values above.

And similarly if there was a bias below, there would only

be values below or they would concentrate around the center, center below.

4:14

The third is that there were enough people participating here.

So 787 participants is enough to get a,

a good average in this case for this experiment.

And as we'll see the number of people that need to

participate is going to depend upon really how difficult the task is.

And also how independent and how unbiased the estimates are.

4:35

So returning now to Amazon, our hope is that when we average customer

ratings for a product, the result is going to get close to the right rating.

But can we even say that such a true even exists?

And doesn't this depend at least somewhat on the specific customer?

So, let's look at the factors that we just identified as

being important in Galton's experiment in the context of Amazon reviews.

First is the definition of task.

So, guessing numbers is a pretty easy task to do.

Guessing the weight of an ox is a straight-forward task.

Reviewing a product on Amazon, answering a rating is a little more difficult.

The second is the independence of the reviews.

Success in opinion aggregation stems not from having many smart

individuals in a crowd who are likely to guess correctly.

Rather, it's from the independent each individual's view

from the rest, as we talked about before.

My review isn't dependent upon yours.

I form my own opinion.

So the question is, are Amazon reviews independent of each other?

And they kind of are, not, not entirely though.

Because you could go on and look at the previous reviews

of a product and read through them before you submit your own.

But given that you've tried out the

product on yourself, you're probably going to form

somewhat of you own opinion at least and submit somewhat of an independent review.

Third one is the review population.

So Galton's experiment would not have worked as well as

we said if there had not been 787, 800 people participating.

And in general, the harder the task is and the less independent the

reviews are, the less unbiased and the

less independent they are from one another.

The more people we're going to need in order to get a good

estimator from the average, or from the collective guess at the population.

And so, those factors are all going to come into play, here.

So, wisdom of crowds we can actually

summarize this, what we've been saying, mathematically.

6:37

So, what it says is that the error in the average, so if we

take a number of guesses, so for instance, this is for these five people here.

We take their guesses, and we take the average of their guesses.

The error that that average is going to have, roughly speaking, is

going to be the error that each guest has, you know, assuming that

they each have the same, not, not the same error, but

the same distribution in their error divided by the number of people.

So it's going to lower by the number of people that are in that estimate.

So if there's five people, it's going to be five

lower than the error that each guess has on average.

So when we take the average, we're going to start to see less and less guess.

7:18

And the reason that we use an approximation symbol here is

because naturally not every guess is going to have the same error.

And in fact, if they all had the same exact error this

wouldn't help at all because then they would all be biased up.

That's would actually be a bias condition there and it wouldn't help at all then.

What we need, we need all these errors to be able

to cancel each other out, but if we take the expected

value of the error we say statistically, divided by the number

of people, it's going to have the expected error in the average.

7:47

So, very important thing though, is that this is

only going to hold if the guesses are independent and unbiased.

And the reason that, that needs to hold this

because these errors intuitively need to cancel each other out.

Again, if my, if the true value is over

here, right, we can't all be guessing up here, right?

Or else, when we average out we're going to be nowhere close to the mean.

We have to some higher, some lower than the average, and so forth.

And we have to have them not being dependent upon one another either.

So, independence and unbias are absolutely essential

for the wisdom of crowds to hold up.

8:26

So just as the example of application of this equation

suppose we have one person, right, and his guess is three.

So the average here, is going to be three divided by

one, which is equal to three, because there's only one person.

8:43

Sorry, three.

And suppose that the error, is one.

So we expect him to be too high by one.

Now, suppose we instead have five people, and they

submit guesses of three, two, three, four and four.

So there's five people here, down the line.

And they submit their guesses.

So if we take the average of this, we're going to have

three plus two plus three plus four plus four, divided by five.

9:37

So now, suppose we have a large number of reviewers for a product on Amazon.

Does this discussion imply that the average rating is

going to be close to the truth that we want?

Well, not necessarily, because as we

pointed out, they're many complications, right?

The task is not exactly easy.

There's not exact independence assumption, and their

view population may not be large enough.

And so let's explore a few of these challenges next.