Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

From the course by University of Houston System

Math behind Moneyball

36 ratings

From the lesson

Module 1

You will learn how to predict a team’s won loss record from the number of runs, points, or goals scored by a team and its opponents. Then we will introduce you to multiple regression and show how multiple regression is used to evaluate baseball hitters. Excel data tables, VLOOKUP, MATCH, and INDEX functions will be discussed.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Okay, let's try and show how you can use linear weights to predict how a team that

is entirely made up of single players statistics would perform.

So let's take Barry Bonds' 2004.

Now, we got a lot of intentional walks and if the team was all Barry Bonds, you would

never walk him intentionally because all you would do is get to the next vary lot.

Okay, so I took intentional walks out.

And so here is what we got for Barry Bond's statistics.

Okay, so Barry Bonds had these stats.

He got walked intentionally 120 times that year by the way.

So he had hardly any at bats, but then this is amazing.

He had 45 home runs in 373 at bats, 27 doubles, got walked 122 times.

Now with regards to outs, okay remember how you get outs?

You take 0.982 times at bats minus hits.

The 0.982 is because 1.8% of at bats on average are errors.

And then he had 9 extra outs, those caught stealings, sacrifice flies, etc.

So Barry Bonds used 240 outs.

Now, a major league team has roughly 4,329 outs in a season.

And he created 240 outs.

So, what would you do to the Barry Bonds individual

stats to make them whole season stats?

Would you say, you multiply this individual stats to certify how many,

you scale it up by, okay if you make 240 outs, okay, you divide that and

put total outs for the season that about 18.

So basically a team of all Barry Bonds should generate about 18 times the stats

of the individual Barry Bonds, because in 240 outs they generated this many stats.

And they get 4329 outs, so you just multiply by the ratio there.

So that's what I call a scale factor, which for Barry Bonds was 18.02.

So I ramped up all of Barry Bonds' stats in row 10 by 18.02 to get them in row 17.

And then we've got our friend, the linear weights, here.

Okay.

And so, like singles here is 18.02 times the singles for the season.

And you get 1081.

You get 810 home runs.

That's a lot.

And walks gets ramped up by 18.02, so

you'd have these stats for an entire season if you had all Barry Bonds.

So, how would you predict how many runs you'd score?

Well, you'd use linear weights.

You would take minus 556,

multiply the linear weights that we got a couple of videos ago, .6328 time

singles plus .70 time doubles plus 1.27 times triples etcetera.

You would predict they would 2510 runs and per game you divide by 162.

You get 15.5 runs.

So, we would predict nine Barry Bonds can score 15.5 runs.

Now, one point here, it's hard to extrapolate your linear or

regression should not be used to make predictions for

data that's not within the range that was used to fit the regression.

And there is no team in our data set that looked anything like Barry Bond's

2004 season.

So really, I mean, if the extrapolation here didn't use these linear weights,

it probably wasn't that valid.

But we actually get a decent answer.

I'll show you in a minute.

We got 15.5 runs.

Well how can we know what a team of non Barry Bonds would score?

Well my best friend and colleague came up with using absorbing

work off chains which is a topic beyond the scope of our course, I believe,

to run through exactly what nine Barry Bonds would score in a season.

Now here's the URL for that, if I can copy that.

This goes to his, basically, his general site.

And basically, if you pick a year, like if I pick 2014 here.

Okay.

You can see Nelson Cruise would have scored, this is ranked by team, here.

Would have scored 5.8 runs.

If I go to the Tigers, a team of 9, Victor Martinez would have scored 7.69 runs, etc.

So you can basically see really how good a hitter was per his plate appearances.

Now, if you want to flip this to a different year you gotta go in and

change the year I think, because they don't have them posted.

I think that'll get me Barry Bonds 2004.

I hope.

Yeah.

Okay. So now,

if I go to National League I should be able to find Barry under the Giants,

I guess.

SF Giants.

Okay, so Barry Bonds, Jeff got 15.97 runs.

Okay.

So we come pretty close here.

And Jeff I put the URLs for those earlier seasons there.

See all you got to do is change the year,

well if you want to go to the correct league you just put in NL hyphen batter or

LA, AL hyphen batter, it also has stuff for pitchers.

Basically, our simple method using linear weights comes pretty close to predicting

how many run our nine Barry Bonds would score.

That shows a nice application of linear weights and you want to know if you had

a team which was nine of a single player, how many runs would they score in a game?

That's a pretty good measure, if everybody played the whole season,

how good a hitter is.

Now, what you might really care about more, and

we'll see this later and it ties into wins above replacement which is very important.

You'd care more.

If I would add Barry Bonds to an average team, how many more runs would he score?

We'll do that with my trap.

Okay. Then, we'll see that in the later video.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.