Welcome back, in the previous lecture, we talked about how we could use models to
become clearer thinkers. In this lecture, what we're gonna do is talk about how we
can use models with data. And this is an important reason why people use models, in
fact when you talk to scientists about why they use models whether they are social
scientists or natural scientists. What they'll typically say is well we use
models to take them to data, to basically use and understand data in better ways.
What I am going to do is unpack that in several directions. I wanna give some
specific reasons or ways in which people use models with data. Alright so the first
one first real reason is just to understand some basic patterns in the
data. So what do I need? Well you could look at data and it could just be a
straight line, and nothing could change. So for you look at a system where there's
not enough energy in the system we know that energy is neither lost nor gained so
energy is a constant. And we have a model that explains why we see energy being a
constant. Alternatively we can see something that's a straight line, and
increasing line. When you're on a model that explains that. And then we also
talked about how we can see patterns in data. So we could see things that go up
and down slowly like this, like business cycles and we have models that tell us why
we see these kinds of cyclic curves. We could have something that's much more
spiking. We could have a model that explains that. So. Again we talked about how there
this sort of hairball of data, this firehose of data. There's tons of data out
there. That datas gonna have patterns to it. And what we can do is use models to
understand why we see those particular patterns. Okay. In addition to the
patterns, there's also the use of models to predict specific points. So suppose you
are looking for a house and you see this house that's for sale and you're
wondering, I wonder how much that house is gonna cost. Well, you can have a model
that says okay, the price of the house depends on it's size. So here's sort of the size of
the house in square feet. And here's the price. We just put dollar sign there for
price. And maybe you get a linear model. And your linear model says basically for
every, you know, additional square foot the price of the house goes $100 or $200
or something like that. Well then if this is your model, so on your model you've got
a house that's got this many square feet it's 2,000 square feet, right, and you go
up here and find the point it's $100 per square foot then you're model would
predict that the house is $200,000 so. We can use just a simple model to make some
sort of prediction about, just in ballpark, how much a particular house
would cost, so this is, again, a common use of models to either construct a model
and from that model, you predict a point value. Okay, third reason why we use
models. It's not so much to predict the points, but to produce bounds. So suppose
you're the economic advisor to the president, not a job you'd necessarily
want, [laugh], but suppose you are. And the president comes to you and says,
what's inflation gonna be next year or next month? Well, you know, inflation
doesn't move that quickly. You might be able to say to the president, well, you
know, I think it's gonna be 1.2%. And you might be pretty confident that it's 1.2%.
But suppose the president says, you know what? I'm just doing some long range
forecasts, so, what if, what's inflation gonna be ten years from now? Well, who
knows what inflation is going to be ten years from now? So you may have some fairy
sophisticated models, but they're not going to give you a point estimate. So,
instead, what they might say is that I can tell you with pretty high probability that
it's going to be between zero and three percent. So it gives you a range. Right?
So what your model won't tell you exactly what's going to happen, cause there's too
many contingencies out there, there's too much complexity, too much uncertainty. You
can't say for sure, but your model might give you some bounds about what's going to
happen, and that can be really useful for making policy decisions. Okay. Reason
Four. Retrodiction. What do I mean by that? Well, you can use models with data
to predict. Past. Now there's a couple reasons you might do this. One reason is
you might not have data from the past, you might want to sort of use models to
figure out, what do we think the past was like? And this is think, you know,
geologists do this. You know, biologists do this, anthropologists do this,
archaeologists do this. They use models and data to try and figure out, what do we
think you know, temperature was like, how many animals do you think there were, what
were these civilizations like, those sorts of things. If you have the data, then you
can use models to see how good they are so you can actually retrodict data to
see if in fact your model would've worked, let me explain what it means, now suppose.
We're looking at some data streams. Perhaps it's, let's stick with that
employment. Suppose the unemployment data looks like this for some period of time.
Right. And now what you're doing is, is you're saying okay. We've got a model.
We're gonna ask how well that model will do. So what you do is you sort of fix
that. You give that model data up to here. So it's fitting pretty well. And then at
this point. Right here, you say hey, let's see how our model would predict from here
on now. If you run your model, it sort of goes like this. If it goes like that, you
can say, you know, our model in the past, if we were using the same model in the
past, it wouldn't have worked. And so that makes you fairly dubious about whether the
model's gonna work now. So, retrodiction, going back and testing past data, is a
good way to test how good your model really works. Fifth reason, predicting
other stuff. So you might construct a model for one reason. Let's suppose you're
really interested in the unemployment rate. You know, you construct a model to
predict the unemployment rate. But out of that pops out the inflation rate, so you
get something else. This is a good way to tell, you know, how strong your model
is.'Cause typically, you construct a model for one reason that gives you other stuff.
There's another type of predicting other that's way cool about models.
So when they developed the first models of the solar system, right? The heliocentric
model, the sun in the center, right? So you've got the sun sitting here in the
center, and the planets orbiting. The math didn't quite work out right. And they figured
out, there must be a big planet out here. That's causing the orbits of the
other planet to be skewed a little bit. And the big planet was Neptune. They
couldn't see it. But their model predicted it. So the model predicted something,
something else, something other, that was evident in the data. So models can
predict stuff. Other than what you expect them to predict. Which was really
cool. Alright, six, 63, to inform data collection. So let's suppose that you're
interested educational reform which is something I'm interested in. You want to
think okay, how do we make better schools? Well, what you can, remember in our last
lecture about being a clear better thinker. One thing models force us to do
is name the parts. So, I want to think, how are schools, how to make better
schools? Well there's a lot of data out there on school performance. So what i
want is, is I want some sort of model that explains why students do poorly and why
students do well. So you think, well what are the parts of that model? Well it might
things like Teacher quality, we call that TQ, right? There might be parental status,
we call that PS, whether your parents went to college, whether they got high school
degrees, whether they're doctors, lawyers, that sort of thing. There might be total
spending in the school district, that might matter, right? Things like class
size, just put CS for class size. Class size probably matters a lot. Right? You
might argue that, you know, technology. Matters is there technology in the
classroom. You might even argue, you know, there's general health. Is health a big
consideration. And you can even, you know argue, what is the, what are the other
students like in the school? What are the other peer effects? What is the effect of
what other students do? So if you don't have a model, you don't even know what
data to go get. So models help you to figure, okay, what data should we get, and
what data should be included, and what data, what data should we go out there and
find, so that use of models can be very useful since it tells you what data to go
out there and get. Our last two. For why you model art a little bit different, but
they're, they're similar to one another. And that is that we can use data, right?
To sort of tell us more about the model, and then we can use the model to tell us
more about the world. So let me, let me explain what I mean a little bit.
[inaudible] confused. So, one thing that these models force us to estimate hidden
parameters in the model. So, here's a, sort of a classic model from. Disease from
epidemiology the study of disease, is called the SIR model, so there's three
types of people, there's susceptible people, there's infected people, and
there's recovered people, so there's a disease you could be susceptible to it,
you could be infected, or you could be recovered and when you're recovered then
you're immune. You're not gonna get it again. So let's suppose that you know, you
work for the Center for Disease Control, and something you see, oh my gosh, people
are getting sick. But you don't know, there's some sort of flu going on. But
you're not quite sure how this is spreading. Is it spreading, is it
airborne, right? Is this virus spreading, you know, through mucus or something?
You're not sure. And you're also not sure how virulent it is, so you're not sure how
many people are gonna get the disease. What you've got, let's draw a little graph
where you get time on this axis. And you've got the number of people. Who have
the disease. And, what you can do is you can sort of see. Over time, exactly how
many people are getting the disease. Well, if you can see over time how many are
getting it from that data, you can predict how virulent the disease is. Like, how
likely it is to pass from one person to the other. And that's gonna allow you to
figure out, is the disease gonna go like this, or is it gonna go like that? And so,
from that data, you can estimate hidden parameters, right? Namely, how virulent
the disease is. Like, you can't tell by looking at data how likely one person is
to get it from another. You know, from just, you can't tell by looking at the
world. But by looking at how many people get it, you can go back and estimate. That
parameter. You can figure it out. That's what's really cool. Alright? Last reason,
calibration, so calibration refers to sort of constructing a model and then
calibrating it as close as possible to the real world. Let me give an example here.
So suppose I want to write a model of forest fires. So I'm going to draw some
really bad trees here. Here's a tree. Here's another tree, right. And I want to
know what's the probability, these are horrible trees, what's the probability
that the fire moves, right, from this tree to this tree. How fast does it move and
all that sort of stuff. Well what I can do, what I can gather is if the state
exists, tons of data about past forest fires, and with that past data I can
calibrate a really accurate model of forest fires. How likely are they to
spread? How you know their speed depends on how dry the trees are, how much
precipitation there's been, what the wind speed is, all that sort of stuff. Once
I've got all that data that would allow me then to figure out. You know, how
dangerous are particular forests? Right? I could say, oh my gosh, northern New Mexico
hasn't had rain in over two years. Here's how dry the soil is. Here's how dry the
trees are here is, you know, how many acres of forest we have, here's what the
wind speed is, and you can know exactly how dangerous a particular forest happens
to be at that particular moment in time. So you use all sorts of past indexes to
calibrate a particular model, you know, your big model and then you can use that
model. To construct policy. And that's what we're going to talk about in the next
lecture, right, how do we use models to make decisions, to strategize, right, and
to design things. Thank you.