0:24

a product where we have sold the product at various prices.

Â That's on the horizontal axis.

Â And on the vertical axis, we've got the sales, the volume,

Â the quantity sold of that product.

Â And what we can see here is that as price goes up so

Â quantity sold goes down and vice versa.

Â If the price is cheap we sell a lot.

Â But when you look at those data points and you see does a line

Â fit these data points very well, there's a sense that there's some curvature.

Â We describe curvature in the data.

Â And I have fit a linear model through the data and

Â that's what you can see in the bottom right hand picture.

Â I can fit a line through the data.

Â All because I can fit a line through the data doesn't actually mean that it

Â makes a lot of sense.

Â One of the most important activities that you can do when you're running

Â regression is to look at the data, and I say always, always plot the data.

Â And by plotting the data here, I can see that

Â a regression line, a linear model here,

Â 1:30

isn't a particularly good description of the underlying relationship.

Â It misses a lot of the points, in particular it provides a very lousy

Â forecast for the times when this product was selling at a low price.

Â Notice how those points at the beginning of the graph are way above the line.

Â And if we go all the way to the other end of the plot,

Â when the price was high, then all the points are above the line there as well.

Â So there's a systematic lack of fit to that points to the line.

Â And it's just telling me that this straight line model

Â doesn't look like it's appropriate in the situation.

Â So this is going to happen a lot in practice.

Â When you look at data, it's not necessarily going to be linear.

Â So then that begs the question of well, what do we do in this situation,

Â when we observe curvature in the data.

Â 2:19

Well, the good news is it's not like everything is lost,

Â there's something that we can do.

Â And the thing that we do is to consider transformations of the data.

Â Now when I say transformation,

Â I mean a mathematical function being applied to the data.

Â It could be applied to x, it could be applied to y, it could be applied to both.

Â 2:39

Now, there are an infinite number of mathematical transformations out there.

Â Which one should I do?

Â That's where an underlying, a basic,

Â knowledge of the key math functions that we discussed in another module really come

Â into play, and those basic functions are the linear.

Â They are the power.

Â They are the exponential, and they are the log function.

Â So those are the ones that we most frequently use when we're thinking

Â about transforming data, and given the relationship isn't a straight line then of

Â those functions the one that we will find used most in practice is the log function.

Â Doesn't mean that it's always going to work for you, but

Â it certainly can provide for some flexible models.

Â And what I've done is taken the data in this particular example, and by the way,

Â the product is a pet food, and what we're looking at is the price that the pet food

Â is sold for and the quantities cases sold, in this case.

Â And what we've done is take the price and

Â the quantities sold, and applied the log transform to them.

Â And in this case, I'm using the natural logarithm.

Â And so we've taken the log transform and when we look at the data on

Â the log scale, you can see that the relationship appears

Â much more linear than it did on the original scale of the data.

Â So one of our basic approaches to seeing curvature in data

Â is to consider transforming the data and

Â if you ask me, well what transformation should I do, my answer generically is

Â going to be do a transformation that achieves linearity on the transform scale.

Â How do I know I achieve linearity?

Â Well, the answer is going to be always, always plot your data.

Â Have a look at it on the transform scale.

Â And when I look at this data on the transform scale, and that's the plot at

Â the bottom left-hand side here, you see it's approximately linear.

Â And putting a line through the data on the transform scale

Â seems to make much more sense.

Â 4:41

So that's how we will typically proceed.

Â Now, the only downside of this so

Â far is that if I go in and give a presentation to people, and I show them

Â a plot of the data on the log-log scale with my line on the log-log scale.

Â Lot of people don't like that.

Â They don't understand logarithms.

Â And so you often do everyone a favor by taking that model on the transform scale

Â and back transforming it to the original scale of the data.

Â 5:07

When you back transform there's log log model on the transform scale.

Â Back to the original scale of the data,

Â you will get the graph that you see on the right hand side.

Â And so the two graphs are presenting the same data.

Â They are presenting the same model.

Â But they're doing it on different scales.

Â We fit the straight line model on the log log scale and for presentation

Â purposes we would typically back transform to the original scale of the data and

Â then my best fitting line becomes the best fitting curve.

Â So what that says to you is that so long as you're willing to transform your data

Â you with a regression methodology, going to be able to capture all sorts

Â of interesting relationships between variables with your quantitative model.

Â So the log log model is a pretty good fit to this demand data, the demand

Â data being how does the quantity so depend on the price of the product.

Â Now, I want to show you formula what the model looks like that we've just fit.

Â It's a regression model, so it's a model for the mean.

Â I'm looking at the first equation now, so we write that as expected value.

Â But now, we're not working with the sales, we're working with the log of sales.

Â So this regression model is the expected value of the log of sales,

Â given the price, where we are doing a log transform on the price.

Â And so we have the expected value of the log of sales is equal to some

Â constant b0 + b1 times the log of price.

Â So I would term this a log-log model we got the log of y and the log of x Convex.

Â And in this particular instance we have an intercept of 11.015 when we use the method

Â of least squares to fit this and we have a slope of -2.442.

Â And because it's a log-log model that slope has

Â an interpretation and the interpretation is as what we call an elasticity.

Â 7:10

Talked about that in one of the other modules.

Â An elasticity tells you how percent changed in x is associated with percent

Â change in y.

Â And the -2.442 is telling me that based

Â on this analysis, as price goes up by 1%,

Â I anticipate the sales to fall by 2.442%.

Â So percent change, an extra percent change, in y.

Â That's the interpretation of the slope.

Â In a log-log model.

Â And one of the caveats of that interpretation is you should only use it

Â for small percent changes.

Â 7:46

The equation that I presented now shows you how

Â you might go about creating a model for a subsequent optimization.

Â One of the things I've been talking about how model,

Â is how models are frequently inputs to an optimization process.

Â Here the optimization would be, I wonder what the best price would be for

Â this product in order to maximize my profit.

Â And you can approach such a question through calculus, but we need some inputs.

Â We need a function to optimize

Â the regression model is giving us such a function.

Â It's reverse engineering the underlying process.

Â Or at least an approximation of process, one that we think adequately captures

Â the association between the outcome quantity sold and

Â the input price of the product.

Â So you can see how we can create the setting to do

Â a subsequent optimization by fitting one of these regression models.

Â