In order to find the best line, all we have to find is the slope.

Well, here's how we could potentially do that.

We would want to find the slope beta that minimizes

the sum of the squared distances between the observed data points the Yi and

the fitted data points on the line, Xi beta.

We'll square that distance and add them up and this is directly analogous to

finding the least squares mean, that we did just a couple of slides ago.

So this is exactly sort of using the origin as a pivot point and

picking the line that minimizes the sum of the squared vertical

distances between the points and the line.

So we're going to use our studio's function manipulate to

experiment with this and see if we can find that line.

Now there, there is a point in regression to the origin is useful for

explaining things, because we only have one parameter, the slope and

we don't have two parameters, the slope and the intercept.

But it's generally bad practice to force regression lines through the point

zero, zero.

So, an easy way around this is to subtract the mean from the parent's heights and

the mean from the child's heights, so that the zero, zero point is right in

the middle of the data and that will make this solution a little bit more palatable.

And we'll discuss it later on how this relates to real regression,

where you fit both the slope and an intercept.

Let me just show a picture to illustrate some of these concepts.

So here, I have a scatter plot where I have some data on the y-axis and

some data on the x-axis and I want to use my x variable to predict my y variables.

So this is my x-axis and this is my y-axis.