0:53

It's week one, session two of operations analytics.

We are focusing on descriptive analytics,

and we're gonna look at forecasting.

With past historical data, moving averages, and

exponential smoothing in the advance slide.

And that will be our session two.

3:50

Now, we are ready to look at descriptive statistics for our data.

Let's look at the entire data set, D1, D2 through D100.

We calculated the sample average and we get 52.81.

We calculated the sample standard deviation and we get 13.73.

In the excel file, DemandData.xlsx,

I show you how to calculate these two from our data set on an excel sheet.

5:32

Once you close the bracket and press enter,

you have the standard deviation which is 13.73023.

If you want to use the number of decimals we want to use,

we just go click here and we close and we get 13.73.

So the descriptor statistics, we have for our data is

the sample average of the sample mean, which is 52.81,

and the standard deviation is 13.73.

Mean and standard deviation are two descriptor statistics for our data.

6:08

If our data will normally distribute it, these two service checks would be

sufficient to describe the demand, predictive statistics.

However, if we need to use our data to forecast,

we need to adjust our sample standard deviation for forecasting.

We will talk more about this later.

6:28

Let's look at the notation for our forecast.

Remember that we denoted past demands as D1, D2, D3 up to Dt,

which are the past values of the demand.

We are interested in making a forecast in period t for a future period.

We have data up to the period t.

7:04

The second subscript tells you which period you're making the forecast for.

For example, Ft,t + tau, means it's a forecast that's being made in period T,

for a future period t + tau, where tau can be one, or two, or three.

Tau denotes how many periods ahead you are looking, so Ft,t+3 as an example,

is the forecast made in period t for three periods ahead.

7:35

Let's put some numbers in there and look at an example.

F100,100+3 is the forecast made in period 100,

for demand in period 103.

This is called and three step forecast, why?

Because we are looking to forecast three periods ahead.

8:07

This is because typically, we are interested in the next outcome, of simply,

one-step ahead forecast.

F t, t + 1 is the forecast made in period t for period t + 1.

Just looking at another example, F t- 1,

t is the forecast made in period t- 1 for the next period t.

8:36

F t + 1 is simply going to say,

what is the one-step forecast made for period t + 1?

So if I want a one-step forecast for

period t + 1, I simply call it F t + 1.

In other words, the F t + 1 stands for

forecast made in period t for the next period, t + 1.

9:20

So a stationary time series has the following form.

Demand in any time period t, demand D t, can be written as sum of two things.

Mu, which is a constant, sometimes the sample average,

plus epsilon t, which is the random variable.

10:44

We'll denote the moving averages method that uses n data points simply as MA (n).

To come up with one-step-ahead forecast for period t,

using the moving averages method, all we have to do is the following.

To come up with a forecast for period t, F t,

you take the end data points from the past periods.

Going from t -1, t- 2, up to t- n, and then / n.

The average of the past data points gives you the forecast for the next median.

11:48

You might wonder why it is called the moving averages method.

It's called the moving averages method because the chosen data points move and

are always the most recent n data points.

For example, if you wanted to forecast for period D, you took

n data points that went from D- 1, D- 2, up to D- n.

14:07

And our data, if it was normally distributed, let's say like a bell curve,

the above statistics that we calculated, namely, the mean and the standard

deviation, would be sufficient to describe the demand distribution.

In this Excel video, I'm gonna use week one MA template,

or moving averages template file, to show you moving averages calculations.

15:34

Again, the average, or the mean, is given by the average function.

Take the average function.

Start selecting the data.

And you can see, as you go up, you can see the count happening.

And that's your ten data points.

And typically, on your bottom of your Excel sheet,

there is also a count expression that you can check.

So, once you have this.

The average answer will be 49.60, which is what I showed you on the Excel sheet.

Again, these ten data points here

give you the average of 49.6.

And we're gonna calculate the standard deviation now.

The standard deviation, again, is given by STDEV function, and select the ten points.

16:59

And the average of 20 past demands is going to be

these 20 numbers that I'm picking up, and you get 51.95.

And the standard deviation is going to be,

for those 20 points, 9.61,

or 9.616, which is 9.62.

Here we have the average of the sample mean, and

the standard deviation for ten data points.

And for 20 data points.

Similarly, you can do it for all data points.

For all your data points, we should get the numbers we got before.

That is periods 1 through 100,

the average is 52.81 and the standard

deviation for all this data is 13.73.

And as you can see, the more data you have,

the more ability you have to calculate these numbers, and

more confidence you have in the forecasting process.

19:09

The standard deviation for prediction is going to be the standard deviation

we calculated plus a corrective factor.

And the corrective factor is s, standard deviation calculated, divided

by the square root of n, where n is the total number of data points you used.

19:51

And the adjustment gets you a sigma.

So here we calculate s is equal to 10.28.

The correction factor is s over square root of n.

So 10.28 + 10.28 / square root of 10.

Because we use ten data points, that gives us 13.53.

As you can see, that's a significant correction.

20:39

The sample standard deviation comes out to be 9.62.

Let's do an example for predictive statistics using 20 data points.

Again, when the data is normally distributed, the mean for

prediction that we got before, the descriptive sample mean, is sufficient.

21:30

Plus the correction factor of s over square root of n.

In this case we get 9.62 + 9.62 / square root of 20,

because we used 20 data points, and we get sigma as 11.77.

For forecasting purposes, we can assume that the actual demand distribution

will deviate from the average.

With the above standard deviation, which is 11.77.

Now let's think about using more data.

As we have more data for forecasting.

What happens is descriptive statistics approach predictive statistics.

As we have more data, we gain more confidence for prediction.

22:15

Also note as the number of data points increase, the descriptive standard

deviation approaches the standard deviation for prediction.

That is, as n becomes larger and larger.

The second term, the correction term, disappears and

sigma becomes almost equal to s.

Suppose we use our entire data set for calculating our statistics.

That is, we're using all the data available of n equal to 100.

The descriptive statistics, as calculated from Excel, are the following.

Mu is equal to 52.81 and standard deviation s is 13.73.

And if our data was normally distributed, for normal distribution,

the predictive statistics are as follows.

The mean, mu, is still going to be the same, it's 52.81.

However, the standard deviation for prediction, as we saw before,

needs to be corrected and adjusted.

And that is gonna be 13.73 plus a correction factor of 13.73 divided by

square root of 100, because we used all 100 data points.

And that comes to 15.10.

So what does this really mean?

That means for our demand, for prediction purposes,

we can use a normal distribution with mean 52.81 and

standard deviation 15.10 in the coming weeks.

I show you how to calculate all this numbers,

in an Excel file, and the solutions are all documented

in the Excel file, Week1 MASolution.xlsx.

Based on our predictive analysis, we can now generate a normal distribution

graph to visualize how our demand is distributed.

We use a mean of 52.81, a standard deviation of 15.10, and

the figure shows you how your demand distribution looks like.

You can see there are a lot of numbers close to the average,

52.1, so demand is likely to be closer to the average, and

there are small probabilities where demand is likely to be very low,

or demand is likely to be very high, but it's possible.

24:54

It's straightforward to compute and it does provide stable forecasts.

There are some disadvantages, though.

First, if there is trend in your data set, the moving averages method will lag

behind the trend, as we will see in section three.

25:14

Finally, it's not a causal model.

That is, the moving averages method won't explain why

your demand realizations in the future behave in a certain way.

It might be very good for prediction, but

it doesn't explain why your demand behaves in a certain way.

26:04

Moving Averages, what data should we use for moving averages?

If you choose to use moving averages method of last ten data points,

remember two things.

One, all older data is ignored.

That is, data from 12 periods back is not used in your calculations for

moving averages.

27:28

For one step ahead forecast, your error in the time period

T is the difference between forecast made in that period,

FD, minus the demand that occurred in that period, DT.

The difference tells you what your error in the forecasting was.

29:54

Once we have a forecast for period 82, the demand for period 82 occurs, and so on.

So, we are able to calculate the errors.

And using MA(10), and

then we can calculate the errors by comparing the demands and

forecasts over periods 81 through 100.

In Week1ErrorsTemplate.xls file, I'll walk you through an example exercise.

31:21

So the forecast for

moving averages of ten periods is just the average of the last ten observations.

So I just do that here.

I start in period 81, and

I'm going to take the average of all demands from period 71 to 80.

And that gives me 55.8.

The forecast is for 55.8.

The actual demand turned out to be 52, so there is an error.

And the error is measured typically by forecast,

minus the demand, and you get Positive 3.8.

That's your error.

In column G I'm gonna calculate the absolute error.

32:24

In column H, I'm going to calculate the squared error,

which is 3.8 times 3.8.

And then, in the last column, I,

I'm gonna calculate the absolute value of the first integer.

Sometimes you're interested in the first integer.

You add the error of 3.8 but how does the error compared to the actual demand?

33:35

And as you can see the errors can be distributed quite significantly.

You have an error as high as 21.8 and

as low as -2.7.

We can see errors can be negative, -10.4, in period 90.

Within a couple of periods they have as a positive and it's positive 14.1.

So the absolute deviation takes care of both negative and

positive errors as errors, and calculates the average absolute deviation.

34:14

We can do that, and that's our MAD.

MAD is mean absolute deviation, of the average of the absolute errors,

and in this case we get the average of absolute errors is 8.9.

Similarly, we calculate the mean square error, or the average square error.

And that's going to be the average of all these square errors.

And that's 113.15.

Finally, we calculate MAP, or mean absolute percentage error.

Which tells us roughly what your percentage deviation of

your forecast is from the true demand.

So let's take the average of all these numbers, and we get 0.197.

To express this as a percentage, we have 20%, or actually, 19.72%.

So on average,

your forecast deviates from the demand by 19.7%.

The square error is about 113%.

The absolute value of the error you have is about 8.9 units of demand.

37:11

And that's 0.067.

So we take these five numbers, again we drag them over the last 20 periods,

and we have forecasts, we have errors, we have absolute errors,

we have square errors, and we have percentage errors.

Again let's calculate the average, absolute error.

And that's 7.655.

That's come down, compared to what it was before, 8.9.

The average square error, or the mean square error, MSC.

37:51

That's come down.

And that's 92.611.

And the percentage error, let's calculate the percentage error in this case.

And that's come down too, and

in fact it's 17.29, 17.29%.

So MAD has come down absolute deviation.

Square error MSC has come down, absolute percentage error, MAP has also come down.

39:01

At the end of the analysis, I'm going to show you that if

you use a moving averages method of MA(10) we will get,

MAD is 8.9, MSE is 113.15, and MAPE is 19.72%.

Using MA of 20 points, moving average of 20 data points.

39:25

The MAD, or mean absolute deviation, is 7.66.

The mean squared error is 92, and

the absolute percentage error is 17.29%.

So this at least answers that MA(20) seems to be performing better than MA(10).

The solution for the Excel analysis is available in Week1ErrorsSolution.xls.

It may be preferable to use MA(20) over MA(10) in this case,

by comparing the corresponding errors.

Thus, measuring errors allows us to understand better

the choice of which method to use in our forecasting.