Let's do an exercise.
In this video, you will practice testing whether
a categorical X variable has a significant effect on a numerical Y variable.
The Y variable is your Throughput Time of a client request and it's measured in days.
This is the CTQ in your Lean Six Sigma Project.
The work is divided amongst three departments.
Department A, B, and C. You want to know if all departments are equally fast.
That is, if the throughput time is equal across the department,
or if one department has a longer throughput time than the other.
Now pause the video,
load your data into Minitab,
answer this question before you continue. Good luck.
Are you ready? Have you answered the question.
Okay, let's discuss the solution to this exercise.
First you check, what variable are you dealing with?
We have here a numerical Y variable,
Throughput Time and the categorical X variable, Department.
The tree diagram points us towards ANOVA with the possibility of a Kruskal-Wallis test.
These are three steps of ANOVA,
and we have to start by organizing our data.
For the first step.
We check whether the data is in the correct format.
We have our department in one column,
and the throughput time in the other column.
Hence, our Y variable throughput time is in one column,
and our X variable department in the other,
which is the format as it should be.
Now, we can go to our second step of ANOVA,
and that is to perform the analysis itself.
We can find the ANOVA analysis under the menu start, ANOVA, one-way.
Next, you have to fill in the Response or your CTQ which is the throughput time.
And for Factor or influence Factor,
you have to fill in the departments.
For ANOVA under options,
we uncheck the Assume equal variances option.
Okay. And we go to Graphs,
and we can ask for an Individual value plot,
and you can uncheck the Interval plot.
To be a bit quicker in the third step,
we already ask for the Four in one plot.
Okay. Okay. This is the output we get.
We get our four in one plot,
we get the Individual value plots for the throughput times versus the departments,
and in your session window you'll get a lot of outputs related to the ANOVA analysis.
Let's take a look at the output.
The lines in the individual value plot are not horizontal,
which means that we have found differences in the averages for each of the departments.
The average of the three departments are 16.9, 10.93, and 20.36.
The P-value is equal to 0.0,
which is lower than our five percent threshold.
This means that we have found statistical evidence,
that there is a difference in throughput times across the departments.
However, we still need to check
the residuals before we know if this this conclusion is valid.
This is the third step of ANOVA.
We already asked Minitab for four in one plot.
So, we can immediately take a look at the output.
In the normal probability plot,
we clearly see that the residuals are not normally distributed.
The time graph also shows some outliers.
Hence, we have to perform a Kruskal-Wallis analysis.
Let's take a look at how to do this in Minitab.
We can find the Kruskal-Wallis analysis under the Stat menu,
and then under the Nonparametric tests.
And here is the Kruskal-Wallis test.
Now, what is your response?
Well, that's of course the throughput time and your Factor that's the Department.
Okay. You get your output in your session window. Let's study it.
The Kruskal-Wallis analysis confirms the conclusion of the ANOVA.
The P-value is still zero,
which is below five percent, and therefore,
we find statistical evidence for
a difference in the throughput times across the departments.
Department B, still has
the smallest throughput time as you can see by looking at the medians.
A practitioner can now investigate what Department B does differently,
and try to learn from its best practices.
But remember, the variation within each department
is very large especially compared to the difference in the medians or means.
Therefore, Department is only a small influence factor on throughput time.
In summary, we performed an ANOVA,
and during the residual check,
we found that they were not normally distributed.
This is shown in the four in one plot.
Hence, we performed a Kruskal-Wallis analysis to verify our ANOVA results.
The Kruskal-Wallis showed a small P-value which means that we can conclude that there is
statistical evidence that there is a difference
in average throughput times across the departments.
Department B, scored lowest on
the throughput time and was therefore, the fastest department.
This knowledge can now be used to search for best practices at B and then,
implement these for departments A and C.