0:05

Our focus with this module is to understand how to do less work and still get mostly the

Â same amount of information, as if we had done all the work.

Â A bit of educated guessing is required, and some assumptions are used along the way.

Â Now, do you remember that rule that when we were dealing with a system with "k" factors,

Â and there are two levels for each factor, that we will have 2 to the power of "k" experiments?

Â That's a lot of experiments in many cases.

Â We saw that in the prior module, that when we used the software, we could estimate all

Â those coefficients.

Â The key insight that you will take away from these videos is that we don't have to run

Â all those experiments.

Â We can do fewer, but there's going to be a price to pay; and we're going to figure out

Â what that price is in this video.

Â Here's an experiment with two factors at two levels and there are the four parameters that

Â we can estimate.

Â The intercept, the main effect of the first factor, the main effect of the second factor

Â and the two factor interaction between the two.

Â Here is a system with three factors, and as we can see, we can estimate eight parameters

Â after we have completed the eight experiments.

Â A system with four factors will have a total of 16 experiments in a full factorial.

Â Such as system will have 16 parameters that we can estimate using computer software.

Â You can probably appreciate that this procedure quickly becomes prohibitive for most practical

Â systems.

Â There are many systems where there are 6, 7, or more factors.

Â We do not want to perform so many experiments required by the full factorial.

Â It will be both time prohibitive and cost prohibitive.

Â This is even true for systems that can be highly automated, e.g. systems with DNA sequencing

Â or systems that are done using computer software and stimulation.

Â There is also very little use in estimating all 2 to the power of "k" coefficients, that's

Â many, many coefficients in some experiments.

Â These higher order interactions are non-existent, and many of those coefficients will be so

Â small, that they're practically zero.

Â You'll seldom see a 3 factor interaction that is actually present in a real system.

Â And a 4th order, and higher level interactions, almost certainly don't exist in practice.

Â By using some educated guessing, and making reasonable assumptions about our system, we

Â are going to figure out a way to do fewer experiments and still retain the essential

Â information of the important effects in our system.

Â At the core of this approach is an implicit assumption that we ignore these higher-order

Â coefficients in the model.

Â There are occasions when it is appropriate to do that, and there will be times when our

Â assumptions are faulty.

Â It is critical to understand that there are practical situations where it's quite okay

Â to lose some of this prediction accuracy from the higher-order terms.

Â Those higher-order terms definitely helped you fine tune the predictions but the cost

Â of obtaining them can be prohibitive.

Â You'll need to decide whether or not it is worth doing that work.

Â And that's the subject of today's video.

Â Perhaps let me ask you to consider the question this way: if we only had the time and a budget

Â to do 4 experiments, which 4 of these original 8 would you do?

Â You might start by considering to only run the 4 experiments here at the front, but that

Â won't work so well because you will only have factor C at its low level.

Â There will be no experiments at the high level for factor C, and so you won't really know

Â what factor C does in the system.

Â So then you might say: "what if I select these two at the front and those two at the back?"

Â Those represent the middle four rows from the standard order table.

Â That's not a bad choice, but it's not the best.

Â Let me show you a better choice then I will explain it afterwards.

Â Here is the set of 4 experiments that you should do.

Â Either select the 4 with open circles or the 4 with closed circles.

Â Notice the interesting pattern in the cube.

Â It is intentionally selected that way and let me explain why.

Â We'll work backwards here.

Â Assuming we have completed these 4 experiments - the 4 with open circles.

Â And now when we analyze the data we discover that factor A is not significant from the

Â Pareto plot.

Â If A is not significant then it essentially implies that we could have ignored factor

Â A, and never really needed to include it in our experiments.

Â Another way of saying that, is that factor A could have been at the - level or at the

Â + level, and it really wouldn't have affected our outcome variable much.

Â If A can exist at two levels and not really affect our outcome, that means that we can

Â collapse the minus and the plus layers together.

Â And notice then what happens.

Â As we do that, we recover 4 experiments in factors B and C. Four experiments in two factors;

Â that's a full factorial!

Â We don't have to do any more work here.

Â These four experiments that we've already run, now complete a full factorial in factors

Â B and C.

Â In fact you can prove this to yourself for the case when factor B is not significant.

Â Then it collapses to a full factorial in factor A and factor C.

Â If factor C is not significant then it collapses to a full factorial in factor A and factor

Â B. So from that perspective, these are really a good set of 4 experiments to use.

Â So now let's imagine that we've run only these 4 experiments.

Â I'd like to show you how we could analyze the data and I'm going to use the water treatment

Â example again.

Â I hope you don't mind if I rename the factors to A, B, and C. I'm doing this because I want

Â to use the water treatment example that you're comfortable with, but at the end, I want to

Â extend what we learned here today to any system, and A, B, and C are the most generic way to

Â do that.

Â Now assume that each of these experiments were very expensive.

Â Maybe they cost around $10,000 each.

Â So instead of doing 8, let's assume we've only done these 4: half the work.

Â Our boss is going to be pretty impressed that we've saved $40,000.

Â Open the software and let's see what happens.

Â Using the best choice design I talked about earlier, where you've only done experiments

Â 2, 3, 5 and 8 from the original set,

Â I'm going to ask the software to create new variables for A, B and C, which only include

Â those 4 experiments.

Â And here are the 4 outcomes at those conditions.

Â Now if you just go ahead and type in the code from the previous class, you can see that

Â the software will create a model from A, B and C; and it includes 2 and 3 factor interactions.

Â But what you will notice that's different from last time, is all these NA terms.

Â That NA stands for "Not Applicable"; those terms cannot be estimated.

Â But we got 4 estimates of 4 coefficients, we ran 4 experiments so we expected that.

Â The full model prediction has 8 parameters and would have required 8 experiments to calculate

Â all 8 of them.

Â Let me assume we've done all 8 experiments.

Â And let me compare that to the case where we've only done 4 of the experiments.

Â We're going to write out the two prediction models side-by-side so that you can see the

Â differences between them.

Â In this particular example, you can see that three of the terms are numerically similar;

Â it's not going to lead to serious misinterpretation.

Â However, there is one term that is very different.

Â What has happened over there?

Â I'm going to show you now how that reduced design was found.

Â How did we come to that best choice?

Â We call this a half fraction.

Â The full set of experiments for 3 factors would've required 2 to the 3 experiments.

Â If we want to do half the work, then we can divide by 2 here, which is equal to 4.

Â Or for those of you that remember your exponent rules, we could write this as 2 to the power

Â of (3 minus 1).

Â This equals 2 to the power of 2, which equals 4.

Â There is a systematic way to select those four runs.

Â Since we know that we will have 4 experiments, we can quite happily go ahead and write out

Â our standard order table for the first two factors, A and B.

Â We do this because we know two factors require 4 experiments.

Â Okay, but what about that third factor, factor C?

Â At what settings should we write out that factor?

Â We write it out as C equals A times B. In fact, we say "generate factor C as A times

Â B".

Â So there we have that factor C is equal to +, -, -, + for the 4 experiments; the multiplication

Â of the values in column A and column B.

Â Let's visualize where those 4 points are on the original cube.

Â The first row is at low A and low B, and high C, so it appears here.

Â The next point is that high A, low B, and then low C. So that's over here.

Â The third experiment is there, and the last experiment is at high A, high B, and high

Â C. Notice how that corresponds to the ideal selection of four experiments we made at the

Â start of this video.

Â In the next video I'm going to show you where I got that rule where C should equal A times

Â B. So let's understand the trade off here.

Â If we do half the amount of experiments we have to accept that we get less information

Â from the system.

Â I guess you can say there's no such thing as a free lunch.

Â You can't get something for nothing.

Â The question is: "what is the penalty for doing fewer experiments?"

Â "What is this free lunch costing me?"

Â I mean, if we had paid an extra $40,000, and did the extra four experiments we'd have that

Â extra information.

Â You can already see that over here.

Â We had some good estimates of the three parameters.

Â The intercept, the A main effect, the C main effect.

Â But the B main effect was actually quite wrong.

Â Also you notice that we didn't get any estimates of the two-factor interactions.

Â Let me drop in two words that we will come back to in later classes.

Â "Screening" and "optimization".

Â When we are screening, we don't mind having reduced knowledge of the system.

Â For example, we don't mind if the two-factor interactions are not all known, or if the

Â estimates of the factors are not quite correct.

Â Later on, when optimizing, though, we want more specific information about the system:

Â a better level of prediction accuracy.

Â At that point is when we will require better resolution of the main effects and interactions.

Â So this is what the $40,000 is costing us: a reduction in the model's prediction quality.

Â You could ask whether that's worth the money saved.

Â Well, you'll never really know the correct answer, unless you do the full set of experiments.

Â But I'm going to show you how we can make some educated guesses later in this module.

Â What we've done here by not running those extra experiments, is, we've rather cleverly

Â selected a subset of them to save $40,000.

Â We can use this money later on.

Â For when we required a more detailed model to find that optimum in the system.

Â George Box, the famous statistician from whose text book, we're using this example, said

Â is a rough rule, but only a portion about 25% of the experimental efforts and budget

Â should be invested in the first experimental designs.

Â I paraphrased that slightly.

Â But basically he is saying that you should leave some money, and time, for later on to

Â figure out the details.

Â In the beginning you don't even know yet if A, B or C are actually significant.

Â First figure that out before you go build a detailed model, with two-level and three-level

Â interactions.

Â That's where we're going to leave the class today.

Â We've shown you the end point that when you do half the work, you lose a bit of accuracy

Â in your model but there's a great built-in backup strategy in the clever selection of

Â which half of the work to do.

Â I guess you could say at least be smart about which half of the work to do.

Â In the next class we're going to learn the technical terms and the mechanics around creating

Â these half-fractions.

Â