I. From "Few Factors" to "Many Factors": Core Questions about DOE and Limitations of Full Factorial Design
In the early stage of getting in touch with DOE, many readers tend to fall into a misunderstanding - "DOE can only handle problems with less than 3 factors". The root of this misunderstanding lies in the over - reliance on "Full Factorial Design": Full Factorial Design requires testing all combinations of factor levels (for example, 4 tests for 2 factors and 8 tests for 3 factors). The number of tests increases exponentially with the number of factors (the formula is \(2^k\), where \(k\) is the number of factors).
Let's do some calculations:
- 6 factors require \(2^6 = 64\) trials;
- 10 factors require \(2^{10} = 1024\) trials.
What does this mean for enterprises? For example, in a production process optimization project, 64 tests may take 3 months (2 tests per week), while 1024 tests will take nearly 5 years. The costs of time, manpower, and materials will skyrocket, ultimately turning DOE into a castle in the air that is "theoretically feasible but actually infeasible."
This is the fatal limitation of the full factorial design: once the number of factors exceeds 5, the scale of the experiment will exceed the enterprise's capacity to bear. But does this mean that DOE cannot handle multiple factors? The answer is exactly the opposite - the core advantage of DOE is to use the "Fractional Factorial Design" to solve the multi - factor problem.
II. Fractional factorial design: The key logic from full to fractional
The essence of fractional factorial design is to select "partial" experimental combinations from the full factorial design while maintaining the "orthogonality" among factors (i.e., the effects of each factor do not interfere with each other). It is not a "random reduction of experiments" but a precise selection of the "most valuable part" using mathematical rules - sacrificing the "discrimination of secondary interaction effects" and retaining the "ability to analyze main factors and key interaction effects".
Let's use a classic example to clarify this logic:
1. Basis of the full factorial design with 3 factors: Structure of 8 trials
A full factorial design for 3 main factors (A, B, C) requires testing all \(2^3 = 8\) combinations of levels (+1/-1 represent two levels). At this time, the experimental table (Table II) contains 3 columns for main factors, 3 columns for two - factor interactions (AB, AC, BC), and 1 column for the three - factor interaction (ABC), a total of 7 columns - any two columns are orthogonal (i.e., the correlation between columns is 0, and the effects do not interfere with each other).
2. Add the fourth factor D: It must be mixed, but choose the right target
If we want to add the 4th main factor D to the 8 trials, we need to add a new column for D. However, it can be proven mathematically that it is impossible to add a new column in an 8 - row table that is different from the first 7 columns and orthogonal to the first 3 columns (main factors).
What to do? Make column D exactly the same as a certain interaction column – this is confounding (Confounded): the main effect of D will be mixed up with the effect of this interaction, and the influences of the two cannot be separately separated during calculation.
But confounding is not "random mixing". We should select the interaction with the least impact. Usually, the effect of the third - order interaction (such as ABC) is much smaller than that of the main factors because the probability that three factors simultaneously have a significant impact is extremely low. Therefore, the optimal choice is to let \(D = ABC\): directly copy the values in column D from column ABC while maintaining the orthogonality between D and A, B, C (Table III).
At this time, the number of tests for the four main factors (A, B, C, D) is still eight. However, the main effect of D will be confounded with the three-way interaction of ABC. But for the enterprise, this is completely acceptable: our goal is to "find the influence of the main factors", not to "calculate the magnitude of the three-way interaction".
3. From 4 to 7: How many factors can be accommodated in 8 trials?
Following this logic, for the 8 trials, the 5th, 6th, and 7th factors can still be added. For example:
- The 5th factor E = AB (confounded with the two-factor interaction AB);
- The 6th factor F = AC (confounded with AC);
- The 7th factor G = BC (confounded with BC).
For each additional factor, there will be one more layer of confounding. However, as long as our goal is to "quickly screen the main factors", these confoundings are all "acceptable sacrifices" - because what enterprises need most are the "vital few" factors, rather than "all the details of all factors".
III. The core trade - off in fractional factorial design: Balancing cost and precision
The essence of fractional factorial design is to "focus on the major factors and overlook the minor ones".
- When the number of factors is large and the test cost is high, we prioritize ensuring that "the effects of the main factors can be accurately identified" and sacrifice the discrimination of "secondary interactions" for "a smaller number of tests".
- Confounding is not a defect but a strategy — as long as we clarify that the goal of the experiment is screening, not precise modeling, confounding will not affect the core conclusion.
For example:
- Eight trials can handle seven factors (main factors), but there will be confounding between seven main factors and seven interactions.
- 16 trials can handle 15 factors with a lower degree of confounding.
- 32 trials can handle 31 factors..
The fewer the number of tests (the more the factors), the higher the probability of confounding. However, as long as the goal is to "screen out the key factors", this trade - off is worthwhile, because the core need of the enterprise is to "quickly find the root cause of the problem" rather than "publish statistical papers".
IV. Practical Case: Solve the Website Click - Through Rate Problem of 6 Factors with 8 Tests
Let's take a real - life case of a foreign enterprise to see how fractional factorial design can be implemented:
1. Problem background: The crisis of Company ACB
ACB is an internet company that serves individual users. Recently, it has encountered a fatal problem: the weekly website traffic has been continuously declining, and its industry ranking has dropped from the top 10 to the top 30. The senior management has demanded that "the key factors be quickly identified and the click - through rate be increased within three months."
2. Factor identification: 6 variables with high possibility
Through user research and data analysis, the project team screened out 6 factors that "are most likely to affect the click - through rate".
- Number of keywords (5 vs 10);
- Keyword types (old keywords vs new keywords);
- URL title length (short title vs. long title);
- Weekly update frequency (once vs four times);
- The position of the keyword in the title (the 40th character vs the 70th character);
- Free gifts (Provided vs Not provided).
3. Dead end of full design: 64 trials = 1 year
If a full factorial design is used, \(2^6 = 64\) combinations need to be tested. With one trial per week, it would take 64 weeks (approximately 15 months). By the time the results are out, the market trends would have long changed, and the optimization plan would be worthless.
4. Selection of partial designs: 8 trials = 2 months
The goal of the project team is very clear: to quickly screen out the "key factors". Therefore, the "2^(6 - 3)" design is selected, that is, \(2^{6 - 3}=8\) trials ("6" is the total number of factors, and "3" is the dimension of the "sacrificed" interaction).
The core advantages of this design:
- It only takes 8 weeks (one trial per week), which just meets the requirement of "getting results within 3 months".
- Maintaining the orthogonality of the main factors can accurately identify the effects of the main factors.
- What is confounded is the third - order and higher - order interactions, which have no impact on the screening results.
5. Test results and conclusions: Two key factors solve the problem
Eight weeks later, the team counted the click - through rates of each combination (Table IV) and analyzed them using JMP software:
The Pareto chart (sorted by the effect size) shows that the effects of "weekly update frequency" and "keyword type" are much greater than those of other factors.
Further verification of the normality plot (to determine whether the effects are significant): The effects of these two factors "significantly deviate from random fluctuations".
The conclusion is very clear:
- The click - through rate of content updated four times a week is approximately 30% higher than that of content updated once a week.
- The click - through rate of the "new keywords" is approximately 25% higher than that of the "old keywords".
6. Result implementation: Rapidly increase the click - through rate
ACB immediately adjusted its strategy:
- Increase the weekly update frequency from once to four times.
- Replace all old keywords with new keywords.
Three months later, the weekly traffic of the website increased by 42%, and its ranking returned to the Top 15. The problem of six factors was solved with eight trials. The cost was only 1/8 of that of a full design, yet the core goal was achieved.
V. Where lies the charm of fractional factorial design?
Fractional factorial design is not a "compromised version of DOE", but rather the "ultimate weapon" of DOE to deal with the real world:
- It resolves the contradiction of "multiple factors = high cost" and transforms DOE from a "laboratory tool" into an "enterprise-level tool".
- It captures the essence of the "key minority" - enterprises don't need to "analyze all factors"; they only need to "find the key factors".
- It achieves "precise slacking off" through mathematical laws—sacrificing secondary and unimportant information while retaining core and valuable conclusions.
For enterprises, the value of DOE has never been "calculating complex statistics" but "solving the most critical problems with the least cost" - fractional factorial design is the best manifestation of this value.