# Module 12

# Nested, Partially Nested, Hierarchical Design

Partially Nested Design is the Basic to all Advanced Design of Experiments, including, repeated measure, split-plot, subsampling, and any complex mixed models

We have explained in the earlier modules that the factor types (defined by its levels) are the key determinant in distinguishing or defining an experiment. For example, an experiment consisting with only fixed factors is called fixed (effect model) design of experiments. An experiment consisting of only random factors is called random (effect model) design of experiments. Similarly, an experiment consisting of a nested factor(s) is called a *nested design of experiment* (also known as *hierarchical design*).

# Nested Factor

Factor *B* is called nested in factor *A* if each level of factor *B* occurs with only one level of factor *A*. The expression *B(A)* is used when factor *B* is nested in *A*. For example, leaves are nested in trees *(leaves(trees))*. Therefore, if a study uses tree species as factor *A* and leaves as another factor *B*, factor *B (leaves) *is nested in factor *A* and the expression *leaves(trees)* is used.

**Figure 1**

*Tree Pictures are Licensed Under Creative Commons*

A specific leaf can only come from a particular tree, not from another tree (Figure 1). A few more examples, include, instructors are nested in schools; fish species are nested in ponds/lakes/rivers; children are nested in parents; and countries are nested in continents. For the same reason, experimental units are nested in treatment combinations. The *nested design* is also known as *hierarchical design* as it can be seen from the examples that the nested factors maintain a hierarchy (e.g., children are nested in parents). Table 1 provides a distinction between a factor that is not nested (known as *crossed*) and a factor that is nested in another factor. Any treatment combinations for the two levels of temperature and humidity factors can be studied to understand human comfort. All levels of the temperature factor can appear with all levels of the humidity factor. Therefore, the *cross *or *interaction *between temperature and humidity factors is possible. However, the leaf #1, #2, and #3 came from the tree species 1, and they did not come from the species 2. Therefore, the cross (interaction) between factor *A (tree species)* and *B (leaves)* is NOT possible. The treatment combinations with an “X” mark cannot be studied. Situations like this should be treated as a *nested-factor* experiment. However, for the temperature and humidity experiment, all possible treatment combinations ((1,1), (1,2), (2,1) and (2,2)) can be studied and situations like this should be treated as a *cross-factor *experiment.

**Table 1**

*Nested vs Not Nested (Crossed) Factors*

# Fully Nested Design

As the interaction for the nested design does not make sense, the model includes the main effects only for a fully nested design (Equation 1).

Where, *α* = effect of factor *A, β* = effect of factor *B* nested in *A*, *γ *= effect of factor *C *nested in factor *B*, therefore, also nested in factor *A, y* = response, *ε* = experimental error

*Equation 1*

# Partially Nested Design

Generally, a fully nested design is rarely used. A partially nested design is more common in practical applications. *Repeated measure*, *split-plot*, *subsampling*, and many mixed models are in fact *partially nested design*. Generally, one/two factor is usually nested with a couple of crossed factors. *A *typical model is provided in Equation 2, in which factor A and B are crossed, while factor *C *is nested in *B*.

Where, *α* = effect of factor *A*, *β* = effect of factor *B*, *γ* = effect of factor *C* nested in factor *B*, *y* = response, *α β* = interaction between factor *A* and *B*, *βγ* = interaction between factor *B* and *C*,* ε* = experimental error.

*Equation 2*

# Nested Design Example

A study was conducted to test the effectiveness of the method of instructions (online vs face-to-face) and course (DOE vs Quality) with respect to the performance in a standardized test. Currently, only two instructors are offering the DOE and Quality courses in each school. Both instructors are equally qualified to teach both DOE and Quality in both online and face-to-face method of instructions. Therefore, the cross between instructors and courses is possible. The interaction between the instructors and method of instructions is also possible for the same reason. For each possible treatment combination, three students (*E*) were randomly selected from those who took the standardized certification exam. Factor definitions and data are provided in Table 2 and Table 3, respectively.

**Table 2**

Nested Design Data

# Analysis Model

The model can be written as provided in Equation 3. The full model contains every term, except for the interaction terms that contain both factor *A *and *B *as *B(Instructor)* is nested in *A(Institution)*.

Where, symbols convey the usual meaning as it can be seen in the earlier equations.

*Equation 3*

# Analysis

Assuming all fixed factors, the analysis is very straight forward as there is no random terms in the model, except for the experimental error that comes from the randomly selected experimental units (e.g., students took the standardized test in this case). Therefore, the divisors for all f-statistics utilize the experimental error term. However, as there is a nested factor, some interactions terms do not exist in the full model as it can be seen in Equation 3. The analysis of variance (ANOVA) results are provided in Figure 2. For the significant variables, Figure 3 shows the post-hoc analysis using the Tukey pairwise comparisons (95% confidence level).

**Figure 2**

*Analysis of Variance*

**Figure 3**

*Tukey Pairwise Comparisons. Grouping Information Using the 95% Confidence*

# Results Explained/ Contextual Conclusions

According to the post-hoc analysis in Figure 3.

Main effect of course factor - Students performed significantly better in Quality (76.38%) as compared to the DEO (71.67%) in the standardized test.

Main effect of instruction method factor - Students performed significantly better in face-to-face (76.59 %) than in online (71.44%) form of instruction method.

Interaction effect of course and method of instruction - Students performed significantly better in Quality (78.67%) than DOE (69.71%) those who took both at institution 1. No statistically significant difference was observed for those who took them at institution 2.

Many authors believe that the main effects are meaningless when their interaction effect is significant.

Nevertheless, the main effect does provide some information, which may not be easily obtained from the interaction effect. For example, the first two conclusions cannot be easily drawn from the interaction effects.

In some situation, in a musical band for an instance, main effect could be meaningless when the interaction effect is significant.

Therefore, learners must use their discretion in ignoring a significant interaction when their main effects are significant too.

# Another Design Scenario

# for the Same Example

Often, in the partially nested design, the nested factor could be a random factor. Assume that many instructors are equally qualified to teach both Quality and DOE in both face-to-face and online method of instructions. Therefore, two instructors can be randomly selected, which makes the instructor a random, rather than a fixed factor. The analysis output is provided in Figure 4.

Effects (*C*, *D*, and *AC*) those were observed to be significant in the fixed effect model are not observed to be insignificant anymore when the instructor nested factor was treated as a random factor. Generally, adding random factors adds more random errors, which makes statistical tests more conservative than a fixed effect model.

**Figure 4**

*Analysis of Variance*

**Figure 5**

*Expected Mean Squares, using Adjusted SS*

# Another Design Scenario

# for the Same Example

Another design scenario could present the institution as a random factor. There are many institutions who offer both Quality and DOE courses in both online and face-to-face methods of instruction. Typically, there are only a few instructors who are specialized in a particular area. For this example, having the institution, a random factor makes more sense to understand students’ performances across a wide range of institutions. Therefore, in this scenario the instructor factor is a fixed factor. However, all factors nested under a random factor become a random factor by default even though they are fixed. The analysis output results are provided in Figure 2, Figure 4, and Figure 6.

The *p*-values are observed to be even higher when more terms become random. The model now contains more random terms, including, factor *A* and any terms that have *A* in it. The statistical significances have become even more conservative when more random terms or random errors are present in the model. Comparing the results from three design scenarios in Figure 2, Figure 4, and Figure 6, something that was easily detected as a significant effect becomes almost a no-evidence case (a *p*-value over 0.5 is considered a no-evidence case) when more random terms/errors added in the model.

**Figure 6**

*Analysis of Variance*

**Figure 7**

*Expected Mean Squares, using Adjusted SS*

# Are Mixed Models

# More Conservative?

The explanation is simple. For example, if only two institutions are being studied and conclusions are drawn only for those two institutions, there will be less error. However, if the conclusions are drawn for all 100 schools by studying only two randomly selected schools, conclusions must have to be conservative. Statistical tests should not easily detect a difference for this type of generalization for over 100s of schools when only studying two schools.

Similar explanations can be assumed for the analysis treating the instructor as a random factor. If a generalized conclusion is drawn for over 500 instructors studying only two instructors, it must be very conservative. Statistical tests should not detect the difference very easily when generalizations are made for the levels beyond what has been studied.

Nevertheless, the fixed effect model with fixed factor only draws conclusions for those levels of the factor studies, rather than a wider range for the factor. Therefore, a fixed effect model will detect statistical differences more than a mixed effect model.

# Approximate/

# Pseudo F-test

An approximate/pseudo test is performed, where automated test statistics (e.g., *f*-test followed by *p*-value) are not possible due to more random terms/errors (in addition to the experimental error) in the model. For example, *B(A)* effect in the output Figure 4 and *B(A)*, *AC*, and *AD *effects in the output Figure 6 do not have any direct and exact error associated with them to conduct an exact test. Exact f-tests are not possible in these situations. However, an approximate f-test can be possible by the combinations of multiple errors associated with the effect that does not have an exact error term associated with it. For example, *B(A)* effect does not have a direct and exact error term associate with it. However, the following combination provides the error ((12) + 3.0000 (11) + 6.0000 (9) + 6.0000 (8)) associated with the *B(A)* term (Figure 7).

Therefore, test statistics is given in Equation 4. The denominator (divisor) degree of freedom is calculated using the Equation 5. Therefore, the *p*-value for the test is calculated as 1.0 from *F*(0,2,1). It can be noted that the *p*-value is now calculated as 1.0, which is twice more than the value of 0.5 observed in the fixed effect model output results in Figure 2. With more random error terms added to the model makes it more difficult to detect a statistical difference. In other words, the tests become more conservative as the random error increases by the addition of more error terms in the model.

For more information on approximate/pseudo *f*-test, check the previous Expected Mean Square Module.