Step 2 of DOE


One of the most common mistakes in the design and analysis of experiments is in finding an appropriate statistical method due to many fine details involved in this step. Often, experts find a better method later after designing and completing the data collection. There is, sometimes, always a better method in doing things out there. Therefore, it requires brainstorming with statisticians and subject matter experts in developing an appropriate statistical method. However, the step by step process discussed here will help minimize mistakes in finding an appropriate statistical method. Finding an appropriate method requires understanding the basic three principles of design of experiments. As the design of experiments uses different names for variables and other commonly used terminologies, they demand some discussions prior to discussing the basic theory in the design of experiments.

Variables/Terminologies in Design of Experiments

Generally, variables can be categorized into independent and dependent variables. However, the design of experiments has adopted different names or terminologies for the independent and dependent variables.

Factor/Independent/Explanatory Variable

Independent variable in the design of an experiment is known as explanatory variable, predictor variable or factor. There are many other more names for the independent variable, which will be discussed in the respective section. To start with the design of experiments, we will first introduce the term/name factor, which has originated from the field of agriculture. For example, to determine the effect of sunlight, irrigation, and fertilizer on crop production, we typically say or assume that sunlight is a factor, irrigation is a factor, fertilizer is a factor, etc. on the crop production. This is how the name “factor” has been brought into the statistical design of experiments. The term/name factor for independent variable is more common in the design of experiments for another reason, which is the word independent is used more frequently for independent distribution of the experimental error (will be discussed later). Therefore, factor name has become more popular for the independent variable in design of experiments.

Levels of a Factor

To study the effect of a factor, a couple of conditions or settings of the factor is applied on the experimental units and the output is observed. Each of these conditions or settings is known as a level. For example, to determine the effect of a fertilizer, a couple of levels of fertilizer is applied and the output is observed. The levels of fertilizer could be set by varying the amount and strength of the fertilizer. To determine the sunlight factor on the crop production, two different categories (low and high sunlight) of sunlight may be used. These low and high sunlight conditions are called low level and high level of the sunlight factor. In other words, there are two levels of sunlight factor.

Choosing the Levels of a Factor

Choosing the appropriate factor levels and the number of levels of a factor requires subject matter expertise. For example, to study the effect of the temperature on human comfort, most of us have some idea about the comfortable temperature. We know that the temperature factor level below 50-degree Fahrenheit (10 degree Celsius) or above 100-degree Fahrenheit (37.8-degree Celsius) will not produce any results that we don’t know. Therefore, running the temperature factor levels of such will be wasted. However, in most experimental situations, only subject matter experts know the ranges of the factor that could potentially provide some valid and useful responses. Moreover, the current literature will provide information on the factor levels. Often, a pilot study is suggested before running the entire experiment so that any necessary adjustments can be made if necessary.

To determine the effect of a factor, generally two levels of a factor are enough. However, the use of more than two levels is also common depending on experimental situations. For example, three levels are needed to determine if there is any curvature effect exists.

Treatment/ Treatment Combinations

The word treatment in the design of experiment can simply be thought of as the medical treatment. For example, if a patient is given a treatment of a medicine, he/she is on a particular treatment. For a single factor, assigning a level of a factor is the same as assigning a treatment such as providing a particular medication. When a patient is given multiple medications, we say that a treatment combination or combination of factor levels is applied. Treatment combination(s) is the preferred term instead of the combinations of the levels of the factor. Assume that a human comfort study uses two levels of the temperature factor and two levels of the humidity factor, which results in four treatment combinations (low-low, low-high, high-low, and high-high levels of the temperature and the humidity). Treatment and treatment combinations are also used interchangeably, which is defined by the applications of the combinations of the levels of the factors to the experimental units.

Response/Dependent Variable

Dependent variable is known as the response variable in design of experiments. The word response originated from the agriculture field too. Is the plant responding to a fertilizer? Therefore, the word response is used for the dependent variable. Moreover, the word dependent is frequently used for other statistical terms such as dependent observations, dependent errors, residuals, etc. (will be discussed later). Therefore, the response makes more sense when some treatments (or combination of treatments or the combinations of the levels of the factors) are applied to experimental units, and the response is observed and measured.

Measuring the Response

Measuring the response also requires subject matter expertise. For example, only subject matter experts know how to measure human comfort. Is it as simple as asking the study subjects (experimental units) about the comfort level using a Likert scale of 0 to 10 (0 = absence of comfort, …, 10=most comfortable) or a scale of 0 to 100, or simply a binary choice of “yes” or “no”? Rather than this subjective scale of the perceived feeling of comfort, is there an objective measure such as the response from a comfort hormone (if that exists)! Nevertheless, an experiment does not start from a complete vacuum. There is literature out there to determine the appropriate response variables and their measures. Once the appropriate measures for the responses are suggested by the subject matter experts, statisticians can provide guidance towards the appropriate method. This book will help determine the most appropriate statistical method for the suggest measures of the response variable(s).

Fixed vs Random Factor

The levels of the factor determine the fixed or random state of the factor. If only a few fixed levels of a factor are chosen for the study, the factor will be called a fixed factor. For example, assume that we are studying the effect of education levels on salary. If the four levels of education (high-school, BS, MS, PhD) are selected, the education will be considered as a fixed factor. When a few levels of a factor are chosen from a long list of levels of that factor, the factor is then called a random factor. For example, assume that we are studying the school effect on learning a subject matter. If a few schools from a long list of schools are randomly chosen for the study, the school is considered as a random factor. Randomization is one of the basic principles of design of experiments, which is entirely a different concept and frequently confused with a random factor. A random factor is defined by the state of its levels, while randomization of the experimental units is ideally expected for any design of experiments.

Experimental and Observational Units

Imagine a herd of cattle is fed with a diet and at the end of their mature period; the total weight is measured for each cow. Each cow is called an observational unit, while the entire herd of cattle is called the experimental unit. Most often, the experimental units are the same as the observational units. In this diet experiment on the herd of cattle, if a random sample of some cows are taken and then weighted, the observation units are different from the experimental units. Therefore, there will be observation (sampling) error due to the random sample of the observational units from the experimental units. If the entire herd of cows are weighted, the response is taken for the entire experimental units, there will be only experimental error. If there is a variation in the treatment applications such as the cows may not be fed as exactly as 25 pounds per day, rather they could be fed 24 or 26 pounds per day, etc. Therefore, there could be a treatment error associated with the experiment.

Outline of a Method

An experimental unit is assumed to respond from a treatment as well as from experimental design. Therefore, an observation from an experiment could be generally modeled as in Equation 1 (Hinkelmann & Kempthorne, 2008).

Equation 1

While Equation 1 provides an overview of how the observations could get affected, finding an appropriate statistical method requires more information, which will be discussed in the following modules of this book.

Three Principles of Design of Experiments

The three principles of experimental designs include (1) replication, (2) randomization, (3) and blocking determines the appropriate statistical method (Hinkelmann & Kempthorne, 2008; Kempthorne, 1952). In absence of these basic principles, the validity, precision, accuracy and sensitivity of the experiment will be questionable.

Replication Principle

The principle replication simply means that several experimental units will receive exact same treatment. Replications allow for the estimation of experimental random error if there is no systematic variation between the experimental units treated alike. In some designs, especially Fractional Factorial and Taguchi, unreplicated experiments are utilized primarily for screening variables/factors.

Replication is often confused with repeated measure on the same experimental unit. Replications is the application of the same treatment combination in more than one experimental unit, while repeated measurement is conducted on the same experimental unit. If the same diet factor is applied/fed to 25 cows (25 experimental units), the experiments are replicated. However, if a cow is measured every month for the next 25 months resulting in 25 measurements on a single cow (single experimental unit), the experiment is NOT replicated, rather it is called a repeated measure design, will be discussed later. Replication provides precision to any experiments, including the repeated measure design.

Randomization Principle

The randomization principle ensures the validity of the experiment. The validity of the experiment will be questionable if randomization is not performed appropriately. Not all the variations between the experimental units can be controlled. Randomization accounts for these uncontrolled extraneous variations between experimental units. Random order of experimental run can be easily performed by simply using MS excel or any statistical software. Following video shows how to randomize experimental runs using MS Excel.

However, there are natural restrictions in randomization such as a gender cannot be assigned to an individual; a place cannot be chosen randomly and called the United States of America; time cannot be randomized, and many more. Restrictions in randomization create a challenge for the experimenters to find an appropriate method. The methods appropriate for the experimental units with restriction in randomization will be discussed in the advanced design of experiment section (e.g., repeated measure, split plot, and nested design).

Blocking Principle

Simply, block is defined by a set of homogenous experimental conditions. Blocking or local control is utilized to reduce experimental error, resulting in better precision. Assume that we are testing a certain combination of diet on cattle growth. Same diet is fed in the southern and northern part of a country. Experimental units (cattle/cows) from these two locations are exposed to two different climates (e.g., hot humid vs cold dry). Therefore, there is a systematic variation, resulting in nonhomogeneous experimental units from these two locations. To control these issues, the large experimental unit (the entire country) is divided into two portions with homogenous conditions, southern and northern part in this case. The location factor is then called blocked. In this diet experiment, the location is not a factor of interests, rather it is somewhat disturbing the experiment. This type of factor is known as a nuisance factor. Generally, nuisance factors shall be blocked whenever possible to improve the precision of the experiment or to reduce the experimental error. More discussion on the error-control or error-reducing designs can be found in the Randomized Complete Block Design (RCBD).

A smaller set of experimental units (50 cows) will always produce a reduced experimental error than a larger set of experimental units (5000 cows from entire country), improving the sensitivity and precision of the experiment. Blocking also performed whenever there is a restriction in randomization such as in Latin-Square Graeco-Latin Square, Split Plot, and Repeated Measure, will be discussed in the appropriate section. Another common use of blocking is if there are not enough experimental units available from one homogenous unit (e.g., samples from different batches of raw materials). If so, the batches shall be blocked. A larger and broader application of blocking is in the fractional factorial design of experiments, will be discussed in the respective section.


We have provided some basic information that we think will facilitate understanding the statistical methods discussed throughout this book. As there is always a better way of doing something, finding an appropriate method is very challenging, and therefore, so rewarding as any other quality work!