Step 3 of DOE
Once the experiment is performed and data is collected using the appropriate method developed in the method step, the next step is to produce results by analyzing the collected data. Analyses have become easier with the availability of statistical software, including, SAS, SPSS, Minitab, R, Design Expert, JMP, and Microsoft Excel to mention a few. Generally, two types of statistical results, including (1) descriptive statistics and (2) inferential statistics are produced to draw conclusions from the sample data. Descriptive and inferential statistics are explained in Video 1.
Video 1. Descriptive Statistics and Inferential Statistics Explained with Example using MS Excel
Descriptive statistics is used to describe the data. Data is generally described in two different dimensions, including (1) niceness and (2) craziness, which are measured by the central tendency and the variability, respectively (Figure 1). Any software, including MS Excel, will provide the descriptive analysis/ statistics within a few mouse clicks.
Figure 1. Descriptive Statistics
Niceness: The Central Tendency
Niceness of the data is measured using the central tendency of the data, including the mean, median, and mode. Mean is defined by the arithmetic average. Median is defined by the mid-point of the data. Mode is the most frequently appeared data. Any basic books on statistics will provide some details if necessary.
Craziness: Measures for Variability
Craziness is measured by the variability statistics, including the minimum, maximum, range, standard deviation or standard error (or variance), skewness and kurtosis. Any basic statistics book will provide details on these statistics if necessary.
While the simple descriptive statistics describes the sample or the data, the inferential statistics provides inference on the population by analyzing the sample utilizing an appropriate statistical method (Figure 2). As the sample is a fraction of the entire population, statistical probabilistic method(s) is applied for analysis, resulting in probabilistic inference (conclusion). This book is, primarily, written to help find the best statistical probabilistic methods and the analyses to produce correct inferential results.
Figure 2. Inferential Statistics
In any inferential statistics, generally Analysis of Variance (ANOVA) is performed on the data utilizing the appropriate method developed in Step 2 of DOE (Method). Many statistical software package are available to run most statistical methods and produce the inferential statistics.
The P-Value and the Level of Significance
Video 2 explains the p-value and the level of significance in statistical tests.
Video 2. What is p-Value and Level of Significance Alpha
Although the analysis details are hidden behind the software package today, the first analysis goal is to find the probability value for the null hypothesis so that we can determine whether there is enough probability for the null hypothesis to happen or not. In reality, researchers want to actually disprove the null hypothesis and prove the alternative hypothesis. Therefore, a lower observed probability value is expected from the data. Once the appropriate method is assigned in a statistical software (video demonstrations are provided for each method in the respective section), p-value is calculated/produced from the sample data. Therefore, the p-value can be defined as the observed probability calculated from the sample data. The weather data in Figure 3 provides the calculated probability (observed probability or p-value) of the sample data after running an appropriate method. At 8 o’clock in the morning, it shows that there is a 30% chance of rain. As the decisions are always between “rain” or “no rain” so that we could decide whether to carry an umbrella or not. We will have to decide on a set criterion of rejecting the null hypothesis of “rain at 8 am.” Generally, 5% probability (0.05) is considered too low probability to happen. This set criteria of 5% (0.05) is called the level of significance. According to our set criterion the level of significance of 5%, we will say that the probability (30%) for the null hypothesis (rain at 8 am) is higher. Therefore, it will rain at 8 am and so carry an umbrella.
Figure 3. p-value and level of significance
Statistical Decision Rules
If the p-value (observed probability for the null hypothesis) is larger than the level of significance, the null hypothesis is accepted.
Generally, “DO NOT REJECT” term is recommended for the word “accept.” However, for simplicity of understanding, we will use the word “accept” throughout this book. Any basic statistics books can provide the details on it.
If the p-value is smaller than the level of significance, the null hypothesis is rejected, and the alternative hypothesis is accepted.
Reject the Null Hypothesis = Accept the Alternative = Significance Result
Errors in Hypothesis Testing
The errors in statistical test is explained with example in Video 3.
Video 3. What is Type 1, Type 2, Type 3, Type 4 Error in Statistical Tests
In inferential statistics, the final decision is a binary choice, either we “reject” or “do not reject” the null hypothesis. The decision is made based on the probability for the null hypothesis, which is calculated from the sample data. The observed probability (p-value) is calculated from the sample data (a tiny fraction of the population). However, the conclusion is drawn for the population. Therefore, there will be always some mistakes due to not studying the entire population. Generally, we reject the hypothesis if the observed probability (p-value) is less than 0.05 or 5% level of significance. Therefore, there will be a chance of 5% mistakes or error in our decision if the null hypothesis is rejected. For example, assume that the observed probability for the null hypothesis of rain was calculated as 4%. As this is lower than our set criteria, we reject the null hypothesis in favor of the alternative and making decision that there will be no rain. However, there is a probability of 4% rain. So, the decision has some error in it Table 3. The error made in reject the null hypothesis (when it is true) is known as “Type I Error.” Error made in not rejecting the null hypothesis (when it is false) is known as “Type II Error.” Type III Error is made when the research question (the hypothesis) is wrong! Type IV Error is made when the statistical method is wrong.
Table 3. Errors in Statistical Tests