Practical Significance Test
Once statistical significance is observed in the step #1, the practical significance is checked from the third output table in Figure 4, the Model Summary Table.
R-square/ Coefficient of Determination
R-square or the Coefficient of Determination is defined by the percent of the variation in the dependent variable explained by the independent variable(s) (Equation 5). In this situation, 96.36 percent variation in the fuel cost can be explained by the distance traveled.
Where, SSR = Sum of Square of the Regression Model = total variances by all model terms, SSTO = Sum of Square Total = total variances including the experimental error (or residuals in a regression analysis).
Figure 5. Understanding the R-Square
Figure 5 shows a visual representation of the r-square values for data sets with different error (residuals). The top left graph shows that most observed data points follow close to the predicted regression line, while the bottom right graph shows that the most data points are away from the predicted regression line. Higher error (residuals) means lower r-square value (Equation 5). While relationships between the dependent and the independent variables are observed to be significant for all four data sets in Figure 5, the strength of their relationship is weak in the bottom right graph than the top-left graph. Therefore, the r-square is a measure for the strength of the relationship. Higher r-square value indicates a stronger relationship between the dependent and the independent variables. Even though the functional relationship between the dependent and the independent variables is observed to be significant, a very weak relationship could be practically meaningless depending on the type of study.
As the r-square is proportional to the variation in the regression model terms (SSR) (Equation 5), adding more model terms may inflate/increase the r-square value. Therefore, the r-square could be misleading in finding the strength of the relationship between the dependent and independent variables. To account for this unwanted inflation, an adjustment to the r-square formula has been made as in Equation 6. Therefore, the relatively unbiased the adjusted r-square could be used to find the strength of the relationship.
Nevertheless, an appropriately built regression model will produce very close values for both r-square and adjusted r-square. If there is a very high difference between them, probably insignificant terms have been included in the final regression model. Therefore, the regression model should be investigated further to find the issue if there is a difference in the value of r-square and adjusted r-square. To explain the model strength or the practical significance, use of either r-square or adjusted r-square should be okay if the model is well built.
What is a satisfactory r-square value?
Satisfactory r-square value depends on the field of study. In the fuel cost versus the distance example, the value of r-square is observed to be 96.36%, which is considered excellent. Therefore, the functional relationship between fuel cost and the distance is considered very strong. While a lower r-square value would not be acceptable for this fuel cost study, human behavior study would be happy enough to get a 50% r-square value. In the field of marketing, advertising for example, 10 to 20 % r-square value would be okay.
Once the regression model is observed to be statistically and practically significant, the third step is to explain the functional relationships between the dependent and the independent variables using the first and the second table in Figure 4.
Figure 4. Simple Linear Regression Analysis Output for the Fuel Cost vs Distance Data