Evaluating regression models involves different metrics compared to classification models. These metrics help understand the model's performance in predicting continuous values. Here are some common regression evaluation metrics along with simple explanations and visual examples using Mermaid diagrams.
MAE measures the average absolute differences between the predicted and actual values. It is calculated as:
Example:
Actual Values: [3, -0.5, 2, 7]
Predicted Values: [2.5, 0.0, 2, 8]
Number of Instances (n): 4
Mermaid Diagram:
Mean Squared Error (MSE)
MSE measures the average squared differences between the predicted and actual values. It is calculated as:
Example:
Actual Values: [3, -0.5, 2, 7]
Predicted Values: [2.5, 0.0, 2, 8]
Number of Instances (n): 4
Mermaid Diagram:
Root Mean Squared Error (RMSE)
RMSE is the square root of the average squared differences between the predicted and actual values. It is calculated as:
Example:
Actual Values: [3, -0.5, 2, 7]
Predicted Values: [2.5, 0.0, 2, 8]
Number of Instances (n): 4
Mermaid Diagram:
Number of Instances: 4
Actual Values: 3, -0.5, 2, 7
Predicted Values: 2.5, 0.0, 2, 8
Squared Errors: 0.25, 0.25, 0, 1
MSE: 0.375
RMSE: 0.612
Root Mean Squared Error (RMSE):
Measures the average magnitude of the errors between predicted and observed values.
Lower values indicate better model performance, as the errors are smaller.
There is no definitive range for "good" or "bad" RMSE values, as it depends on the scale and units of the target variable.
Generally, lower RMSE values are better, with very low values indicating excellent model performance.
R-squared (R²)
R-squared measures the proportion of the variance in the dependent variable that is predictable from the independent variables. It is calculated as:
Where is the mean of the actual values.
Example:
Actual Values: [3, -0.5, 2, 7]
Predicted Values: [2.5, 0.0, 2, 8]
Mean of Actual Values (): 3
Mermaid Diagram:
Number of Instances: 4
Actual Values: 3, -0.5, 2, 7
Predicted Values: 2.5, 0.0, 2, 8
Mean of Actual Values: 3
Sum of Squared Errors: 1.5
Total Sum of Squares: 29.25
R-squared: 0.949
Here's a markdown table with a range of values for R-squared and Root Mean Squared Error (RMSE) and how they are typically interpreted for machine learning models:
R-squared
Interpretation
RMSE
Interpretation
0.0 - 0.3
Very poor fit
High values
Poor model performance
0.3 - 0.5
Poor fit
Moderate values
Moderate model performance
0.5 - 0.7
Moderate fit
Low values
Good model performance
0.7 - 0.9
Good fit
Very low values
Excellent model performance
0.9 - 1.0
Excellent fit
~ 0
Perfect model performance (rare)
Adjusted R-squared
Adjusted R-squared is a statistical metric that adjusts the R-squared value to account for the number of predictors in a regression model. It is particularly useful when comparing models with different numbers of predictors or when evaluating the impact of adding or removing predictors from a model.
In simple terms, Adjusted R-squared helps in understanding whether adding more predictors to the model improves its predictive power significantly or not, considering the risk of overfitting.
Let's revisit the previous example and explain why Adjusted R-squared is required:
Suppose we have a regression model with the following data:
Actual Values: [3, -0.5, 2, 7]
Predicted Values: [2.5, 0.0, 2, 8]
Mean of Actual Values (): 3
Number of Observations (): 4
Number of Predictors (): 2 (assuming a simple linear regression model with two independent variables)
The standard R-squared () measures the proportion of the variance in the dependent variable (actual values) that is explained by the independent variables (predicted values). In our example, let's say we calculated , indicating that around 94.9% of the variance in the actual values is explained by the model.
However, R-squared has a limitation. It tends to increase with the addition of more predictors, regardless of whether those predictors actually improve the model's predictive power. This can lead to overestimation of the model's performance, especially when including irrelevant variables.
Adjusted R-squared addresses this limitation by penalizing the addition of unnecessary predictors. It takes into account both the goodness of fit (captured by ) and the number of predictors in the model . The formula for Adjusted R-squared is:
Here's why Adjusted R-squared is required:
Controls for Model Complexity: Adjusted R-squared penalizes the inclusion of more predictors, ensuring that the model's improvement in predictive power justifies the added complexity.
Prevents Overfitting: By penalizing unnecessary predictors, Adjusted R-squared helps prevent overfitting, where the model performs well on training data but poorly on new, unseen data.
Facilitates Model Comparison: When comparing models with different numbers of predictors, Adjusted R-squared provides a fair comparison by considering both goodness of fit and model simplicity.
Iour example, let's say we calculated Adjusted R-squared . This value is slightly lower than the standard R-squared, indicating that although the model explains a significant portion of the variance, the addition of predictors might not have provided a substantial improvement in predictive power considering the model's complexity.
Mean Absolute Percentage Error (MAPE)
MAPE measures the average absolute percentage error between the predicted and actual values. It is calculated as:
Example:
Actual Values: [3, -0.5, 2, 7]
Predicted Values: [2.5, 0.0, 2, 8]
Number of Instances (n): 4
Mermaid Diagram:
Number of Instances: 4
Actual Values: 3, -0.5, 2, 7
Predicted Values: 2.5, 0.0, 2, 8
Percentage Errors: 0.167, 1, 0, 0.143
Sum of Percentage Errors: 1.31
MAPE: 32.75
When to Use Each Metric
MAE: Use when you want to measure the average magnitude of errors in predictions without considering their direction. It is more interpretable in the same unit as the predicted values.
MSE: Use when you want to penalize larger errors more severely, as squaring the errors gives more weight to larger errors.
RMSE: Use when you want to measure the standard deviation of the errors. It provides a sense of how spread out the errors are.
R-squared: Use when you want to understand the proportion of variance in the dependent variable explained by the independent variables.
MAPE: Use when you want to understand the error in percentage terms, which is useful for comparing model performance across different scales.
These metrics help evaluate the performance of regression models from different perspectives, ensuring a comprehensive understanding of their strengths and weaknesses.