Bias vs Variance
Bias
Bias refers to errors introduced by assuming that our model is simpler than the real-world problem we are trying to solve. Think of it as an archer consistently aiming at a target but always missing in the same direction because they are not accounting for the wind.
- High bias: The archer (or model) aims consistently but misses the target in a specific direction every time. This means the model is too simple and cannot capture the complexity of the data (underfitting).
- Low bias: The archer adjusts their aim better and can hit closer to the target. This means the model captures the complexity of the data more accurately.
Variance
Variance refers to errors introduced because our model is too sensitive to the small fluctuations in the training data. Imagine the same archer hitting different spots all over the target every time they shoot because they are overreacting to every gust of wind.
- High variance: The archer's shots are spread out all over the target. This means the model is too complex and is fitting the noise in the training data (overfitting).
- Low variance: The archer's shots are closer together, indicating that the model is not overly sensitive to small changes in the training data.
Balance Between Bias and Variance
The goal in machine learning is to find the right balance between bias and variance:
- High Bias + Low Variance: The model is simple and consistent but misses capturing the underlying patterns (underfitting).
- Low Bias + High Variance: The model is complex and captures too much detail, including noise (overfitting).
- Just Right: The model captures the underlying patterns without being overly sensitive to noise, achieving good generalization on new data.
Here's a visual analogy:
- High Bias: All the arrows are clustered together but far from the bullseye.
- High Variance: The arrows are scattered all over the target, some near the bullseye and some far.
- Low Bias + Low Variance: The arrows are clustered together and close to the bullseye, indicating a good model.
Balancing bias and variance is key to building models that perform well on unseen data.
graph TD;
A[Model Performance];
A --> B(High Bias, Low Variance);
A --> C(Low Bias, High Variance);
A --> D(Low Bias, Low Variance);
A --> E(High Bias, High Variance);
B -->|Underfitting| F{{Example: Simple Model}};
C -->|Overfitting| G{{Example: Complex Model}};
D -->|Good Generalization| H{{Example: Balanced Model}};
E -->|Poor Model| I{{Example: Poor Model}};
style B fill:#f96,stroke:#333,stroke-width:2px;
style C fill:#f66,stroke:#333,stroke-width:2px;
style D fill:#6f6,stroke:#333,stroke-width:2px;
style E fill:#f66,stroke:#333,stroke-width:2px;
style F fill:#f96,stroke:#333,stroke-width:1px;
style G fill:#f66,stroke:#333,stroke-width:1px;
style H fill:#6f6,stroke:#333,stroke-width:1px;
style I fill:#f66,stroke:#333,stroke-width:1px;