$\color{black}\rule{365px}{3px}$
Table of Contents

$\color{black}\rule{365px}{3px}$
Very good reading about stats behind this:
https://ds100.org/course-notes-su23/probability_2/probability_2.html


<aside> <img src="/icons/bookmark_lightgray.svg" alt="/icons/bookmark_lightgray.svg" width="40px" />
Observation Variance ($\epsilon$)
Observation variance refers to the inherent variability in the data itself. Even if you had a perfect model, the observations (data points) you collect may vary due to noise, measurement errors, or randomness in the process you are modeling. High observation variance means that even repeated measurements under the same conditions can yield different results.
Observation variance is the variance $\sigma^2$ in the data itself. It is present in the observed data distribution:
$$ \text{Data} = g(x) + \epsilon $$
where:
The variance $\sigma^2$ represents the observation variance, which is the inherent noise in the data that no model can eliminate (”irreducible”).
</aside>
<aside> <img src="/icons/bookmark_lightgray.svg" alt="/icons/bookmark_lightgray.svg" width="40px" />
Model Variance $(\mathbb{E}[\hat{Y}(x)] - \hat{Y}(x))$
Variance measures how much the model’s predictions change if we use different training datasets. A model with high variance tends to overfit the data, meaning it is too sensitive to fluctuations in the training data and performs poorly on unseen data. High variance models are too complex and capture noise along with the signal.
Variance is the variability of the model’s prediction due to different training sets:
$$ \text{Variance} = \text{Var}(\hat{Y}(x))=\mathbb{E}\left[ \left( \hat{Y}(x) - \mathbb{E}[\hat{Y}(x)] \right)^2 \right] $$
Where:
High variance indicates that the model’s predictions fluctuate significantly with changes in the data, leading to overfitting.
</aside>