MAP and Bayes Estimates
The posterior distribution’s mean and variance have been calculated in the previous lecture. Because the posterior is a gaussian, we can clearly understand that
MAP is the $w$ where a global maxima is attained, and Bayes is the expected value of $w$.
Pure Bayesian
Didn’t understand anything, kek
Regularized Ridge Regression
The Bayes and MAP estimates obtained for linear regression coincide with regularized ridge regression as well.
In case of polynomial regression, increasing $\lambda$ tends to decrease the curvature, as 2-norm is being limited.
Replacing 2-norm with 1-norm is called as Lasso Regression, and with 0-norm is called as Support Based Penalty.
The additional term is usually represented as $\Omega(w)$, and this formulation is known as Penalized Formulation. The original problem of wanting to limit 2-norm of $w$ to be less than or equal to $\theta$ is called Constrained Formulation.
Lasso Regression tends to yield solutions that are sparser. That is, it is more likely for elements of $w$ to be 0 using this method. (many corners)
This follows a Laplacian distribution, and has no closed form solution. Therefore, iterating with gradient descent is the only way for Lasso regression. The algorithm for lasso regression will be explained in the next lecture.