Supervised and Unsupervised Learning
Supervised Learning is when a goal is achieved by learning from training data which contains true labels. Examples include linear regression and classification.
Unsupervised Learning is when objects similar to each other are grouped together. Examples include clustering and dimensionality reduction. The desired output is unobserved in the training data.
There are three canonical learning settings:
- Regression - Supervised
Estimate parameters, such as least square fit
- Classification - Supervised
Given parameters about an object, assign a label to it
- Unsupervised Learning
Clustering, and dimensionality reduction are prominent examples
Supervised Learning
Formally, let $\mathcal{X}$ be the input space and $\mathcal{Y}$ be the output space. We would like to obtain a function $f$ belonging to the function family $\mathcal{F}$ such that $y_i \approx f(x_i)$, where $(x_i, y_i) \in \mathcal{X} \times \mathcal{Y}$.
In linear regression, $\mathcal{F}$ is the Linear Function Space.
It is not guaranteed that the training data is error-prone. We would like the final estimator to be robust to errors, and one way to do this is Data Cleansing (pre-processing).
Error Function
The error function $\mathcal{E}$ takes the curve and data as input and yields a real number as the output. This is used to quantitatively judge whether a function is a “good fit” for the given data.
Some examples of $\mathcal{E}$ are $\sum \vert f(x_i)-y_i\vert$ and $\sum (f(x_i)-y_i)^2$. We would ideally want the error to always be positive (so that positive and negative errors doesn’t cancel out).
Using the error function $\sum (f(x_i)-y_i)^2$ is known as the Method of Least Squares, or Ordinary Least Squares (OLS).