Supervised and Unsupervised Learning

Supervised Learning is when a goal is achieved by learning from training data which contains true labels. Examples include linear regression and classification.

Unsupervised Learning is when objects similar to each other are grouped together. Examples include clustering and dimensionality reduction. The desired output is unobserved in the training data.

There are three canonical learning settings:

Regression - Supervised

Estimate parameters, such as least square fit

Classification - Supervised

Given parameters about an object, assign a label to it

Unsupervised Learning

Clustering, and dimensionality reduction are prominent examples

Supervised Learning

Formally, let $\mathcal{X}$ be the input space and $\mathcal{Y}$ be the output space. We would like to obtain a function $f$ belonging to the function family $\mathcal{F}$ such that $y_i \approx f(x_i)$, where $(x_i, y_i) \in \mathcal{X} \times \mathcal{Y}$.

In linear regression, $\mathcal{F}$ is the Linear Function Space.

It is not guaranteed that the training data is error-prone. We would like the final estimator to be robust to errors, and one way to do this is Data Cleansing (pre-processing).

Error Function

The error function $\mathcal{E}$ takes the curve and data as input and yields a real number as the output. This is used to quantitatively judge whether a function is a “good fit” for the given data.

Some examples of $\mathcal{E}$ are $\sum \vert f(x_i)-y_i\vert$ and $\sum (f(x_i)-y_i)^2$. We would ideally want the error to always be positive (so that positive and negative errors doesn’t cancel out).

Using the error function $\sum (f(x_i)-y_i)^2$ is known as the Method of Least Squares, or Ordinary Least Squares (OLS).