# Relational Algebra

There are six basic operators:

- Select
- Project
- Union
- Set Difference
- Cartesian Product

**Select Operation**
\(\sigma_\rho (r) \qquad \rho\text{ - Selection Predicate}\)
Only the rows which satisfy the selection predicate are present in the returned relation.

**Project Operation**
\(\Pi_{A_1, \ldots, A_k}(r) \qquad A_i\text{ - Attribute}\)
The attributes that are not listed are removed from the relation $r$.

These operations can be composed as shown in the below expression. \(\Pi_{name}(\sigma_{dept\_name = Physics}(instructor))\)

**Cartesian Product**
\(r_1 \times r_2\)
Every row of $r_1$ is paired with every row of $r_2$. Cartesian product is usually used along with the select operation to compare rows between two different relations. This composition can be written in short as shown below:
\(\sigma_\theta(r\times s) \equiv r \infty_\theta s\)
Note that the predicate $\theta$ is over the union of the attributes of $r,s$.

**Union Operation**
\(r\cup s\)
The union operation is valid if:

- $r,s$ have the same
*arity*(same number of attributes) - The attribute domains must be comparable

Two relations with the above properties are said to be **compatible**.

==doubtful==

**Set Intersection Operation**
\(r\cap s\)
==dammit man==

**Set Difference Operation**
\(r - s\)
==eugh==

**Assignment Operation**
\(tmp \leftarrow r\)
This operator acts similar to assignment in programming languages. This is used to break complex queries into smaller, more manageable queries.

**Rename Operation**
\(\rho_x(E)\qquad \rho_{x(A_1, \ldots A_n)}(E)\)
This solves the issue of multiple attributes with the same name. That is, $r\times r$ would have every attribute repeated twice. The expression $E$ is renamed to $x$ in the first representation, and the attributes of the expression are renamed to $A_1, \ldots A_n$ in the second expression.

## Aggregate Functions

Operation | Function |
---|---|

avg | Average value |

min | Minimum value |

max | Maximum value |

sum | Sum of values |

count | Number of values |

$Y_{aggregate function} (r)$ returns a single value. For example, $Y_{avg(salary)}(student)$ returns the average salary of all students.

However, ${}*{dept_name} Y*{avg(salary)}(student)$ groups the average salary of students in each department.