Relational Algebra
There are six basic operators:
- Select
- Project
- Union
- Set Difference
- Cartesian Product
Select Operation \(\sigma_\rho (r) \qquad \rho\text{ - Selection Predicate}\) Only the rows which satisfy the selection predicate are present in the returned relation.
Project Operation \(\Pi_{A_1, \ldots, A_k}(r) \qquad A_i\text{ - Attribute}\) The attributes that are not listed are removed from the relation $r$.
These operations can be composed as shown in the below expression. \(\Pi_{name}(\sigma_{dept\_name = Physics}(instructor))\)
Cartesian Product \(r_1 \times r_2\) Every row of $r_1$ is paired with every row of $r_2$. Cartesian product is usually used along with the select operation to compare rows between two different relations. This composition can be written in short as shown below: \(\sigma_\theta(r\times s) \equiv r \infty_\theta s\) Note that the predicate $\theta$ is over the union of the attributes of $r,s$.
Union Operation \(r\cup s\) The union operation is valid if:
- $r,s$ have the same arity (same number of attributes)
- The attribute domains must be comparable
Two relations with the above properties are said to be compatible.
==doubtful==
Set Intersection Operation \(r\cap s\) ==dammit man==
Set Difference Operation \(r - s\) ==eugh==
Assignment Operation \(tmp \leftarrow r\) This operator acts similar to assignment in programming languages. This is used to break complex queries into smaller, more manageable queries.
Rename Operation \(\rho_x(E)\qquad \rho_{x(A_1, \ldots A_n)}(E)\) This solves the issue of multiple attributes with the same name. That is, $r\times r$ would have every attribute repeated twice. The expression $E$ is renamed to $x$ in the first representation, and the attributes of the expression are renamed to $A_1, \ldots A_n$ in the second expression.
Aggregate Functions
Operation | Function |
---|---|
avg | Average value |
min | Minimum value |
max | Maximum value |
sum | Sum of values |
count | Number of values |
$Y_{aggregate function} (r)$ returns a single value. For example, $Y_{avg(salary)}(student)$ returns the average salary of all students.
However, ${}{dept_name} Y{avg(salary)}(student)$ groups the average salary of students in each department.