22 minute read

Overview of SQL

The SQL language has several parts:

  • Data-Definition Language (DDL)
    Provides commands for defining relation schemas, deleting relations, and modifying relation schemas.
  • Data-Manipulation Language (DML)
    Provides the ability to query information from the database and to insert tuples into, delete tuples from, and modify tuples in the database.
  • Integrity
    Provides commands for specifying integrity constraints that the data stored in the database must satisfy.
  • View Definition
    Provides commands for defining views.
  • Transaction Control
    Provides commands for specifying the beginning and ending of transactions.
  • Embedding SQL & Dynamic SQL
    Define how SQL statements can be embedded within general-purpose programming languages.
  • Authorization
    Provides commands for specifying access rights to relations and views.


SQL Data Definition

Basic Types

The SQL standard supports a variety of built-in types, including:

  • CHAR(n)
    Fixed-length character string, with user-specified length $n$
  • VARCHAR(n)
    Variable-length character string, with user-specified maximum length $n$
  • INT
    Integer (machine-dependent)
  • SMALLINT
    Small integer (machine-dependent)
  • NUMERIC(p, d)
    Fixed-point number, with user-specified precision of $p$ digits (plus a sign), with $d$ digits to the right of the decimal point NUMERIC(3, 1) allows 44.5 to be stored exactly, but neither 444.5 or 0.32 can be stored exactly.
  • REAL, DOUBLE PRECISION
    Floating-point and double precision floating-point numbers (machine-dependent precision)
  • FLOAT(n)
    Floating-point number, with user-specified precision of at least $n$ digits

Each type may include a special value called the NULL value. This indicates an absent value that may exist but be unknown or that may not exist at all.


Basic schema definition

An SQL relation is defined using the CREATE TABLE command with the following general form:

1
2
3
4
5
6
7
8
9
CREATE TABLE r
(
	A_1 D_1,
	... ,
	A_n D_n,
	[Integrity Constraint 1],
	... ,
	[Integrity Constraint k]
)
  • r: Name of the relation
  • A_i: Attribute name in the schema of the relation r
  • D_i: Domain of attribute A_i; Type of attribute A_i

SQL prevents any update to the database that violates an integrity constraints. SQL supports a number of different integrity constraints:

  • PRIMARY KEY(A_1, ... , A_n)
    • Attributes A_1, ... , A_n form the primary key for the relation.
    • Required to be non-null and unique.
  • FOREIGN KEY(A_1, ... , A_n) REFERENCES s
    Values of attributes A_1, ... , A_n for any tuple in the relation must correspond to values of the primary key attributes (or any attributes with UNIQUE constraint specified) of some tuple in relation s

    Some database systems, including MySQL, require an alternative syntax, FOREIGN KEY(A_1, ... , A_n) REFERENCES s(B_1, ..., B_n), where the referenced attributes in the referenced table are listed explicitly.
  • NOT NULL
    NULL value is not allowed for that attribute

Here are some examples.

1
2
3
4
5
6
7
8
9
CREATE TABLE instructor
(
	ID           CHAR(5),
	name         VARCHAR(20) NOT NULL,
	dept_name    VARCHAR(20),
	salary       NUMERIC(8, 2),
	PRIMARY KEY(ID),
	FOREIGN KEY(dept_name) REFERENCES department
);
1
2
3
4
5
6
7
8
9
10
11
12
CREATE TABLE takes
(
	ID           VARCHAR(5),
	course_id    VARCHAR(8),
	sec_id       VARCHAR(8),
	semester     VARCHAR(6),
	year         NUMERIC(4,0),
	grade        VARCHAR(2), 
	PRIMARY KEY(ID, course_id, sec_id, semester, year),
	FOREIGN KEY(ID) REFERENCES student,
    FOREIGN KEY(course_id, sec_id, semester, year) REFERENCES section
);


A newly created relation is empty initially. Inserting tuples into a relation, updating them, and deleting them are possible by following commands:

  • INSERT INTO s t
    • Inserts tuple t into the relation s
    • e.g., INSERT INTO instructor VALUES('10211', 'Smith', 'Biology', 66000)
  • DELETE FROM s t
    • Deletes tuple t from the relation r
    • DELETE FROM r deletes all tuples from the relation r
  • DROP TABLE r
    • Deletes all information of r, including the table itself, from the database
    • After r is dropped, no tuples can be inserted into it unless it is re-created with the CREATE TABLE command
    • e.g., DROP TABLE student
  • ALTER TABLE
    • ALTER TABLE r ADD A D
      • A is the name of the attribute to be added to relation r, and D is the domain of A
      • All existing tuples in the relation are assigned NULL as the value for the new attribute
    • ALTER TABLE r DROP A
      • A is the name of an attribute of relation r
      • Drop attributes from a relation


Basic Structure of SQL Queries

A typical SQL query consists of three clauses: SELECT, FROM, and WHERE

1
2
3
SELECT  A_1, ... , A_n
FROM    r_1, ... , r_m
WHERE   p
  • A_i: Attributes
  • r_i: Relations
  • p: Predicate

Note that the result of an SQL query is a relation.

SELECT Clause

  • SELECT clause lists the attributes desired in the result of a query.
    • Corresponding to the projection operation $\Pi$ of the relational algebra
    • e.g.,

      1
      2
      
        SELECT name 
        FROM instructor
      
  • DISTINCT keyword inserted after SELECT eliminates all duplicates.
    • In practice, duplicate elimination is time-consuming. Therefore, SQL allows duplicates in database relations as well as in the results of SQL expressions. Thus, the preceding SQL query lists each department name once for every tuple in which it appears in the instructor relation.
    • e.g.,

      1
      2
      
        SELECT DISTINCT dept_name
        FROM instructor
      
  • ALL keyword, in contrast, specifies explicitly that duplicates should not be removed, although it is the default of SELECT clause.
    • In practice, duplicate elimination is time-consuming. Therefore, SQL allows duplicates in database relations as well as in the results of SQL expressions.Thus, the preceding SQL query lists each depart- ment name once for every tuple in which it appears in the instructor relation.
    • e.g.,

      1
      2
      
        SELECT ALL dept_name
        FROM instructor
      

Few remarks:

  • $*$ in the SELECT clause denotes all attributes;
    • e.g.,

      1
      2
      
        SELECT * 
        FROM instructor
      
    • e.g.,

      1
      2
      3
      
        SELECT instructor.* 
        FROM instructor, teaches
        WHERE instructor.ID = teaches.ID;
      
  • An attribute can be a literal without FROM clause;
    • e.g.,

      1
      
        SELECT '437'
      
    • The result is a table with one column and a single row with value “437”
    • A column name can be given using:

      1
      
        SELECT '437' AS foo
      
  • An attribute can be a literal with FROM clause;
    • e.g.,

      1
      
        SELECT 'A' FROM instructor
      
    • The result is a table with one column and $N$ rows (the number of tuples in the instructor table), each row with value “A”

  • SELECT clause can contain arithmetic expressions $+$, $−$, $*$, and $/$ operating on constants or attributes of tuples
    • e.g., SELECT ID, name, dept name, salary * 1.1 FROM instructor;
    • Returning a relation that is the same as the instructor relation, except that the value of the salary attribute is multiplied by 1.1
    • Being possible to rename “salary/12” using the as clause

      1
      2
      
        SELECT ID, name, salary/12 AS monthly_salary
        FROM instructor
      

FROM Clause

  • FROM clause lists the relations involved in the query.
    • Corresponding to the Cartesian product $\times$ of the relational algebra
    • e.g., find the Cartesian product $\textrm{instructor} \times \textrm{teaches}$

      1
      2
      
        SELECT *
        FROM instructor, teaches
      
    • the Cartesian product is useful combined with WHERE clause condition (selection operation $\sigma$ in the relational algebra)

WHERE Clause

  • WHERE clause specifies conditions that the resulting relation must satisfy.
    • Corresponding to the selection operation $\sigma$ of the relational algebra
    • e.g., find all instructors in Comp. Sci. dept

      1
      2
      3
      
        SELECT name
        FROM instructor
        WHERE dept_name = 'Comp. Sci.'
      
    • the Cartesian product is useful combined with WHERE clause condition (selection operation $\sigma$ in the relational algebra)
  • SQL allows the use of the logical connectives AND, OR, and NOT.
    • The operands can be expressions involving the comparison operators $<$, $\leq$, $>$, $\geq$, $=$, and $<>$ ($\neq$)
  • Comparisons can be applied to results of arithmetic expressions.



Additional Basic Operations

A number of additional basic operations are supported in SQL.

Rename operation

  • SQL allows renaming relations and attributes using the AS clause, taking the form:

    1
    
      old_name AS new_name
    
    • e.g., find the names of all instructors who earn a higher salary than some instructor in ‘Comp. Sci’
    1
    2
    3
    
      SELECT T.name
      FROM instructor AS T, instructor AS S
      WHERE T.salary > S.salary AND S.dept_name = 'Comp. Sci.'
    
    • The keyword AS is optional and may be omitted
    1
    
      instructor AS T  instructor T
    
    • Identifier, such as T and S, that is used to rename a relation is referred to as a correlation name in the SQL standard, or table alias, correlation variable, or tuple variable


String operation

  • SQL includes a string-matching operator for comparisons on character strings, using the operator Like:

    1
    
      LIKE some_string
    
  • It uses patterns; they can be described using two special characters: % and _.
    • %: Matches any string.
    • _: Matches any character.
    • e.g.,
      • 'Intro%' matches any string beginning with ‘Intro’;
      • '%Comp%' matches any string containing ‘Comp’ as a substring;
      • '_ _ _' matches any string of exactly three characters;
      • '_ _ _ %' matches any string of at least three characters
    • e.g., find the names of all instructors whose name includes the substring “dar”

      1
      2
      3
      
        SELECT name
        FROM instructor
        WHERE name LIKE '%dar%' 
      
    • Patterns are case-sensitive; uppercase characters do not match lowercase characters, or vice versa.
  • SQL supports a variety of string operations such as:
    • str1 || str2: Concatenate str1 and str2.
    • UPPER(str), LOWER(str): Converting from lower to upper / upper to lower cases.
    • LENGTH(str): Finding the length of str.
    • SUBSTR(str, start, length): Extracting substring.

Escape Character

  • For patterns to include the special pattern characters (% and _), SQL allows the specification of an escape character defined using the ESCAPE keyword.
  • Used immediately before a special pattern character to indicate that it is to be treated like a normal character.

    1
    
      LIKE some_string ESCAPE escape_character
    
    • For example,
      • LIKE 'ab\%cd%' ESCAPE '\' matches all strings beginning with ‘ab%cd’.
      • LIKE 'ab\\cd%' ESCAPE '\' matches all strings beginning with ‘ab\cd’.

Ordering the Display of Tuples

  • The ORDER BY clause causes the tuples in the results of a query to appear in sorted (ascending) order.

    1
    
      ORDER BY some_attribute
    
    • e.g., list the names of all instructors in alphabetical order
    1
    2
    3
    
      SELECT DISTINCT name
      FROM   instructor
      ORDER BY name
    
  • We may specify DESC for descending order or ASC for ascending order; ascending order is the default setting.
    • e.g.,

      1
      
        ORDER BY name DESC
      
  • ORDER BY can sort on multiple attributes.
    • e.g.,

      1
      2
      3
      
        SELECT * 
        FROM instructor 
        ORDER BY salary DESC, name ASC;
      

WHERE Clause Predicates

  • SQL includes a BETWEEN comparison operator to simplify WHERE clauses.

    1
    
      WHERE some_attribute BETWEEN start_val AND end_val
    
  • SQL allows the use of the notation (v_1, ... , v_n) to denote a tuple of arity $n$ containing values v_1, … , v_n.
  • The comparison operator can be used on tuples, and the ordering is defined lexicographically.
    • e.g., (a_1, a_2) <= (b_1, b_2) is true if both a_1 <= b_1 and a_2 <= b_2.



Set Operations in SQL

The SQL operations union, intersect, and except operate on relations and correspond to the mathematical set operations $\cup$, $\cap$, and $−$.

UNION operation

  • The SQL operation UNION corresponds to the mathematical operation $\cup$.

    1
    
      R1 UNION R2
    
    • e.g., Find the courses that ran in Fall 2017 or in Spring 2018

      1
      2
      3
      
        SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2017
        UNION
        SELECT course_id FROM section WHERE semester = 'Spring' AND year = 2018
      
  • The UNION operation automatically eliminates duplicates.

    • To retain duplicates, one must use UNION ALL operation instead.

INTERSECT operation

  • The SQL operation INTERSECT corresponds to the mathematical operation $\cap$.

    1
    
      R1 INTERSECT R2
    
    • e.g., Find the courses that ran in Fall 2017 and in Spring 2018

      1
      2
      3
      
        SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2017
        INTERSECT
        SELECT course_id FROM section WHERE semester = 'Spring' AND year = 2018
      
  • The INTERSECT operation automatically eliminates duplicates.

    • To retain duplicates, one must use INTERSECT ALL operation instead.

EXCEPT operation

  • The SQL operation EXCEPT corresponds to the mathematical operation $-$ (set-difference).
    • It outputs all tuples from R1 that do not occur in R2.

      1
      
        R1 EXCEPT R2
      
    • e.g., Find the courses that ran in Fall 2017 but not in Spring 2018

      1
      2
      3
      
        SELECT course_id FROM section WHERE semester = 'Fall' AND year = 2017
        EXCEPT
        SELECT course_id FROM section WHERE semester = 'Spring' AND year = 2018
      
  • The EXCEPT operation automatically eliminates duplicates.
    • To retain duplicates, one must use EXCEPT ALL operation instead.



NULL Values in SQL

  • NULL values signifies an unknown value or that a value does not exist.
    • Arithmetic operations
      • The result of any arithmetic expression involving NULL is NULL.
    • Comparison operations
      • SQL treats the result of any comparison involving a NULL value (other than IS NULL and IS NOT NULL) as UNKNOWN
      • e.g., 5 < NULL, NULL = NULL
      • The predicate in a WHERE clause can involve Boolean operations such as AND, OR, and NOT.

        1
        2
        3
        4
        5
        6
        7
        8
        9
        10
        
        --- AND
            TRUE AND UNKNOWN = UNKNOWN
            FALSE AND UNKNOWN = FALSE
            UNKNOWN AND UNKNOWN = UNKNOWN
        --- OR
            UNKNOWN OR TRUE = TRUE
            UNKNOWN OR FALSE = UNKNOWN
            UNKNOWN OR UNKNOWN = UNKNOWN
        --- NOT
            NOT UNKNOWN = UNKNOWN
        

        The result of WHERE clause predicate is treated as FALSE if the value evaluates to UNKNOWN.

    • IS NULL and IS NOT NULL
      • The predicate IS NULL can be used to check for NULL values.
      • The predicate IS NOT NULL succeeds if the value on which it is applied is not NULL.



Aggregate Functions

  • Aggregate functions are functions that take a collection of values as input, and return a single value.
  • SQL offers five aggregate functions:
    • Average: AVG
    • Minimum: MIN
    • Maximum: MAX
    • Total: SUM
    • Count: COUNT
  • The input to AVG and SUM must be a collection of numbers.
  • Examples:
    • e.g., Find the average salary of instructors in the Computer Science department

      1
      2
      3
      
        SELECT AVG (salary) AS avg_salary
        from instructor
        WHERE dept_name = 'Comp. Sci.'
      
    • e.g., Find the total number of instructors who teach a course in the Spring 2018 semester

      1
      2
      3
      
        SELECT AVG (salary)
        from instructor
        WHERE dept_name = 'Comp. Sci.'
      
    • e.g., Find the total number of instructors (need duplicates elimination) who teach a course in the Spring 2018 semester

      1
      2
      3
      
        SELECT COUNT (DISTINCT ID)
        FROM teaches
        WHERE semester = 'Spring' AND year = 2018
      
    • e.g., Find the number of tuples in the course relation

      1
      2
      
        SELECT COUNT (*)
        FROM course
      
      • SQL does not allow the use of DISTINCT with COUNT (*).

GROUP BY clause

  • GROUP BY clause uses the given attributes to form groups.

    1
    
      GROUP BY some_attributes
    
    • e.g., Find the average salary of instructors in each department.

      1
      2
      3
      
        SELECT   dept_name, AVG(salary) AS avg_salary
        FROM     instructor
        GROUP BY dept_name
      
  • Attributes in SELECT statement that is outside of aggregate functions must appear in the GROUP BY clause; otherwise the query is treated as erroneous.

    • e.g., the following is an erroneous query

      1
      2
      3
      
        SELECT   dept_name, ID, AVG(salary) AS avg_salary
        FROM     instructor
        GROUP BY dept_name
      
    • It’s worth noting that all DBMS do not require this condition.

HAVING clause

  • SQL applies predicated in the HAVING clause after the formation of groups, whereas predicates in the WHERE clause are applied before the group formation.

    1
    2
    
      GROUP BY some_attributes
      HAVING   predicate
    
  • A typical query containing aggregation, GROUP BY, and/or HAVING clauses is defined by the following sequence of operations:

  1. The FROM clause is first evaluated to get a relation.
  2. If a WHERE clause is present, the predicate in the WHERE clause is applied on the result relation of the FROM clause.
  3. Tuples satisfying the WHERE predicate are then placed into groups by the GROUP BY clause (if present). Otherwise the entire set of tuples satisfying WHERE clause’s predicate is treated as one single group.
  4. The HAVING clause (if present) is applied to each group, the groups that do not satisfying the HAVING predicate are removed.
  5. The SELECT clause uses the remaining groups to generate tuples of the result of the query, applying the aggregate functions to get a single result tuple for each group.
  • e.g., For each course section offered in 2017, find the average total credits (tot cred) of all students enrolled in the section, if the section has at least 2 students.

    1
    2
    3
    4
    5
    
      SELECT course_id, semester, year, sec_id, AVG (tot_cred)
      FROM student, takes
      WHERE student.ID = takes.ID AND year = 2017
      GROUP BY course_id, semester, year, sec_id
      HAVING COUNT (ID) >= 2;
    

Aggregation with NULL

  • All aggregate functions except COUNT ignore NULL values in their input collection.
  • The COUNT of an empty collection is defined to be $0$.
  • All other aggregate operations return a value NULL when applied to an empty collection.



Nested Subqueries

SQL provides a mechanism for the nesting of subqueries.

  • A subquery is a SELECT-FROM-WHEREexpression that is nested within another query.
  • The nesting can be done in the following SQL query

    1
    2
    3
    
      SELECT A_1, A_2, ..., A_n
      FROM r_1, r_2, ..., r_m
      WHERE P
    

    as follows:

    • FROM clause: r_i can be replaced by any valid subquery
    • WHEREclause: P can be replaced with an expression of the form: B OPERATION (subquery)
      • B: attribute
      • OPERATION: e.g., IN, NOT IN, etc.
    • SELECT clause: A_i can be replaced be a subquery that generates a single value

Set Membership

SQL allows testing tuples for membership in a relation, using IN or NOT IN clause.

  • Examples
    • e.g., Find the courses offered in Fall 2017 and in Spring 2018

      1
      2
      3
      4
      5
      6
      
        SELECT DISTINCT course_id
        FROM section
        WHERE semester = 'Fall' AND year = 2017 AND 
                course_id IN (SELECT course_id
                              FROM section
                              WHERE semester = 'Spring' AND year = 2018)
      
    • e.g., Name all instructors whose name is neither “Mozart” nor Einstein”

      1
      2
      3
      
        SELECT DISTINCT name
        FROM instructor
        WHERE name NOT IN ('Mozart', 'Einstein') 
      
    • e.g., Find the total number of distinct students who have taken course sections taught by the instructor with ID 10101

      • As seen in the example query below, it is possible to test for membership in an arbitrary relation.
      1
      2
      3
      4
      5
      6
      7
      8
      
        SELECT  COUNT(DISTINCT ID)
        FROM    takes
        WHERE   (course_id, sec_id, semester, year) IN
        (
            SELECT  course_id, sec_id, semester, year
            FROM    teaches
            WHERE   teaches.ID = 10101
        ); 
      

Set Comparison

SOME clause

SOME clause can check if the predicate is satisfied by at least one tuple in the relation.

  • F comp SOME r $\Leftrightarrow$ $\exists t \in r$ such that $(F \; \textrm{comp} \; t)$, where $\textrm{comp} \in {<,\leq,>,\geq,=,\neq }$.
  • = SOME $\Leftrightarrow$ IN, however <> SOME $\nLeftrightarrow$ NOT IN

  • e.g., Find the names of all instructors whose salary is greater than that of some (at least one) instructor in the Biology department

    1
    2
    3
    4
    
      SELECT  DISTINCT T.name
      FROM    instructor AS T,
                      instructor AS S
      WHERE   T.salary > S.salary AND S.dept_name = 'Biology';
    
    1
    2
    3
    4
    5
    6
    
      -- Using SOME clause
      SELECT  name
      FROM    instructor
      WHERE   salary > SOME (SELECT  salary
                             FROM    instructor
                             WHERE   dept_name = 'Biology');
    

ALL clause

ALL clause can check if the predicate is satisfied by all of the tuples in the relation.

  • F comp ALL r $\Leftrightarrow$ $\forall t \in r$ such that $(F \; \textrm{comp} \; t)$, where $\textrm{comp} \in {<,\leq,>,\geq,=,\neq }$.
  • <> ALL $\Leftrightarrow$ NOT IN, however = ALL $\nLeftrightarrow$ IN

  • e.g., Find the names of all instructors whose salary is greater the salary of all instructors in the Biology department

    1
    2
    3
    4
    5
    
      SELECT  name
      FROM    instructor
      WHERE   salary > ALL (SELECT  salary
                            FROM    instructor
                            WHERE   dept_name = 'Biology');
    

Test for Empty Relations

The EXISTS construct returns the value TRUE if the argument subquery is nonempty.

  • EXISTS r $\Leftrightarrow$ $r \neq \emptyset$
  • NOT EXISTS r $\Leftrightarrow$ $r = \emptyset$

  • e.g., Find all students who have taken all courses offered in the Biology department
    • “relation A contains relation B” as NOT EXISTS (B EXCEPT A)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
      SELECT  DISTINCT S.ID, S.name
      FROM    student AS S
      WHERE   NOT EXISTS (SELECT  course_id
                          FROM    course
                          WHERE   dept_name = 'Biology'
                          EXCEPT
                          SELECT  T.course_id
                          FROM    takes AS T
                          WHERE   S.ID = T.ID);
    
    • Correlation name: variable S in the outer query
    • Correlated subquery: the inner query
    • Note that $X−Y = \emptyset$ $\Leftrightarrow$ $X \subseteq Y$
    • This query cannot be written using = ALL and its variants

Test for the Absence of Duplicate Tuples

The UNIQUE construct tests whether a subquery has any duplicate tuples in its result. It evaluates to TRUE if a given subquery contains no duplicates, or an empty set.

  • e.g., Find all courses that were offered at most once in 2017

    1
    2
    3
    4
    5
    6
    
      SELECT  T.course_id
      FROM    course AS T
      WHERE   UNIQUE (SELECT  R.course_id
                      FROM    course AS R
                      WHERE   T.course_id = R.course_id AND
                              R.year = 2017);
    

Since the test $t_1 = t_2$ fails if any of the fields of $t_1$ or $t_2$ are null, it is possible for UNIQUE to be true even if there are multiple copies of a tuple, as long as at least one of the attributes of the tuple is null.

Subqueries in the FROM clause

Since any SELECT - FROM - WHERE clause returns a relation as a result, it can be inserted into another SELECT - FROM - WHERE anywhere that a relation can appear.

  • e.g., Find the average instructors’ salaries of those departments where the average salary is greater than 42000

    1
    2
    3
    4
    5
    
      SELECT dept_name, avg_salary
      FROM  (SELECT  dept_name, AVG(salary) AS avg_salary
             FROM    instructor
             GROUP BY dept_name)
      WHERE  avg_salary > 42000;
    
    • Since the subquery in the FROM clause computes the average salary, HAVING query is not required; the predicate is rather inside the WHERE clause of the outer query.
    • The above query is equivalent to

      1
      2
      3
      4
      
        SELECT  dept_name, AVG(salary) AS avg_salary
        FROM    instructor
        GROUP BY dept_name
        HAVING  avg_salary > 42000
      

WITH clause

The WITH clause provides a way of defining a temporary relation whose definition is available only to the query in which the WITH clause occurs.

  • e.g., Find all departments with the maximum budget

    1
    2
    3
    4
    5
    
      WITH    max_budget(value) AS ( SELECT  MAX(budget)
                                     FROM    department)
      SELECT  department.dept_name
      FROM    department, max_budget
      WHERE   department.budget = max_budget.value
    
  • e.g., Find all departments where the total salary is greater than the average of the total salary at all departments

    • You don’t need to repeat the WITH keyword (it might raise a syntax error):
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    
      WITH    dept_total(dept_name, value) AS
      (
          SELECT  dept_name,
                  SUM(salary)
          FROM    departments
          GROUP BY dept_name
      ),
              dept_total_avg(value) AS
      (
          SELECT AVG(value)
          FROM   dept_total
      )
      SELECT  dept_name
      FROM    dept_total,
              dept_total_avg
      WHERE   dept_total.value > dept_total_avg.value;
    

Scalar subquery

Scalar subqeury is used where a single value is expected. It incurs a runtime error if the subquery returns more than one result tuple.

  • e.g., Find all departments, along with the number of instructors, in each department

    1
    2
    3
    4
    5
    6
    7
    
      SELECT  dept_name,
              (SELECT  COUNT(*)
               FROM    instructor
               WHERE   department.dept_name = instructor.dept_name
               GROUP BY dept_name
              ) AS num_instructors
      FROM    department;
    



Modification of the database

Deletion

A delete request can only delete whole tuples; deleting values on only particular attributes is not allowed.

1
2
DELETE FROM r
WHERE       p
  • r: Relation
  • p: Predicate

The DELETE statement first finds all tuples t in r for which P(t) is true, and then deletes them from r. Note that DELETE command operates on only one relation.

  • Examples
    • e.g., Delete all tuples from instructor table

      1
      
        DELETE FROM instructor;
      
    • e.g., Delete all tuples in the instructor relation for those instructors associated with a department located in the Watson building

      1
      2
      3
      4
      
        DELETE FROM instructor
        WHERE       dept_name IN (SELECT  dept_name
                                  FROM    department
                                  WHERE   building = 'Watson');
      
    • e.g., Delete all instructors whose salary is less than the average salary of all instructors

      1
      2
      3
      4
      5
      6
      
        DELETE FROM instructor
        WHERE       salary <
        (
            SELECT  AVG(salary)
            FROM    instructor
        );
      

    SQL first computes the average salary and find all tuples to delete, then delete all corresponding tuples from the relation (without recomputing or retesting the tuples).

Insertion

To insert data into a relation, either a tuple to be inserted, or the query whose result is a set of tuples to be inserted, must be specified. The SELECT - FROM - WHERE statement is evaluated fully before any of its results are inserted into the relation.

1
INSERT INTO r
  • r: Relation

  • Examples

    • e.g., Insert a new tuple to course

      1
      2
      
        INSERT INTO course
            VALUES('CS360', 'Introduction to Database', 'SoC', 3)
      
    • e.g., Make each student in the Music department who has earned more than 144 credit as an instructor in the Music department with a salary of 18000

      1
      2
      3
      4
      5
      6
      7
      8
      
        INSERT INTO instructor
            SELECT  ID,
                    name,
                    dept_name,
                    18000
            FROM    student
            WHERE   dept_name = 'MUSIC'
                    AND tot_credit > 144;
      

Update

A value in a tuple can be changed without changing all values in the tuple with the UPDATE statement.

1
2
3
UPDATE  r
SET     A = (Some Value)
WHERE   p
  • r: Relation
  • A: Attribute
  • p: Predicate

Here are some examples;

  • Examples
    • e.g., Give a 5% salary raise to those instructors who earn less than 70000

      1
      2
      3
      
        UPDATE  instructor
        SET     salary = salary * 1.05
        WHERE   salary < 70000;
      
    • e.g. Give a 5% salary raise to instructors whose salary is less than the average of all instructors

      1
      2
      3
      4
      5
      6
      7
      
        UPDATE  instructor
        SET     salary = salary * 1.05
        WHERE   salary <
        (
            SELECT  AVG(salary)
            FROM    instructor
        );
      

Note that the order of UPDATE statements is very important. Consider the following query.

1
2
3
4
5
6
7
8
-- Update 1
UPDATE  instructor
SET     salary = salary * 1.03
WHERE   salary > 100000
-- Update 2
UPDATE  instructor
SET     salary = salary * 1.05
WHERE   salary <= 100000

If the order of the two updates are changed, the results whould not be as desired. To prevent order related problems, CASE construct is provided by SQL.

CASE construct

CASE construct can be used in any place where a value is expected.

1
2
3
4
5
6
CASE
    WHEN  P_1 THEN  R_1
    ...
    WHEN  P_n THEN  R_n
    ELSE            R_0
END
  • P_i: Predicates
  • R_i: Resulting value

The error-prone query above can be re-written using CASE construct.

1
2
3
4
5
6
7
8
UPDATE  instructor
SET     salary =
(
    CASE
        WHEN salary <= 100000 THEN salary * 1.05
        ELSE salary * 1.03
    END
);

Updates with scalar subqueries

Scalar subqueries are also useful in SQL update statements, where they can be used in SET clause.

  • Example
    • e.g., Recompute and update tot_credit for all students to the credits of courses successfully completed by the student (successfully completed means grade is not F nor NULL)

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      
        UPDATE  student S
        SET     tot_credit = (
            SELECT  SUM(credits)
            FROM    takes JOIN course USING(course_id)
            WHERE   S.ID = takes.ID AND
                    (
                        takes.grade <> 'F' AND
                        takes.grade IS NOT NULL
                    )
        );
      




Reference

[1] Silberschatz, Abraham, Henry F. Korth, and Shashank Sudarshan. “Database system concepts.” (2011).

Leave a comment