Explore the power of SQL set operators UNION, INTERSECT, and EXCEPT to efficiently manage and manipulate data sets. Learn how to combine, intersect, and differentiate data with practical examples and best practices.
In the realm of SQL, set operators are powerful tools that allow us to perform operations on multiple result sets, much like set operations in mathematics. They enable us to combine, intersect, and differentiate data sets, providing a versatile approach to data manipulation. In this section, we will delve into the intricacies of three primary set operators: UNION, INTERSECT, and EXCEPT. We will explore their syntax, use cases, and best practices, along with practical examples to solidify your understanding.
Set operators in SQL are used to combine the results of two or more queries into a single result set. The key operators include:
Before diving into each operator, it’s important to understand the compatibility requirements for using set operators:
VARCHAR with a column of type INTEGER.The UNION operator is used to combine the results of two or more queries into a single result set. By default, it removes duplicate rows from the result set.
1SELECT column1, column2, ...
2FROM table1
3UNION [ALL | DISTINCT]
4SELECT column1, column2, ...
5FROM table2;
Let’s consider two tables, employees_us and employees_uk, which store employee data for US and UK offices, respectively.
1-- Combine employee data from both US and UK offices, including duplicates
2SELECT employee_id, first_name, last_name
3FROM employees_us
4UNION ALL
5SELECT employee_id, first_name, last_name
6FROM employees_uk;
In this example, UNION ALL combines the data from both tables, including any duplicate rows.
1-- Combine employee data from both US and UK offices, removing duplicates
2SELECT employee_id, first_name, last_name
3FROM employees_us
4UNION
5SELECT employee_id, first_name, last_name
6FROM employees_uk;
Here, UNION (equivalent to UNION DISTINCT) removes duplicate rows from the combined result set.
The INTERSECT operator returns only the rows that are common to both queries. It is useful for finding overlapping data between two result sets.
1SELECT column1, column2, ...
2FROM table1
3INTERSECT
4SELECT column1, column2, ...
5FROM table2;
Suppose we have two tables, project_a and project_b, each containing a list of employee IDs working on different projects.
1-- Find employees working on both Project A and Project B
2SELECT employee_id
3FROM project_a
4INTERSECT
5SELECT employee_id
6FROM project_b;
This query returns the employee IDs that are present in both project_a and project_b.
The EXCEPT operator returns rows from the first query that are not present in the second query. It is useful for identifying unique data in one result set compared to another.
1SELECT column1, column2, ...
2FROM table1
3EXCEPT
4SELECT column1, column2, ...
5FROM table2;
Consider two tables, all_employees and terminated_employees, where all_employees contains all current employees and terminated_employees contains employees who have left the company.
1-- Find employees who are currently active (not terminated)
2SELECT employee_id, first_name, last_name
3FROM all_employees
4EXCEPT
5SELECT employee_id, first_name, last_name
6FROM terminated_employees;
This query returns the employees who are still active and not present in the terminated_employees table.
To better understand how these set operators work, let’s visualize the operations using a Venn diagram.
graph TD;
A["Query 1 Result Set"] -->|UNION| C["Combined Result Set"];
B["Query 2 Result Set"] -->|UNION| C;
A -->|INTERSECT| D["Common Rows"];
B -->|INTERSECT| D;
A -->|EXCEPT| E["Unique to Query 1"];
B -->|EXCEPT| F["Unique to Query 2"];
UNION ALL over UNION to avoid the overhead of duplicate removal.EXCEPT operator to validate data by identifying discrepancies between expected and actual data sets.Experiment with the following modifications to the code examples:
UNION example to include additional columns and observe how it affects the result set.INTERSECT example to use different tables and see how the common rows change.EXCEPT operator with different conditions to explore its behavior with various data sets.UNION and UNION ALL?INTERSECT operator differ from the EXCEPT operator?Remember, mastering set operators is just one step in your SQL journey. As you continue to explore and experiment, you’ll uncover even more powerful ways to manipulate and analyze data. Keep pushing the boundaries, stay curious, and enjoy the process of learning and growing as a SQL expert!