Mastering SQL Set Operators: UNION, INTERSECT, EXCEPT

Explore the power of SQL set operators UNION, INTERSECT, and EXCEPT to efficiently manage and manipulate data sets. Learn how to combine, intersect, and differentiate data with practical examples and best practices.

3.10 Set Operators: UNION, INTERSECT, EXCEPT

In the realm of SQL, set operators are powerful tools that allow us to perform operations on multiple result sets, much like set operations in mathematics. They enable us to combine, intersect, and differentiate data sets, providing a versatile approach to data manipulation. In this section, we will delve into the intricacies of three primary set operators: UNION, INTERSECT, and EXCEPT. We will explore their syntax, use cases, and best practices, along with practical examples to solidify your understanding.

Understanding Set Operators

Set operators in SQL are used to combine the results of two or more queries into a single result set. The key operators include:

  • UNION: Combines the results of two queries, removing duplicates by default.
  • INTERSECT: Returns only the rows that are common to both queries.
  • EXCEPT: Returns rows from the first query that are not present in the second query.

Compatibility Requirements

Before diving into each operator, it’s important to understand the compatibility requirements for using set operators:

  1. Matching Column Count: The queries combined by set operators must have the same number of columns in their result sets.
  2. Compatible Data Types: The corresponding columns in the queries must have compatible data types. For example, you cannot combine a column of type VARCHAR with a column of type INTEGER.

UNION Operator

The UNION operator is used to combine the results of two or more queries into a single result set. By default, it removes duplicate rows from the result set.

UNION ALL vs. UNION DISTINCT

  • UNION ALL: Combines the results of two queries and includes all duplicates.
  • UNION DISTINCT: Removes duplicates from the combined result set. This is the default behavior of the UNION operator.

Syntax

1SELECT column1, column2, ...
2FROM table1
3UNION [ALL | DISTINCT]
4SELECT column1, column2, ...
5FROM table2;

Example: UNION ALL

Let’s consider two tables, employees_us and employees_uk, which store employee data for US and UK offices, respectively.

1-- Combine employee data from both US and UK offices, including duplicates
2SELECT employee_id, first_name, last_name
3FROM employees_us
4UNION ALL
5SELECT employee_id, first_name, last_name
6FROM employees_uk;

In this example, UNION ALL combines the data from both tables, including any duplicate rows.

Example: UNION DISTINCT

1-- Combine employee data from both US and UK offices, removing duplicates
2SELECT employee_id, first_name, last_name
3FROM employees_us
4UNION
5SELECT employee_id, first_name, last_name
6FROM employees_uk;

Here, UNION (equivalent to UNION DISTINCT) removes duplicate rows from the combined result set.

INTERSECT Operator

The INTERSECT operator returns only the rows that are common to both queries. It is useful for finding overlapping data between two result sets.

Syntax

1SELECT column1, column2, ...
2FROM table1
3INTERSECT
4SELECT column1, column2, ...
5FROM table2;

Example: INTERSECT

Suppose we have two tables, project_a and project_b, each containing a list of employee IDs working on different projects.

1-- Find employees working on both Project A and Project B
2SELECT employee_id
3FROM project_a
4INTERSECT
5SELECT employee_id
6FROM project_b;

This query returns the employee IDs that are present in both project_a and project_b.

EXCEPT Operator

The EXCEPT operator returns rows from the first query that are not present in the second query. It is useful for identifying unique data in one result set compared to another.

Syntax

1SELECT column1, column2, ...
2FROM table1
3EXCEPT
4SELECT column1, column2, ...
5FROM table2;

Example: EXCEPT

Consider two tables, all_employees and terminated_employees, where all_employees contains all current employees and terminated_employees contains employees who have left the company.

1-- Find employees who are currently active (not terminated)
2SELECT employee_id, first_name, last_name
3FROM all_employees
4EXCEPT
5SELECT employee_id, first_name, last_name
6FROM terminated_employees;

This query returns the employees who are still active and not present in the terminated_employees table.

Visualizing Set Operations

To better understand how these set operators work, let’s visualize the operations using a Venn diagram.

    graph TD;
	    A["Query 1 Result Set"] -->|UNION| C["Combined Result Set"];
	    B["Query 2 Result Set"] -->|UNION| C;
	    A -->|INTERSECT| D["Common Rows"];
	    B -->|INTERSECT| D;
	    A -->|EXCEPT| E["Unique to Query 1"];
	    B -->|EXCEPT| F["Unique to Query 2"];
  • UNION: Combines both result sets into one, removing duplicates.
  • INTERSECT: Returns only the overlapping area (common rows).
  • EXCEPT: Returns the part of the first set that does not overlap with the second.

Best Practices for Using Set Operators

  1. Ensure Column Compatibility: Always verify that the columns in the queries have compatible data types and the same number of columns.
  2. Use UNION ALL When Duplicates Are Acceptable: If duplicates are acceptable and performance is a concern, prefer UNION ALL over UNION to avoid the overhead of duplicate removal.
  3. Optimize Queries for Performance: Set operations can be resource-intensive. Ensure that the underlying queries are optimized for performance.
  4. Consider Indexing: Proper indexing can significantly improve the performance of set operations, especially when dealing with large data sets.
  5. Use EXCEPT for Data Validation: Use the EXCEPT operator to validate data by identifying discrepancies between expected and actual data sets.

Try It Yourself

Experiment with the following modifications to the code examples:

  • Modify the UNION example to include additional columns and observe how it affects the result set.
  • Change the INTERSECT example to use different tables and see how the common rows change.
  • Use the EXCEPT operator with different conditions to explore its behavior with various data sets.

References and Further Reading

Knowledge Check

  • What are the primary differences between UNION and UNION ALL?
  • How does the INTERSECT operator differ from the EXCEPT operator?
  • What are the compatibility requirements for using set operators in SQL?

Embrace the Journey

Remember, mastering set operators is just one step in your SQL journey. As you continue to explore and experiment, you’ll uncover even more powerful ways to manipulate and analyze data. Keep pushing the boundaries, stay curious, and enjoy the process of learning and growing as a SQL expert!

Quiz Time!

Loading quiz…
Revised on Thursday, April 23, 2026