Chapter 7 subqueries

Yüklə 156 Kb.

səhifə	4/5
tarix	16.08.2018
ölçüsü	156 Kb.
	#63145

1 2 3 4 5

Subqueries and the EXISTS operator
Subqueries and the ORDER BY Clause

CORRELATED SUBQUERIES

As opposed to a regular subquery, where the outer query depends on values provided by the inner query, a correlated subquery is one where the inner query depends on values provided by the outer query. This means that in a correlated subquery, the inner query is executed repeatedly, once for each row that might be selected by the outer query.

Correlated subqueries can produce result tables that answer complex management questions. Suppose our payroll manager needs a listing of the most highly paid salaried employee from each department—there are a total of nine departments so the result table should list nine employees, barring any ties in salary. Clearly the MAX aggregate function should be used some place in the SELECT statement, and sure enough, it is used in the inner query of the SELECT statement shown in the correlated subquery of SQL Example 7.26.

/* SQL Example 7.26 */

SELECT LastName "Last Name", FirstName "First Name",

DepartmentNumber "Dept", Salary "Salary"

FROM Employee e1

WHERE Salary =

(SELECT MAX(Salary)

FROM Employee e2

WHERE e2.DepartmentNumber = e1.DepartmentNumber);

Last Name First Name Dept Salary

--------------- --------------- ----- --------

Simmons Lester 8 $22,000

Bock Douglas 1 $16,250

Bordoloi Bijoy 2 $17,850

Smith Susan 3 $32,500

Klepper Robert 4 $15,055

Quattromani Toni 5 $22,325

Becker Roberta 6 $23,000

Boudreaux Betty 7 $4,895

Schultheis Robert 9 $17,525

9 rows selected.
Unlike the subqueries previously considered, the subquery in this SELECT statement cannot be resolved independently of the main query. Notice that the outer query specifies that rows are selected from the employee table with an alias name of e1. The inner query compares the employee department number column (DepartmentNumber) of the employee table with alias e2 to the same column for the alias table name e1. The value of e1.DepartmentNumber is treated like a variable—it changes as the Oracle server examines each row of the employee table. The subquery's results are correlated with each individual row of the main query—thus, the term correlated subquery.

In this query, the Oracle server considers each row of the employee table for inclusion in the result table by substituting the value of the employee's department number for each row into the inner query. Unlike the previous subqueries, the inner query generates a new set of values for each row in the employee table.

This may be easier to understand by actually working through the data stored in the employee table. The first row stores data for employee Lester Simmons of department 8. The Oracle server will retrieve Simmons' department number and insert it as the value for e1.DepartmentNumber in the inner query. The inner query will use the MAX function to compute the maximum salary for all of the employees in department 8. Since this value is $22,000, and since Simmons is paid $22,000, his name is included in the final result table.

Let's skip to the sixth row. The second row stores data for employee Maxwell Eakin with a salary of $15,000 in department 1. Again, the Oracle server will retrieve Eakin's department number and insert it as the new value for e1.DepartmentNumber in the inner query. The maximum salary for department 3 is $16,250. Since Eakin is paid $15,000, his name is not included in the final result table. Oracle will continue to work through the employee table to produce the resulting report of the top salaried employee from each of the nine departments. You should be able to work through the remaining rows and determine whether or not a row will be included in the final result table.

Subqueries and the EXISTS operator

When a subquery uses the EXISTS operator, the subquery functions as an existence test. In other words, the WHERE clause of the outer query tests for the existence of rows returned by the inner query. The subquery does not actually produce any data; rather, it returns a value of TRUE or FALSE.

The general format of a subquery WHERE clause with an EXISTS operator is shown next. Note that the NOT operator can also be used to negate the result of the EXISTS operator.

WHERE [NOT] EXISTS (subquery)
Again, we return to the problem of listing all employees that have dependents. This query can be written using the EXISTS operator shown in SQL Example 7.27.
/* SQL Example 7.27 */

SELECT LastName "Last Name", FirstName "First Name"

FROM Employee e

WHERE EXISTS

(SELECT *

FROM Dependent d

WHERE d.EmployeeID = e.EmployeeID);
Last Name First Name

--------------- ---------------

Bock Douglas

Bordoloi Bijoy

Boudreaux Beverly

Simmons Lester

Do you notice that this SELECT statement is a correlated subquery? The inner query depends on values (the EmployeeID column) provided by a table (employee) specified in the FROM clause of the outer query. Therefore, the subquery will execute for each row contained in the employee table. When the subquery executes, it searches the dependent table for row(s) that meets the criteria stated in the subquery's WHERE clause. If at least one row is found, the subquery returns a TRUE value to the outer query. If the subquery cannot find a row that meets the criteria, it returns a FALSE value. When the outer query receives a TRUE value, the employee row under evaluation is included in the result table.

Notice that subqueries using the EXISTS operator are a bit different from other subqueries in the following ways:

The keyword EXISTS is not preceded by a column name, constant, or other expression.
The parameter in the SELECT clause of a subquery that uses an EXISTS operator almost always consists of an asterisk (*). This is because there is no real point in listing column names since you are simply testing for the existence of rows that meet the conditions specified in the subquery.
The subquery evaluates to TRUE or FALSE rather than returning any data.
A subquery that uses an EXISTS operator always is a correlated subquery.

The EXISTS operator is very important, because there is often no alternative to its use. All queries that use the IN operator or a modified comparison operator (=, <, >, etc. modified by ANY or ALL) can be expressed with the EXISTS operator. However, some queries formulated with EXISTS cannot be expressed in any other way! Why then shouldn't we simply write all of the earlier queries by using the EXISTS operator? The answer concerns query processing efficiency. Consider the two queries shown in SQL Examples 7.28 and 7.29. Each query produces identical result tables of employees that have dependents.

/* SQL Example 7.28 */

SELECT LastName "Last Name"

FROM Employee

WHERE EmployeeID = ANY

(SELECT EmployeeID

FROM Dependent);

Last Name

---------------

Bock

Bordoloi

Boudreaux

Simmons

/* SQL Example 7.29 */

SELECT LastName "Last Name"

FROM Employee e

WHERE EXISTS

(SELECT *

FROM Dependent d

WHERE d.EmployeeID = e.EmployeeID);
Last Name

---------------

Bock

Bordoloi

Boudreaux

Simmons

Suppose the employee table has 5 million rows, as might be the case for a very large government agency. In the first query, the subquery executes only once. The outer query then processes the employee table against values returned by the subquery. The employee table will have 5 million rows processed. The second query contains a correlated subquery; therefore, the subquery processes once for every row processed by the outer query. If the outer query processes 5 million rows, then the subquery will also process 5 million times. Obviously, the first query is more efficient.

The NOT EXISTS operator is the mirror image of the EXISTS operator. A query that uses NOT EXISTS in the WHERE clause is satisfied if the subquery returns no rows.

Subqueries and the ORDER BY Clause

Let's return to SQL Example 7.26. This query listed the employee with the top salary for each department; however, the data is not ordered by department. If the number of departments is large, management may request sorting the result table by department number or by salary or in some other meaningful fashion. The SELECT statement shown in SQL Example 7.30 adds the ORDER BY clause to specify sorting by the DepartmentNumber column. Note that the ORDER BY clause is placed after the WHERE clause. The syntax used here clarifies that the WHERE clause includes the subquery whereas the ORDER BY clause is part of the outer query.
/* SQL Example 7.30 */

SELECT LastName "Last Name", FirstName "First Name",

DepartmentNumber "Dept", Salary "Salary"

FROM Employee e1

WHERE Salary =

(SELECT MAX(Salary)

FROM Employee e2

WHERE e2.DepartmentNumber = e1.DepartmentNumber)

ORDER BY DepartmentNumber;
Last Name First Name Dept Salary

--------------- --------------- ----- --------

Bock Douglas 1 $16,250

Bordoloi Bijoy 2 $17,850

Smith Susan 3 $32,500

Klepper Robert 4 $15,055

Quattromani Toni 5 $22,325

Becker Roberta 6 $23,000

Boudreaux Betty 7 $4,895

Simmons Lester 8 $22,000

Schultheis Robert 9 $17,525

9 rows selected.

Yüklə 156 Kb.

Dostları ilə paylaş:

1 2 3 4 5