Oracle9i Database Performance Tuning Guide and Reference Release 2 (9.2) Part Number A96533-02 |
|
|
View PDF |
This chapter expands on the ideas introduced in Chapter 1, "Introduction to the Optimizer" and explains optimizer actions in greater detail for specific cases. This chapter describes how the cost-based optimizer evaluates expressions and performs specific operations. It also explains how the CBO transforms some SQL statements into others to achieve the same goal more efficiently.
The chapter contains the following sections:
The optimizer fully evaluates expressions whenever possible and translates certain syntactic constructs into equivalent constructs. Constructs are translated in the following cases:
ANY
(subquery) and IN
(subquery) to a single construct.This section contains the following discussions:
The IN
-list iterator is used when a query contains an IN
clause with values. The execution plan is identical to what would result for a statement with an equality clause instead of IN
except for one additional step. That extra step occurs when the IN
-list iterator feeds the equality clause with unique values from the IN
-list.
Both of the statements in Example 2-1 and Example 2-1 are equivalent and produce the same plan.
SELECT header_id, line_id, revenue_amount FROM so_lines_all WHERE header_id IN (1011,1012,1013); SELECT header_id, line_id, revenue_amount FROM so_lines_all WHERE header_id = 1011 OR header_id = 1012 OR header_id = 1013; Plan ------------------------------------------------- SELECT STATEMENT INLIST ITERATOR TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1
An equivalent statement in Example 2-1 binds :b1
to the different unique values by the IN
-list iterator.
SELECT header_id, line_id, revenue_amount FROM so_lines_all l WHERE header_id = :b1; Plan ------------------------------------------------- SELECT STATEMENT TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1
Example 2-3 uses a unique index. Because there is a sort involved on the IN
-list, even with complete keys of unique indexes, there is still a range scan.
SELECT header_id, line_id, revenue_amount FROM so_lines_all WHERE line_id IN (1011,1012,1013); Plan ------------------------------------------------- SELECT STATEMENT INLIST ITERATOR TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_U1
In Example 2-4, the IN-list operator can be used when driving into a table with a nested loop operation.
SELECT h.header_id, l.line_id, l.revenue_amount FROM so_headers_all h, so_lines_all l WHERE l.inventory_item_id = :b1 AND h.order_number = l.header_id AND h.order_type_id IN (1,2,3); Plan ------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N5 INLIST ITERATOR TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_U2
The IN
-list operator is especially useful if there is an expensive first step that you do not want to repeat for every IN
-list element. In Example 2-5, even though there are three IN
-list elements, the full scan on so_lines_all
happens only once.
SELECT h.header_id, l.line_id, l.revenue_amount FROM so_headers_all h, so_lines_all l WHERE l.s7 = :b1 AND h.order_number = l.header_id AND h.order_type_id IN (1,2,3); Plan ------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS FULL SO_LINES_ALL INLIST ITERATOR TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_U2
The optimizer uses an IN
-list iterator when an IN
clause is specified with values, and the optimizer finds a selective index for that column. If there are multiple OR
clauses using the same index, then the optimizer chooses this operation rather than CONCATENATION
or UNION
ALL
, because it is more efficient.
There are no hints for this operation. You can provide a hint to use the relevant index, which can cause this operation. Example 2-6 shows a query without an INDEX
hint, and the resulting execution plan.
SELECT h.customer_id, l.line_id, l.revenue_amount FROM so_lines_all l, so_headers_all h WHERE l.s7 = 20 AND h.original_system_reference = l.attribute5 AND h.original_system_source_code IN (1013,1014); Plan -------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS FULL SO_LINES_ALL TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N5
Example 2-7 shows a query with an INDEX
hint, and the resulting execution plan.
SELECT /*+INDEX(h so_headers_n9 */ h.customer_id, l.line_id, l.revenue_amount FROM so_lines_all l, so_headers_all h WHERE l.s7 = 20 AND h.original_system_reference = l.attribute5 AND h.original_system_source_code IN (1013,1014); Plan -------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS FULL SO_LINES_ALL INLIST ITERATOR TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N9
Concatenation is useful for statements with different conditions combined with an OR
clause. With concatenation, you get a good execution plan with appropriate indexes. Examples 2-8 and 2-9 show two plans, each accessing the table through the appropriate index and combined using concatenation.
The plan in Example 2-8 does not return duplicate rows, so for each component it appends a negation of the previous components.
SELECT l.header_id, l.line_id, l.revenue_amount FROM so_lines_all l WHERE l.parent_line_id = :b1 OR l.service_parent_line_id = :b1; Plan ------------------------------------------------- SELECT STATEMENT CONCATENATION TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N20 TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N17
In Example 2-8, the components are the following:
Example 2-9 shows how the optimizer uses concatenation to optimize a nested OR
statement.
SELECT p.header_id, l.line_id, l.revenue_amount FROM so_lines_all p , so_lines_all l WHERE p.header_id = :b1 AND (l.parent_line_id = p.line_id OR l.service_parent_line_id = p.line_id); Plan ------------------------------------------------- SELECT STATEMENT CONCATENATION NESTED LOOPS TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1 TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N20 NESTED LOOPS TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1 TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N17
Example 2-10 shows the resulting execution plan for the same query shown in Example 2-9, but with concatenation specifically disabled by the NO_EXPAND
hint.
SELECT /*+NO_EXPAND */ p.header_id, l.line_id, l.revenue_amount FROM so_lines_all p, so_lines_all l WHERE p.header_id = :b1 AND (l.parent_line_id = p.line_id OR l.service_parent_line_id = p.line_id); Plan ------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1 TABLE ACCESS FULL SO_LINES_ALL
Trying to execute the statement in a single query produces a poor execution plan. Because the optimizer has two paths to follow and has been instructed not to decompose the query, it needs to access all the rows in the second table to see if any rows match one of the conditions.
Use the hint USE_CONCAT
for this operation.
Concatenation is expensive in the following cases and should not be used:
OR
conditions are on same column and can use the IN
-list operator, which is more efficient than concatenationExample 2-11 illustrates this point.
Consider the following statement:
SELECT h.customer_id, l.line_id, l.revenue_amount FROM so_lines_all l, so_headers_all h WHERE l.s7 = 20 AND h.original_system_reference = l.attribute5 AND h.original_system_source_code IN (1013,1014); Plan -------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS FULL SO_LINES_ALL TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N5
The initial statement calls for a full scan on so_lines_all
as the first stage. The optimizer chooses to use a single column index for the second table, but we want it to use a two-column index.
You could use a hint to force concatenation as in shown in Example 2-12, but the initial full scan is still repeated which is not desirable.
SELECT /*+USE_CONCAT*/ h.customer_id, l.line_id, l.revenue_amount FROM so_lines_all l, so_headers_all h WHERE l.s7 = 20 AND h.original_system_reference = l.attribute5 AND h.original_system_source_code IN (1013,1014); Plan -------------------------------------------------- SELECT STATEMENT CONCATENATION NESTED LOOPS TABLE ACCESS FULL SO_LINES_ALL TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N9 NESTED LOOPS TABLE ACCESS FULL SO_LINES_ALL TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N9
If, instead, you provide a hint to use the two-column index, then the optimizer switches to that with an IN
-list iterator. The initial scan is not repeated, and a better execution plan results, as follows:
SELECT /*+INDEX(h so_headers_n9 */ h.customer_id, l.line_id, l.revenue_amount FROM so_lines_all l, so_headers_all h WHERE l.s7 = 20 AND h.original_system_reference = l.attribute5 AND h.original_system_source_code IN (1013,1014); Plan -------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS FULL SO_LINES_ALL INLIST ITERATOR TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N9
The remote operation indicates that there is a table from another database being accessed through a database link. Example 2-14 has a remote driving table:
SELECT c.customer_name, count(*) FROM ra_customers c, so_headers_all@oe h WHERE c.customer_id = h.customer_id AND h.order_number = :b1 GROUP BY c.customer_name; Plan -------------------------------------------------- SELECT STATEMENT SORT GROUP BY NESTED LOOPS REMOTE TABLE ACCESS BY INDEX ROWID RA_CUSTOMERS INDEX UNIQUE SCAN RA_CUSTOMERS_U1
SELECT "ORDER_NUMBER","CUSTOMER_ID" FROM "SO_HEADERS_ALL" "H" WHERE "ORDER_NUMBER"=:"SYS_B_0";
Example 2-15 has a local driving table.
SELECT c.customer_name, h.order_number FROM ra_customers c, so_headers_all@oe h WHERE c.customer_id = h.customer_id AND c.customer_name LIKE :b1; Plan -------------------------------------------------- SELECT STATEMENT MERGE JOIN REMOTE SORT JOIN TABLE ACCESS BY INDEX ROWID RA_CUSTOMERS INDEX RANGE SCAN RA_CUSTOMERS_N1
SELECT "ORDER_NUMBER","CUSTOMER_ID" FROM "SO_HEADERS_ALL" "H" WHERE "CUSTOMER_ID" IS NOT NULL ORDER BY "CUSTOMER_ID";
A couple of factors influence the execution plan:
In general, the optimizer chooses to access the remote tables first, before accessing the local tables. This works well for cases like Example 2-14, where the driving table is the remote table. However, if the driving table is the local table, then there might not be any selective way of accessing the remote table without first accessing the local tables. In such cases, you might need to provide appropriate hints to avoid performance problems.
SELECT /*+USE_NL(c h) */ c.customer_name, h.order_number FROM ra_customers c, so_headers_all@oe h WHERE c.customer_id = h.customer_id AND c.customer_name LIKE :b1; Plan -------------------------------------------------- SELECT STATEMENT NESTED LOOPS TABLE ACCESS BY INDEX ROWID RA_CUSTOMERS INDEX RANGE SCAN RA_CUSTOMERS_N1 REMOTE
SELECT /*+ USE_NL("H") */ "ORDER_NUMBER","CUSTOMER_ID" FROM "SO_HEADERS_ALL" "H" WHERE :1="CUSTOMER_ID" FILTER;
The construct used by the optimizer is applying a filter condition to filter out rows; this filter could not be applied when the table was accessed. Example 2-17 uses no filter.
SELECT h.order_number FROM so_headers_all h WHERE h.open_flag = 'Y' AND attribute1 IS NOT NULL; Plan -------------------------------------------------- SELECT STATEMENT TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N2
Besides the conditions used in the access path, a table might have additional conditions to filter rows when the table is visited. Conditions that get applied when the table is accessed, like attribute1
IS
NOT
NULL
, do not show up as FILTER
. Example 2-18 and Example 2-19 illustrate such conditions.
Example 2-18 shows a query with a GROUP
BY
condition that creates a filter.
SELECT h.order_number, count(*) FROM so_headers_all h WHERE h.open_flag = 'Y' AND attribute1 IS NOT NULL GROUP BY h.order_number HAVING COUNT(*) = 1 ß Filter condition; Plan -------------------------------------------------- SELECT STATEMENT FILTER SORT GROUP BY TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N2
Example 2-19 shows a query with a subquery that creates a filter.
SELECT h.order_number FROM so_headers_all h WHERE h.open_flag = 'Y' AND EXISTS (SELECT null FROM so_lines_all l WHERE l.header_id = h.header_id AND l.revenue_amount > 10000); Plan -------------------------------------------------- SELECT STATEMENT FILTER TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N2 TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1
In this example, for every row meeting the condition of the outer query, the correlated EXISTS
subquery is executed. If a row meeting the condition is found in the so_lines_all
table, then the row from so_headers_all
is returned.
The optimizer chooses execution plans for SQL statements that access data on remote databases in much the same way that it chooses execution plans for statements that access only local data:
When choosing a cost-based execution plan for a distributed statement, the optimizer considers the available indexes on remote databases, just as it considers indexes on the local database. The optimizer also considers statistics on remote databases for the CBO. Furthermore, the optimizer considers the location of data when estimating the cost of accessing it. For example, a full scan of a remote table has a greater estimated cost than a full scan of an identical local table.
For a rule-based execution plan, the optimizer does not consider indexes on remote tables.
See Also:
Chapter 6, "Optimizing SQL Statements" for more information on tuning distributed queries |
Sort operations result when users specify some operation that requires a sort. Commonly encountered operations include the following:
SORT
UNIQUE
occurs if a user specifies a DISTINCT
clause (Example 2-20) or if an operation requires unique values for the next step (Example 2-21).
SELECT DISTINCT last_name, first_name FROM per_all_people_f WHERE full_name LIKE :b1; Plan -------------------------------------------------- SELECT STATEMENT SORT UNIQUE TABLE ACCESS BY INDEX ROWID PER_ALL_PEOPLE_F INDEX RANGE SCAN PER_PEOPLE_F_N54
SORT UNIQUE provides the outer query with a unique list of header_id
s. The plan shows that the IN
subquery has been un-nested and transformed into VW_NSO_1.
SELECT c.customer_name, h.order_number FROM ra_customers c, so_headers_all h WHERE c.customer_id = h.customer_id AND h.header_id in (SELECT l.header_id FROM so_lines_all l WHERE l.inventory_item_id = :b1 AND ordered_quantity > 10); Plan -------------------------------------------------- SELECT STATEMENT NESTED LOOPS NESTED LOOPS VIEW VW_NSO_1 SORT UNIQUE TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N5 TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX UNIQUE SCAN SO_HEADERS_U1 TABLE ACCESS BY INDEX ROWID RA_CUSTOMERS INDEX UNIQUE SCAN RA_CUSTOMERS_U1
If the optimizer can guarantee (with unique keys) that duplicate values will not be passed, then a sort can be avoided, as in Example 2-22.
UPDATE so_lines_all l SET line_status = 'HOLD' WHERE l.header_id IN ( SELECT h.header_id FROM so_headers_all h WHERE h.customer_id = :b1); Plan -------------------------------------------------- UPDATE STATEMENT UPDATE SO_LINES_ALL NESTED LOOPS TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N1 INDEX RANGE SCAN SO_LINES_N1
SORT
AGGREGATE
does not actually involve a sort. It is used when aggregates are being computed across the whole set of rows, as shown in Example 2-23.
SELECT SUM(l.revenue_amount) FROM so_lines_all l WHERE l.header_id = :b1; Plan -------------------------------------------------- SELECT STATEMENT SORT AGGREGATE TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1
SORT
GROUP
BY
is used when aggregates are being computed for different groups in the data. The sort is required to separate the rows into different groups, as shown in Example 2-24.
SELECT created_by, SUM(l.revenue_amount) FROM so_lines_all l WHERE header_id > :b1 GROUP BY created_by; Plan -------------------------------------------------- SELECT STATEMENT SORT GROUP BY TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1
SORT
JOIN
happens during a SORT
MERGE
JOIN
, if the rows need to be sorted by the join key, as shown in Example 2-25.
SELECT SUM(l.revenue_amount), l2.creation_date FROM so_lines_all l, so_lines_All l2 WHERE l.creation_date < l2.creation_date AND l.header_id <> l2.header_id GROUP BY l2.creation_date, l2.line_id; Plan -------------------------------------------------- SELECT STATEMENT SORT GROUP BY MERGE JOIN SORT JOIN TABLE ACCESS FULL SO_LINES_ALL FILTER SORT JOIN TABLE ACCESS FULL SO_LINES_ALL
SORT
ORDER
BY
is required when the statement specifies an ORDER
BY
that cannot be satisfied by one of the indexes, as shown in Example 2-26.
SELECT h.order_number FROM so_headers_all h WHERE h.customer_id = :b1 ORDER BY h.creation_date DESC; Plan -------------------------------------------------- SELECT STATEMENT SORT ORDER BY TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N1
Either of the following can result in creation of a view by the CBO:
SELECT order_id FROM orders WHERE customer_id = :b1 AND revenue > :b2; Plan -------------------------------------------------- SELECT STATEMENT VIEW ORDERS FILTER SORT GROUP BY NESTED LOOPS TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N1 TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N1
In Example 2-28, a view is created because the IN
subquery requires a SORT
UNIQUE
on the values being selected. This view would be unnecessary if the columns being selected were unique, not requiring a sort.
SELECT c.customer_name, h.order_number FROM ra_customers c, so_headers_all h WHERE c.customer_id = h.customer_id AND h.header_id IN (SELECT l.header_id FROM so_lines_all l WHERE l.inventory_item_id = :b1 AND ordered_quantity > 10); Plan -------------------------------------------------- SELECT STATEMENT NESTED LOOPS NESTED LOOPS VIEW VW_NSO_1 SORT UNIQUE TABLE ACCESS BY INDEX ROWID SO_LINES_ALL INDEX RANGE SCAN SO_LINES_N5 TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX UNIQUE SCAN SO_HEADERS_U1 TABLE ACCESS BY INDEX ROWID RA_CUSTOMERS INDEX UNIQUE SCAN RA_CUSTOMERS_U1
The query in Example 2-29 examines the distribution of orders and revenue by the number of lines in each order. The CBO uses temporary inline views to do the double grouping.
SELECT COUNT(*) "Orders", cnt "Lines", sum(rev) "Revenue" FROM (SELECT header_id, COUNT(*) cnt, SUM(revenue_amount) rev FROM so_lines_all GROUP BY header_id) GROUP BY cnt; Plan -------------------------------------------------- SELECT STATEMENT SORT GROUP BY VIEW SORT GROUP BY TABLE ACCESS FULL SO_LINES_ALL
Computation of constants is performed only once, when the statement is optimized, rather than each time the statement is executed.
For example, the following conditions all test for monthly salaries greater than 2000:
salary > 24000/12 salary > 2000 salary*12 > 24000
If a SQL statement contains the first condition, then the optimizer simplifies it into the second condition.
This operator is useful for combining OR
clauses into one compound statement or for breaking up a complex statement into a compound statement containing simpler select statements that are easier to optimize and understand.
As with concatenation, you do not want to duplicate expensive operations by using UNION
ALL
.
The optimizer uses UNION
or UNION
ALL
when the SQL statement contains UNION
or UNION
ALL
clauses.
Example 2-30 shows a query without a UNION
clause. The query finds customers who are new or have open orders.
SELECT c.customer_name, c.creation_date FROM ra_customers c WHERE c.creation_date > SYSDATE - 30 OR customer_id IN (SELECT customer_id FROM so_headers_all h WHERE h.open_flag = 'Y'); Plan -------------------------------------------------- SELECT STATEMENT FILTER TABLE ACCESS FULL RA_CUSTOMERS TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N1
Because the driving conditions come from different tables, you cannot execute the query effectively in a single statement.
With a UNION
clause, you can break the query into two statements:
These two statements can be optimized easily. Because you do not want duplicates (some customers meet both criteria), use UNION
, which eliminates duplicates by using a sort. If you use two disjoint sets, then you can use UNION
ALL
, eliminating the sort. The query from Example 2-30 is shown using UNION
in Example 2-31.
SELECT c.customer_name, c.creation_date FROM ra_customers c WHERE c.creation_date > SYSDATE - 30 UNION ALL SELECT c.customer_name, c.creation_date FROM ra_customers c WHERE customer_id IN (SELECT customer_id FROM so_headers_all h WHERE h.open_flag = 'Y'); Plan -------------------------------------------------- SELECT STATEMENT SORT UNIQUE UNION-ALL TABLE ACCESS BY INDEX ROWID RA_CUSTOMERS INDEX RANGE SCAN RA_CUSTOMERS_N2 NESTED LOOPS TABLE ACCESS BY INDEX ROWID SO_HEADERS_ALL INDEX RANGE SCAN SO_HEADERS_N2 TABLE ACCESS BY INDEX ROWID RA_CUSTOMERS INDEX UNIQUE SCAN RA_CUSTOMERS_U1
There are no hints for this operation.
The optimizer simplifies conditions that use the LIKE
comparison operator to compare an expression with no wildcard characters into an equivalent condition that uses an equality operator instead.
In the following example, the optimizer simplifies the first condition into the second:
last_name LIKE 'SMITH'
is transformed into
last_name = 'SMITH'
The optimizer can simplify these expressions only when the comparison involves variable-length datatypes. For example, if last_name
was of type CHAR
(10), then the optimizer cannot transform the LIKE
operation into an equality operation due to the equality operator following blank-padded semantics and LIKE
not following blank-padded semantics.
The optimizer expands a condition that uses the IN
comparison operator to an equivalent condition that uses equality comparison operators and OR
logical operators.
In the following example, the optimizer expands the first condition into the second:
last_name IN ('SMITH', 'KING', 'JONES')
is transformed into
last_name = 'SMITH' OR last_name = 'KING' OR last_name = 'JONES'
The optimizer expands a condition that uses the ANY
or SOME
comparison operator followed by a parenthesized list of values into an equivalent condition that uses equality comparison operators and OR
logical operators.
In the following example, the optimizer expands the first condition into the second:
salary > ANY (:first_sal, :second_sal)
is transformed into
salary > :first_sal OR salary > :second_sal
The optimizer transforms a condition that uses the ANY
or SOME
operator followed by a subquery into a condition containing the EXISTS
operator and a correlated subquery.
In the following example, the optimizer transforms the first condition into the second:
x > ANY (SELECT salary FROM employees WHERE job_id = 'IT_PROG')
is transformed into
EXISTS (SELECT salary FROM employees WHERE job_id = 'IT_PROG' AND x > salary)
The optimizer expands a condition that uses the ALL
comparison operator followed by a parenthesized list of values into an equivalent condition that uses equality comparison operators and AND
logical operators.
In the following example, the optimizer expands the first condition into the second:
salary > ALL (:first_sal, :second_sal)
is transformed into
salary > :first_sal AND salary > :second_sal
The optimizer transforms a condition that uses the ALL
comparison operator followed by a subquery into an equivalent condition that uses the ANY
comparison operator and a complementary comparison operator. In the following example, the optimizer transforms the first condition into the second:
x > ALL (SELECT salary FROM employees WHERE department_id = 50)
is transformed into
NOT (x <= ANY (SELECT salary FROM employees WHERE department_id = 50) )
The optimizer then further transforms the second query into the following query using the rule for transforming conditions with the ANY
comparison operator, followed by a correlated subquery:
NOT EXISTS (SELECT salary FROM employees WHERE department_id = 50 AND x <= salary)
The optimizer always replaces a condition that uses the BETWEEN
comparison operator with an equivalent condition that uses the >= and <= comparison operators. In the following example, the optimizer replaces the first condition with the second:
salary BETWEEN 2000 AND 3000
is transformed into
salary >= 2000 AND salary <= 3000
The optimizer simplifies a condition to eliminate the NOT
logical operator. The simplification involves removing the NOT
logical operator and replacing a comparison operator with its opposite comparison operator. In Example 2-32, the optimizer simplifies the first condition into the second one:
NOT department_id = (SELECT department_id FROM employees WHERE last_name = 'Taylor')
is transformed into
department_id <> (SELECT department_id FROM employees WHERE last_name = 'Taylor')
Often, a condition containing the NOT
logical operator can be written in many different ways. The optimizer attempts to transform such a condition so that the subconditions negated by NOT
s are as simple as possible, even if the resulting condition contains more NOT
s. In Example 2-33, the optimizer simplifies the first condition into the second, and then into the third.
NOT (salary < 1000 OR commission_pct IS NULL)
is transformed into
NOT salary < 1000 AND commission_pct IS NOT NULL
which is further transformed into
salary >= 1000 AND commission_pct IS NOT NULL
If two conditions in the WHERE
clause involve a common column, then the optimizer sometimes can infer a third condition, using the transitivity principle. The optimizer can then use the inferred condition to optimize the statement. The inferred condition can make available an index access path that was not made available by the original conditions.
Consider a WHERE
clause containing two conditions of these forms:
WHERE column1 comp_oper constant AND column1 = column2
In this case, the optimizer infers the condition:
column2 comp_oper constant
where:
comp_oper
is any of the comparison operators =
, !=
, ^=
, <
, <>
, >
, <=
, or >=
constant
is any constant expression involving operators, SQL functions, literals, bind variables, and correlation variablesIn Example 2-34, the WHERE
clause contains two conditions, each of which uses the employees
.department_id
column.
SELECT * FROM employees, departments WHERE employees.department_id = 20 AND employees.department_id = departments.department_id;
Using transitivity, the optimizer infers the following condition:
departments.department_id = 20
If an index exists on the departments
.department_id
column, then this condition makes available access paths using that index.
The optimizer only infers conditions that relate columns to constant expressions, rather than columns to other columns. Consider a WHERE
clause containing two conditions of these forms:
WHERE column1 comp_oper column3 AND column1 = column2
In this case, the optimizer does not infer the condition, column2
comp_oper
column3
.
Common subexpression optimization is an optimization heuristic that identifies, removes, and collects common subexpressions from disjunctive (OR
) branches of a query. In most cases, it reduces the number of joins to be performed. This optimization is enabled with the OPTIMIZER_FEATURES_ENABLE
initialization parameter.
A query is considered valid for common subexpression optimization if its WHERE
clause is in the following form:
OR
ed logs).AND
ed logs).AND
or OR
.)The query in Example 2-35 finds names of employees who work in a department located in London and who make more than 60K or who are accountants. The query contains common subexpressions in its two disjunctive branches.
SELECT employees.last_name FROM employees E, departments D WHERE (D.department_id = E.department_id AND E.job_id = 'AC_ACCOUNT' AND D.location = 2400) OR E.department_id = D.department_id AND E.salary > 60000 AND D.location = 2400);
The elimination of the common subexpressions transforms this query into the following query, reducing the number of joins from two to one.
SELECT employees.last_name FROM employees E, departments D WHERE (D.department_id = E.department_id AND D.location = 2400) AND (E.job_id = 'AC_ACCOUNT' OR E.salary > 60000);
The query in Example 2-36 contains common subexpression in three disjunctive branches.
SELECT SUM (l_extendedprice* (1 - l_discount)) FROM PARTS, LINEITEM WHERE (p_partkey = l_partkey AND p_brand = 'Brand#12' AND p_container IN ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') AND l_quantity >= 1 AND l_quantity <= 1 + 10 AND p_size >= 1 AND p_size <= 5 AND l_shipmode IN ('AIR', 'REG AIR') AND l_shipinstruct = 'DELIVER IN PERSON') OR (l_partkey = p_partkey) AND p_brand = 'Brand#23' AND p_container IN ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') AND l_quantity >= 10 AND l_quantity <= 10 + 10 AND p_size >= 1 AND p_size <= 10 AND p_size BETWEEN 1 AND 10 AND l_shipmode IN ('AIR', 'REG AIR') AND l_shipinstruct = 'DELIVER IN PERSON') OR (p_partkey = l_partkey AND p_brand = 'Brand#34' AND p_container IN ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') AND l_quantity >= 20 AND l_quantity <= 20 + 10 AND p_size >= 1 AND p_size <= 15 AND l_shipmode IN ('AIR', 'REG AIR') AND l_shipinstruct = 'DELIVER IN PERSON');
This query is transformed by common subexpression optimization to the following query, reducing the number joins from three to one.
SELECT SUM (l_extendedprice* (1 - l_discount)) FROM PARTS, LINEITEM WHERE (p_partkey = l_partkey /* these are the four common subexpressions */ AND p_size >= 1 AND l_shipmode IN ('AIR', 'REG AIR') AND l_shipinstruct = 'DELIVER IN PERSON') AND ((p_brand = 'Brand#12' AND p_container IN ( 'SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') AND l_quantity >= 1 AND l_quantity <= 1 + 10 AND p_size <= 5) OR (p_brand = 'Brand#23' AND p_container IN ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') AND l_quantity >= 10 AND l_quantity <= 10 + 10 AND p_size <= 10) OR (p_brand = 'Brand#34' AND p_container IN ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') AND l_quantity >= 20 AND l_quantity <= 20 + 10 AND p_size <= 15));
In some cases, the optimizer can use a previously calculated value rather than executing a user-written function. This is only safe for functions that behave in a restricted manner. The function must return the same output return value for any given set of input argument values.
The function's result must not differ because of differences in the content of package variables or the database, or session parameters such as the globalization support parameters. Furthermore, if the function is redefined in the future, then its output return value must be the same as that calculated with the prior definition for any given set of input argument values. Finally, there must be no meaningful side effects to using a precalculated value instead of executing the function again.
The creator of a function can promise to the Oracle server that the function behaves according to these restrictions by using the keyword DETERMINISTIC
when declaring the function with a CREATE
FUNCTION
statement or in a CREATE
PACKAGE
or CREATE
TYPE
statement. The server does not attempt to verify this declaration--even a function that obviously manipulates the database or package variables can be declared DETERMINISTIC
. It is the programmer's responsibility to use this keyword only when appropriate.
Calls to a DETERMINISTIC
function might be replaced by the use of an already calculated value when the function is called multiple times within the same query, or if there is a function-based index or a materialized view defined that includes a relevant call to the function.
See Also:
|
SQL is a very flexible query language; there are often many statements you could use to achieve the same goal. Sometimes, the optimizer (Query Transformer) transforms one such statement into another that achieves the same goal if the second statement can be executed more efficiently.
This section discusses the following topics:
See Also:
"Understanding Joins" for additional information about optimizing statements that contain joins, semi-joins, or anti-joins |
If a query contains a WHERE
clause with multiple conditions combined with OR
operators, then the optimizer transforms it into an equivalent compound query that uses the UNION
ALL
set operator, if this makes the query execute more efficiently:
OR
transformation for IN
-lists or OR
s on the same column; instead, it uses the INLIST
iterator operator.
See Also:
"Understanding Access Paths for the RBO" and "How the CBO Chooses an Access Path" for information on access paths and how indexes make them available |
Example 2-37 shows how an OR
query is transformed into a compound query. In this example, the WHERE
clause contains two conditions combined with an OR
operator.
If there are indexes on both the job_id
and department_id
columns, then the optimizer might transform this query into the following equivalent query:
SELECT * FROM employees WHERE job_id = 'ST_CLERK' UNION ALL SELECT * FROM employees WHERE department_id = 50 AND job_id <> 'ST_CLERK';
When the CBO is deciding whether to make a transformation, the optimizer compares the cost of executing the original query using a full table scan with that of executing the resulting query.
The execution plan for the transformed statement might look like the illustration in Figure 2-1. The shaded boxes indicate steps that physically retrieve data and the clear boxes indicate steps that operate on data returned from the previous step.
To execute the transformed query, Oracle performs the steps in Figure 2-1 in the following order:
job_id
and department_id
columns using the conditions of the component queries. These steps obtain rowids of the rows that satisfy the component queries.If either the job_id
column or the department_id
column is not indexed, then the optimizer does not even consider the transformation, because the resulting compound query would require a full table scan to execute one of its component queries. Executing the compound query with a full table scan in addition to an index scan could not possibly be faster than executing the original query with a full table scan.
Example 2-38 shows the kind of query that is not transformed. The query in this example assumes that there is an index on the last_name
column only:
SELECT * FROM employees WHERE last_name = 'Smith' OR salary > commission_pct*100000;
Transforming this query would result in the compound query in Example 2-39.
SELECT * FROM employees WHERE last_name = 'Smith' UNION ALL SELECT * FROM employees WHERE salary > commission_pct*100000;
Because the condition in the WHERE
clause of the second component query (salary
> commission_pct
) does not make an index available, the compound query requires a full table scan. For this reason, the optimizer does not make the transformation, but chooses a full table scan to execute the original statement.
To optimize a complex statement, the optimizer chooses to do one of the following:
The optimizer transforms a complex statement into a join statement whenever the resulting join statement is guaranteed to return exactly the same rows as the complex statement. This transformation allows Oracle to execute the statement by taking advantage of join optimizer techniques.
Example 2-40 shows how the optimizer uses a join to unnest a subquery. The complex statement in this example selects all rows from the orders
table for which owners appear in the customers
table.
If the customer_id
of the customers
table is a primary key or has a UNIQUE
constraint, then the optimizer can transform the complex query into the following join statement that is guaranteed to return the same data:
SELECT orders.* FROM orders, customers WHERE orders.customer_id = customers.customer_id;
To execute this statement, Oracle performs a nested-loops join operation. The execution plan for this statement might look like Figure 2-2.
If the optimizer cannot transform a complex statement into a join statement, then it chooses execution plans for the parent statement and the subquery as though they were separate statements. Oracle then executes the subquery and uses the rows returned to execute the parent query.
Example 2-41 shows the kind of statement is not transformed. The complex statement in this example returns all rows from the customers
table that have credit limits greater than the average credit limit.
SELECT * FROM customers WHERE credit_limit > (SELECT AVG(credit_limit) FROM customers);
No join statement can perform the function of this statement, so the optimizer does not transform the statement.
Note: Complex queries whose subqueries contain aggregate functions such as |
See Also:
"Understanding Joins" for information on nested loops joins |
To merge the view's query into a referencing query block in the accessing statement, the optimizer replaces the name of the view with the names of its base tables in the query block and adds the condition of the view's query's WHERE
clause to the accessing query block's WHERE
clause.
This optimization applies to select-project-join views, which are views that contain only selections, projections, and joins. This means that these views do not contain items such as set operators, aggregate functions, DISTINCT
, GROUP
BY
, and CONNECT
BY
. See "Mergeable and Nonmergeable Views".
Example 2-42 shows how the CBO merges views. The view in this example is of all employees who work in department 10:
CREATE VIEW emp_10 AS SELECT employee_id, last_name, job_id, manager_id, hire_date, salary, commission_pct, department_id FROM employees WHERE department_id = 10;
The following query accesses the view. The query selects the Ids greater than 170 of employees who work in department 10:
SELECT employee_id FROM emp_10 WHERE employee_id > 170;
The optimizer transforms the query into the following query that accesses the view's base table:
SELECT employee_id FROM employees WHERE department_id = 10 AND employee_id > 170;
If there are indexes on the department_id
or employee_id
columns, then the resulting WHERE
clause makes them available.
The optimizer can merge a view into a referencing query block when the view has one or more base tables, provided the view does not contain any of the following:
UNION
, UNION
ALL
, INTERSECT
, MINUS
)CONNECT
BY
clauseROWNUM
pseudocolumnAVG
, COUNT
, MAX
, MIN
, SUM
) in the select listWhen a view contains one of the following structures, it can be merged into a referencing query block only if Complex View Merging is enabled:
View merging is not possible for a view that has multiple base tables, if it is on the right side of an outer join. However, if a view on the right side of an outer join has only one base table, then the optimizer can use complex view merging, even if an expression in the view can return a nonnull value for a NULL
.
If a query has a CURSOR
expression, then no view merging will take place, even for views which would normally be mergeable. An example is the following:
CREATE VIEW emp_v AS SELECT last_name,employee_id FROM employees; SELECT CURSOR(select * from sys.dual), last_name, employee_id from emp_v;
This query will not merge the emp_v
view.
If a view's query contains a GROUP
BY
clause or DISTINCT
operator in the select list, then the optimizer can merge the view's query into the accessing statement only if complex view merging is enabled. Complex merging of a view with a GROUP
BY
clause is illustrated in Example 2-43.
Complex merging can also be used to merge an IN
subquery into the accessing statement if the subquery is uncorrelated, as shown in Example 2-44. Complex merging is not cost-based; it must be enabled with the initialization parameter OPTIMIZER_FEATURES_ENABLE
or the MERGE
hint. Without this hint or parameter setting, the optimizer uses another approach, pushing predicates.
See Also:
|
The view avg_salary_view
contains the average salaries for each department:
CREATE VIEW avg_salary_view AS SELECT department_id, AVG(salary) AS avg_sal_dept, FROM employees GROUP BY department_id;
If complex view merging is enabled, then the optimizer can transform the following query, which finds the average salaries of departments in London:
SELECT departments.location_id, avg_sal_dept FROM departments, avg_salary_view WHERE departments.department_id = avg_salary_view.department_id AND departments.location_id = 2400;
into the following query:
SELECT departments.loc, AVG(salary) FROM departments, employees WHERE departments.department_id = employees.department_id AND departments.location_id = 2400 GROUP BY departments.rowid, departments.location_id;
The transformed query accesses the view's base table, selecting only the rows of employees who work in London and grouping them by department.
Complex merging can be used for an IN
clause with a noncorrelated subquery, as well as for views. The view min_salary_view
contains the minimum salaries for each department:
CREATE VIEW min_salary_view AS SELECT department_id, MIN(salary) min_sal FROM employees GROUP BY department_id;
If complex merging is enabled, then the optimizer can transform the following query, which finds all employees who earn the minimum salary for their department in London:
SELECT employees.last_name, employees.salary FROM employees, departments WHERE (employees.department_id, employees.salary) IN (select department_id, min_sal from min_salary_view) AND employees.department_id = departments.department_id AND departments.location_id = 2400;
into the following query (where e1
and e2
represent the employees
table as it is referenced in the accessing query block and the view's query block, respectively):
SELECT e1.last_name, e1.salary FROM employees e1, departments, employees e2 WHERE e1.department_id = departments.department_id AND departments.location_id = 2400 AND e1.department_id = e2.department_id GROUP BY e1.rowid, departments.rowid, e1.last_name, e1.salary HAVING e1.salary = MIN(e2.salary);
The optimizer can transform a query block that accesses a nonmergeable view by pushing the query block's predicates inside the view's query. Examples 2-45 and 2-46 illustrate this process.
In Example 2-45, the two_emp_tables
view is the union of two employee tables. The view is defined with a compound query that uses the UNION
set operator.
CREATE VIEW two_emp_tables (employee_id, last_name, job_id, manager_id, hire_date, salary, commission_pct, department_id) AS SELECT employee_id, last_name, job_id, manager_id, hire_date, salary, commission_pct, department_id FROM emp1 UNION SELECT employee_id, last_name, job_id, manager_id, hire_date, salary, commission_pct, department_id FROM emp2;
The following query accesses the view. The query selects the IDs and names of all employees in either table who work in department 50:
SELECT employee_id, last_name FROM two_emp_tables WHERE department_id = 50;
Because the view is defined as a compound query, the optimizer cannot merge the view's query into the accessing query block. Instead, the optimizer can transform the accessing statement by pushing its predicate, the WHERE
clause condition (department_id
= 50), into the view's compound query.
The resulting statement looks like the following:
SELECT employee_id, last_name FROM ( SELECT employee_id, last_name, job_id, manager_id, hire_date, salary, commission_pct, department_id FROM emp1 WHERE department_id = 50 UNION SELECT employee_id, last_name, job_id, manager_id, hire_date, salary, commission_pct, department_id FROM emp2 WHERE department_id = 50 );
If there is an index on the department_id
column, then the resulting WHERE
clauses make it available.
Figure 2-3 shows the execution plan of the resulting statement.
To execute this statement, Oracle performs the steps in Figure 2-3 in the following order:
emp1
and emp2
tables.UNION
-ALL
operation returning all rows returned by either step 5 or step 6, including all copies of duplicates.In Example 2-46, the view emp_group_by_deptno
contains the department number, average salary, minimum salary, and maximum salary of all departments that have employees.
CREATE VIEW emp_group_by_deptno AS SELECT department_id, AVG(salary) avg_sal, MIN(salary) min_sal, MAX(salary) max_sal FROM employees GROUP BY department_id;
The following query selects the average, minimum, and maximum salaries of department 50 from the emp_group_by_deptno
view:
SELECT * FROM emp_group_by_deptno WHERE department_id = 50;
The optimizer transforms the statement by pushing its predicate (the WHERE
clause condition) into the view's query. The resulting statement looks like the following:
SELECT department_id, AVG(salary) avg_sal, MIN(salary) min_sal, MAX(salary) max_sal, FROM employees WHERE department_id = 50 GROUP BY department_id;
If there is an index on the department_id
column, then the resulting WHERE
clause makes it available. Figure 2-4 shows the execution plan for the resulting statement. The execution plan uses an index on the department_id
column.
To execute this statement, Oracle performs the steps in Figure 2-4 in the following order:
emp_department_ix
(an index on the department_id
column of the employees
table) to retrieve the rowids of all rows in the employees
table with a department_id
value of 50.employees
table using the rowids retrieved by step 4.salary
values.The optimizer can transform a query that contains an aggregate function (AVG
, COUNT
, MAX
, MIN
, SUM
) by applying the function to the view's query.
The query in Figure 2-47 accesses the emp_group_by_deptno
view defined in Figure 2-46. The query derives the averages for the average department salary, the minimum department salary, and the maximum department salary from the employee table.
SELECT AVG(avg_sal), AVG(min_sal), AVG(max_sal) FROM emp_group_by_deptno;
The optimizer transforms this statement by applying the AVG
aggregate function to the select list of the view's query:
SELECT AVG(AVG(salary)), AVG(MIN(salary)), AVG(MAX(salary)) FROM employees GROUP BY department_id;
Figure 2-5 shows the execution plan of the resulting statement.
To execute this statement, Oracle performs the steps in Figure 2-5 in the following order:
employees
table.department_id
values and calculates the average, minimum, and maximum salary
value of each group.For a view that is on the right side of an outer join, the optimizer can use one of two methods, depending on how many base tables the view accesses:
The optimizer cannot transform all statements that access views into equivalent statements that access base table(s). For example, if a query accesses a ROWNUM
pseudocolumn in a view, then the view cannot be merged into the query, and the query's predicate cannot be pushed into the view.
To execute a statement that cannot be transformed into one that accesses base tables, Oracle issues the view's query, collects the resulting set of rows, and then accesses this set of rows with the original statement as though it were a table. Example 2-48 illustrates this process.
Consider the emp_group_by_deptno
view defined in Figure 2-46:
CREATE VIEW emp_group_by_deptno AS SELECT department_id, AVG(salary) avg_sal, MIN(salary) min_sal, MAX(salary) max_sal FROM employees GROUP BY department_id;
The following query accesses this view. The query joins the average, minimum, and maximum salaries from each department represented in the view to the name and location of the department in the departments
table.
SELECT emp_group_by_deptno.department_id, avg_sal, min_sal, max_sal, department_name, location_id FROM emp_group_by_deptno, departments WHERE emp_group_by_deptno.department_id = departments.department_id;
Because there is no equivalent statement that accesses only base tables, the optimizer cannot transform this statement. Instead, the optimizer chooses an execution plan that issues the view's query and then uses the resulting set of rows as it would the rows resulting from a table access.
See Also:
"Understanding Joins" for more information on how Oracle performs a nested loops join operation |
Figure 2-6 shows the execution plan for this statement.
To execute this statement, Oracle performs the steps in Figure 2-6 in the following order:
employees
table.salary
values selected by the query for the emp_group_by_deptno
view.department_id
value to perform a unique scan of the dept_id_pk
index.departments
table with the matching department_id
value.To choose the execution plan for a compound query, the optimizer chooses an execution plan for each of its component queries, and then combines the resulting row sources with the union, intersection, or minus operation, depending on the set operator used in the compound query. Examples 2-49, 2-50 and 2-51 illustrate the process.
The query in Example 2-49 uses the UNION
ALL
operator to select all occurrences of all parts in either the orders1
table or the orders2
table.
UNION
ALL
SELECT part FROM orders1 UNION ALL SELECT part FROM orders2;
Figure 2-7 shows the execution plan for this statement:
To execute this statement, Oracle performs the steps in Figure 2-7 in the following order:
orders1
and orders2
tables.UNION
-ALL
operation returning all rows that are returned by either step 2 or step 3 including all copies of duplicates.The query in Example 2-50 uses the UNION
operator to select all parts that appear in either the orders1
or orders2
table.
SELECT part FROM orders1 UNION SELECT part FROM orders2;
Figure 2-8 shows the execution plan for this statement:
This execution plan is identical to the one for the UNION
ALL
operator shown in Figure 2-7, except that in this case, Oracle uses the SORT
operation to eliminate the duplicates returned by the UNION
ALL
operation.
The query in Example 2-51 uses the INTERSECT
operator to select only those parts that appear in both the orders1
and orders2
tables.
SELECT part FROM orders1 INTERSECT SELECT part FROM orders2;
Figure 2-9 shows the execution plan for this statement:
To execute this statement, Oracle performs the steps in Figure 2-9 in the following order: