Oracle® Database Concepts 11g Release 1 (11.1) Part Number B28318-01 |
|
|
View PDF |
This chapter contains an overview of VLDB topics, with emphasis on partitioning as a key component of the VLDB strategy. It covers the following topics:
See Also:
Oracle Database VLDB and Partitioning Guide for more information about VLDB topics such as partitioningPartitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions. SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, after partitions are defined, DDL statements can access and manipulate individuals partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects. Also, partitioning is entirely transparent to applications.
Each partition of a table or index must have the same logical attributes, such as column names, datatypes, and constraints, but each partition can have separate physical attributes such as pctfree, pctused, and tablespaces.
Partitioning is useful for many different types of applications, particularly applications that manage large volumes of data. OLTP systems often benefit from improvements in manageability and availability, while data warehousing systems benefit from performance and manageability.
Partitioning offers these advantages:
Partitioning enables data management operations such data loads, index creation and rebuilding, and backup/recovery at the partition level, rather than on the entire table. This results in significantly reduced times for these operations.
Partitioning improves query performance. In many cases, the results of a query can be achieved by accessing a subset of partitions, rather than the entire table. For some queries, this technique (called partition pruning) can provide order-of-magnitude gains in performance.
Partitioning can significantly reduce the impact of scheduled downtime for maintenance operations.
Partition independence for partition maintenance operations lets you perform concurrent maintenance operations on different partitions of the same table or index. You can also run concurrent SELECT
and DML operations against partitions that are unaffected by maintenance operations.
Partitioning increases the availability of mission-critical databases if critical tables and indexes are divided into partitions to reduce the maintenance windows, recovery times, and impact of failures.
Partitioning can be implemented without requiring any modifications to your applications. For example, you could convert a nonpartitioned table to a partitioned table without needing to modify any of the SELECT
statements or DML statements which access that table. You do not need to rewrite your application code to take advantage of partitioning.
This section includes the following topics:
Each row in a partitioned table is unambiguously assigned to a single partition. The partition key is a set of one or more columns that determines the partition for each row. Oracle Database automatically directs insert, update, and delete operations to the appropriate partition through the use of the partition key. A partition key:
Consists of an ordered list of 1 to 16 columns
Cannot contain a LEVEL
, ROWID
, or MLSLABEL
pseudocolumn or a column of type ROWID
Can contain columns that are NULL
able
Tables can be partitioned into up to 1024K-1 separate partitions. Any table can be partitioned except those tables containing columns with LONG
or LONG RAW
datatypes. You can, however, use tables containing columns with CLOB
or BLOB
datatypes.
Note:
To reduce disk use and memory use (specifically, the buffer cache), you can store tables and partitioned tables in a compressed format inside the database. This often leads to a better scaleup for read-only operations. Table compression can also speed up query execution. There is, however, a slight cost in CPU overhead.See Also:
"Table Compression"You can partition index-organized tables by range, list, or hash. Partitioned index-organized tables are very useful for providing improved manageability, availability, and performance for index-organized tables. In addition, data cartridges that use index-organized tables can take advantage of the ability to partition their stored data.
For partitioning an index-organized table:
Partition columns must be a subset of primary key columns
Secondary indexes can be partitioned—locally and globally
OVERFLOW
data segments are always equipartitioned with the table partitions
Oracle provides the following partitioning methods:
Range partitioning maps data to partitions based on ranges of partition key values that you establish for each partition.
Interval partitioning is an extension of range partitioning which instructs the database to automatically create partitions of a specified interval when data inserted into the table exceeds all of the range partitions.
Hash partitioning maps data to partitions based on a hashing algorithm that evenly distributes rows among partitions, giving partitions approximately the same size.
List partitioning enables you to explicitly control how rows map to partitions by specifying a list of discrete values in the description for each partition.
Reference partitioning enables you to partition a table based on the partitioning scheme of the table referenced in its referential constraint.
Composite partitioning is a combination of two partitioning methods to further divide the data into subpartitions:
Composite range-range partitioning partitions data using the range method, and within each partition, subpartitions it using the range method.
Composite range-hash partitioning partitions data using the range method, and within each partition, subpartitions it using the hash method.
Composite range-list partitioning partitions data using the range method, and within each partition, subpartitions it using the list method.
Composite list-range partitioning partitions data using the list method, and within each partition, subpartitions it using the range method.
Composite list-hash partitioning partitions data using the list method, and within each partition, subpartitions it using the hash method.
Composite list-list partitioning partitions data using the list method, and within each partition, subpartitions it using the list method.
System partitioning enables application-controlled partitioning for arbitrary tables.
Just like partitioned tables, partitioned indexes improve manageability, availability, performance, and scalability. They can either be partitioned independently (global indexes) or automatically linked to a table's partitioning method (local indexes). In general, you should use global indexes for OLTP applications and local indexes for data warehousing or DSS applications. Also, whenever possible, you should try to use local indexes because they are easier to manage. When deciding what kind of partitioned index to use, you should consider the following guidelines in order:
If the table partitioning column is a subset of the index keys, use a local index. If this is the case, you are finished. If this is not the case, continue to guideline 2.
If the index is unique, use a global index. If this is the case, you are finished. If this is not the case, continue to guideline 3.
If your priority is manageability, use a local index. If this is the case, you are finished. If this is not the case, continue to guideline 4.
If the application is an OLTP one and users need quick response times, use a global index. If the application is a DSS one and users are more interested in throughput, use a local index.
See Also:
Oracle Database VLDB and Partitioning Guide and Oracle Database Administrator's Guide for more information about partitioned indexes and how to decide which type to useThis section includes the following topics:
Miscellaneous Information about Creating Indexes on Partitioned Tables
Using Partitioned Indexes in Data Warehousing and DSS Applications
Local partitioned indexes are easier to manage than other types of partitioned indexes. They also offer greater availability and are common in DSS environments. The reason for this is equipartitioning: each partition of a local index is associated with exactly one partition of the table. This enables Oracle Database to automatically keep the index partitions in sync with the table partitions, and makes each table-index pair independent. Any actions that make one partition's data invalid or unavailable only affect a single partition.
Local partitioned indexes support more availability when there are partition or subpartition maintenance operations on the table. A type of index called a local nonprefixed index is very useful for historical databases. In this type of index, the partitioning is not on the left prefix of the index columns.
See Also:
Oracle Database VLDB and Partitioning Guide more information about prefixed indexesYou cannot explicitly add a partition to a local index. Instead, new partitions are added to local indexes only when you add a partition to the underlying table. Likewise, you cannot explicitly drop a partition from a local index. Instead, local index partitions are dropped only when you drop a partition from the underlying table.
A local index can be unique. However, in order for a local index to be unique, the partitioning key of the table must be part of the index's key columns. Unique local indexes are useful for OLTP environments.
Oracle Database offers two types of global partitioned index: range partitioned and hash partitioned.
This section includes the following topics:
Global range partitioned indexes are flexible in that the degree of partitioning and the partitioning key are independent from the table's partitioning method. They are commonly used for OLTP environments and offer efficient access to any individual record.
The highest partition of a global index must have a partition bound, all of whose values are MAXVALUE
. This ensures that all rows in the underlying table can be represented in the index. Global prefixed indexes can be unique or nonunique.
You cannot add a partition to a global index because the highest partition always has a partition bound of MAXVALUE
. If you wish to add a new highest partition, use the ALTER INDEX SPLIT PARTITION
statement. If a global index partition is empty, you can explicitly drop it by issuing the ALTER INDEX DROP PARTITION
statement. If a global index partition contains data, dropping the partition causes the next highest partition to be marked unusable. You cannot drop the highest partition in a global index.
Global hash partitioned indexes improve performance by spreading out contention when the index is monotonically growing. In other words, most of the index insertions occur only on the right edge of an index.
By default, the following operations on partitions on a heap-organized table mark all global indexes as unusable:
ADD (HASH) COALESCE (HASH) DROP EXCHANGE MERGE MOVE SPLIT TRUNCATE
These indexes can be maintained by appending the clause UPDATE INDEXES
to the SQL statements for the operation. The two advantages to maintaining global indexes:
The index remains available and online throughout the operation. Hence no other applications are affected by this operation.
The index doesn't have to be rebuilt after the operation.
Example:
ALTER TABLE DROP PARTITION P1 UPDATE INDEXES;
Note:
This feature is supported only for heap-organized tables.See Also:
Oracle Database SQL Language Reference for more information about theUPDATE INDEXES
clauseGlobal nonpartitioned indexes behave just like a nonpartitioned index. They are commonly used in OLTP environments and offer efficient access to any individual record.
You can create bitmap indexes on partitioned tables, with the restriction that the bitmap indexes must be local to the partitioned table. They cannot be global indexes.
Global indexes can be unique. Local indexes can only be unique if the partitioning key is a part of the index key.
Here are a few guidelines for OLTP applications:
Global indexes and unique, local indexes provide better performance than nonunique local indexes because they minimize the number of index partition probes.
Local indexes offer better availability when there are partition or subpartition maintenance operations on the table.
Hash-partitioned global indexes offer better performance by spreading out contention when the index is monotonically growing. In other words, most of the index insertions occur only on the right edge of an index.
Here are a few guidelines for data warehousing and DSS applications:
Local indexes are preferable because they are easier to manage during data loads and during partition-maintenance operations.
Local indexes can improve performance because many index partitions can be scanned in parallel by range queries on the index key.
Partitioning can help you improve performance and manageability. Some topics to keep in mind when using partitioning for these reasons are:
Oracle Database explicitly recognizes partitions and subpartitions. It then optimizes SQL statements to mark the partitions or subpartitions that need to be accessed and eliminates (prunes) unnecessary partitions or subpartitions from access by those SQL statements. In other words, partition pruning is the skipping of unnecessary index and data partitions or subpartitions in a query.
For each SQL statement, depending on the selection criteria specified, unneeded partitions or subpartitions can be eliminated. For example, if a query only involves March sales data, then there is no need to retrieve data for the remaining eleven months. Such intelligent pruning can dramatically reduce the data volume, resulting in substantial improvements in query performance.
The SQL Access Advisor can automatically analyze the schema design for a given workload and recommend indexes, function-based indexes, partitions, and materialized views to create, retain, or drop as appropriate for the workload.
See Also:
Oracle Database Performance Tuning Guide for more information about the SQL Access AdvisorIf the optimizer determines that the selection criteria used for pruning are satisfied by all the rows in the accessed partition or subpartition, it removes those criteria from the predicate list (WHERE
clause) during evaluation in order to improve performance. However, the optimizer cannot prune partitions if the SQL statement applies a function to the partitioning column (with the exception of the TO_DATE
function). Similarly, the optimizer cannot use an index if the SQL statement applies a function to the indexed column, unless it is a function-based index.
Pruning can eliminate index partitions even when the underlying table's partitions cannot be eliminated, but only when the index and table are partitioned on different columns. You can often improve the performance of operations on large tables by creating partitioned indexes that reduce the amount of data that your SQL statements need to access or modify.
Equality, range, LIKE
, and IN
-list predicates are considered for partition pruning with range or list partitioning, and equality and IN
-list predicates are considered for partition pruning with hash partitioning.
There is a partitioned table called cust_orders
. The partition key for cust_orders
is order_date
. Assume that cust_orders
has six months of data, January to June, with a partition for each month of data. If the following query is run:
SELECT SUM(value) FROM cust_orders WHERE order_date BETWEEN '28-MAR-98' AND '23-APR-98';
Partition pruning is achieved by:
First, partition elimination of January, February, May, and June data partitions. Then either:
An index scan of the March and April data partition due to high index selectivity
or
A full scan of the March and April data partition due to low index selectivity
A partition-wise join is a join optimization for joining two tables that are both partitioned along the join column(s). With partition-wise joins, the join operation is broken into smaller joins that are performed sequentially or in parallel. Another way of looking at partition-wise joins is that they minimize the amount of data exchanged among parallel slaves during the execution of parallel joins by taking into account data distribution.