When you have an enormous load of data to copy/merge from one table to another, you are probably concerned about:
- Performance: How fast can the program copy/merge all the data?
- Exception handling: How well does the program deal with errors in the data? Can it highlight bad data if present in the lot, at the same time allow the good data to pass successfully?
In older versions of Oracle, if you had to insert a million rows using INSERT…SELECT and even one of those rows had an error, the entire transaction would fail. For selective exception handling, you had no choice but to manage the transaction procedurally.
Oracle 10G R2+ has a way to fortify DML statements with a "LOG ERRORS INTO" clause, which can deal with errors pretty much like a PL/SQL exception handler would. Read onto know how DML error logging works.
DML Error Logging: Building Blocks
The LOG ERRORS INTO clause gets suffixed to your regular DML statement this way:
<DML statement>
LOG ERRORS INTO<error_logging_table> (<tag>)
REJECT LIMIT <reject limit value>;
Here’s a closer look at the components of DML Error Logging.
1. DML statement
The DML statement could be an INSERT, UPDATE, DELETE or MERGE.
2. Error logging table
The error logging table is designed to hold the errorred rows during the DML operation. The error table may be created in any of the two ways:
- Manually via the regular CREATE TABLE command. This table must follow certain guidelines as described in the next section "Error Logging Table Format".
- Automatically via call to the DBMS_ERRLOG package::
exec dbms_errlog.create_error_log( ‘dml_target_table’, ‘error_logging_table’);
3. Tag
This is an optional value that goes into each row inserted into the error table. In a batch transaction, it helps to tag both the target DML table rows and the error logging table rows with a batch_id for grouping and cross-validation.
4. Reject limit value
An integer that defines the maximum number of errors allowed in the DML statement — if the error count exceeds the reject limit value, the DML statement terminates and all changes made are rolled back.
The default reject limit value is 0.
If you want the DML to go through regardless of the count of errors, set REJECT LIMIT UNLIMITED.
Error Logging Table Format
The DML error logging table should follow these specifications.
1. Mandatory columns
The table must contain these columns:
Column Name | Data Type | Description |
ORA_ERR_NUMBER$ | NUMBER | ORA error number |
ORA_ERR_MESG$ | VARCHAR2(2000) | ORA error message text |
ORA_ERR_ROWID$ | ROWID | Rowid of the row in error, for update and delete; null for insert |
ORA_ERR_OPTYP$ | VARCHAR2(2) | Type of operation: insert (I), update (U), delete (D). In a MERGE, update rows are marked U, insert rows are marked I. |
ORA_ERR_TAG$ | VARCHAR2(2000) | Optional tag value from the DML error logging clause |
DBMS_ERRLOG takes care of including the mandatory columns in the error logging table, in the case of automatic error logging table creation.
When creating the error logging table manually, you need to ensure the mandatory columns are present. These columns can be in any order, but they must appear at the start of the table.
2. Optional columns:
The error logging table can optionally contain a set of columns to hold data from the row that caused the error. The column names of the error logging table must match the column names of the DML target table. If there’s a mismatch, the column is not logged/value is ignored.
If DBMS_ERRORLOG is used to create the error logging table automatically, all the columns from the DML target table get included in the error logging table. In case of a table with several columns, this may be an unnecessary space overhead: in this scenario it’s best to create your own error logging table trimmed to only the essential columns.
With your own error logging table, you can choose which columns to include: in general, the unique index columns are enough. Position them in the table after all the mandatory columns.
3. Error Logging Table Name:
When creating the error logging table automatically, you can specify its name in the DBMS_ERRLOG parameters.
If the error logging table name is unspecified, it is given a default name of this form:
ERR$_ || first 25 characters of <dml_target_table>
Example:
dml_target_table name: EMP
error_logging_table name: ERR$_EMP
When creating the error logging table manually, you can give it any name that follows the standard Oracle table naming guidelines.
Let’s get on with the DML error logging code in action.
Case Study: Copying huge data volume across tables
We’ll build on the problem statement from the bulk DML with FORALL example:
The tables:
- PEOPLE — Source table with columns (ID, NAME, STATUS) from which data is to be read.
- CUSTOMER — Target table with columns (ID, NAME) to which data is to be inserted.
Table CUSTOMER has a unique constraint on columm ID.
Table PEOPLE has no unique constraint, and some data may have IDs repeated.
The requirement:
Copy data from table PEOPLE to CUSTOMER, where the status of the data in PEOPLE is NEW.
Record the errors if they occur during copying. Given the ID repetitions in table PEOPLE, errors are expected due to unique constraint violation in table CUSTOMER.
After copying, update the status column in table PEOPLE: set the successfully copied rows to DONE and the failed rows to ERROR.
The data before copying:
SQL> -- Source DML table PEOPLE SQL> select * from people; ID NAME STATUS ---- ---------- -------- 1 Jon NEW 2 Cersei NEW 2 Khal NEW 3 Gregor NEW 3 Sansa NEW 4 Arya NEW 5 Eddard NEW 7 rows selected. SQL> -- Target DML table CUSTOMER SQL> select * from customer; no rows selected
DML Error Logging: Solution Approach
1. Create an error logging table.
2. INSERT INTO CUSTOMER…SELECT FROM PEOPLE, logging errors into the error logging table.
3. UPDATE PEOPLE status from NEW to DONE/ERROR conditionally with a searched case.
You might ask why we didn’t consider direct-path INSERT (that is, INSERT with /*+ APPEND */ hint), which could go much faster for high data volumes than conventional-path INSERT.
The reason is that with DML error logging, direct-path INSERT will fail when a unique constraint violation occurs, while a conventional-path INSERT will log these errors to the error logging table. Our expected error in this case is the unique constraint violation on table CUSTOMER, so we need to stick to direct-path INSERT…
…unless we are game to try this clever hack.
DML Error Logging in Action
[BEFORE] Data:
SQL> -- Source DML table PEOPLE SQL> select * from people; ID NAME STATUS ---- ---------- -------- 1 Jon NEW 2 Cersei NEW 2 Khal NEW 3 Gregor NEW 3 Sansa NEW 4 Arya NEW 5 Eddard NEW 7 rows selected. SQL> -- Target DML table CUSTOMER SQL> select * from customer; no rows selected
Note that of the seven rows of source data, two IDs (2 and 3) are repeated — this means the INSERT should work for only five of the source rows, and two rows should end up in error.
1. Create the error logging table for DML target table
SQL> -- Create the error logging table for DML target table SQL> exec dbms_errlog.create_error_log( 'CUSTOMER' ); PL/SQL procedure successfully completed.
Since we used the defaults for DBMS_ERRLOG, the error logging table would be named ERR$_CUSTOMER.
2. Verify the structure of the error logging table
SQL> -- Verify the structure of the automatically SQL> -- created error_logging_table for CUSTOMER SQL> desc err$_customer Name Null? Type ----------------------- -------- --------------- ORA_ERR_NUMBER$ NUMBER ORA_ERR_MESG$ VARCHAR2(2000) ORA_ERR_ROWID$ ROWID ORA_ERR_OPTYP$ VARCHAR2(2) ORA_ERR_TAG$ VARCHAR2(2000) ID VARCHAR2(4000) NAME VARCHAR2(4000)
We see that the table has the mandatory columns ending in $, and the other columns (ID, NAME) from the DML target table CUSTOMER.
3. Run the script with DML Error Logging
This script inserts into CUSTOMER via select from PEOPLE where status = NEW. It then updates PEOPLE status to DONE or ERROR, depending on whether a corresponding row exists in the error logging table or not.
SQL> declare 2 -- Id for the run. Not really necessary in this example 3 -- but useful if DML error logging is part of a stored 4 -- proc and called periodically via batch program 5 batch_id number := 999; 6 begin 7 8 -- Insert into CUSTOMER, logging errors along the way 9 insert into customer (id, name) 10 select p.id, p.name 11 from people p 12 where status = 'NEW' 13 log errors into err$_customer (batch_id) 14 reject limit unlimited; 15 16 -- Update status in PEOPLE after copying data 17 -- DONE if successful, ERROR if failed 18 update people p 19 set p.status = case when exists 20 (select 1 21 from err$_customer e 22 -- identify the copied row 23 where p.id = e.id and p.name = e.name 24 and ora_err_tag$ = batch_id) 25 then 'ERROR' 26 else 'DONE' 27 end 28 where status = 'NEW'; 29 end; 30 / PL/SQL procedure successfully completed.
The script ran successfully. Let’s check the errors if any in ERR$_CUSTOMER, and the state of the data in the two tables PEOPLE and CUSTOMER.
4. Check failed rows in the error_logging_table
SQL> -- Values in the error_logging_table SQL> select ora_err_number$ number$ 2 , ora_err_mesg$ mesg$ 3 , ora_err_optyp$ optyp$ 4 , ora_err_tag$ tag$ 5 , id 6 , name 7 from err$_customer; NUMBER$ MESG$ OPTYP$ TAG$ ID NAME ------- ---------------------------- ------ ---- -- ----- 1 ORA-00001: unique constraint I 999 2 Khal (HR.CUSTOMER_UK) violated 1 ORA-00001: unique constraint I 999 3 Sansa (HR.CUSTOMER_UK) violated
As expected, two rows have failed due to unique constraint violation.
[AFTER] Data:
- CUSTOMER has 5 new unique rows
- ERR$_CUSTOMER has 2 error rows with error details
- PEOPLE status correctly indicates which rows were copied successfully, which were not (five DONE, two ERROR)
SQL> -- Source DML table PEOPLE SQL> select * from people; ID NAME STATUS ---- ---------- -------- 1 Jon DONE 2 Cersei DONE 2 Khal ERROR 3 Gregor DONE 3 Sansa ERROR 4 Arya DONE 5 Eddard DONE 7 rows selected. SQL> SQL> -- Target DML table CUSTOMER SQL> select * from customer; ID NAME ---- ---------- 1 Jon 2 Cersei 3 Gregor 4 Arya 5 Eddard
Which to Use: DML Error Logging or FORALL…SAVE EXCEPTIONS?
When both DML Error Logging and FORALL…SAVE EXCEPTIONS give you performance/exception handling benefits for transactions with large data volume, which is the preferred approach?
The question doesn’t have an all-encompassing “this approach” answer – if it did, Oracle would probably not have two ways of doing the same thing. You would need to analyse your requirements/benchmark for both approaches.
DML Error Logging can be very fast if you can use direct-path load (INSERT /*+ APPEND */) and if there are no indexes on the target table. If you must use conventional-path load (as in the cases study above), you may want to evaluate the approach more carefully.
FORALL…SAVE EXCEPTIONS is faster than row-by-row processing, but might not be as fast as direct-path load with DML Error Logging.
Even if you can use direct-path load, there are factors other than speed to consider:
- Locking: direct-path load locks the object while it’s in action; bulk DML does not.
- Free space utilization: direct-path load does not utilise existing free space, which bulk DML does.
- Transaction management: direct-path load needs a commit/rollback right after completion*; bulk DML does not and can support further processing in the same transaction.
*Example: try selecting from a table immediately after direct-path load, without commit/rollback. You get:
ORA-12838: cannot read/modify an object after modifying it in parallel
Summary
This article explains the features of DML Error Logging, with a working example. It also gives guidelines for making a choice between DML Error Logging and bulk DML with SAVE EXCEPTIONS.
{ 1 comment… read it below or add one }
Very helpful and informative .