Oracle provides an abstract SQL data type called XMLType for storing XML data in the database. You can create an XMLType table, or an XMLType column in a relational table, to persist XML data.
Different storage models are available in Oracle to best fit the nature of the XML data and its expected use. These are:
- Binary XML storage
- Structured storage (object-relational)
- Unstructured XML storage (CLOB)
Here’s an overview of each of these XML storage models with the use cases the model is appropriate for.
[This post is from the perspective of Oracle 11g/12c. Oracle XML DB features have evolved considerably with version changes — it’s best to check your version’s XML DB Developer’s Guide for the most relevant details for your needs.]
Binary XML Storage
Also called post-parse persistence, binary XML storage is a compact, flexible, XML schema-aware binary format designed specifically for XML data.
A note on the XML terminology in case "XML schema-aware" is an unfamiliar term for you:
XML Schema, also called XML Schema Definition (XSD), describes the content and structure of a type of XML document. XSDs make manipulation of the XML data easier and more robust in applications – XML files can be validated against an XSD, or XML data can be annotated with type information based on an XSD. This may be done by mapping the XSD-defined types to types in a programming language, or by using the XSD’s information in XML processing languages such as XSLT and XQuery – the latter is known as "schema-awareness".
Binary XML storage was introduced in Oracle 11.1 as a new XMLType storage model. In Oracle 11.2.0.2, binary XML became the default XMLType storage model.
The main advantage of binary XML storage is flexibility: it works with XML data which may or may not be based on an XML schema. You can use it with an XML schema that has high data variability or one that evolves unexpectedly.
Q: When is it appropriate to use binary XML storage?
A: The following use cases are a good fit for binary XML:
- the typical operations on the XML are INSERT or SELECT of the *whole* XML document; filtered/conditional selection of portions of XML is rare
- the XML data is very sparse i.e. has many missing/empty elements
- the XML data does not have an associated XML schema
- the XML data does have an associated XML schema but the schema evolves frequently
- the XML data has more than one associated XML schema
Structured XML storage (object-relational)
In structured XML storage, XMLType data is "shredded" into scalar values for storage in object-relational tables in the database. When an XML schema is registered with Oracle XML DB, SQL objects to store the conforming XML data are automatically generated. Structured XML storage is also called object-relational storage or object-based persistence.
With structured XML storage, Oracle XML DB can execute XQuery-based and XPath-based queries at highly optimized speed using query rewrite. This makes structured XML storage very useful for DML-heavy applications.
Q: When is it appropriate to use structured XML storage?
A: To consider structured XML storage, all of the following conditions must be met:
- the typical operations on the XML are INSERT or SELECT on parts of the XML document; filtered/conditional selection of portions of XML is common
- the XML data is not sparse i.e. does not have many missing/empty elements
- the XML data has an associated XML schema that rigorously specifies the data format (i.e. no constructs such as any and choice)
- the associated XML schema does not evolve frequently
- document fidelity is not necessary, DOM fidelity is sufficient — i.e. if you retrieved the XML document after storing it, it is sufficient to preserve all the information as in the original XML document, not necessary to preserve whitespace.
Unstructured XML storage (CLOB)
In unstructured XML storage, XMLType data is stored in the database as CLOB (Character Large OBjects). Unstructured XML storage is also called CLOB storage or text-based persistence.
CLOB used to be the default XMLType storage model till binary XML ousted it from that rank in Oracle 11.2.0.2. Oracle version 12.1.0.1 onwards, CLOB storage model for XMLType has been deprecated.
Q: When is it appropriate to use unstructured XML storage?
A: Before Oracle 11.1: CLOB storage was considered apt in scenarios where the XML required absolute document fidelity — i.e. if you retrieved the XML document after storing it, you would get exactly the XML document that went in, whitespaces intact etc. Document fidelity came at the expense of query/DML performance.
Oracle 11.1 onwards: Use of unstructured XML storage is not recommended; binary XML storage is recommended instead. If document fidelity is necessary, then a copy of the XML data may be stored in a relational CLOB column.
Summary
This articles gives an overview of three models for storing XML data in Oracle. These are:
- Binary XML storage
- Structured storage (object-relational)
- Unstructured XML storage (CLOB)
In the next articles, we will see a working examples with scripts for binary XML storage and object-relational storage.
For Further Reading
- XML DB on Oracle Forums
- Choice of XMLType Storage and Indexing (Oracle 12.2)
- Storing JSON Data in Oracle