"Wheels within wheels", as Monty Bodkin would say. Extracting a very long string or CLOB from a JSON CLOB (very long => larger than max_string_size of 32767), in a pre-12.2 Oracle database, turned out to be more complex than it appeared at first.
This case study shows how to extract CLOB data from JSON CLOB, in an Oracle database with no/limited JSON parsing features.
CLOB within CLOB: What’s the problem?
Here’s the requirement and the challenge in implementing it. A big JSON CLOB input goes into function extract_payload, with the key name for which to extract the payload. In this example, the keyname is "payload" and it carries text data which could be very big in size. The payload value needs to be extracted and returned as CLOB.
The PL/SQL function’s signature:
CREATE OR REPLACE FUNCTION extract_payload (p_json_clob IN CLOB , p_json_keyname IN VARCHAR2) RETURN CLOB
Sample JSON CLOB input:
[ { "id": "1", "payloadType": "csv", "payload": "Very big payload 100K bytes" } ]
The challenges:
- The database is Oracle version 12.1 => newer features to extract CLOB with json_object_t.get_clob (as done here) are not available.
- The environment is controlled => the developer is not permitted to install third party utilities such as PL/JSON where json.to_clob (as done here) could have been used.
You might ask: Don’t the other Oracle 12c features for querying JSON data (SQL/JSON, dot notation) work?
Good question. Tried those: with CLOB data, got either errors or no value returned.
JSON_QUERY, JSON_VALUE with RETURNING CLOB don’t compile successfully.
Those lead to:
ORA-40444: JSON processing error
With JSON_TABLE, the column datatype mapping does not support CLOB. Compilation goes through, but the code conks off at runtime:
ORA-01460: unimplemented or unreasonable conversion requested
As a last-ditch effort, we saved "payload" data to temporary database table in a CLOB column, and then tried selecting via dot notation into a CLOB variable.
This did not give a compilation or runtime error, but nothing got selected when using very large CLOBs.
What’s the solution then? Here’s what worked for us. [Know of alternatives? Do share them.]
DBMS_LOB Solution: Instr to Find JSON Key Value end points, WriteAppend to CLOB output
Approach:
- Use dbms_lob.instr to find the start and end of the payload, based on the input JSON keyname and delimiters
- Read the JSON input chunk-wise — each chunk just under max string length — into a string buffer
- Use dbms_lob.writeappend to iteratively add the last read string buffer to a CLOB
- After the last chunk has been read and appended [test by comparing with the end delimiter for the payload], return the CLOB
extract_payload function:
-- Function that takes as input (1) a JSON CLOB -- (2) JSON key name for which to extract the value -- Returns the value for the input key name as a CLOB -- Can extract very long string/CLOB CREATE OR REPLACE FUNCTION extract_payload (p_json_clob IN CLOB , p_json_keyname IN VARCHAR2) RETURN CLOB IS l_payload CLOB; l_payload_buf VARCHAR2(32767); l_payload_init PLS_INTEGER; l_payload_end PLS_INTEGER; l_json_keyname VARCHAR2(100) := '"'||p_json_keyname||'":'; n_chunksize PLS_INTEGER := 32766; n_count PLS_INTEGER := 0; BEGIN -- Create temporary LOB to hold the payload dbms_lob.createtemporary(l_payload, FALSE); -- Starting position of data to extract; this must lie within the -- first chunk being read i.e. l_payload_init < n_chunksize l_payload_init := dbms_lob.instr(p_json_clob, l_json_keyname) + length(l_json_keyname) + 1; -- End position of data to extract l_payload_end := dbms_lob.instr(p_json_clob, '"', l_payload_init + 1) -1; -- Length of the data to extract l_payload_len := l_payload_end - l_payload_init + 1; -- Extract the data chunk by chunk and append to l_payload CLOB WHILE TRUE LOOP IF (l_payload_len - (n_chunksize * n_count) >= n_chunksize) THEN -- Extract next chunk of size n_chunksize l_payload_buf := dbms_lob.substr( p_json_clob , n_chunksize , (l_payload_init + (n_chunksize * n_count+1)) ); dbms_lob.writeappend(l_payload, n_chunksize, l_payload_buf); n_count := n_count + 1; ELSE -- Extract final chunk l_payload_buf := dbms_lob.substr( p_json_clob , (l_payload_end - (l_payload_init + (n_chunksize * n_count))) , (l_payload_init + (n_chunksize * n_count+1)) ); dbms_lob.writeappend(l_payload, length(l_payload_buf), l_payload_buf); RETURN l_payload; END IF; END LOOP; END extract_payload;
When compiled:
SQL> -- Function that takes as input (1) a JSON CLOB SQL> -- (2) JSON key name for which to extract the value SQL> -- Returns the value for the input key name as a CLOB SQL> -- Can extract very long string/CLOB SQL> CREATE OR REPLACE FUNCTION extract_payload 2 (p_json_clob IN CLOB 3 , p_json_keyname IN VARCHAR2) 4 RETURN CLOB 5 IS 6 l_payload CLOB; 7 l_payload_buf VARCHAR2(32767); 8 l_payload_init PLS_INTEGER; 9 l_payload_end PLS_INTEGER; 10 l_json_keyname VARCHAR2(100) := '"'||p_json_keyname||'":'; 11 n_chunksize PLS_INTEGER := 32766; 12 n_count PLS_INTEGER := 0; 13 BEGIN 14 -- Create temporary LOB to hold the payload 15 dbms_lob.createtemporary(l_payload, FALSE); 16 17 -- Starting position of data to extract; this must lie within the 18 -- first chunk being read i.e. l_payload_init < n_chunksize 19 l_payload_init := dbms_lob.instr(p_json_clob, l_json_keyname) 20 + length(l_json_keyname) + 1; 21 22 -- End position of data to extract 23 l_payload_end := dbms_lob.instr(p_json_clob, '"', l_payload_init + 1)-1; 24 25 -- Extract the data chunk by chunk and append to l_payload CLOB 26 WHILE TRUE 27 LOOP 28 IF (LENGTH(p_json_clob) - (n_chunksize * n_count) >= n_chunksize) THEN 29 -- Extract next chunk of size n_chunksize 30 l_payload_buf := 31 dbms_lob.substr( 32 p_json_clob 33 , n_chunksize 34 , (l_payload_init + (n_chunksize * n_count+1)) 35 ); 36 dbms_lob.writeappend(l_payload, n_chunksize, l_payload_buf); 37 n_count := n_count + 1; 38 39 ELSE 40 -- Extract final chunk 41 l_payload_buf := 42 dbms_lob.substr( 43 p_json_clob 44 , (l_payload_end - (l_payload_init + (n_chunksize * n_count))) 45 , (l_payload_init + (n_chunksize * n_count+1)) 46 ); 47 48 dbms_lob.writeappend(l_payload, length(l_payload_buf), l_payload_buf); 49 RETURN l_payload; 50 END IF; 51 END LOOP; 52 END extract_payload; 53 / Function created.
Note: JSON key names in the example given are enclosed in double quotes and followed immediately a colon and blank space. There may be variations in your use case, for which you may need to make adjustments in the delimiter position derivation logic.
Testing the Solution: Extract CLOB from CLOB
Case 1: Small payload < 100 characters
SQL> -- Anonymous block to call extract payload SQL> -- Testing with a small payload < 100 characters SQL> DECLARE 2 l_json_clob CLOB := '[ 3 { 4 "id": "1", 5 "payloadType": "csv", 6 "payload": "Small payload < 100 characters" 7 } 8 ]'; 9 l_output_clob CLOB; 10 BEGIN 11 l_output_clob := extract_payload(l_json_clob, 'payload'); 12 13 insert into bigfiles values (l_output_clob); 14 commit; 15 END; 16 / PL/SQL procedure successfully completed. SQL> select substr(data,1,30) data_fragment 2 , dbms_lob.getlength(data) data_length 3 from bigfiles; DATA_FRAGMENT DATA_LENGTH ------------------------------ ----------- Small payload < 100 characters 30
Case 2: Big payload > 32767 characters
SQL> -- Anonymous block to call extract payload SQL> -- Testing with a big payload > 32767 characters SQL> DECLARE 2 l_json_clob CLOB := empty_clob(); 3 l_output_clob CLOB; 4 BEGIN 5 dbms_lob.createtemporary(l_json_clob, true); 6 l_json_clob := '[{ 7 "id": "1", 8 "payloadType": "csv", 9 "payload": "'; 10 11 for i in 1 .. 5 loop 12 l_json_clob := l_json_clob || lpad('x', 32767, 'x'); 13 end loop; 14 l_json_clob := l_json_clob || '"}]'; 15 16 l_output_clob := extract_payload(l_json_clob, 'payload'); 17 dbms_lob.freetemporary(l_json_clob); 18 19 insert into bigfiles values (l_output_clob); 20 commit; 21 END; 22 / SQL> select substr(data,1,30) data_fragment 2 , dbms_lob.getlength(data) data_length 3 from bigfiles; DATA_FRAGMENT DATA_LENGTH ------------------------------ ----------- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 163835 Small payload < 100 characters 30
Summary
This articles demonstrates an approach for extracting a very long string or CLOB from JSON, in a pre-12.2 Oracle database with no/limited support for JSON features.
For older database versions with smaller max string length, reduce the value of n_chunksize (say, set it to < 4000 instead of < 32767) in function extract_payload and you are good to go.
Also read Crop a CLOB with DBMS_LOB.FRAGMENT_DELETE for an alternate approach to solving the same problem.
{ 1 comment… read it below or add one }
What if you have to loop through json array to get the objec as clob?