Friday, August 30, 2024

DBMS_XMLGEN limited workaround for converting LONG columns into CLOBs

If you landed here is because you hit the problem of dealing with LONG columns.

There are some good articles elsewhere about how to tackle this old problem, my contribution in this case consists in advising about some limitations that apparently have been overlooked when using the same technique explained below.

It's a mystery to me why after so many years we can't rid of this annoyance once for good, why not "simply" adding a CLOB column equivalent at least in the case of data dictionary columns?
Come on!

I needed to extract the content of the TEXT column from DBA_VIEWS and DBA_MVIEWS, possibly without having to pass through an INSERT into a table (using function TO_LOB), which is the best workaround in case you deal with static data, for one-off operations.

I stumbled upon an old video of Connor McDonald showing how to extract the content of a LONG column exploiting the XML API DBMS_XMLGEN.GETXMLTYPE. This trick seemed to save the day after some adaptation for my case, and actually I was almost ready to celebrate when I started hitting some errors while doing further tests.

To cut a long story short, eventually I encountered the following problems:

  1. API documentation for version 19c of DBMS_XMLGEN.SETCONVERTSPECIALCHARS is incorrect as it mentions a parameter "conv" but the real parameter name is "replace". This typo is still present in the latest version of the documentation of 23ai.

  2. DBMS_XMLGEN.GETXMLTYPE and DBMS_XMLGEN.GETXML won't perform special characters escaping via DBMS_XMLGEN.SETCONVERTSPECIALCHARS if the column type is LONG.
    I was getting parsing errors when using Connor's EXTRACTVALUE technique because the XML document contained < or > as spare characters in the source (as in WHERE conditions inside the query source).

  3.  DBMS_XMLGEN.GETXMLTYPE and DBMS_XMLGEN.GETXML will truncate the content to the first 32K for LONG columns.

Problem #1 was easily solved, problem #2 was solved extracting the data using REGEXP_SUBSTR instead of EXTRACTVALUE, but this was possible because I was working on a XML document containing a single ROW tag at a time. For multiple rows this solution will not work.

  FUNCTION long2clob
     ( p_qry in clob, -- must return a single row!
       p_col in varchar2)
  RETURN CLOB
  IS
    c        CLOB;
  BEGIN
    c := regexp_substr(
           dbms_xmlgen.getxml(p_qry),
           '(<ROW>.*<' || p_col || '>(.*)</' || p_col || '>.*</ROW>)',
           1,
           1,
           'cn'
           ,2
         );
    return c;

  END long2clob;

Problem #3 remains, unless LONG columns are less than 32K.
Unfortunately we do have some views exceeding 32K of source, but considering the usage of this function I'll probably live with this limitation for the moment.
By the way, SQLDeveloper won't allow you to edit a view larger than 32K, and to me this sounds like an invitation to avoid such situations.

Finally, I also tried to see what happens when you supply a LONG column to function JSON_OBJECT, unfortunately it returns the exception:

ORA-40654: Input to JSON generation function has unsupported data type.

That's all folks!
(quote)

No comments:

yes you can!

Two great ways to help us out with a minimal effort. Click on the Google Plus +1 button above or...
We appreciate your support!

latest articles