Monday, July 31, 2023

In a single-byte character set database non-ASCII characters may cause JSON parse to fail

Always check out the original article at http://www.oraclequirks.com for latest comments, fixes and updates.
declare
  js json_object_t;
begin
  js := json_object_t.parse('{"name":"$DB_NAME - report GEN - MIELE – Riepilogo domande aiuto"}');
end;
Error report -
ORA-40441: JSON syntax error
ORA-06512: at "SYS.JDOM_T", line 4
ORA-06512: at "SYS.JSON_OBJECT_T", line 86
ORA-06512: at line 4
40441. 00000 -  "JSON syntax error"
*Cause:    The provided JavaScript Object Notation (JSON) data had invalid
           syntax and could not be parsed.
*Action:   Provide JSON data with the correct syntax.

The problem here is the presence of a "en dash" character (ASCII 150) after the word MIELE (it looks like a normal hyphen but it's not).

In a single-character database like this one (WE8MSWIN1252), the presence of non-ASCII characters inside JSON strings causes problems at parsing time.

One solution is to convert the strings containing such characters using the function ASCIISTR, that will replace non-ASCII character with their numeric Unicode counterpart:

report GEN SIAG - MIELE \2013 Riepilogo domande aiuto

However such conversion will increase the length of the string and this might lead to further problems.

Converting the strings using CONVERT(<str>, 'US7ASCII') can also fix the parsing failure and keep the size consistent, but it will replace the non-ASCII character with "?" in most cases. 

In a AL32UTF8 character set database the script above will execute without errors.

No comments:

yes you can!

Two great ways to help us out with a minimal effort. Click on the Google Plus +1 button above or...
We appreciate your support!

latest articles