Although I've been developing applications that make extensive use of XML for years, yesterday i was really amazed when i discovered that Oracle functions like EXTRACT and EXTRACTVALUE return a null value when the document being handled includes a default namespace declaration in the topmost element (see the purple string in the document sample below) and you do not explicitly specify this namespace as the third parameter for these functions.
Let's take a sample RSS XML document (many elements omitted for clarity):
select extractValue(xml_content,'/feed/title') title
from xml_documents
where id = 38;
TITLE
------
(null)
How can it be? i can see the title element right there below the feed element!
To make things worse, add up to the picture that the documentation for Oracle (up until 11gR1) is not so clear on this point but you might think that in the end it's just a matter of looking at some examples given in the SQL reference, however you may easily see yourself that the examples for EXTRACT and EXTRACTVALUE do not show the usage of the third parameter and the syntax diagram merely says "namespace string", without giving any insight on the real syntax of this value. Actually, finding a working example in the official documentation is not so straightforward as you need to go to the search page and review all the examples given for the EXTRACT function in the XML set of books, until you find the good one...
As a consequence, initially, i thought it would be enough to specify the namespace string, but all i got was
select extractValue(xml_content,'/feed/title','http://www.w3.org/2005/Atom') titleAt this point i was even more puzzled because even if i got this rather misleading error message, i had the feeling that the culprit was an incorrectly specified namespace string, so i set out on the web in pursue of some working example.
from xml_documents
where id = 38;
ORA-31013: Invalid XPATH expression.
After some research i finally encountered an OTN forum posting where this phantom parameter was eventually specified, so i adapted the example for my case:
select extractValue(xml_content,'/feed/title','xmlns="http://www.w3.org/2005/Atom"') titleSo, finally we got it, but notice the peculiar syntax (in green and blue) and the double enclosing quotes around the namespace string.
from xml_documents
where id = 38;
TITLE
------------------------------------
Annals of Oracle's Improbable Errors
At least we can now draw a preliminary conclusion, the namespace string syntax for EXTRACT, EXTRACTVALUE and UPDATEXML is the following:
'xmlns[:prefix]="namespace"'and in the case of a default namespace there is no prefix.
You may wonder, as i did, why you never came across this problem before today.
May be i always dealt with XML documents without a default namespace declaration like those spawned by Oracle Application Express for instance, when you unload a table in XML format.
Such documents do not pose any problems to the EXTRACT or EXTRACTVALUE functions, so you may live well for years ignoring the namespace parameter.
Note also that a tool like XMLSpy does not ask you to specify a default namespace, for the simple reason it is considered a default value, so, when you specify an XPath expression like /feed/title, it assumes that the elements without a prefix belong to the default namespace.
This situation may easily lead to some confusion if you try out such XPath expression in XMLSpy first and then you use it inside an Oracle XML/SQL expression returning nothing.
I can't say if the behavior depends on a strict interpretation of the XML specification, at any rate it's somewhat curious that one is forced to specify a supposedly default value.
11 comments:
Thank you very much! I ran into exactly the same dilemma!
Much appreciated!!!
Hello Paul,
i found another link that shows some puzzling behavior of GETNAMESPACE() method when a default namespace is involved.
Bye,
Flavio
I had the same issue and after reading ur experience i was able to solve it too, great work :)
Cool stuff! It solved my problem!
Thanks. Exactly what I needed. DP.
Hi Flavio,
I am encountering a similar problem .. where the namespace is specified explicitly.
Since we have specified the namespace as ns0 .. even using it as the optional 3rd parameter in extract does not seem to work :-(
This is what I am using - but it returns null ..
select extract(xmltype(contextdata), '/ErrorReport/Class/text()', 'xmlns:ns0="http://www.xyz.com/pe/EngineTypes"')
Ani,
well, this is easily explained, you are not mentioning ns0 anywhere.
I guess your XPath expression should look like this:
select extract(xmltype(contextdata), '/ns0:ErrorReport/Class/text()', 'xmlns:ns0="http://www.xyz.com/pe/EngineTypes"')
or
select extract(xmltype(contextdata), '/ErrorReport/ns0:Class/text()', 'xmlns:ns0="http://www.xyz.com/pe/EngineTypes"')
depending on the position where the namespace appears in the source document.
Flavio
Thanks!
Nice post very helpful
dbakings
Hi Flavio,
Thanks a lot man.
i have been trying to solve this issue from a week or so.
But your post helped a lot.
Thanks once again.
:):):):)
Thanks a lot. this is very helpful. it saved me lots of time.
Post a Comment