Showing posts with label default namespace declaration. Show all posts
Showing posts with label default namespace declaration. Show all posts

Wednesday, May 07, 2008

Extracting nodes belonging to different namespaces

Always check out the original article at http://www.oraclequirks.com for latest comments, fixes and updates.

As soon as i posted the previous article on the error caused by a missing namespace string parameter in the call to EXTRACT (or EXTRACTVALUE or UPDATEXML), a subliminal question started buzzing in my head.

What happens if i need to extract node(s) whose XPath expression contains elements belonging to different namespaces?
More precisely, how does the namespace parameter look like in such SQL expression?

The answer, fortunately, was found at the first attempt.
Let's take the same document used previously:


select extractValue(
xml_content,
'/feed/openSearch:totalResults',
'xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns="http://www.w3.org/2005/Atom"') val
from xml_documents
where id = 38;

VAL
---
175
As you can see the feed element has a default namespace (no prefix) matched by the blue string and the totalResults element, prefix openSearch, is matched by the green string. Both namespaces appear inside the same string (the famous third parameter of EXTRACT or EXTRACTVALUE), separated by a blank.

I guess one can generalize the namespace syntax as follows:
'xmlns[:prefix]="namespace1" [xmlns[:prefix]="namespace2"] [xmlns[:prefix]=...]'

Specifying the default namespace (sting in purple color in the document above) in the SQL/XML expression is necessary, if you don't do that, you'll get NULL as result:
select extractValue(
xml_content,
'/feed/openSearch:totalResults',
'xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"') val
from xml_documents
where id = 38;

VAL
---
(null)

Oracle's XML SQL functions and the default namespace

Always check out the original article at http://www.oraclequirks.com for latest comments, fixes and updates.

Although I've been developing applications that make extensive use of XML for years, yesterday i was really amazed when i discovered that Oracle functions like EXTRACT and EXTRACTVALUE return a null value when the document being handled includes a default namespace declaration in the topmost element (see the purple string in the document sample below) and you do not explicitly specify this namespace as the third parameter for these functions.

Let's take a sample RSS XML document (many elements omitted for clarity):


select extractValue(xml_content,'/feed/title') title
from xml_documents
where id = 38;

TITLE
------
(null)

How can it be? i can see the title element right there below the feed element!

To make things worse, add up to the picture that the documentation for Oracle (up until 11gR1) is not so clear on this point but you might think that in the end it's just a matter of looking at some examples given in the SQL reference, however you may easily see yourself that the examples for EXTRACT and EXTRACTVALUE do not show the usage of the third parameter and the syntax diagram merely says "namespace string", without giving any insight on the real syntax of this value. Actually, finding a working example in the official documentation is not so straightforward as you need to go to the search page and review all the examples given for the EXTRACT function in the XML set of books, until you find the good one...

As a consequence, initially, i thought it would be enough to specify the namespace string, but all i got was

select extractValue(xml_content,'/feed/title','http://www.w3.org/2005/Atom') title
from xml_documents
where id = 38;

ORA-31013: Invalid XPATH expression.
At this point i was even more puzzled because even if i got this rather misleading error message, i had the feeling that the culprit was an incorrectly specified namespace string, so i set out on the web in pursue of some working example.

After some research i finally encountered an OTN forum posting where this phantom parameter was eventually specified, so i adapted the example for my case:
select extractValue(xml_content,'/feed/title','xmlns="http://www.w3.org/2005/Atom"') title
from xml_documents
where id = 38;

TITLE
------------------------------------
Annals of Oracle's Improbable Errors
So, finally we got it, but notice the peculiar syntax (in green and blue) and the double enclosing quotes around the namespace string.
At least we can now draw a preliminary conclusion, the namespace string syntax for EXTRACT, EXTRACTVALUE and UPDATEXML is the following:
'xmlns[:prefix]="namespace"'
and in the case of a default namespace there is no prefix.

You may wonder, as i did, why you never came across this problem before today.
May be i always dealt with XML documents without a default namespace declaration like those spawned by Oracle Application Express for instance, when you unload a table in XML format.
Such documents do not pose any problems to the EXTRACT or EXTRACTVALUE functions, so you may live well for years ignoring the namespace parameter.

Note also that a tool like XMLSpy does not ask you to specify a default namespace, for the simple reason it is considered a default value, so, when you specify an XPath expression like /feed/title, it assumes that the elements without a prefix belong to the default namespace.
This situation may easily lead to some confusion if you try out such XPath expression in XMLSpy first and then you use it inside an Oracle XML/SQL expression returning nothing.

I can't say if the behavior depends on a strict interpretation of the XML specification, at any rate it's somewhat curious that one is forced to specify a supposedly default value.

yes you can!

Two great ways to help us out with a minimal effort. Click on the Google Plus +1 button above or...
We appreciate your support!

latest articles