ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00200: could not convert from encoding UTF-8 to UCS2
Error at line 1931
I must admit that when i saw this exception for the first time, i thought it would take a long time to be sorted out.
But, fortunately, things went much better!
For some reason the XML parser of Oracle doesn't like some character in the file, however line 1931 or whatever number is present in the message you've got, doesn't normally match with the line number that you can see if you open the xml file with a text editor like Ultraedit or XMLSpy.
Even if you don't know precisely where is the offending charater, you can be sure that it will look like some weird glyph, it was a square in my case and when i opened the xml file with the hex viewer of Ultraedit, i could see it was some kind of junk character whose hex code was FDFF.
I don't really know why Oracle rejected it if the file was meant to be UTF-8, i'll investigate the problem later if i'll have the time.
I tried to to think of a way of recognizing this kind of situations upfront and with the help of an XSL transformation probably one can get rid of these characters or replace them with other symbols, however, for some reason, my old XMLSpy version seems unable to cope with a translate function containing characters represented by their hex code like ﷿ or at least so does the Evaluate XPath menu function.
My idea was to look for elements matching the following expression:
//elem[contains(translate(.,'﷿','¿'),'¿')]
But XMLSpy failed to find any element, until i copied and pasted the offending character from the xml file directly in place of ﷿.
Later i'll try with a real transformation and if xmlspy fails, i'll stick to the good ole Saxon.
In the meanwhile, happy searching!
Updated March 6, 2007
PS: Well, if you are unfamiliar with Unicode, UTF-8, UCS-2 and other character encoding issues, i bet you'll find this article very helpful and also very entertaining!
For instance now i am finally clear with one of the issues: FDFF must be read the other way around, FFFD, in big-endian mode and it represents a so-called replacement character, that is a placeholder for a character that is not defined in Unicode.
What i am still not clear with is if Oracle should accept this character or not.
I posted a message in the XML DB forum, let's see if the Oracle gurus come up with an answer.
But, at any rate, now i know exactly what to look for in the files.
Updated March 7, 2007
I downloaded Saxon-B version 8.9, i "upgraded" my original XSLT from 1.0 to 2.0 and now, before outputting the text nodes of my elements, i replace any unwanted character with a more readable string.
replace( . ,'�','** U+FFFD **')
See message translations for ORA-31011, ORA-19202, LPX-00200 and search additional resources.
ORA-31011: Analisi XML non riuscita
ORA-19202: Errore durante XML processing
ORA-31011: fallo en el análisis de XML
ORA-19202: Se ha producido un error en el procesamiento de XML
ORA-31011: Ha fallat l'anàlisi XML
ORA-19202: S'ha produït un error en el processament XML
ORA-31011: Echec d'analyse XML
ORA-19202: Une erreur s'est produite lors du traitement la fonction XML ()
ORA-31011: XML-Parsing nicht erfolgreich
ORA-19202: Fehler bei XML-Verarbeitung aufgetreten
ORA-31011: Η ανάλυση XML απέτυχε
ORA-19202: Παρουσιάστηκε σφάλμα στην επεξεργασία XML
ORA-31011: XML-analyse fejlede
ORA-19202: Fejl opstod ved XML-behandling
ORA-31011: XML-analys misslyckades
ORA-19202: Ett fel uppstod vid XML-bearbetningen
ORA-31011: XML-analysen mislyktes
ORA-19202: Det oppstod en feil i XML-behandlingen
ORA-31011: XML-jäsennys epäonnistui
ORA-19202: Virhe XML-käsittelyssä
ORA-31011: Az XML-elemzés nem sikerült
ORA-19202: Hiba lépett fel az XML-feldolgozás során:
ORA-31011: Nu s-a reuşit analizarea XML
ORA-19202: Eroare la procesarea XML
ORA-31011: Ontleden van XML is mislukt.
ORA-19202: Fout in XML-verwerking ().
ORA-31011: falha na análise XML
ORA-19202: Ocorreu um erro no processamento XML
ORA-31011: Falha na análise de XML
ORA-19202: Ocorrência de erro no processamento de XML
ORA-31011: сбой разбора XML
ORA-19202: Возникла ошибка при обработке XML
ORA-31011: selhala analýza XML
ORA-19202: Vyskytla se chyba při zpracování XML
ORA-31011: Syntaktická analýza XML zlyhala
ORA-19202: Pri spracovaní XML sa vyskytla chyba
ORA-31011: Niepowodzenie analizy składniowej XML
ORA-19202: Wystąpił błąd podczas przetwarzania XML
ORA-31011: XML ayrıştırılamadı
ORA-19202: XML işlenirken hata ortaya çıktı
7 comments:
I've the same problem when I use updateXML with this character ('éàè...').
have you find a solution ?
thanks
Jacques,
do you mean the horizontal ellipsis character (the three dots) or any of those accented characters?
i also have a problem with XML and accented letters.
the à should be converted, but it is not
Anonymous,
the only time i had problems with accented characters in XML file was when the database character set was not AL32UTF8 (it was WE8ISO8859P1).
At that time every "à" was converted into a double character string.
Also the euro symbol was a major problem until the database was migrated to AL32UTF8.
Does this scenario look like yours?
Its the : character that is giving me a problem . Please let me konw if u have a solution.
Thanks
-Pradip (pradipc@gmail.com)
Pradip,
you mean you are getting this error because your xml file contains a standard ASCII "colon" character?
I think yes because in the xml i just said
xslprocessor.selectNodes(xmldom.makeNode(l_doc),'/soapenv:Envelope');
and this give me the error.
Post a Comment