Annals of Oracle's Improbable Errors: Handling XLIFF documents generated by Oracle APEX

Tuesday, October 31, 2023

Handling XLIFF documents generated by Oracle APEX

XLIFF files have been around since the first version of Oracle APEX as I remember.

Even if you are not interested in XLIFF files, you may find interesting the code below as I am showcasing the usage of many SQL/XML functions.

Recently I wanted to spare some time while translating some of these files, so I decided to brush up my Oracle XMLDB skills that have been put on hold in the last years owing to the increasing popularity of JSON.

The goal is to take a XLIFF file exported from an APEX application and turn it into relational data for easier handling, then take the data, presumably modified for some reason, and convert them back into a XLIFF file that can be applied to the original application.

This code was created on Oracle 19c.

-- XLIFF structure
/*
<?xml version="1.0" encoding="UTF-8"?>
<!-- 
  ****************** 
  ** Source     :  100
  ** Source Lang:  en
  ** Target     :  108
  ** Target Lang:  it
  ** Filename:     f100_108_en_it.xlf
  ** Generated By: TOOLS_WKS_ADMIN
  ** Date:         26-OCT-2023 15:55:30
  ****************** 
 -->
<xliff version="1.0">
<file original="f100_108_en_it.xlf" source-language="en" target-language="it" datatype="html">
<header></header>
<body>
<trans-unit id="S-5-1-100">
<source>Welcome</source>
<target>Benvenuto</target>
</trans-unit>
...
<trans-unit id="S-458-4595199304859029-100">
<source>Username</source>
<target>Nome utente</target>
</trans-unit>
</body>
</file>
</xliff>
*/


--------------------------------------------------------
--  DDL for Table XLIFF_EXPORT_FILES
--
--  This table holds the XLIFF documents stored as XMLType
--  Created by Byte64 2023/10/27
--------------------------------------------------------

  CREATE TABLE "XLIFF_EXPORT_FILES" 
   (	"ID" NUMBER(*,0) GENERATED ALWAYS AS IDENTITY ORDER NOT NULL, 
	"FILENAME" VARCHAR2(128) NOT NULL, 
	"DATE_INS" DATE DEFAULT SYSDATE, 
	"XLIFF" "XMLTYPE" NOT NULL,
  CONSTRAINT "XLIFF_EXPORT_FILES_PK" PRIMARY KEY ("ID") USING INDEX  ENABLE
   );


--------------------------------------------------------
--  DDL for View V_XLIFF_RELATIONAL
--
--  This view transforms XML into relational data
--  Created by Byte64 2023/10/27
--------------------------------------------------------

  CREATE OR REPLACE FORCE EDITIONABLE VIEW "V_XLIFF_RELATIONAL" ("ID", "ITEM_ID", "SOURCE_LANGUAGE", "TARGET_LANGUAGE", "TRANS_UNIT_ID", "SOURCE_TEXT", "TARGET_TEXT", "DATE_INS") AS 
  select x.id, rownum as item_id, tu.source_language, tu.target_language, t."TRANS_UNIT_ID",t."SOURCE_TEXT",t."TARGET_TEXT", x.date_ins
    from xliff_export_files x,
     xmltable('/xliff/file' passing x.xliff
       columns
         source_language  varchar2(30) path '@source-language',
         target_language  varchar2(30) path '@target-language',
         trans_unit       xmltype      path 'body/trans-unit'
     ) tu,
     xmltable('/trans-unit' passing tu.trans_unit
       columns
         trans_unit_id varchar2(128)  path '@id',
         source_text   varchar2(4000) path 'source',
         target_text   varchar2(4000) path 'target'
     ) t
;     
--------------------------------------------------------
--  DDL for Table XLIFF_RELATIONAL
--
--  This table will contain the data extracted with
--  view V_XLIFF_RELATIONAL
--  Created by Byte64 2023/10/27
--------------------------------------------------------

  CREATE TABLE "XLIFF_RELATIONAL" 
   (	"ID" NUMBER(*,0) NOT NULL, 
	"ITEM_ID" NUMBER(*,0) NOT NULL, 
	"SOURCE_LANGUAGE" VARCHAR2(30) NOT NULL, 
	"TARGET_LANGUAGE" VARCHAR2(30) NOT NULL, 
	"TRANS_UNIT_ID" VARCHAR2(128) NOT NULL, 
	"SOURCE_TEXT" VARCHAR2(4000) NOT NULL, 
	"TARGET_TEXT" VARCHAR2(4000) NOT NULL, 
	"DATE_INS" DATE DEFAULT SYSDATE
   );

--------------------------------------------------------
--  DDL for View V_XLIFF_RELATIONAL_DATA
--
--  This view ensures that the elements are pulled out in
--  the same original order, which could make it easier
--  to spot problems by comparing the original XLIFF
--  with the new one
--------------------------------------------------------

CREATE OR REPLACE FORCE EDITIONABLE VIEW "V_XLIFF_RELATIONAL_DATA" ("ID", "ITEM_ID", "SOURCE_LANGUAGE", "TARGET_LANGUAGE", "TRANS_UNIT_ID", "SOURCE_TEXT", "TARGET_TEXT", "DATE_INS") AS 
  select "ID","ITEM_ID","SOURCE_LANGUAGE","TARGET_LANGUAGE","TRANS_UNIT_ID","SOURCE_TEXT","TARGET_TEXT","DATE_INS" 
from xliff_relational
order by id, item_id
;


--------------------------------------------------------
--  DDL for View V_XLIFF_OUTPUT_FILES
--
--  This view extracts the relational data as XLIFF files 
--  one BLOB for each ID
--  Created by Byte64 2023/10/27
--------------------------------------------------------

CREATE OR REPLACE FORCE EDITIONABLE VIEW "V_XLIFF_OUTPUT_FILES" ("ID", "XLIFF") AS 
  select 
  id,
  xmlserialize(document
    xmlconcat(
      xmlcomment('generated externally'),
      xmlelement("xliff", xmlattributes('1.0' as "version"),
        xmlelement("file", xmlattributes(f.filename as "original", 
                                         (select x1.source_language 
                                            from xliff_relational x1
                                           where x1.id = f.id
                                          fetch first 1 row only) as "source-language", 
                                         (select x1.target_language 
                                            from xliff_relational x1 
                                           where x1.id = f.id
                                          fetch first 1 row only) as "target-language"
                                         ),
          xmlelement("header", null),
          xmlelement("body", (select xmlagg(
                                        xmlelement("trans-unit", xmlattributes(x.trans_unit_id as "id"),
                                          xmlforest(x.source_text as "source", x.target_text as "target")
                                        )
                                     )  from v_xliff_relational_data x
                                       where x.id = f.id
                             )
                    )
        )
      )
    ) as blob
  encoding 'UTF-8'
  version '1.0'    -- this will generate the XML header automatically
  indent size = 2  -- optional indentation for pretty-print
  ) as xliff
  from xliff_export_files f
;

Workflow:

seed the application translation;
export the XLIFF file generated by APEX;
load the XLIFF file into XLIFF_EXPORTED_FILES;
populate XLIFF_RELATIONAL selecting from V_XLIFF_RELATIONAL with the ID created during the previous step;
do some work on the data;
download the resulting file querying V_XLIFF_OUTPUT_FILES by ID;
load the translation back into the Apex Builder;
apply the translation to the application;
publish the application.

Steps 3-6 could be done with the help of a simple APEX application enabling people who are not allowed to work directly in the Application Builder to perform the translation work.

Besides the normal translation work, having the data in a custom table allows to:

leverage existing translations by matching the source text, especially in the case of simple strings like button names, item labels taking the translations from another application;
perform quality checks searching for similar text and make them consistent throughout the whole application, i.e. verify that buttons, item labels, region titles have a consistent format, and so on.
clone a language with minor linguistic variations starting from a complete translation, i.e. English (US) vs English (CA) or Spanish (ES) vs Spanish (MX) and so on.

The XLIFF file generated by the view V_XLIFF_OUTPUT_FILES are pretty printed with an indentation value of 2 spaces, this will cause the file to be somewhat larger than the original even if no change has been made to the target elements. The clause INDENT SIZE = 2 can be removed to keep the file size smaller.

Caveat emptor!

Annals of Oracle's Improbable Errors

Tuesday, October 31, 2023

Handling XLIFF documents generated by Oracle APEX

No comments:

yes you can!

latest articles