Yesterday i needed to find a way to ensure that a list values entered manually by a user were unique.
If these values came from a table, it would be fairly easy to build a query returning DISTINCT values, but in this case values are pulled in as comma separated values.
This type of problems are suitable for set operations using PL/SQL collections because we are dealing with a limited, although unspecified amount of values that are not permanently stored as individual rows in a table.
PL/SQL comes with an array of operators for manipulating collections as sets and you can find a few basic examples in PL/SQL Language Reference manual. The functional description of each set operator can be found in the SQL Reference instead. The links refer to Oracle version 11gR2 but these functions are available at least since version 10gR1. There is also a nice tutorial by Steven Feuerstein discussing these functions in an old Oracle Magazine issue, so you may want to check it out.
It is worth noting that such set operators work only on PL/SQL nested tables, but not on associative arrays or varying arrays and this makes a difference as we shall see.
One of the most common operations involving collections in Oracle Application Express is to take some delimited string (by commas, semicolons, colons or spaces typically) and convert into a collection or viceversa using function APEX_UTIL.STRING_TO_TABLE (or its counterpart APEX_UTIL.TABLE_TO_STRING). Unfortunately the result of function STRING_TO_TABLE is APEX_APPLICATION_GLOBAL.VC_ARR2, that is an associative array, not a PL/SQL table, which means that an attempt to use the SET function on it will lead to the following run time error:
PLS-00306: wrong number or types of arguments in call to 'SET'This error can be puzzling at first because the difference between a PL/SQL table and an associative array indexed by positive integers is very subtle and the only way to say what kind of collection type is APEX_APPLICATION_GLOBAL.VC_ARR2 is to look at its definition, which can be not so straightforward for an unprivileged user. So, once we are aware of the fact that APEX_APPLICATION_GLOBAL.VC_ARR2 is an associative array, we are basically stuck at the original problem.
How do we move forward?
The options are the following:
- stick to the apex STRING_TO_TABLE function and copy the values one by one into a PL/SQL nested table, using a FOR...LOOP structure.
- write you own STRING_TO_TABLE function and return a PL/SQL table instead.
- find another way of quickly transforming the associative array into a PL/SQL table other than #1.
- normalize the values yourself instead of using SET.
- ?
Option #1 saves me from rewriting a function but it forces me to perform a not so efficient extra loop that it's also requiring extra memory allocation.
Option #2 is probably comparable in terms of memory and speed although i hate to "reinvent the wheel", unless this is absolutely necessary. It gives me also the advantage of further customization, like space trimming and case insensitive or sensitive comparison.
Option #3 cannot be performed with CAST, so i guess there is no easy way of doing it unless we fall back to option #1, but i do not exclude it could be done in another fashion.
Option #4 is probably the worst choice because i need to replace an efficient piece of C code with my own PL/SQL logic based on a search loop.
Option #5 is an open question, if you know a better way of doing this, please let me know.
So, after rewriting my custom STR2TAB function, i could finally leverage the power of SET function and simply return the result in the following manner:
DECLAREIn conclusion, apex function APEX_UTIL.STRING_TO_TABLE does an excellent job when you are taking the values as they have been typed in, but if you need to work on the values as a set, then you'll need to come up with your own custom function.
str varchar2(1000) := 'A,B,C,A,D';
TYPE nested_typ IS TABLE OF VARCHAR2(255);
nt1 nested_typ;
nt2 nested_typ;
PROCEDURE print_nested_table(p_nt nested_typ) IS
output VARCHAR2(128);
BEGIN
IF p_nt IS NULL THEN
DBMS_OUTPUT.PUT_LINE('Results:');
RETURN;
END IF;
IF p_nt.COUNT = 0 THEN
DBMS_OUTPUT.PUT_LINE('Results: empty set');
RETURN;
END IF;
FOR i IN p_nt.FIRST .. p_nt.LAST
LOOP
output := output || p_nt(i) || ' ';
END LOOP;
DBMS_OUTPUT.PUT_LINE('Results: ' || output);
END;
FUNCTION str2tab(
p_text in varchar2,
p_separator in varchar2 default ',')
return nested_typ
as
l_string varchar2(32767) := p_text || p_separator;
l_data nested_typ := nested_typ();
n binary_integer;
begin
loop
exit when l_string is null;
n := instr( l_string, p_separator );
l_data.extend;
l_data(l_data.count) := substr(l_string, 1, n-1);
l_string := substr(l_string, n + length(p_separator));
end loop;
return l_data;
END;
BEGIN
nt1 := str2tab(str);
nt2 := set(nt1);
print_nested_table(nt2);
END;
/
Results: A B C D
See message translations for PLS-00306 and search additional resources.
4 comments:
You could also have used a pipelined function for transforming the string into a table.
Then it is possible to call this function in a select statement.
regards,
Mathieu
Hi,
the function does not really require to be declared as pipelined in order to be callable from within a sql query.
In the example i gave, the same code becomes callable from within SQL if i create nested_typ as a true SQL user defined type and i also create the str2tab function as a standalone function. Said that, the pipelined declaration would give me some benefit if i had to deal with values fetched from a table that might be returned in several batches depending on the access path determined by the optimizer. In the case of an input string that is split in several but not-so-many values everything is going to happen in memory so i don't think that a pipelined function would give me any edge over the traditional approach, at any rate, here is also the pipelined version of the code above:
create or replace
type nested_typ is table of varchar2(255);
/
create or replace
FUNCTION str2tab(
p_text in varchar2,
p_separator in varchar2 default ',')
return nested_typ pipelined
as
l_string varchar2(32767) := p_text || p_separator;
l_data nested_typ := nested_typ();
n binary_integer;
begin
loop
exit when l_string is null;
n := instr( l_string, p_separator );
pipe row (substr(l_string, 1, n-1));
l_string := substr(l_string, n + length(p_separator));
end loop;
return;
END;
/
DECLARE
str varchar2(1000) := 'A,B,C,A,D';
nt1 nested_typ;
nt2 nested_typ;
PROCEDURE print_nested_table(p_nt nested_typ) IS
output VARCHAR2(128);
BEGIN
IF p_nt IS NULL THEN
DBMS_OUTPUT.PUT_LINE('Results: ');
RETURN;
END IF;
IF p_nt.COUNT = 0 THEN
DBMS_OUTPUT.PUT_LINE('Results: empty set');
RETURN;
END IF;
FOR i IN p_nt.FIRST .. p_nt.LAST
LOOP
output := output || p_nt(i) || ' ';
END LOOP;
DBMS_OUTPUT.PUT_LINE('Results: ' || output);
END;
BEGIN
select set(str2tab(str))
into nt2
from dual;
print_nested_table(nt2);
END;
/
Results: A B C D
Hi!
Mathieu, using a pipelined/table function in a select statement causes a context switch. And using a select statement in PL/SQL causes another one. So, when you deal with PL/SQL, Flavio's solution is the best (or one of the best).
And when you deal with SQL (e.g. a select statement), you can split a list string into the items with INSTR and SUBSTR functions and hierarchical query. In 10g+ you can use REGEXP_SUBSTR / REGEXP_REPLACE function instead of SUBSTR+INSTR. And in 11gR2+ you can use recursive subquery factoring instead of CONNECT BY.
Less context switch - less overhead.
Good luck,
Alex.
PS: Sorry, my Russian and Ukrainian are much better than English.
Thanks a lot!!!
Post a Comment