The Problem Needs String Hacking

I was writing a PL/SQL function to generate a MERGE sql statement from input data, the name of a table, a list of columns in the input data, and either a list of join columns or if not provided, getting them from the primary key constraint. This is something I’ve done in Perl before relatively easily, and I wanted to turn my hand to doing it in PL/SQL.

My first effort let me down the road of creating a Polymorphic Table Function to parse the CSV data. It was a nice exercise and I produced a workable PTF implementation to parse a clob and generate Oracle column data, but it is too complex for this relatively simple use case.

Next I turned towards doing it all in a set of inline PL/SQL WITH functions where I intended to have it generate the UNION ALL set of rows for the data for the MERGE USING clause. I don’t need to deploy any code in the database that my team may or may not be able to support. It winds up being a script I use to generate deployment code.

Man, what a grind. PL/SQL is very good for database things. It sucks badly for text processing like this. I found myself pining away for Perl operations like split, map and join. I’m not even going to complain about lack of variable interpolation in PL/SQL strings and how you must concatentate everything with hideous syntax because it is what it is. Yet some things we can do something about.

Consider needing to take a nested table of column names and generate the list that goes with the INSERT(…) clause of the MERGE.

    v_s := 'WHEN NOT MATCHED THEN INSERT('||v_arr_cols(1);
    FOR i IN 2..v_arr_cols.COUNT
    LOOP
        v_s := v_s||', '||v_arr_cols(i);
    END LOOP;
    v_s := v_s||')';

Phew, that is ugly. Or how about the MERGE UPDATE clause? We need to write:

WHEN MATCHED THEN UPDATE SET
    t.col1 = q.col1,
    t.col2 = q.col2

I started off writing a custom join function that also took a template string as an argument. It was sort of a meld between the Perl join and map methods.

    FUNCTION join(
        p_arr           sys.ku$_vcnt
        ,p_separator    VARCHAR2 DEFAULT ', '
        ,p_template     VARCHAR2 DEFAULT NULL
    ) RETURN VARCHAR2
    IS
        l_s     VARCHAR2(4000);
        FUNCTION apply_template(
            p_val   VARCHAR2
        )
        RETURN VARCHAR2
        IS
            l_t VARCHAR2(4000);
        BEGIN
            IF p_template IS NULL THEN
                l_t := p_val;
            ELSE 
                l_t := REPLACE(p_template, '__PLACEHOLDER__', p_val);
            END IF;
            RETURN l_t;
        END;
    BEGIN
        IF p_arr.COUNT = 0 THEN
            RETURN NULL;
        END IF;
        l_s := apply_template(p_arr(1));
        FOR i IN 2..p_arr.COUNT
        LOOP
            l_s := l_s||p_separator||apply_template(p_arr(i));
        END LOOP;
        RETURN l_s;
    END -- join
    ;

Now to write the UPDATE portion I have

    v_s := 'WHEN MATCHED THEN UPDATE SET'
        ||join(v_arr_non_pk_cols, p_separator => ',
    '
                ,p_template => 't.__PLACEHOLDER__ = q.__PLACEHOLDER__')
    ;

It did not make me happy though. It was a little too customized.

I also had some utility methods as standalone functions (split_csv and transform_perl_regexp) that really needed a package or user defined type home. I wound up creating a new User Defined Type to hold my Perlish methods. I called it japh_util_udt originally, which comes from the phrase “I’m just another perl hacker.” Most people don’t get that, so I renamed it to perlish_util_udt.

Perlish Utility User Defined Type

You can find it in my plsql_utilities github repository.

From the REAMDE.md in the repository:

perlish_util_udt

It isn’t Perl, but it makes some Perlish things a bit easier in PL/SQL. We also get handy methods for splitting Comma Separated Value (CSV) text into lines and fields, which you can use independent of the Perlish methods, and even one that turns a CSV clob into a private temporary table.

There is valid argument that when you are programming in a language you should use the facilities of that language, and that attempting to layer the techniques of another language upon it is a bad idea. I see the logic and partially agree. I expect those who later must support my work that uses this utility will curse me. Yet PL/SQL really sucks at some string and list related things. This uses valid PL/SQL object techniques to manipulate strings and lists in a way that is familiar to Perl hackers.

A perlish_util_udt object instance holds an arr_varchar2_udt collection attribute which you will use when employing the following member methods;

map
join
sort
get
combine

All member methods except get have static alternatives using arr_varchar2_udt parameters and return types, so you are not forced to use the Object Oriented syntax.

It has static method split_csv (returns arr_varchar2_udt) that formerly lived as a standalone function in the plsql_utilities library as split. We have a static method split_clob_to_lines that returns an arr_varchar2_udt collection of “records” from what is assumed to be a CSV file. It parses for CSV syntax when splitting the lines which means there can be embedded newlines in text fields in a “record”.

There is a static procedure create_ptt_csv that consumes a CLOB containing lines of CSV data and turns it into a private temporary table for your session. The PTT has column names from the first line in the CLOB.

It also has a static method named transform_perl_regexp that has nothing to do with arrays/lists, but is Perlish.

Most of the member methods are chainable which is handy when you are doing a series of operations.

Conclusion

I will work on a full demonstration to generate that deployable MERGE I described in the introduction. Seeing it in real action rather than the contrived cases of the documentation examples will show why I’m excited about it. Maybe PL/SQL can suck a little less for hacking strings and lists.

Lee Lindley

Perl Like Operations in PL/SQL

The Problem Needs String Hacking

Perlish Utility User Defined Type

perlish_util_udt

Conclusion

Related Posts