Friday, August 20, 2021

Inserting into multiple tables in the cloud or not.

 


There is more to the INSERT command than most folks are aware.  

 

My buddy Dan Morgan (The man behind Morgan’s Library) remined me recently of this.  For example, you can use one statement to insert into multiple tables at once. 

 

 

The basic syntax looks like this:

 

 

INSERT

WHEN (<condition>) THEN

INTO <table_name> (<column_list>)

VALUES (<values_list>) 

WHEN (<condition>) THEN

INTO <table_name> (<column_list>)

VALUES (<values_list>)

ELSE

INTO <table_name> (<column_list>)

VALUES (<values_list>)

SELECT <column_list> FROM <table_name>;

 

 

You can have many WHEN conditions.  This syntax only would insert into the first table where condition is true.   An advantage of the one statement is the ELSE part.  This is nice to put stuff in a table that just doesn’t match the other conditions.  This could be tricky with multiple statements.  For my test below, it was simple, the DEPNO is one of four values, 10,20, 30 or NULL for the table.  Hence it was easy to come up with four statements to get all the rows into the four separate tables. I think you can image cases where this wouldn’t be as straight forward.

 

 

If you add ALL after the word INSERT, it would insert into any of the tables where the condition is true.  With the ALL version you could have no WHEN conditions at all and it would insert into all the tables in the statement.  This could be useful when you are moving a row from a wide table (many columns) and you are breaking that up into several narrow tables.  For example, you have a table that is not normalized and you want to normalize it into a set of tables.

 

 

This is super cool of course, but as a performance guy, how does it perform?  I set up a test to see at least in one case how it did. This test takes the good old EMP table and populates 4 other tables based on DEPTNO, I do this 100,000 times to have a reasonable amount of activity. (The code at the bottom if you’d like to try it yourself.)  The results are below from both a 21c super cool autonomous database and from a 12c database on one of my old windows laptop. 

 

 

As I do, each test was run multiple times to weed out any noise to get a good idea of what is happening.  In this test I ran the script with the SETUP part once, then commented the SETUP lines and reran it multiple times, this way the tables are already there and have been inserted into to weed out any noise with the high-water mark and such.  The results below are representative of these runs after the first one. 

 

 

For the 21c Database:

 

 

......................................................

Inserting into 4 tables with one command 100,000 times

......................................................

In hundredths of a second

**** TIME - 3133

**** CPU  - 3105

**** LIO  - 619022

 

 

......................................................

Inserting into 4 tables with four commands 100,000 times

......................................................

In hundredths of a second

**** TIME - 6874

**** CPU  - 6769

**** LIO  - 1230158

 

 

For the 12c Database:

 

 

......................................................

Inserting into 4 tables with one command 100,000 times

......................................................

In hundredths of a second

**** TIME - 1622

**** CPU  - 1111

**** LIO  - 707520

 

 

......................................................

Inserting into 4 tables with four commands 100,000 times

......................................................

In hundredths of a second

**** TIME - 2784

**** CPU  - 2627

**** LIO  - 1308283

 

 

Over all it took a bit more than twice as long to do the task with 4 separate statement rather than one, elapsed time and CPU time.  Not unexpected, I would think that one statement should perform better then several.  But it wasn’t 4 times as long as one might have thought, only about twice as long.   LIOs were just a bit less than twice as many when doing 4 statements rather than one. 

 

 

What does this mean?  It’s likely better to use one statement rather than multiple if you can.  And it might be slightly easier to maintain this code over time.  It does require a slightly different way of thinking about your inserts of course. 

 

 

Code for the test below.  Make sure you use the correct statistic number for the consistent gets statistic.  Its 139 for 12c and earlier, 209 for 21c. 

 

 

-- setup

drop table emp_10 purge;

drop table emp_20 purge;

drop table emp_30 purge;

drop table emp_xx purge;

create table emp_10 as select * from emp where 1=0;

create table emp_20 as select * from emp where 1=0;

create table emp_30 as select * from emp where 1=0;

create table emp_xx as select * from emp where 1=0;

-- end setup 

 

set serveroutput on;

 

 

declare

    x1           varchar2(20);

    l_start_time pls_integer;

    l_start_cpu  pls_integer;

    l_start_cr   pls_integer :=0;

    l_end_cr     pls_integer :=0;

begin

    --the consistent gets statistic is #139 in 12c and #209 in 21c 

    --select value into l_start_cr from v$mystat where STATISTIC# = 139;

    select value into l_start_cr from v$mystat where STATISTIC# = 209;

    l_start_time := DBMS_UTILITY.GET_TIME;

    l_start_cpu  := DBMS_UTILITY.GET_CPU_TIME;

    for ii in 1 .. 100000 loop

        INSERT

            WHEN (deptno=10) THEN

            INTO emp_10 (empno,ename,job,mgr,sal,deptno)

            VALUES (empno,ename,job,mgr,sal,deptno) 

            WHEN (deptno=20) THEN

            INTO emp_20 (empno,ename,job,mgr,sal,deptno)

            VALUES (empno,ename,job,mgr,sal,deptno) 

            WHEN (deptno=30) THEN

            INTO emp_30 (empno,ename,job,mgr,sal,deptno)

            VALUES (empno,ename,job,mgr,sal,deptno)

            ELSE

            INTO emp_xx (empno,ename,job,mgr,sal,deptno) 

            VALUES (empno,ename,job,mgr,sal,deptno)

            SELECT * FROM emp;

    end loop;

    DBMS_OUTPUT.put_line ('......................................................');

    DBMS_OUTPUT.put_line ('Inserting into 4 tables with one command 100,000 times');

    DBMS_OUTPUT.put_line ('......................................................');

    DBMS_OUTPUT.put_line ('In hundredths of a second');

    DBMS_OUTPUT.put_line ('**** TIME - '||to_char(DBMS_UTILITY.GET_TIME - l_start_time));

    DBMS_OUTPUT.put_line ('**** CPU  - '||to_char(DBMS_UTILITY.GET_CPU_TIME - l_start_cpu));

    --select value into l_end_cr from v$mystat where STATISTIC# = 139;

    select value into l_end_cr from v$mystat where STATISTIC# = 209;

    DBMS_OUTPUT.put_line ('**** LIO  - '||to_char( l_end_cr - l_start_cr));

end;

/

-- delete data from tables 

delete from emp_10;

delete from emp_20;

delete from emp_30;

delete from emp_xx;

 

declare

    x1           varchar2(20);

    l_start_time pls_integer;

    l_start_cpu  pls_integer;

    l_start_cr   pls_integer :=0;

    l_end_cr     pls_integer :=0;

begin

    --the consistent gets statistic is #139 in 12c and #209 in 21c 

    --select value into l_start_cr from v$mystat where STATISTIC# = 139;

    select value into l_start_cr from v$mystat where STATISTIC# = 209;

    l_start_time := DBMS_UTILITY.GET_TIME;

    l_start_cpu  := DBMS_UTILITY.GET_CPU_TIME;

    for ii in 1 .. 100000 loop

        INSERT

            INTO emp_10 (empno,ename,job,mgr,sal,deptno)

            SELECT empno,ename,job,mgr,sal,deptno FROM emp where deptno =10;

        INSERT

            INTO emp_20 (empno,ename,job,mgr,sal,deptno)

            SELECT empno,ename,job,mgr,sal,deptno FROM emp where deptno =20;

        INSERT

            INTO emp_30 (empno,ename,job,mgr,sal,deptno)

            SELECT empno,ename,job,mgr,sal,deptno FROM emp where deptno =30;

        INSERT

            INTO emp_xx (empno,ename,job,mgr,sal,deptno) 

            SELECT empno,ename,job,mgr,sal,deptno FROM emp where deptno is null;

    end loop;

    DBMS_OUTPUT.put_line ('......................................................');

    DBMS_OUTPUT.put_line ('Inserting into 4 tables with four commands 100,000 times');

    DBMS_OUTPUT.put_line ('......................................................');

    DBMS_OUTPUT.put_line ('In hundredths of a second');

    DBMS_OUTPUT.put_line ('**** TIME - '||to_char(DBMS_UTILITY.GET_TIME - l_start_time));

    DBMS_OUTPUT.put_line ('**** CPU  - '||to_char(DBMS_UTILITY.GET_CPU_TIME - l_start_cpu));

    --select value into l_end_cr from v$mystat where STATISTIC# = 139;

    select value into l_end_cr from v$mystat where STATISTIC# = 209;

    DBMS_OUTPUT.put_line ('**** LIO  - '||to_char( l_end_cr - l_start_cr));

end;

/

 

-- delete data from tables 

delete from emp_10;

delete from emp_20;

delete from emp_30;

delete from emp_xx;