Friday, September 28, 2018

One row to rule them all in the cloud or not


Looking for a single row out of a table it’s always better to use an index and a unique is best.  Yes even for a small table.  As a test of this I have a table that has 10 rows in it and just two columns, a number and a sting of 20 random characters.   I’ll run a select in a loop that gets each row one at a time using the number column which will be unique, 1 to 10.  I’ll do this 100,000 times and capture both elapsed time and CPU time.  Also I’ll capture the number for consistent gets (LIOs) done for each. 

The first go around it will be without an index, with an index, with a unique index and last with a unique covering index.  Here is how it stacked up in my test (I ran the test multiple times and the results were consistent):

......................................................
selecting all 10 rows without an index 100000 times
......................................................
In hundredths of a second
**** TIME - 5669
**** CPU  - 5629
**** LIO  - 7000020

......................................................
selecting all 10 rows with an index 100000 times
......................................................
In hundredths of a second
**** TIME - 4883
**** CPU  - 4852
**** LIO  - 2000011

......................................................
selecting all 10 rows with a unique index 100000 times
......................................................
In hundredths of a second
**** TIME - 4556
**** CPU  - 4536
**** LIO  - 2000004

......................................................
selecting all 10 rows with a unique covering index 100000 times
......................................................
In hundredths of a second
**** TIME - 4516
**** CPU  - 4482
**** LIO  - 1000014

Well there it is.  Over all everything got better as the test went from the no index to index and then to unique index.   The full table scan used for the first query did about seven LIOs per execution, it’s a five block table plus the header block which gets visited twice each scan.  The data of the table fits in one block, but the high water mark is at five.  A full table scan always scans to the high water mark, so we get five LIOs for the table scan.

For the first two index scans there are about two LIOs per execution, one in the index and one in the table.   The entire index is in one index block, which gets the ROWID and then the scan into the table goes to just the one block where the data is.  

The LIOs are not really that much different when it used the unique index but it was faster, why?  A key here is the latching protocol used.   With a unique index oracle is able to use a lighter weight latching protocol  (consistent gets examination) when accessing the index, this ends up taking less time than just using a non-unique index which uses the normal latching protocol (consistent gets). 

The last query used a covering index; this index had both columns of the table in it.  The time is better, not by much, but it’s now down to just one LIO per access.  This is because it just goes to the index and gets everything and never even goes to the table.   By only doing one LIO is really best, this has the best scalability and the lest possible latch contention.   (Also you could use an IOT (Index Organized Table) which would be the same.)

So it is very true that looking up a single row is really faster with an index and best with a unique index even on a 10 row table.  It might not seem like much but doing it thousands or millions of time, the saving will start to make a difference.

Note:  There are some “extra” LIOs when the queries run, most of which are likely caused by any parsing going on.
 
Here is the code for the test if you’d like to try it yourself:

rem testing to get one row out of a table.
rem SEP2018  RVD
rem
set echo on feedback on serveroutput on
drop table iamsmall purge;
create table iamsmall (id_col number, data_col varchar(20));
begin
 for x in 1..10 loop
   insert into iamsmall values (x, (dbms_random.string('A',20)));
 end loop;
end;
/

declare
    x1           varchar2(20);
    l_start_time pls_integer;
    l_start_cpu  pls_integer;
    l_start_cr   pls_integer :=0;
    l_end_cr     pls_integer :=0;
begin
    select value into l_start_cr from v$mystat where STATISTIC# = 139;
    l_start_time := DBMS_UTILITY.GET_TIME;
    l_start_cpu  := DBMS_UTILITY.GET_CPU_TIME;
    for ii in 1 .. 100000 loop
     for i in 1 .. 10 loop
        select data_col into x1 from iamsmall where id_col = i;
     end loop;
    end loop;
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('selecting all 10 rows without an index 100000 times');
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('In hundredths of a second');
    DBMS_OUTPUT.put_line ('**** TIME - '||to_char(DBMS_UTILITY.GET_TIME - l_start_time));
    DBMS_OUTPUT.put_line ('**** CPU  - '||to_char(DBMS_UTILITY.GET_CPU_TIME - l_start_cpu));
    select value into l_end_cr from v$mystat where STATISTIC# = 139;
    DBMS_OUTPUT.put_line ('**** LIO  - '||to_char( l_end_cr - l_start_cr));
end;
/

rem create a nonunique index on id_col
create index ias_id on iamsmall(id_col);
declare
    x1           varchar2(20);
    l_start_time pls_integer;
    l_start_cpu  pls_integer;
    l_start_cr   pls_integer :=0;
    l_end_cr     pls_integer :=0;
begin
    select value into l_start_cr from v$mystat where STATISTIC# = 139;
    l_start_time := DBMS_UTILITY.GET_TIME;
    l_start_cpu  := DBMS_UTILITY.GET_CPU_TIME;
    for ii in 1 .. 100000 loop
     for i in 1 .. 10 loop
        select data_col into x1 from iamsmall where id_col = i;
     end loop;
    end loop;
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('selecting all 10 rows with an index 100000 times');
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('In hundredths of a second');
    DBMS_OUTPUT.put_line ('**** TIME - '||to_char(DBMS_UTILITY.GET_TIME - l_start_time));
    DBMS_OUTPUT.put_line ('**** CPU  - '||to_char(DBMS_UTILITY.GET_CPU_TIME - l_start_cpu));
    select value into l_end_cr from v$mystat where STATISTIC# = 139;
    DBMS_OUTPUT.put_line ('**** LIO  - '||to_char( l_end_cr - l_start_cr));
end;
/

rem create a unique index on id_col
drop index ias_id;
create unique index ias_id on iamsmall(id_col);
declare
    x1           varchar2(20);
    l_start_time pls_integer;
    l_start_cpu  pls_integer;
    l_start_cr   pls_integer :=0;
    l_end_cr     pls_integer :=0;
begin
    select value into l_start_cr from v$mystat where STATISTIC# = 139;
    l_start_time := DBMS_UTILITY.GET_TIME;
    l_start_cpu  := DBMS_UTILITY.GET_CPU_TIME;
    for ii in 1 .. 100000 loop
     for i in 1 .. 10 loop
        select data_col into x1 from iamsmall where id_col = i;
     end loop;
    end loop;
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('selecting all 10 rows with a unique index 100000 times');
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('In hundredths of a second');
    DBMS_OUTPUT.put_line ('**** TIME - '||to_char(DBMS_UTILITY.GET_TIME - l_start_time));
    DBMS_OUTPUT.put_line ('**** CPU  - '||to_char(DBMS_UTILITY.GET_CPU_TIME - l_start_cpu));
    select value into l_end_cr from v$mystat where STATISTIC# = 139;
    DBMS_OUTPUT.put_line ('**** LIO  - '||to_char( l_end_cr - l_start_cr));
end;
/


rem create a unique covering index on id_col,data_col

drop index ias_id;
create unique index ias_id on iamsmall(id_col, data_col);
declare
    x1           varchar2(20);
    l_start_time pls_integer;
    l_start_cpu  pls_integer;
    l_start_cr   pls_integer :=0;
    l_end_cr     pls_integer :=0;
begin
    select value into l_start_cr from v$mystat where STATISTIC# = 139;
    l_start_time := DBMS_UTILITY.GET_TIME;
    l_start_cpu  := DBMS_UTILITY.GET_CPU_TIME;
    for ii in 1 .. 100000 loop
     for i in 1 .. 10 loop
        select data_col into x1 from iamsmall where id_col = i;
     end loop;
    end loop;
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('selecting all 10 rows with a unique covering index 100000 times');
    DBMS_OUTPUT.put_line ('......................................................');
    DBMS_OUTPUT.put_line ('In hundredths of a second');
    DBMS_OUTPUT.put_line ('**** TIME - '||to_char(DBMS_UTILITY.GET_TIME - l_start_time));
    DBMS_OUTPUT.put_line ('**** CPU  - '||to_char(DBMS_UTILITY.GET_CPU_TIME - l_start_cpu));
    select value into l_end_cr from v$mystat where STATISTIC# = 139;
    DBMS_OUTPUT.put_line ('**** LIO  - '||to_char( l_end_cr - l_start_cr));
end;
/

2 comments:

  1. Might want to add a comment about the entirely underutilized single table has cluster....?

    ReplyDelete
  2. Roger that. I think that might call for another post. Coming soon..... :-)

    ReplyDelete