Tuesday, October 21, 2014

2014 East Coast Oracle Users Conference



Hey everyone!  I'll be speaking at this conference with a host of other great speakers.  Come on down and join the fun!



2014 East Coast Oracle Users Conference (ECO) 
November 4 &5 
Sheraton Imperial Hotel & Convention Center 
 Raleigh/Durham, NC

Friday, October 3, 2014

CAGE MATCH: IN vs OR


Many of us know that an IN list is turned into a set of OR’ed predicates by the optimizer.  Up until 11 an IN list and an OR list were pretty much the same.  But in 11 there was a subtle shift that changed the game.

This example is from the Hotsos “Optimizing Oracle SQL, Intensive” course.  When originally written this was to demonstrate how ANDs and ORs work with regard to “short circuit logic”.    That is, when a FALSE is found in an AND list it stops and when a TRUE is found in an OR list is stops.  But something odd started to happen in 11:

First we have a little function.  It writes a line to the screen and returns the word “match”.  The line it writes tells us that it ran for a particular row. The data we use for this is equally complex, a table with 2 columns and 2 rows; a string and a number, the number to represent each row.


create or replace function test_po(a number) return varchar2
is
begin
 dbms_output.put_line('function executed for row '||a);
 return 'match';
end;
/
SQL> desc pred_order
 Name          Null?    Type
 ------------- -------- ------------
 COL1                   VARCHAR2(10)
 COL2                   NUMBER
SQL> select * from pred_order;
 COL1             COL2
 ---------- ----------
 aaaaa               1
 match               2

Here’s a query that selects from this table:

select *
from pred_order
where test_po(col2) in ('xxxxx','yyyyy')
and col1 = 'zzzzz'

Now when we execute this plan in 10 or before (and using IO costing), the output would look like below.  Notice that the function was executed twice for each row, and you can see that in the plan the function is shown twice in the OR list that the IN list was converted to.  So that makes sense:

SQL> alter session set "_optimizer_cost_model"=io;
SQL>
SQL> @pred1
function executed for row 1
function executed for row 1
function executed for row 2
function executed for row 2
SQL>
SQL> @hxplan
Enter .sql file name (without extension): pred1
Enter the display level (TYPICAL, ALL, BASIC, SERIAL) [TYPICAL]  :
Plan hash value: 1987733821

----------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost  |
----------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |     1 |     9 |     2 |
|*  1 |  TABLE ACCESS FULL| PRED_ORDER |     1 |     9 |     2 |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("TEST_PO"("COL2")='xxxxx' OR "TEST_PO"("COL2")='yyyyy')
              AND "COL1"='zzzzz')

Note
-----
   - cpu costing is off (consider enabling it)
SQL>

However do the same thing in 11 or 12 and the output different.  The plan stays the same but the function appears to be only run once:

SQL> @pred1
function executed for row 1
function executed for row 2
SQL>
SQL> @hxplan
Enter .sql file name (without extension): pred1
Enter the display level (TYPICAL, ALL, BASIC, SERIAL) [TYPICAL]  :
Plan hash value: 1987733821

----------------------------------------------------------------
| Id  | Operation      | Name       | Rows  | Bytes | Cost  |
----------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |     1 |     9 |     2 |
|*  1 |  TABLE ACCESS FULL| PRED_ORDER |     1 |     9 |     2 |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("TEST_PO"("COL2")='xxxxx' OR "TEST_PO"("COL2")='yyyyy')
           AND "COL1"='zzzzz')

Note
-----
   - cpu costing is off (consider enabling it)
SQL>

What I’m pretty sure is happening is some sort of “result cache” is being done.  It sees that it’s the same call with the same value being passed in so it either automatically created a result cache or behind the scenes is putting in a bind like value for the second call.  It does make the comparison, I put the word “match” way down the list and is did find it, so it’s does really do the comparison it just doesn’t make the function call over and over again.

That’s cool.  But watch what happens if I manually change the query to ORs instead of the IN list:

SQL> get pred1_new
  1  select *
  2    from pred_order
  3   where (test_po(col2)='xxxxx' or test_po(col2)='yyyyy')
  4*    and col1 = 'zzzzz'
SQL>
SQL>
SQL> @pred1_new
function executed for row 1
function executed for row 1
function executed for row 2
function executed for row 2
SQL>
SQL> @hxplan
Enter .sql file name (without extension): pred1_new
Enter the display level (TYPICAL, ALL, BASIC, SERIAL) [TYPICAL]  :
Plan hash value: 1987733821

----------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost  |
----------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |     1 |     9 |     2 |
|*  1 |  TABLE ACCESS FULL| PRED_ORDER |     1 |     9 |     2 |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("TEST_PO"("COL2")='xxxxx' OR "TEST_PO"("COL2")='yyyyy')
              AND "COL1"='zzzzz')

Note
-----
   - cpu costing is off (consider enabling it)
SQL>

We are right back to it running the function twice!  HA!  So with the IN list it recognizes that it’s the same call over and over, but when I manually separate it out, the optimizer can’t see that is the same call over and over.

*** BTW – If you’re use CPU costing (which you should be), the optimizer moves the AND predicate (COL1= ‘zzzzz’) first and you see nothing.  This is another part of the exercise to see how predicates move around with CPU costing.

So the bottom line?  INs do have an edge over ORs when it comes to a function call in particular.   There is another advantage which this demo doesn’t show which is the IN list will sort the list, this is very useful for index scans with the IN-List Iterator step, maybe another post about that someday. 

Thursday, September 18, 2014

Fun with your WITH!


Here’s a new feature that might have slipped under the radar for you with 12c.  You can now have PL/SQL functions defined in the WITH clause.  First off this brings some cool functionality to your select statements, but even cooler is that your statement will run faster, a lot faster potential. 

Here’s a simple test to show this.  The ORD9 table has 103,120 rows in 1,390 blocks.  The GMT_ORDER_DATE column represents the order data in the number of seconds since midnight 1 January of the year the order was placed. It has no indexes and does have full (100%) stats.  I have a function defined below which returns a date as it relates to the UNIX epoch.  (Why you would really want to do this, I’m not sure, but it’s here mostly to have it do something.)

create or replace function rtn_date_convert (p_unix_gmt in number)
  return date
  as
   v_date  date;
  begin
     v_date := to_date('01/01/1970','mm/dd/yyyy') + (p_unix_gmt/86400) ;
     RETURN v_date;
  end ;
/

Now I have a select which uses this function as such:


select count(*) from ord9 where rtn_date_convert(gmt_order_date) > sysdate - 3650
/

Next I have a query that does the same thing but has the function defined within a with clause:


with function rtn_date_convert2 (p_unix_gmt in number)
  return date
  as
   v_date  date;
  begin
     v_date := to_date('01/01/1970','mm/dd/yyyy') + (p_unix_gmt/86400) ;
     RETURN v_date;
  end ;
select count(*) from ord9
where rtn_date_convert2(gmt_order_date) > sysdate - 3650
/

I run each several times (more then 10).  Each query “does the same thing”, the plans are identical.  They both do a full scan on the ORD9 table then do an aggregation of the data, just as you’d expect.  LIOs and other stats are the same between the plans; however the one with the internal function runs consistently in less than half the time of the external one (this is output from our harness tool which captures stats and trace files about SQL statements):


                                   withfun:    withfun:
TYPE  NAME                         externalfun internalfun DIFFERENCE
----- ---------------------------- ----------- ----------- ----------
Latch cache buffers chains               2,960       2,964         -4
      row cache objects                    157         151          6
      shared pool                           18          17          1
                                                                    
Stats buffer is pinned count                 0           0          0
      consistent gets                    1,326       1,326          0
      db block changes                       0           0          0
      db block gets                          0           0          0
      execute count                          5           5          0
      index fast full scans (full)           0           0          0
      parse count (hard)                     0           0          0
      parse count (total)                    6           6          0
      physical reads                         0           0          0
      physical writes                        0           0          0
      redo size                              0           0          0
      session logical reads              1,326       1,326          0
      session pga memory                     0           0          0
      session pga memory max                 0           0          0
      session uga memory                     0           0          0
      session uga memory max                 0           0          0
      sorts (disk)                           0           0          0
      sorts (memory)                         0           0          0
      sorts (rows)                           0           0          0
      table fetch by rowid                   0           0          0
      table scan blocks gotten           1,316       1,316          0
      table scans (long tables)              0           0          0
      table scans (short tables)             1           1          0
                                                                     
Time  elapsed time (centiseconds)           63          29         34

withfun:externalfun
STAT #565587040 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=1326 pr=0 pw=0 time=621435 us)'
STAT #565587040 id=2 cnt=103120 pid=1 pos=1 obj=102779 op='TABLE ACCESS FULL ORD9 (cr=1326 pr=0 pw=0 time=608109 us cost=282 size=824960 card=103120)'


withfun:internalfun
STAT #407419720 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=1326 pr=0 pw=0 time=284779 us)'
STAT #407419720 id=2 cnt=103120 pid=1 pos=1 obj=102779 op='TABLE ACCESS FULL ORD9 (cr=1326 pr=0 pw=0 time=272208 us cost=282 size=824960 card=103120)'

 These tests were run on a Windows 12.1.0.1 database, which interestingly had optimizer_features_enable set to 11.2.0.2.  

Friday, September 12, 2014

NO_PUSH_PRED – the non-hint


I doubt this will be a big deal to most folks, just an interesting thing to know.   It certainly appears that the “NO_PUSH_PRED” hint does nothing.  I’ve tried every way I can think of to get this hint to work and no luck.  This test was done on a 12.1.0.1 database but I’ve done this on 11 (and I think 10 also) and can’t get this hint to do anything.  Of course I don’t know when or why someone wouldn’t want a predicate pushed, I can’t think of a good case that it shouldn’t be if it can be.  Sure there are lots of reasons why it shouldn’t be, the overall reason is that it would change the results of the view, which would be very bad.  The optimizer wouldn’t do that anyway so I don’t think we have to worry about that.
I have a simple example:

SQL> create table t1 as
  2  select * from all_users ;
Table created.
SQL> create table t2 as
  2  select * from all_users ;
Table created.
SQL> create or replace view t_view1 as
  2  select t1.user_id, t2.username
  3    from t1, t2
  4   where t1.user_id = t2.user_id;
View created.

Now selecting from the view t_view1 as such:

select /*+ NO_QUERY_TRANSFORMATION */ * from t_view1 v where username = 'OP'

I get this plan, I’m using the NO_QUERY_TRANSFORMATION  hint to keep it from doing view merging which it would do without the hint.  It’s easier to see the pushing for the predicate this way.  You can see that the predicate got pushed into the view resolution. 

Plan hash value: 1448425336

-------------------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |     1 |    79 |     5  (20)| 00:00:01 |
|   1 |  VIEW               | T_VIEW1 |     1 |    79 |     5  (20)| 00:00:01 |
|*  2 |   HASH JOIN         |         |     1 |    17 |     5  (20)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T2      |     1 |    13 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1      |    45 |   180 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."USER_ID"="T2"."USER_ID")
   3 - filter("T2"."USERNAME"='OP')
SQL>

Now I’ll change the query to this:

select /*+ NO_QUERY_TRANSFORMATION NO_PUSH_PRED(v)*/ * from t_view1 v where username = 'OP'

and I get this plan (which is the same one):

Plan hash value: 1448425336

-------------------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |     1 |    79 |     5  (20)| 00:00:01 |
|   1 |  VIEW               | T_VIEW1 |     1 |    79 |     5  (20)| 00:00:01 |
|*  2 |   HASH JOIN         |         |     1 |    17 |     5  (20)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T2      |     1 |    13 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1      |    45 |   180 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."USER_ID"="T2"."USER_ID")
   3 - filter("T2"."USERNAME"='OP')

I even looked at the 10053 trace to see if there was a clue there, nothing.  The hint is there in the plan but nothing like “hint ignored” like I saw with the USE_CONCATE hint.   I tried the table name, I tried using query block names, I tried putting the hint in the view, I tried the hint without a table alias or name, all end up with the same; the predicate gets pushed.

And just for kicks I looked up the example for this hint in the docs (Oracle® Database SQL Language Reference) and this is the query:

SELECT /*+ NO_MERGE(v) NO_PUSH_PRED(v) */ *

  FROM employees e,

       (SELECT manager_id

          FROM employees) v

  WHERE e.manager_id = v.manager_id(+)

    AND e.employee_id = 100;

And guess what?  The plan is the same for that query with or without the hint!  It doesn’t push the predicate for either query. And I’m pretty sure that is because of the outer join between the main query and the inline view. 

The only way I can get it to not push the predicate is to define the view with rownum in it like this, notice in the plan this time the predicate is NOT pushed into the view, and is evaluated after the view is resolved:

create or replace view t_view2 as
  select rownum rno, t1.user_id, t2.username
    from t1, t2
    where t1.user_id = t2.user_id;

select /*+ NO_QUERY_TRANSFORMATION */ * from t_view2 v where username = 'OP'

Plan hash value: 3187783690

--------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |    45 |  4140 |     5  (20)| 00:00:01 |
|*  1 |  VIEW                | T_VIEW2 |    45 |  4140 |     5  (20)| 00:00:01 |
|   2 |   COUNT              |         |       |       |            |          |
|*  3 |    HASH JOIN         |         |    45 |   765 |     5  (20)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T1      |    45 |   180 |     2   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| T2      |    45 |   585 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("USERNAME"='OP')
   3 - access("T1"."USER_ID"="T2"."USER_ID")

So if anyone can show me a repeatable example of using the NO_PUSH_PRED hint and it working, I’d love to see it!