Friday, September 1, 2017

DYNAMIC_SAMPLING SQL Plan Directive in the cloud or not

With 12, Oracle introduced SQL Plan Directives, which in my view is really giving the optimizer a
chance to learn from its mistakes.   This is a step in the direction of a leaning optimizer.  In a way the optimizer is taking notes on how something ran and if not quite right, this note will help it not make the same mistake again.  It’s limited in 12.1 to just one type DYNAMIC_SAMPLING and in 12.2 another one appears DYNAMIC_SAMPLING_RESULT.  Right now I’m going to focus on the first one.

So when it says that the directive is to do dynamic sampling, is that like doing a table level dynamic sample?  That seems like over kill since the directives will have column names associated with them.  I suspected that they were column based, so I set out to prove this to be true or false.  And the tool I used was the good old 10053 trace, a trace of a hard parse. 

The test bears out that yes it is doing the sampling based on the columns of the directive.  The rest of this is a summary of my test.  If anyone would like to run this test on their own, let me know and I can send you the files to set up the test tables and the like. 

I used the newer DBMS_SQLDIAG.DUMP_TRACE technique to get the 10053 trace.  This is very convenient as I can run the query then ask for a 10053 trace on a given SQL_ID and CHILD_NUMBER.

Here is my test case SQL:
SQL> get jcomp_opt
  1  select /*+ qb_name(opt) */b.object_name, b.object_type, a.username
  2    from allusers_tab a, big_tab b
  3   where a.username = b.owner
  4     and b.object_type = 'PROCEDURE'
  5*    and a.username not in ('SYS','SYSTEM')

ALLUSERS_TAB is quite small with 48 rows and BIG_TAB has 2,422,880 rows.  The “not in” predicate on the BIG_TAB table gives the optimizer some math issues and it over calculates the cardinality for BIG_TAB.  It thinks at first it’s getting about 65,000 rows when in reality it only gets just under 3,000.  Because of this mismatch, after a couple runs I see this for the plan:

SQL> select * from table(dbms_xplan.display_cursor('gnshwskp49773',2, 'allstats last'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

SQL_ID  gnshwskp49773, child number 2
-------------------------------------
select /*+ qb_name(opt) */b.object_name, b.object_type, a.username
from allusers_tab a, big_tab b  where a.username = b.owner    and
b.object_type = 'PROCEDURE'    and a.username not in ('SYS','SYSTEM')

Plan hash value: 3435153054

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                 |      1 |        |   2784 |00:00:00.01 |    1773 |
|   1 |  NESTED LOOPS                        |                 |      1 |   1576 |   2784 |00:00:00.01 |    1773 |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| BIG_TAB         |      1 |   1610 |   2784 |00:00:00.01 |    1584 |
|*  3 |    INDEX RANGE SCAN                  | BIG_OBJTYPE_IDX |      1 |   3403 |   3424 |00:00:00.01 |     199 |
|*  4 |   INDEX UNIQUE SCAN                  | USERNAME_PK     |   2784 |      1 |   2784 |00:00:00.01 |     189 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("B"."OWNER"<>'SYS' AND "B"."OWNER"<>'SYSTEM'))
   3 - access("B"."OBJECT_TYPE"='PROCEDURE')
   4 - access("A"."USERNAME"="B"."OWNER")
       filter(("A"."USERNAME"<>'SYS' AND "A"."USERNAME"<>'SYSTEM'))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 2 Sql Plan Directives used for this statement

Key for this test is the note about the use of the Plan Directives.    The directives on the table are:
OBJECT_NAME  DIR_ID               COL_NAME     OBJECT_TYPE  TYPE
------------ -------------------- ------------ ------------ -----------------------
BIG_TAB      11891983782874668880              TABLE        DYNAMIC_SAMPLING_RESULT
BIG_TAB      14819284582793040278 OBJECT_TYPE  COLUMN       DYNAMIC_SAMPLING
BIG_TAB      14819284582793040278              TABLE        DYNAMIC_SAMPLING
BIG_TAB      8668036953221628977  OBJECT_TYPE  COLUMN       DYNAMIC_SAMPLING
BIG_TAB      8668036953221628977  OWNER        COLUMN       DYNAMIC_SAMPLING
BIG_TAB      8668036953221628977               TABLE        DYNAMIC_SAMPLING
BIG_TAB      8700850869231480407               TABLE        DYNAMIC_SAMPLING_RESULT

There are really 4 directives, but only the dynamic sampling ones are of interest for this test.   So what are these dynamic sampling directives really doing?  Well to find out I looked in the 10053 trace of this SQLID and child number and I found three queries being run on the table, one for each first directive, and two on the second:

Plan directive ID 8668036953221628977 has this one (I’ll refer to this one as 977 from now on, the last three digits of the directive ID):
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel  */ NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "B")  */ 1 AS C1 FROM "BIG_TAB" "B" WHERE ("B"."OBJECT_TYPE"='PROCEDURE') AND ("B"."OWNER"<>'SYS') AND ("B"."OWNER"<>'SYSTEM')) innerQuery (objid = 9445728533958271359)

Plan directive ID 14819284582793040278 has these two (I’ll refer to this one as 278):
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel  OPT_ESTIMATE(@"innerQuery", TABLE, "B", ROWS=1610) */ NVL(C1,0), NVL(C2,0), NVL(C3,0) FROM (SELECT /*+ qb_name("innerQuery") INDEX( "B" "BIG_OBJTYPE_IDX")  */ COUNT(*) AS C1, 4294967295 AS C2, COUNT(*) AS C3  FROM "BIG_TAB" "B" WHERE ("B"."OBJECT_TYPE"='PROCEDURE')) innerQuery (objid = 12743291823137504172)

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel  OPT_ESTIMATE(@"innerQuery", TABLE, "B", ROWS=1610) OPT_ESTIMATE(@"innerQuery", INDEX_SCAN, "B", "BIG_OBJTYPE_IDX", ROWS=3424) */ NVL(C1,0), NVL(C2,0), NVL(C3,0) FROM (SELECT /*+ qb_name("innerQuery") INDEX( "B" "BIG_OBJTYPE_IDX")  */ COUNT(*) AS C1, 4294967295 AS C2, COUNT(*) AS C3  FROM "BIG_TAB" "B" WHERE ("B"."OBJECT_TYPE"='PROCEDURE')) innerQuery (objid = 12743291823137504172)

When I ran the query in 977 (after taking out the last bit starting with innerQuery) it’s really just counting up how many rows in BIG_TAB match the given predicates on the two columns.  In this case its 2,784.  This is then used to recalculate the cardinality of BIG_TAB from 65,483 to 1,610 for the full table scan estimate.  This line is in the 10053 trace  after the select doing the count:

Single Tab Card adjusted from 65483.243243 to 1610.000000 due to adaptive dynamic sampling

For 278 it's really about using the index on the OBJECT_TYPE column.  Notice that in these queries (which is done after 977) they both use the 1,610 number as a corrected cardinality with the OPT_ESTIMATE hint in them.  And the second one is also using the value retrieved from the first one (3,424) as a corrected cardinality on the index also with an OPT_ESTIMATE hint.   Clearly the selects associated with the SQL Directives are working together and build off each other.

Both the queries for 278 are doing counts on the table for just the OBJECT_TYPE predicate.  I’m not sure what it does the count twice in both queries as the C1 and C3 columns in the inner query, and 4,294,967,295 literal for C2 is odd as well.  That number only appears in these queries; it’s nowhere else in the trace.  It is the max values for an unsigned 32 bit integer, which is interesting but I’m not sure what that has to do with anything.  For this test, the number both queries come back with is 3,424 for both C1 and C3.  This value is then used for the index cardinality. This line is in the 10053 trace after the two selects doing the counts are done:

Index Card adjusted from 65893.938102 to 3424.000000 due to adaptive dynamic sampling

Maybe the two counts for C1 and C3 can be different in other version of these queries, but here they both are the exact same thing ’COUNT(*)’ so they can’t be different in this version of the query.  I may investigate this more in the future, but my testing this time is done.   

The conclusion of the test is yes, the dynamic sample is really being done on the columns as listed in the directive, also it does just the predicates in the sampling hence it should get very good counts to base the plan on.  The bad news is that this sampling could take some time on really big tables with lots of directives, which is likely why this feature is now turned off by default in 12.2.  Hopefully the great folks out in Redwood Shores will figure out a way to rein this in a bit and make it the great feature it appears to be.