chance to learn from its mistakes. This is a step in the direction of a leaning optimizer. In a way the optimizer is taking notes on how something ran and if not quite right, this note will help it not make the same mistake again. It’s limited in 12.1 to just one type DYNAMIC_SAMPLING and in 12.2 another one appears DYNAMIC_SAMPLING_RESULT. Right now I’m going to focus on the first one.
So when it says that the directive is to do dynamic
sampling, is that like doing a table level dynamic sample? That seems like over kill since the
directives will have column names associated with them. I suspected that they were column based, so I
set out to prove this to be true or false.
And the tool I used was the good old 10053 trace, a trace of a hard
parse.
The test bears out that yes it is doing the sampling based
on the columns of the directive. The
rest of this is a summary of my test. If
anyone would like to run this test on their own, let me know and I can send you
the files to set up the test tables and the like.
I used the newer DBMS_SQLDIAG.DUMP_TRACE technique
to get the 10053 trace. This is very convenient
as I can run the query then ask for a 10053 trace on a given SQL_ID
and CHILD_NUMBER.
Here is my test case SQL:
SQL> get
jcomp_opt
1
select /*+ qb_name(opt) */b.object_name, b.object_type, a.username
2
from allusers_tab a, big_tab b
3
where a.username = b.owner
4
and b.object_type = 'PROCEDURE'
5*
and a.username not in ('SYS','SYSTEM')
ALLUSERS_TAB is quite small with
48 rows and BIG_TAB has 2,422,880 rows. The “not in” predicate on the BIG_TAB
table gives the optimizer some math issues and it over calculates the cardinality
for BIG_TAB. It thinks at first it’s getting about 65,000
rows when in reality it only gets just under 3,000. Because of this mismatch, after a couple runs
I see this for the plan:
SQL>
select * from table(dbms_xplan.display_cursor('gnshwskp49773',2, 'allstats
last'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID gnshwskp49773, child number 2
-------------------------------------
select /*+
qb_name(opt) */b.object_name, b.object_type, a.username
from
allusers_tab a, big_tab b where
a.username = b.owner and
b.object_type
= 'PROCEDURE' and a.username not in
('SYS','SYSTEM')
Plan hash
value: 3435153054
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 2784 |00:00:00.01 | 1773 |
| 1 |
NESTED LOOPS
| | 1 | 1576 |
2784 |00:00:00.01 | 1773 |
|* 2 |
TABLE ACCESS BY INDEX ROWID BATCHED| BIG_TAB |
1 | 1610 | 2784 |00:00:00.01 | 1584 |
|* 3 |
INDEX RANGE SCAN
| BIG_OBJTYPE_IDX | 1 | 3403 |
3424 |00:00:00.01 | 199 |
|* 4 |
INDEX UNIQUE SCAN
| USERNAME_PK | 2784 |
1 | 2784 |00:00:00.01 | 189 |
------------------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 -
filter(("B"."OWNER"<>'SYS' AND
"B"."OWNER"<>'SYSTEM'))
3 -
access("B"."OBJECT_TYPE"='PROCEDURE')
4 -
access("A"."USERNAME"="B"."OWNER")
filter(("A"."USERNAME"<>'SYS' AND
"A"."USERNAME"<>'SYSTEM'))
Note
-----
- dynamic statistics used: dynamic sampling
(level=2)
- 2 Sql Plan Directives used for this
statement
Key for this test is the note about the use of the Plan
Directives. The directives on the
table are:
OBJECT_NAME DIR_ID COL_NAME OBJECT_TYPE TYPE
------------
-------------------- ------------ ------------ -----------------------
BIG_TAB 11891983782874668880 TABLE DYNAMIC_SAMPLING_RESULT
BIG_TAB 14819284582793040278 OBJECT_TYPE COLUMN
DYNAMIC_SAMPLING
BIG_TAB 14819284582793040278 TABLE DYNAMIC_SAMPLING
BIG_TAB 8668036953221628977 OBJECT_TYPE
COLUMN DYNAMIC_SAMPLING
BIG_TAB 8668036953221628977 OWNER
COLUMN DYNAMIC_SAMPLING
BIG_TAB 8668036953221628977 TABLE DYNAMIC_SAMPLING
BIG_TAB 8700850869231480407 TABLE DYNAMIC_SAMPLING_RESULT
There are really 4 directives, but only the dynamic sampling
ones are of interest for this test. So what
are these dynamic sampling directives really doing? Well to find out I looked in the 10053 trace
of this SQLID and child number and I found three queries being run on the
table, one for each first directive, and two on the second:
Plan directive ID 8668036953221628977 has this one (I’ll
refer to this one as 977 from now on, the last three digits of the directive ID):
SELECT /*
DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring
optimizer_features_enable(default) no_parallel
*/ NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery")
NO_INDEX_FFS( "B") */ 1 AS C1
FROM "BIG_TAB" "B" WHERE ("B"."OBJECT_TYPE"='PROCEDURE')
AND ("B"."OWNER"<>'SYS') AND
("B"."OWNER"<>'SYSTEM')) innerQuery (objid =
9445728533958271359)
Plan directive ID 14819284582793040278 has these two (I’ll
refer to this one as 278):
SELECT /*
DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default)
no_parallel
OPT_ESTIMATE(@"innerQuery", TABLE, "B", ROWS=1610)
*/ NVL(C1,0), NVL(C2,0), NVL(C3,0) FROM (SELECT /*+
qb_name("innerQuery") INDEX( "B" "BIG_OBJTYPE_IDX") */ COUNT(*) AS C1, 4294967295 AS C2, COUNT(*)
AS C3 FROM "BIG_TAB"
"B" WHERE ("B"."OBJECT_TYPE"='PROCEDURE'))
innerQuery (objid = 12743291823137504172)
SELECT /*
DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default)
no_parallel OPT_ESTIMATE(@"innerQuery",
TABLE, "B", ROWS=1610) OPT_ESTIMATE(@"innerQuery",
INDEX_SCAN, "B", "BIG_OBJTYPE_IDX", ROWS=3424) */ NVL(C1,0),
NVL(C2,0), NVL(C3,0) FROM (SELECT /*+ qb_name("innerQuery") INDEX(
"B" "BIG_OBJTYPE_IDX")
*/ COUNT(*) AS C1, 4294967295 AS C2, COUNT(*) AS C3 FROM "BIG_TAB" "B" WHERE
("B"."OBJECT_TYPE"='PROCEDURE')) innerQuery (objid =
12743291823137504172)
When I ran the query in 977 (after taking out the last bit
starting with innerQuery) it’s really just counting up how many rows in BIG_TAB
match the given predicates on the two columns.
In this case its 2,784. This is
then used to recalculate the cardinality of BIG_TAB from 65,483
to 1,610 for the full table scan estimate.
This line is in the 10053 trace after the select doing the count:
Single Tab Card adjusted from 65483.243243 to
1610.000000 due to adaptive dynamic sampling
For 278 it's really about using the index on the OBJECT_TYPE
column. Notice that in these queries
(which is done after 977) they both use the 1,610 number as a corrected
cardinality with the OPT_ESTIMATE hint in them. And the second one is also using the value retrieved
from the first one (3,424) as a corrected cardinality on the index also with an
OPT_ESTIMATE
hint. Clearly the selects associated with the SQL
Directives are working together and build off each other.
Both the queries for 278 are doing counts on the table for just
the OBJECT_TYPE
predicate. I’m not sure what it does the
count twice in both queries as the C1 and C3 columns in the
inner query, and 4,294,967,295 literal for C2 is odd as well. That number only appears in these queries; it’s
nowhere else in the trace. It is the max
values for an unsigned 32 bit integer, which is interesting but I’m not sure
what that has to do with anything. For
this test, the number both queries come back with is 3,424 for both C1
and C3. This value is then used for the index
cardinality. This line is in the 10053 trace after the two selects doing the
counts are done:
Index Card adjusted from 65893.938102 to
3424.000000 due to adaptive dynamic sampling
Maybe the two counts for C1 and C3
can be different in other version of these queries, but here they both are the
exact same thing ’COUNT(*)’ so they can’t be different
in this version of the query. I may investigate
this more in the future, but my testing this time is done.
The conclusion of the test is yes, the dynamic sample is
really being done on the columns as listed in the directive, also it does just
the predicates in the sampling hence it should get very good counts to base the
plan on. The bad news is that this
sampling could take some time on really big tables with lots of directives, which
is likely why this feature is now turned off by default in 12.2. Hopefully the great folks out in Redwood Shores
will figure out a way to rein this in a bit and make it the great feature it
appears to be.