An interesting optimization has been done with Nested Loops in Oracle recently. In 11 in you may see plans like this:
SQL> set autotrace traceonly explain
SQL> select /*+ use_nl (d e) */ e.empno, e.sal, d.deptno, d.dname
2 from emp e, dept d
3 where e.deptno = d.deptno ;
Execution Plan
----------------------------------------------------------
Plan hash value: 1688427947
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 312 | 6 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 13 | 312 | 6 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | DEPT | 4 | 52 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | EMP_DEPT_IDX | 4 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| EMP | 3 | 33 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("E"."DEPTNO"="D"."DEPTNO")
filter("E"."DEPTNO" IS NOT NULL)
Why the double Nested Loop? We certainly expect to see the number of joins to be one less then the total number of tables, so with only 2 tables we'd expect to see just 1 join not 2.
What's happening here is that Oracle is "carrying" less in the inner loop. It's joining the DEPT table just to index on EMP. The results from this inner join is a row from the DEPT table and the ROWID from the EMP table, much less stuff to work with in the inner loop. Joining the row of DEPT with just the index on EMP is certainly less then working with the entire row from both.
The outer loop then really isn't a nested loop at all, but rather a "lookup" into the EMP table to find the row using the ROWID that it retrieved from the inner loop. Now it has all the data, the row from DEPT and the row from EMP joined together.
Kinda cool eh?
Wednesday, November 28, 2012
Monday, November 19, 2012
Which trace file is mine?
A common issue when tracing is finding your trace file after the tracing is done. Back in the bad old days I would do something like "select 'Ric Van Dyke' from dual;" in the session I was tracing. Once I was finished I'd then do a search for a trace file with "Ric Van Dyke" in the file to find mine. It worked, but today we have a much more elegant way to find a trace file:
SQL> alter session set tracefile_identifier='MY_TRACE';
SQL> exec dbms_monitor.session_trace_enable(null,null,true,true,'ALL_EXECUTIONS');
SQL> select 'hello' from dual;
'HELL
-----
hello
SQL> exit
Disconnected from Oracle Database
11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data
Mining and Real Application Testing options
C:\app\hotsos\diag\rdbms\hotsos\hotsos\trace>dir
*MY_TRACE*
11/15/2012 04:30 PM 10,220 hotsos_ora_5540_MY_TRACE.trc
11/15/2012 04:30 PM 94 hotsos_ora_5540_MY_TRACE.trm
2 File(s) 10,314 bytes
You can make this more elaborate for example this is an anomomys PL/SQL block I have used in the past to turn on tracing, which dynamicly sets TRACEFILE_IDENTIFIER to the current time and user:
declare
t_trc_file varchar2(256):=
'alter session
set tracefile_identifier='||chr(39)||
to_char(sysdate, 'hh24miss’)||'_'||user||chr(39);
t_trc_stat varchar2(256):=
'alter session set timed_statistics=true';
t_trc_size varchar2(256):=
'alter session set max_dump_file_size=unlimited';
t_trc_sql
varchar2(256):=
'alter session set events '||chr(39)||
'10046 trace name context forever, level 12'||chr(39);
begin
execute immediate t_trc_file;
-- Set tracefile identifier
execute immediate t_trc_stat;
-- Turn on timed statistics
execute immediate t_trc_size;
-- Set max Dump file size
execute immediate t_trc_sql;
-- Turn on Trace
end;
Friday, November 9, 2012
Is this index being used?
When teaching the Optimizing Oracle SQL, Intensive class I’m
often asked how to tell if an index is being used or not. Oracle does have a feature for this called
(oddly enough) index monitoring. However
it’s a less than ideal. First off it’s
just a YES or NO switch. Either the
index has been used since you turned it on or it hasn’t. Also it’s not exactly 100% accurate. Sometimes
an index will be used and not counted and other times the way it’s used isn’t
really what you want, for example collecting stats will typically be counted as
“using” the index. However this is not
the type of use most folks are interested in.
We’d like to know when it’s used as part of a query.
We can see this. With
a rather simple query you can easily see which indexes are in use and how many
times, and even how the index is being used. Something like this does the trick; of course you’d
want to change the owner name.
The "#SQL" column is the number of different SQL statements currently in
the library cache that have used the index and the "#EXE" column shows
the total number of executions.
select count(*) "#SQL", sum(executions) "#EXE",
object_name, options
from v$sql_plan_statistics_all
where operation = 'INDEX' and
object_owner = 'OP'
group by object_name, options
order by object_name, options
/
object_name, options
from v$sql_plan_statistics_all
where operation = 'INDEX' and
object_owner = 'OP'
group by object_name, options
order by object_name, options
/
An example run:
#SQL #EXE OBJECT_NAME OPTIONS
------ ------ ------------------------ ------------------------
4 4 BIG_OBJTYPE_IDX RANGE SCAN
2 4 EMP_EMPNO_PK FULL SCAN
1 3 EMP_UPPER_IDX RANGE SCAN
1 2 LIC_OWN_CTY_LICNO_IDX SKIP SCAN
1 3 SALESORDER_PK FAST FULL SCAN
1 2 SALESORDER_PK FULL SCAN (MIN/MAX)
1 1 SALESORDER_PK RANGE SCAN DESCENDING
1 3 SALESORDER_PK UNIQUE SCAN
2 2 USERNAME_PK FULL SCAN
2 2 USERNAME_PK UNIQUE SCAN
SQL>
From this simple output we can see that the BIG_OBJTYPE_IDX is used in four different statements that each ran just once. And that the SALEORDER_PK is used a variety of ways. Each one from the same statement but some run more times then others.
A couple of other things to keep in mind, this will only
show what’s happened “recently”. Being a
V$ view the data isn’t kept in here for any particular length of time. In a
production environment you might want to run this a few times in the day to
capture what’s going on. Also this view is
fairly expensive to select from, so doing this once in a while is fine, just
don’t set this up to poll the view every second for the day. It
will likely show us as a “top SQL” statement in every monitoring tool.
Subscribe to:
Posts (Atom)