Friday, December 7, 2012

Which step completes first? How to read an explain plan.

When reading an explain plan (or execution plan) in Oracle one of the first things you should figure out is which step completes first.  As a general rule this step is best if it's something like a Primary Key index scan.  The goal with a SQL plan is to  "Start Small and Stay Small".  Hence with a PK type look up you will tend to have a plan that starts with the fewest rows.  So let's take a look at a plan and figure out which step is first to complete.  Here is a query, and its explain plan:


select * from emp, dept

where emp.deptno = dept.deptno and sal > 1000 order by empno

SQL> @hxplan
Enter .sql file name (without extension): emp_dept2
Enter the display level (TYPICAL, ALL, BASIC, SERIAL) [TYPICAL]  :
Plan hash value: 77426881

-------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                |    12 |   720 |     6  (34)| 00:00:01 |
|   1 |  SORT ORDER BY                 |                |    12 |   720 |     6  (34)| 00:00:01 |
|   2 |   MERGE JOIN                   |                |    12 |   720 |     5  (20)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | DEPT           |     4 |    80 |     2   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN            | DEPT_DEPTNO_PK |     4 |       |     1   (0)| 00:00:01 |
|*  5 |    SORT JOIN                   |                |    12 |   480 |     3  (34)| 00:00:01 |
|*  6 |     TABLE ACCESS BY INDEX ROWID| EMP            |    12 |   480 |     2   (0)| 00:00:01 |
|*  7 |      INDEX FULL SCAN           | EMP_DEPT_IDX   |    13 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
       filter("EMP"."DEPTNO"="DEPT"."DEPTNO")
   6 - filter("SAL">1000)
   7 - filter("EMP"."DEPTNO" IS NOT NULL)
SQL>



Some folks have been taught that a plan completes "bottom up".  So they would think that the first step to complete is step 7. But it isn't. Order of execution is Top - Down, but order of completion is a sort of "inside out bottom up". 

From an exection point of view, the first thing this query wants to do is a SORT ORDER BY (Step 1), but it can't do that yet.  It needs rows to sort.  So then it wants to do a MERGE JOIN (2), but to do that it would need two sets of rows to merge together. Then it wants to do a TABLE ACCESS BY INDEX ROWID (3), but to to that it needs ROWIDs from an index.  Next is the INDEX FULL SCAN (4).  This means that it is step number 4 that is the first step to complete.  This index scan will return the rows to the table scan so the rows can be retrieved.

Now this is important to understand.  The index scan and the table scan steps are interactive. The index scan gets one ROWID, passes it to the table scan which reads the row, then the index scan gets the next ROWID and so on.  This means the INDEX FULL SCAN (4) is the first to complete but the TABLE ACCESS BY INDEX ROWID (3) finishes very soon after it.

After step 3 completes, the SORT JOIN (5) step would want to go, this would be were the second set of rows would be sorted.  In this case the second set of rows doesn't need to be sorted.  Why?  Because the INDEX FULL SCAN (7) returns the ROWIDS in DETPNO order to the TABLE SCAN BY INDEX ROWID (6). Since the join is on DEPTNO, the sort would have been down on DEPTNO, but the index is on DEPTNO so the rows are already in the order needed for the MERGE JOIN (2).

BTW - The SORT JOIN step is always there for the second row source in a Sort Merge Join, even if the data isn't sorted, like here.  This is because this temp segment is where the second row source is merged with the first row source.  If the first row source needed to be sorted you would see the SORT JOIN step as the last step for that set of rows as well.

Since in this plan both sets of rows come via and index full scan no data is sorted in this plan. The query was able to leverage the sorted data in both the indexes.

Once steps 7, 6 and 5 complete, there are now two sets of sorted data and the MERGE JOIN (2) can complete and finally the SORT ORDER BY (1) can complete.  This sort is by EMPNO (the ORDER BY in the query).  Now the sorted data is passed back to the user (step 0).

So order of competition is:  4-3-7-6-5-2-1-0.

This plan doesn't do a PK look up as it's first step, but in this case the INDEX FULL SCAN is quite good.  The query is joining all the rows from each table so this type of index scan is appropriate for this join.

As an additional question where did this predicate being applied on step 7 come from?

filter("EMP"."DEPTNO" IS NOT NULL)