Way back in Oracle time, before SQL_ID there was just
HASH_VALUE to identify statements in the shared pool. Oracle used a technique that basically takes some
number of charters of the statement and using a magical hashing algorithm computes
a hexadecimal number to represent the statement in the shared pool. What the optimizer would do at parse time is
use this value to see if the statement was already in the shared pool. But, since no hash algorithm is perfect,
there would be collisions where two different statements would have the same hash
value. So the optimizer would then
compare character by character to verify they were the same.
This worked fine but as databases got bigger and SQL
statements got longer, the collision rate went up and the optimizer was
spending too much time looking for queries, so SQL_ID came along in version 10.
This really is much like HASH_VALUE, but uses more characters of the statement
and is also in base 32 numbering. The
SQL_ID looks a lot more like a string then a number, but it’s a number.
Then in 10.2 we started to see PLAN_HASH_VALUE. What is this?
One of the goals of the folk writing the optimizer is to
make it smarter and faster. This
cascades into the run time of the SQL.
If we can either parse or find the already parsed plan faster than we
can run faster. Also why have multiple
plans that really are the same BUT have slightly different statements? Wouldn't it be nice if two (or more) statements that
“look different” but generate the same plan, just have ONE plan to use?
Yes it would be, and that is where
PLAN_HASH_VALUE comes in.
Take a look at this very simple example:
SQL> @empno_test
SQL> alter system flush shared_pool;
System altered.
SQL> set echo on feed on
SQL> exec
dbms_application_info.set_module('LAB1','EMPNO1');
PL/SQL procedure successfully completed.
SQL> select ename, job from emp where empno
= 7902;
ENAME
JOB
---------- ---------
FORD
ANALYST
1 row selected.
SQL> exec
dbms_application_info.set_module('LAB1','EMPNO2');
PL/SQL procedure successfully completed.
SQL> select ename, job from emp where 7902 =
empno;
ENAME
JOB
---------- ---------
FORD
ANALYST
1 row selected.
SQL>
SQL> exec
dbms_application_info.set_module(null,null);
PL/SQL procedure successfully completed.
SQL> set lines 10000
SQL> column action format a10
SQL> column executions heading EXEC format
9999
SQL> column version_count heading VERSIONS
format 9999
SQL> column hash_value format 99999999999
SQL> column sql_text format a50
SQL>
SQL> select action, executions, version_count,
hash_value, sql_id, plan_hash_value, sql_text
2 from v$sqlarea
3 where module = 'LAB1' and command_type != 47
4 order by hash_value;
ACTION
EXEC VERSIONS HASH_VALUE
SQL_ID PLAN_HASH_VALUE SQL_TEXT
---------- ----- -------- ------------
------------- ---------------
--------------------------------------------------
EMPNO2
1 1 77702346 ay1h1g42a396a 4120447789 select ename, job from emp
where 7902 = empno
EMPNO1
1 1 116025303 1cvmsb83fntyr 4120447789 select ename, job from emp
where empno = 7902
2 rows selected.
Notice each statement has its own unique HASH_VALUE and
SQL_ID. And both have the same
PLAN_HASH_VALUE. This means that
although they are truly two different statements in the view of the optimizer, but
they share the same plan. And that plan
is:
SQL> select /*+ gather_plan_statistics*/ ename, job from emp where 7902 = empno;
ENAME JOB
---------- ---------
FORD ANALYST
1 row selected.
ENAME JOB
---------- ---------
FORD ANALYST
1 row selected.
SQL> select * from
table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
SQL_ID
8nuj6ncwusvdt, child number 0
-------------------------------------
select /*+ gather_plan_statistics*/ ename, job
from emp where 7902 =
empno
Plan hash value: 4120447789
--------------------------------------------------------------------------------------------
| Id |
Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 |
SELECT STATEMENT | | |
| 1 (100)| |
| 1
| TABLE ACCESS BY INDEX ROWID| EMP |
1 | 18 | 1
(0)| 00:00:01 |
|* 2
| INDEX UNIQUE SCAN | EMP_EMPNO_PK | 1 |
| 0 (0)|
|
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation
id):
---------------------------------------------------
2 -
access("EMPNO"=7902)
20 rows selected.
A key point is that the two
statements are DIFFERENT. They will each
take up some space of their own in the shared pool in the SGA, but the good
news is they do share the same plan.
The HASH_VALUE and the PLAN_HASH_VALUE are different numbers used differently. The explain plan (see above) shows the PLAN_HASH_VALUE not the HASH_VALUE, be careful not to confuse the two.
Pretty cool, eh?
Awesome!
ReplyDeleteThank you, very good explanation.
ReplyDeletemuch clear, thank you!
ReplyDelete