Tuesday, December 26, 2023

CTEs and UNIONs, Wrap it up!


CTEs (Common Table Expressions) can solve a lot of performance issues.  Maybe not all of them, but a majority of them.  

 

 

I ran into a situation recently that was quite frustrating.  A customer I was working with had created a rather large and complex CTE that was 3 select statements with the UNION ALL operation pulling them together.  I’m a huge fan of the /*+ MATERIALIZE */ hint for CTEs as this will keep the CTE as a separate entity.   Using this hint more times than not is best for performance. 

 

 

But this CTE wouldn’t get materialized.  It was in effect ignoring the hint.  

 

 

Turns out a CTE using UNION or UNION ALL will do this.  I didn’t find any official documentation on this, given that the MATERIALIZE hint is really undocumented anyway, this is not a surprise. 

 

 

A solution I found was to “wrap it” in another select. 

 

 

Here is an example query to illustrate the technique.  Yes, this is a silly query but it works for showing what I’m talking about. 

 

 

WITH loc_cte AS (

    SELECT /*+ MATERIALIZE */

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'NEW YORK'

    UNION 

    SELECT

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'DALLAS'

    UNION 

    SELECT

         ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'CHICAGO'

)

SELECT ename, sal, dname FROM loc_cte

ORDER BY sal, ename;

 

 

For this one the explain plan looks like below.  Notice there is not a materialized object for the CTE, the main query and the CTE have been merged together.  There are remnants of the CTE in here, but key is that the CTE is not being materialized. (The focus here is not what the optimizer did when it transformed the query. Maybe a topic for another day.)

 

 

--------------------------------------------------------------------------------------------

| Id  | Operation                          | Name               | Rows  | Bytes | Time     |

--------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                   |                    |    12 |   348 | 00:00:01 |

|   1 |  SORT ORDER BY                     |                    |    12 |   348 | 00:00:01 |

|   2 |   VIEW                             |                    |    12 |   348 | 00:00:01 |

|   3 |    SORT UNIQUE                     |                    |    12 |   414 | 00:00:01 |

|   4 |     UNION-ALL                      |                    |       |       |          |

|   5 |      NESTED LOOPS                  |                    |     3 |    99 | 00:00:01 |

|   6 |       NESTED LOOPS                 |                    |     5 |    99 | 00:00:01 |

|*  7 |        TABLE ACCESS STORAGE FULL   | DEPT               |     1 |    20 | 00:00:01 |

|*  8 |        INDEX RANGE SCAN            | EMP_DEPT_IDX       |     5 |       | 00:00:01 |

|   9 |       TABLE ACCESS BY INDEX ROWID  | EMP                |     3 |    39 | 00:00:01 |

|  10 |      NESTED LOOPS                  |                    |     9 |   315 | 00:00:01 |

|  11 |       NESTED LOOPS                 |                    |    10 |   315 | 00:00:01 |

|  12 |        VIEW                        | VW_JF_SET$5E11AB46 |     2 |    44 | 00:00:01 |

|  13 |         SORT UNIQUE                |                    |     2 |    40 | 00:00:01 |

|  14 |          UNION-ALL                 |                    |       |       |          |

|* 15 |           TABLE ACCESS STORAGE FULL| DEPT               |     1 |    20 | 00:00:01 |

|* 16 |           TABLE ACCESS STORAGE FULL| DEPT               |     1 |    20 | 00:00:01 |

|* 17 |        INDEX RANGE SCAN            | EMP_DEPT_IDX       |     5 |       | 00:00:01 |

|  18 |       TABLE ACCESS BY INDEX ROWID  | EMP                |     4 |    52 | 00:00:01 |

--------------------------------------------------------------------------------------------

 

 

 

Now the change to the plan is to wrap the CTE in a select.  I’d recommend you don’t use SELECT * for the wrapping select.  I’m not a fan of this syntax as it can cause problems.  

 

 

WITH loc_cte AS (

    select /*+ MATERIALIZE */ ename, sal, dname from (

    SELECT 

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'NEW YORK'

    UNION 

    SELECT

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'DALLAS'

    UNION 

    SELECT

         ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'CHICAGO'

))

SELECT ename, sal, dname FROM loc_cte

ORDER BY sal, ename;

 

 

And now the plan shows the CTE staying as a separate entity (it’s been materialized, Beam me up Scotty!). 

 

 

-----------------------------------------------------------------------------------------------------------

| Id  | Operation                                | Name                        | Rows  | Bytes | Time     |

-----------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                         |                             |    12 |   348 | 00:00:01 |

|   1 |  TEMP TABLE TRANSFORMATION               |                             |       |       |          |

|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_3FD9E6966_7A9C9340 |       |       |          |

|   3 |    VIEW                                  |                             |    12 |   348 | 00:00:01 |

|   4 |     SORT UNIQUE                          |                             |    12 |   828 | 00:00:01 |

|   5 |      UNION-ALL                           |                             |       |       |          |

|   6 |       NESTED LOOPS                       |                             |     3 |    99 | 00:00:01 |

|   7 |        NESTED LOOPS                      |                             |     5 |    99 | 00:00:01 |

|*  8 |         TABLE ACCESS STORAGE FULL        | DEPT                        |     1 |    20 | 00:00:01 |

|*  9 |         INDEX RANGE SCAN                 | EMP_DEPT_IDX                |     5 |       | 00:00:01 |

|  10 |        TABLE ACCESS BY INDEX ROWID       | EMP                         |     3 |    39 | 00:00:01 |

|  11 |       NESTED LOOPS                       |                             |     9 |   315 | 00:00:01 |

|  12 |        NESTED LOOPS                      |                             |    10 |   315 | 00:00:01 |

|  13 |         VIEW                             | VW_JF_SET$B1ADC4B5          |     2 |    44 | 00:00:01 |

|  14 |          SORT UNIQUE                     |                             |     2 |    80 | 00:00:01 |

|  15 |           UNION-ALL                      |                             |       |       |          |

|* 16 |            TABLE ACCESS STORAGE FULL     | DEPT                        |     1 |    20 | 00:00:01 |

|* 17 |            TABLE ACCESS STORAGE FULL     | DEPT                        |     1 |    20 | 00:00:01 |

|* 18 |         INDEX RANGE SCAN                 | EMP_DEPT_IDX                |     5 |       | 00:00:01 |

|  19 |        TABLE ACCESS BY INDEX ROWID       | EMP                         |     4 |    52 | 00:00:01 |

|  20 |   SORT ORDER BY                          |                             |    12 |   348 | 00:00:01 |

|  21 |    VIEW                                  |                             |    12 |   348 | 00:00:01 |

|  22 |     TABLE ACCESS STORAGE FULL            | SYS_TEMP_3FD9E6966_7A9C9340 |    12 |   348 | 00:00:01 |

-----------------------------------------------------------------------------------------------------------

 

 

And there you have it!  Here is the full script if you’d like to run it on your own.  Enjoy! 

 

 

explain plan for

WITH loc_cte AS (

    SELECT /*+ MATERIALIZE */

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'NEW YORK'

    UNION 

    SELECT

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'DALLAS'

    UNION 

    SELECT

         ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'CHICAGO'

)

SELECT ename, sal, dname FROM loc_cte

ORDER BY sal, ename;

 

set lines 4000

set pages 5000

select * from table(DBMS_XPLAN.display(format=>'advanced -cost -projection'));

 

 

explain plan for

WITH loc_cte AS (

    select /*+ MATERIALIZE */ ename, sal, dname from (

    SELECT 

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'NEW YORK'

    UNION 

    SELECT

        ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'DALLAS'

    UNION 

    SELECT

         ename, sal, dname

    FROM

        emp, dept

    WHERE

            emp.deptno = dept.deptno

        AND dept.loc = 'CHICAGO'

))

SELECT ename, sal, dname FROM loc_cte

ORDER BY sal, ename;

 

set lines 4000

set pages 5000

select * from table(DBMS_XPLAN.display(format=>'advanced -cost -projection'));