Thursday, July 22, 2021

Exadata SQL Optimization in the cloud or not.


“We just moved our Oracle system to Exadata, I don’t need to optimize my SQL any more, right?”

 

Chances are that a lot of your SQL will run amazingly faster in Exadata then it did on a non-Exadata machine.  But to think that optimization is no longer needed is a bit naïve at best.  It is still very possible to have SQL that runs much slower than it should on an Exadata machine. 

 

Here are a two big things that I’ve seen that can cause a query to run in a less than optimal way on an Exadata machine.  

 

1.  Not getting a Smart Scan (offloading).   

 

The number one way that Exadata speeds up a query is the ability to do this thing called Smart Scan.  This is where some (maybe all) of the predicates are “offloaded” to the storage cells to scan the blocks and only return the blocks that satisfy these predicates to the Buffer Cache in the SGA.  

 

Many predicates can be offloaded, but not all.  Generally, the simpler the predicate the more likely it will be offloaded.   You can get a list of what can be offloaded with this query: 

 

SELECT DISTINCT name, datatype, offloadable

FROM v$sqlfn_metadata

WHERE offloadable = 'YES'

ORDER BY name;

 

In a 21c database there are 394 operations that can be offloaded.  That puts the odds in your favor but there could be things you do that aren’t in this list and that will slow things down, a lot.  

 

An example of something that isn’t offloadable is the function calls used to support case insensitive searching when you set the NLS parameters as follows:

 

NLS_SORT = BINARY_CI   (accent-insensitive and case-insensitive sort)

NLS_COMP = LINGUISTIC (sorting and comparison are based on the linguistic rule specified by 

NLS_SORT)

 

Here is a simple query to show what happens (see the code at the end of this if you’d like to run it yourself).  The table is the set of owners form the DBA_OBJECTS view.  The query is run once with the default setting for the two parameters and then run with them set for case insensitive searching as noted above.  Of interest is the predicate information.  I’m using DBMS_XPLAN.DISPLAY_CURSOR to show the plan. 

 

select /*+ gather_plan_statistics */ count(*) from case_test where owner = 'ric';

 

With the default settings this returns zero (0) rows since the names are stored in upper case.  The predicate information in the plan looks like this:

 

storage("OWNER"='ric')

filter("OWNER"='ric')

 

Notice the first one is called STORAGE.  This is the one that can be pushed out to the storage cells for the Smart Scan (offloaded).   Depending on how you get the plan this may just say “Access” and not Storage.  If you see the same predicate with Access and Filter on a same step that is showing one that can be offloaded, and the other is used when it isn’t offloaded.  (More about this is a moment.)  

 

With the NLS parameters set for case insensitive searching this is the predicate information:  

 

filter(NLSSORT("OWNER",'nls_sort=''BINARY_CI''')=HEXTORAW('72696300'))

 

This returned 44 for the count, which is correct in my simple test.  The problem is that this predicate cannot be offloaded. The access step in the plan is the same “TABLE ACCESS STORAGE FULL” but it is being applied after the blocks are returned to the buffer cache in the SGA.  As such, this isn’t taking advantage of the Exadata machine. For this test query it hardly mattered as the table only has about 55,000 rows.  But for tables of significate size, not doing offloading will be a major problem. 

 

How to solve this simple example?  Change the predicate to something like this:

 

select /*+ gather_plan_statistics */ count(*) from case_test where lower(owner) = 'ric';

 

The predicates used are below, and the first one can be offloaded (it likely wasn’t in this case since all the blocks of the table were in the buffer cache).  This returned a count of 44:

 

storage(LOWER("OWNER")='ric')

filter(LOWER("OWNER")='ric')

 

When else might something not get offloaded?  A Smart Scan is a run time decision.  You may have done everything correctly to get a Smart Scan, but it doesn’t happen sometimes.  This is because if the storage cells get over loaded, they will stop doing Smart Scans.   From a code point of view there isn’t really anything you can do about this.  

 

At the DBA level they could spread out the data, mostly by getting more storage cells and redistribute the data.  This can avoid contention at the storage cell level.  Also, it can be a timing issue, don’t try to run everything at the same time.  Even just a few minutes (or even seconds) between jobs might make all the difference.  

 

2.  Using Indexes.

 

This one is a bit tricky.  Indexes are still useful in an Exadata environment, and necessary for things like primary keys.  To compound be this problem, you may be running a mixed work load of transactional based queries (that indexes still will be helpful) and data warehouse queries (that indexes don’t work well for).  Knowing which indexes to keep and which to drop is not exactly clear cut. 

 

You will likely need fewer indexes as compared to non-Exadata.   A really big problem with indexes is that they tend to drive NESTED LOOPS joins (potentially large “stacks” of them) and those can be rather inefficient in Exadata with huge tables.   HASH and SORT MERGE joins tend to be better in Exadata.  Both of these tend to favor full table scans over index scans.  A full table scan can benefit from the Smart scan.  Not so much with index scans.   

 

Other then the INDEX FAST FULL scan (which is works much like a full table scan), offloading doesn’t happen with index scans.  Which as just explained, is the real advantage of using Exadata.  

 

Indexes can still be useful when you are getting a small amount of data from a table. But even here it can be tricky.  Testing is the key.  Here is where you may need to do some SQL tricks to get the optimizer to not use an index, two of my favorites are:

 

COALESCE (NVL doesn’t always work)

In Line Views (ILV), need the NO_MERGE hint

 

Using COALESCE the code would look something like this:

 

COALESCE(my_tab.id,-99999) = other_tab.id

 

The reason that NVL doesn’t always work is that the optimizer knows about things like NOT NULL constraints.  For example, if the column you’re trying not to use an index on is a primary key, the optimizer knows it can’t be null so it will remove the NVL function.  But it wouldn’t remove a COALESCE because it could have multiple options. 

 

To use an ILV it would look something like this:

 

JOIN (select /*+ no_merge */ id, name, status from my_tab) my_tab ON my_tab.id = other_tab.id

 

The NO_MERGE is needed because without it, the optimizer will merge it back into the query and it will be just a normal join, and use the index you’re trying to not use.  With an ILV there are no indexes on any column so it has to do a full scan, which is exactly the point of doing this.  It’s really a good idea to only include the columns you use in the query in the ILV.  You want to keep the ILV small, both in width (columns) and depth (rows) if possible.

 

Of course, other approaches would be to make indexes invisible or use hints to force full table scans.  And there are plenty of other “tricks” to use to make an index not useable for query, like add zero to a number or concatenate a NULL to a string.  Whatever technique you do use, I highly recommend that you put in a comment to explain why you put in something odd in your code.  Use the comment to help yourself know why you did something.  It might be months or years later when you look at this code again, so help yourself and add a comment to anything that is non-standard. 

 

Below is the code I used for the example on the case insensitive searching with the NLS parameters.  If you run it on your system, you’ll like want to change the owner.  I suspect you don’t have a user of RIC that owns any objects in your database. 

 

create table case_test 

as (select owner from dba_objects);

set LINESIZE 200

set PAGESIZE 2000

show parameter NLS_SORT

show parameter NLS_COMP

 

select /*+ gather_plan_statistics */ count(*) from case_test where owner = 'ric'; 

 

COLUMN PREV_SQL_ID NEW_VALUE PSQLID

COLUMN PREV_CHILD_NUMBER NEW_VALUE PCHILDNO

select PREV_SQL_ID,PREV_CHILD_NUMBER  from v$session WHERE audsid = userenv('sessionid')

/

SELECT PLAN_TABLE_OUTPUT 

FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR

('&psqlid','&PCHILDNO','TYPICAL ALLSTATS LAST'))

/

 

alter session set NLS_SORT = BINARY_CI;

alter session set NLS_COMP = LINGUISTIC;

show parameter NLS_SORT

show parameter NLS_COMP

 

 

select /*+ gather_plan_statistics */ count(*) from case_test where owner = 'ric'; 

 

COLUMN PREV_SQL_ID NEW_VALUE PSQLID

COLUMN PREV_CHILD_NUMBER NEW_VALUE PCHILDNO

select PREV_SQL_ID,PREV_CHILD_NUMBER  from v$session WHERE audsid = userenv('sessionid')

/

SELECT PLAN_TABLE_OUTPUT

FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR

('&psqlid','&PCHILDNO','TYPICAL ALLSTATS LAST'))

/

 

alter session reset NLS_SORT;

alter session set NLS_COMP = BINARY;

show parameter NLS_SORT

show parameter NLS_COMP

 

 

select /*+ gather_plan_statistics */ count(*) from case_test where lower(owner) = 'ric'; 

 

COLUMN PREV_SQL_ID NEW_VALUE PSQLID

COLUMN PREV_CHILD_NUMBER NEW_VALUE PCHILDNO

select PREV_SQL_ID,PREV_CHILD_NUMBER  from v$session WHERE audsid = userenv('sessionid')

/

SELECT PLAN_TABLE_OUTPUT 

FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR

('&psqlid','&PCHILDNO','TYPICAL ALLSTATS LAST'))

/

No comments:

Post a Comment