Tuesday, April 20, 2021

SQL Optimization it’s not always about time in the cloud or not


Most often when presented with a Query that needs to be optimized it’s about how long it takes to run.  Something like, “The query is running for over an hour, can we make it run faster?”   But what about a query that runs quickly each time it runs but runs hundreds (or thousands) of times a minute?   Sure, maybe it runs in a second or less, but the cumulative work caused by the query may add up to something that is a problem.  This is commonly referred to as a scaleability issue. 

These types of queries show the issue more with their LIOs (Logical IOs) rather than time.  Here is an example query from many years ago to illustrate the point.   A query much like this was run at a University and on a per run bases it was running in acceptable time.  However, it was showing up in the “Top SQL Report” because it was run very often.  




When it runs the execution plan with stats looks like this:




This was original run way back in version 8 or 9 as I recall, and this version of the code it running on a 21c Autonomous Database, notice the “STORAGE FULL” option for the full table scan.  Even with this way cool stuff, this isn’t really optimal. 

There isn’t a set of steps that is really screaming out “I’m the problem!” in this plan.  None of the operations appear to be taking a lot of time or recourses (like LIOs). 

When looking at a plan like this, one of the first things to do is to figure out “Who’s who in the Zoo?”  Look at the output and look at the tables in the query, do we need them all? 

The output looks like this:





Looking at the tables we have 

CLASS: Class information
STUDENTS: Student information 
GRADES: Grades the students received for a class
STUDENTCLASS: Intersection table for the Many-to-Many relationship between students and class 

Seems clear that the first 3 tables are needed for the output, but what about the STUDENTCLASS table?  It would be necessary if this report was reporting about more than one class, but it isn’t.   The report is only on one class (English 101 in this instance).  

Looking into the tables we would find that the GRADES table has both the STUDENT_ID and the CLASS_ID columns so we can stitch the tables together with GRADES and we don’t need STDENTCLASS at all. 

Also, there is the access to the CLASS table, from the plan we see that the optimizer chose a CARTESIAN join.  Why?  It’s not because there is a missing join, it’s because it thinks it’s only going to get one row from the table, and it does.  Which is odd since it’s doing a full table scan to get the one row.   How could it know that it was only going to get one row? 

The statistics on that table.  Notice that the table has 10 rows and there are 10 distinct values for the class description column. 




OK so that means the CARTESIAN join isn’t a “problem” in the sense that it’s causing a lot of rows to be unnecessarily returned.  But isn’t there a better way to get one row from the table?  Sure there is, use the primary key, the CLASS_ID column. 

The class is being picked from a list of values so we can change the code to send in the CLASS_ID rather than the CLASS_DESC to the query.  This will use the already existing index on the CLASS_ID column and since it’s a unique index, this is a much better way to find the one row.  Always try to use things that already exist before creating something new. 

Now the rewritten query would look something like this (I kept in some of the old stuff so you can easily see the changes.) 




The new execution plan with stats looks like this:




The key here is the LIOs, it was cut a little bit more then 50%.  Yes, the time went down as well.  However, it’s not even noticeable, 411 micro-seconds to 360 micro-seconds cannot be perceived by a human. 

What did happen is by cutting the LIOs in half (22 down to 10) over all contention went down and throughput was improved in a noticeable way.  In the actual production environment, the tables were much larger than this test case.  The basic improvement was the same cutting the LIOs by over 50%.  With these changes, this query dropped off the “Top SQL Report” for the University.