Wednesday, August 26, 2020

Cardinality in the cloud or not

I’ve been teaching and doing SQL optimization for many years now. Given a statement that is 5 lines or 500 lines long, the reason an execution plan is not optimal really comes down to the same thing every time. 

The cardinality calculation is off. 

The key is that it should be in the same magnitude as the actual rows returned. Not the exact same value, that’s cool when it happens but not necessary. The value for cardinality is an estimated number of rows, so it’s expected to not be exact. But when it’s way off, either up or down, things go badly. This is when the optimizer picks to use an index, a full scan, a join method, or a join order that causes the plan to go horribly wrong. 

Recently I was at a company that was moving very large queries from another system to Oracle Exadata. These queries were large in just about every sense of the word. The queries were hundreds of lines long many times and some worked with billions of rows in as well. When the cardinality would go wrong everything after that step in the plan just got worse and worse. It wasn’t uncommon for us to fix that sort of thing and take a query from never finishing to completing in a few minutes or even seconds. 

What kinds of things did we do to fix this? Mostly it had to do with changing the code of the query. It wasn’t common that the stats were wrong. And even if they were all we could do was recommend to the DBA team that the stats appeared to be off and hope they would fix it. 

Mostly it was rewrites of the query. 

 One problem was the over use of OUTER joins. Sure, they are needed, but the folks writing these queries tended to use them all the time. This did a few things; one it more or less forced a join order and it didn’t help the cardinality estimate many times. I fixed many queries but changing OUTER joins to INNER joins. The query got the same result set, of course if it didn’t that isn’t a fix! And could run many magnitudes faster because it would get a better join order and/or get a better cardinality. 

Another problem we banged into was a bug with hybrid histograms. This would kick in mostly with very large sets of data. The histogram would lead the optimizer to think it was getting back a very large set of rows many times and very poor decisions would be made because of that. The customer was still on version 12.1 and I’m pretty sure that bug has been fixed. 

 If you notice that columns with a hybrid histogram is giving very bad cardinality estimates you might want change that. How? Of course, you could drop the histogram (really you can’t, you just recollect stats without creating the histogram on that column), we couldn’t do that. We would typically modify that column in the predicate so it couldn’t use the histogram. Mostly we used NVL or COALESCE on the column. I prefer COALESCE myself because it will work on a column even if it’s a primary key, NVL wouldn’t. 

Sometimes it was expressing the predicate(s) in a different way. The simpler a predicate is the less likely a mistake will happen with the math. This typically requires really understanding the question being asked by the query. If you didn’t write the code, then you’ll likely need to chat with whoever did to find out what is going on. 

 Other times we had to resort to using the CARDINALITY hint to get the optimizer to do the right thing. I talked about this in my post "Hints in the cloud or not" this past March. You might want to check that post out too. It’s a quick read. 

The key with optimizing a query is getting that CARINALITY estimate to be in the right magnitude. Walk the plan “up and out” to find were it goes wrong. Stop there and fix that. Which table and predicates are driving the poor estimate? Really did into what that bit of code is doing, can it be rewritten another way? Sometimes one tiny change to a huge query and all is good. Other times it’s an iterative process, make a change, run it, make another and so on. 

Yes, it’s a lot harder to do this then to add more memory or CPU or other resource. But if you don’t fix it and just “add hardware” to make it go away, all you really did was kick the can down the road and it’s going to come back.

Monday, July 27, 2020

The Rise of SQL Optimization in the Autonomous Age in the cloud or not

In the world of IT, the age of Autonomous computing in upon us.  I’m an Oracle guy by trade and Oracle is very much all about Autonomous.   Machine Learning is the latest buzz in the profession and that has a lot of folks thinking that tuning and optimization will be completely in the hands of the autonomous database.


I tend to disagree.  I believe the need for folk who can write optimal code will be just as needed as we embark on this autonomous path as there ever was, maybe more.  


The reason for this is rather simple.  Someone writing code has a purpose for the code, a result set they are interested in seeing.  The better they know the data they are working with and the better then know what they are looking for, the better they can write something to get the results.  This is why many times in my optimization of code I need to talk to the author about what is it they are trying to get in order to find a more optimal way to get the results they want.


Of course, I and the autonomous database can do things like throw indexes at it or use hints/tricks to get the code to do this or that.  But maybe the code is just not the right code to begin with.  An outer join that doesn’t need to be, a subquery that really isn’t returning anything useful for the final result set, and maybe it’s just returning data that isn’t even used. 


When a query is only tens of lines long, some of these things can be seen with enough looking and testing sometimes.  But when the query is hundreds of lines long and with an execution plan with 1000s lines, it’s difficult for someone who doesn’t know the “why” of the SQL to be able to really optimize it. 


This is where the autonomous database will have some trouble optimizing code, if it can do anything at all. Sure, it can solve the performance issue by throwing more horsepower at it (CPUs, Memory, Parallel and the like).   But when the root cause of the performance issue is that the code it not doing the right thing to begin with, the autonomous database will not know that. 


The optimizer in the autonomous database is going to take a SQL statement as written, it will run it.  Yes, the optimizer has some limited capacity to do rewrites on the SQL, which can be rather good many times.   But will it know that the user actually doesn’t even need the data from these 6 tables in this complex subquery that is used four times slightly differently each time?  No, and that is where the human SQL optimization specialist will still be needed.  She/he can talk to the user(s) of the query and find out what is the real intent and what do they really want. 


How does code end up doing things it shouldn’t do?  Shouldn’t the code know what they are asking and write code to just get that?  Why would someone write code that does extra work for no good reason?  


No one writes code like that on purpose I believe.  What happens more often than not is they don’t write the code from scratch.  They take code that sort-of does what they want and then typically add things to it to get it to do what they want.  This will tend to bloat the code in such a way that it ends up doing much more than it should and gets so complex that it’s hard for anyone to read and fully understand, including the optimizer. 


There is a saying about tuning that goes like this “Tune the question not the query”.   And that is still going to be a human activity for the foreseeable future.   The optimizer doesn’t know the real question being asked, it knows the representation of it as presented by the SQL. 

Monday, July 20, 2020

Baking a Cake in the Cloud or not

If you want an efficient system, don’t bake a cake at home, let the store bake it for you.


I’m a SQL performance guy for the most part.  When I look at SQL a key thing is to understand the nature of the SQL and how it fits in to the application.  This is sort of philosophical, I’m trying to understand the why of the SQL.  And one attribute I look for is how is it being used to process the result set of rows.  


And a killer is the ever-popular row-by-row operation.  This is hardly efficient no matter how you look at it.  This is a philosophical issue at the core, how is the writer of the code is seeing the solution from a grand perspective.  


A good analogy is baking a cake, and getting the ingredients in particular.  The situation is this, you need to bake a cake and have none of the ingredients in your kitchen to bake this cake.  You have a list of the ingredients; flour, sugar, milk, eggs and so on.  How do you go about buying them?


Would you get in your car, drive to the store get the flour, pay for it, get back in your car, drive home put it on the counter, and check off flour from your list.  Then get back I the car, and do this cycle over again for sugar and each ingredient.  I sure hope not.  I suspect you would take the list and go to the store once, in the store get all ingredients in your cart (tram) and pay for the whole lot once and now go home and bake the cake.  


But even this isn’t likely best, why even bake the cake at home at all?  How about have the store do all the for you. Many grocery stores (at least in the USA) can bake cakes right in the store, and they are pretty good at it since they have staff and equipment to do it.  (And there are septicity shops where you can get even more elaborate baking creations, which you could do yourself, maybe on the fourth or fifth try.)


If you all you want is simple cake, you can just pick it up, pay for it and go.  Akin to a simple query in the database.  If you want to have a custom cake you call ahead and they have it ready for you when you arrive.  More like a complex query with lots of tables and maybe some fancy mathematical operations and such.  


Certainly, a lot of folks have gotten past the idea of getting “one row at a time” for their SQL.  But the second part of letting the database do most of the work is where many folks have trouble.  This is mainly due to their philosophy and training on coding, they are very good Java, Python, C++ or like programmers.  To them the database is just data, and they only want to get it from there into their application code.  There are queries bringing back millions of rows and then the application does more processing on them.  This is baking the cake at home, fun as baking a cake is, from a performance point of view it’s not ideal. 


Using a stored PL/SQL packaged procedure to get the work done is best.  Push the business logic into the database inside a PL/SQL block.  The database is really good at this and can likely process those millions of rows faster than your application layer can.  This is buying the already baked and decorated cake from the store.  


As I said at the top, if you want an efficient system, don’t bake a cake at home, let the store bake it for you.

Thursday, June 11, 2020

Lock down in the cloud or not

Well this sure has been a year.  And the odd thing is that for me, not much changed.  I do like to go to my local pub for a brew which I can’t.  Luckily, they have their own caning machine for the beers and I can still get great brews from them. 

On the professional side I’ve been doing a lot of on line training.  Some is better than others and it’s great to see the innovations that have been made in this area over the years.  So far, I’ve only done recorded type training, no interactive instructor giving the class.  

One that I just finished was quite good and I thought it was great that this guy did the recordings in a room at his house.  It was done over several days as far as I can tell and was really good training and the casualness of it, I thought was great.  

I’ve done some training on non-Oracle stuff (Ack!) and the more I did the more I missed Oracle.  I suppose a big part of that is that I’ve been using Oracle of 1000 years so it’s much more “natural” for me.  

But when I hear one of the speakers in this other product say “We’ve been doing this a really long time, since 2006.” 

I’m like, what?  I don’t call 14 years “a really long time”.   Sound like you’re just getting into a rhythm to me.  But I guess in today’s world where things change in a heartbeat, 14 years can seem like a “really long time”.  

The old saying of the more things change the more they stay the same is true too.  This Cloud thing is in way a lot like when I started collage and I “dialed in” to connect on a 1200 BAUD modem using a land line phone to have access to the computer.  Sure, it’s way cooler and faster (and has color!) then what I worked on, but the basic idea is the same.  “It’s someone else’s computer” that I’m doing work on.

It's clearly the way to go for IT at least right now.  The cost reduction for hardware and such alone for most companies makes it worth it.  

Who knows what will happen in the future, I'm sure something else will come along and make replace the Cloud, at least as we know it today.  What that will be I haven't a clue but change is the only constant in the universe, so something will.  

And speaking of how much cooler and faster the new computer systems are, I’m just finding out about this Oracle VirtualBox and that is really cool.  It is great for training and testing something.  And it looks like you can run some pretty big stuff with it.  I’d heard of it but never used one.  Now that I have, I think I’m hooked.  It will be great to have this on my laptop for testing and such. 

Well this was a more rambling post then normal, but this seems to be a rambling kind of time and it is the name of this blog.  Some days I’m not sure what day it is, and I really can’t believe it’s June already.  I hope very one is doing well and staying safe. 

Thursday, April 2, 2020

Working from Home

If you are one of the many folks that are suddenly trying to do your job from home, I’d like to take a moment to pass on some things I’ve learned. I’ve been working from home for years. It’s a different mindset. A couple of things I’ve found that might help you deal with this change:
Stay focused. You can get a lot more done at home then in the office. There isn’t the office politics to deal with which give you more time to do your job. On the flip side the job can take over your life. Set yourself hours for work and stick to it. 
Use technology. There is Skype, Zoom, Teams and other programs that can let you connect coworkers and of course your phone. On a gig recently a coworker would typically call me in the morning and he and I would stay on the phone for an hour or more. Not always talking but it was like he was next to me at the office. We could talk about what we were doing and the typical chit-chat. 
Don’t watch TV while working. Some folks can do this, but really its not good. You’ll get less done and feel like you're rushed when you realize you didn’t get such-and-such done. Some music for background noise is likely better and less distracting. 
Eat meals like you would going to an office or job site. Have a breakfast before you start, minimize snacking and have a dedicated lunch time. This keeps you on schedule and like is better for your health then just snaking all the time. 
Get up and move around now and then. Doing a simple domestic task once in a while to help break up the day a bit is good. Or just walk about and check for the mail. Don’t get into a big project, like cleaning the bathroom or painting a room. You wouldn’t do that when at the office or work site so don’t do that when you’re working. 
If you are the manger of folks that are now working from home, you may have the hardest adjustment. This is complex and totally different way to manage folks. In a nutshell you have to trust that they are doing their job. Lording over them and calling them every minute and checking their status on Skype or whatever is a great way to drive them and you crazy. 
Set up scheduled status calls, keep them short. Give them assignments to do and realistic deadlines. Help them with the technology, as best you can get them things they need. For example, having more than one monitor can really boost performance for folks doing a lot of computer-based work. Be open to help them, be it you actually helping them get something done or helping them find the resources they need to get something done. 
Your management skills will really be tested in ways you’ve never imagined. I’ve managed remote teams and it’s a challenge. Hopefully your team will be in the same time zone, that helps a lot. 
Good luck!

Monday, March 30, 2020

Hints in the cloud or not

Hints aren’t hints; they are directives. 
Don’t think of a hint as some way of encouraging the optimizer to do something, it is a demand to do something.  Generally, hints are only “ignored” when you miss spell it or put in something that can’t be done.  For example, have the FULL hint and an INDEX hint on the same table.  There are also cases when optimizing will transforms a query in such a way that hints in effect disappear from the final query that the optimizer really uses.  This isn’t common in most shops but can happen.  

Hints are best used rarely and must be documented when used.

If you find yourself “always” using a hint, then you have a problem.  Two hints I’ve seen used “out of habit” are the NO_MERGE and MATERILZE hints.  Just because you have sub-query doesn’t mean you need the NO_MERGE.   It might not be getting merge anyway and maybe it’s a good idea if it is.  The same basic issues are true with MATERLIZE on CTEs (Common Table Expressions or Sub-Query Factors).  The MATERLIZE hint tells the optimizer to resolve it as a separate query, and turn the results into a temporary table like structure.  And often this is exactly what we want to have happen with a CTE.  But the same can be true of a CTE, it sometimes wouldn’t be merged into the main query and even if it is, sometimes this is a good thing. 

If your testing validates that the hint helps the optimizer do the right thing and you can’t get it any other way, document it’s use.  This documentation, doesn’t have to be elaborate and is really for you.  Always document the code as if you will be working on this code again in two years and have no recollection of working on this code before.  Help yourself understand what you did today.  

Hints can be a fast fix for a bug.

I know this may be a shock, but there are bugs in the optimizer.  This is where the use of a hint might be a better solution then some elaborate coding trick.  Cardinality can be an issue here, and the CARDINALITY hint can help.  And CTEs again can be a problem.  If the cardinality estimate is way off for a CTE and that CTE is used many times in the main query, then the odds of the main query having a good plan are slim.  

If you can’t rework the code of the CTE to get a better estimate then a CARINAILTY hint can really save the day.  BUT this is a hint that could go bad over time.  Always put in a comment as to when this was used and that it will need to be checked in the future.  There may come a time it wouldn’t be needed or that the value used needs to be updated.  Keep in mind that the number needs to be in the correct magnitude, not exactly right. 

Hint should be viewed as a temporary fix and not the final solution. 

All this leads up to this, using hints are temporary fixes to a problem.  Often, we are tuning around bugs or poor coding.  If you don’t have access to the author of the query then significate rewriting of the code can be difficult or impossible.  Using a few well-placed hints can get the code to at least run in an acceptable amount of time.   This can at least get thru an “emergency” and hopefully there will come a time when the code can be rewritten or the bug fix is applied and the hints are no longer needed.

Monday, November 4, 2019

Aliases in Oracle in the cloud or not

Recently I posted in Twitter ( that you shouldn’t use single character aliases.  And I made the point that using full table names is best.  This of course sparked a good dialog that went on for about two days.  

After tuning hundreds of queries with thousands of lines in each of them, I really think that full table names are best.  Next best is shortening the name by removing non-leading vowels.   And even more important is make aliases unique within the statement.  This means if you use the same table more than once, put a 01, 02 and so on after each use.  

But you say “What if the table name is really long?” Are you really that lazy?  You can’t use copy/paste? 

But you say “My query is really short, using single characters just makes sense.”   No, this is about the habit.  You start doing in a small query, 5-6 lines long.  Then you say, well this one is only 20 lines its OK to use single characters.  Then these queries get pushed together, and before you know it you have a several hundred line query with all these “A” and “E” and “C” and what not in it. Which “A” goes with which table?  

And yes this applies to columns and subqueries as well.  Don’t have subquery “A” unioned with subquery “B”.  Give them a name.  Heck I’m OK with SUBQ01 and SUBQ02.  That is way better to search a huge query for SUBQ01 then looking for “A”.  Trust me, that is painful. 

I’m sticking to my guns on this.  Never use single character aliases.  Shorter names are fine, but keep it close to the name of the table.  Sure for EMPLOYEES using EMP is fine, but not E.  Because you’ll find yourself with several subqueries all using “E” and then you’re looking at the execution plan trying to figure out which sub query this predicate goes with.  And instead of it taking seconds, it’s hunting expedition to figure it out. I’ve had to do this way too many times. 

Of course all is good when it’s your code; you (presumably) knew what you were doing when you wrote it.  What about when someone else is reading your code?  

Or you are reading it 5 years later?  Do you really think you’ll remember what you were doing 5 years ago?  Heck I have trouble working on code from yesterday. 

Help others and yourself, don’t use single character aliases.  Make your code easier to read and debug by being more verbose then terse.  You’ll thank yourself later.