Sunday, February 3, 2019

Changes in the cloud and on the ground

At the start of this year I embarked on a new adventure.  I’m now at Zione Solutions in a consulting role as a Senior Database Administrator and doing what I’ve been teaching for the past so many years.  It’s been quite a change, for one I am now using that strange new join syntax “LEFT OUTER JOIN … ON…” which I had successfully avoided for all these years.  It’s turned out to be harder than I expected to learn.  Also the queries I’ve been tuning are rather huge.  An average query is about 3-400 lines long, and the plans for these will easily be 1000 lines.   These are reporting queries the classic Data Warehouse type queries.  The smaller tables are millions of rows and 10’s of billions in a table is not uncommon.  Also it’s on Exadata.

I’ve seen firsthand that what I’ve been teaching works.  

One of the most important things you can do when tuning a query is; let the numbers lead you to the problem.   Some folks can look at a query and, without running it, make a good guesses as to where the problem will likely be.  And that can work, however why bother?  Run the query and use a tool like SQL Monitor to see how the plan is executing.  In the SQL Monitor report the activity column is a great way to see which step is taking the most percentage of time.   Now you can see exactly where a query is spending all it’s time and then pound that nail down. 

Also it’s about starting and staying small with the rows.  Here is where getting predicates applied at the right time and use of things like the WITH clause can really help.  With Exadata and working with huge volumes of data this concept is still critical to success; however the use of indexes is not a solution more often than not.

Why?  Not because indexes are bad, it’s just that index use tends to serialize the access.  This tends to cause nested loop joins which end up being very expensive many times because the plan will hit the inner table (or index) way too much.  This can be millions of times, even if using a unique index to look up values, doing it over a million times isn’t efficient.  A fast full index scan can work fine, since it works like a full table scan and can be in parallel.

Using subquery factors (also called CTE, Common Table Expression) are very powerful in these queries.  These are those temporary views you define in the WITH clause, typically it’s best to use the MATERIALIZE hint.   This hint forces the optimizer into making the sub query factor a global temp table like structor, without thie hint the optimizer might just merge it back into the query.  This is of course exactly what you don’t want to happen most times. Taking a subquery that is executed even just a couple times and making a sub query factor can be huge.  Also sometimes taking that subquery out of the main query and making it a “stand alone” query can help parallelization.   This can also work well to get that “start small stay small” thing. 

In the team that I’m working in now, it’s speculated that about 60% of the performance problems are solved with using sub query factors (CTEs).

Oh and did I mention LAT (Lateral) views?  Yea these tend to work very poorly in these kinds of queries.    Seeing one or more in these plans is a known killer to performance.  Making them go away can be tricky sometimes.  A typical cause of these is having an OR in the ON clause of an outer join.  These LAT view might work well with smaller data sets, but with large ones these can cause a query to stall at that LAT view for very long periods of time, like hours.  

For me it’s been quite a beginning to the year.  As the saying goes chance is inevitable, except from vending machines.  And this year has had plenty, but still not from vending machines.