This year I’ve been on a gig working with a good sized
Exadata implementation. I really don’t know
how big it is, my job is to tune queries not the database. Before this gig I’d only done a little bit of
work on Exadata and wasn’t well aware of the system. Yes I knew the technical aspects of how
Exadata worked, the storage cells, offloading, and such. But to work with it day after day is something
much more.
Indexing is a hot topic around Exadata. There are some who think there shouldn’t be
any indexes at all on an Exadata system.
That is likely a bit extreme. Fewer
indexes for sure. Primary Keys must have
them and Foreign Keys really should. Other indexes that would be created for performance
might not be a good idea.
There are two reasons of this. One is that there is a chance that offloading
wouldn’t happen on an index scan. This
is the ability for the Exadata machine to run the predicate at the storage device.
This offloading is very powerful. The storage cells in effect pre-screen the
data at the storage level then only bring back data that matches. With an index this might not happen. But with each release of Exadata more and
more types of scans are supported at the storage cell level so this keeps
becoming less of an issue.
The other reason, which is really the bad one, is that
indexes tend to drive NESTED LOOPS joins.
And they tend to be less then optimal many times in Exadata. Not
always of course, I have seen many plans with NESTED LOOPS joins and they are
quite nice. However, when you start
joining millions or billions of rows in a NESTED LOOPS, well things tend to
good badly.
Getting the optimizer to not use an index can be done many
ways. Of course you can use the FULL
hint, but many shops discourage the use of hints. And I generally agree with this stand. It’s not that hints are “bad” it’s really that
folks lean on them too much. Too often I
will see over use of hints, and it’s clear many times the coder doesn’t really know
what to do, so he/she just throw a bunch of hints at it hoping they will
work.
Other techniques mostly revolve around making it “impossible”
for the index to be used. Some examples, use TRIM or COALESCE or NLV on
the indexed column in the join. And
another is making the table an in line view (ILV) which many times will need a
NO_MERGE hint.
Personally I like the COALESCE. The great thing about that is that it works
on unique key indexes. NVL doesn’t work,
because the optimizer recognizes that the field is already unique and removes
the NVL form the predicate.
So what to do? Let
the numbers lead you to the problem. Get
an execution plan with stats, SQL Monitor is my go-to tool for this, and look
at where all the time and resource usage is going in the plan. Focus on that part of the plan. Sometimes removing it is the answer. Just had a query that was running sub-query
to check a value which was really resource intensive. Turned out, that code was useless. Removed it, job done.
If you do have to do it, how can you make this access
faster? Is the index a good idea? Is the full scan a good idea? Is it the right type of join of the
data? Are the predicates getting applied
in the right place? Can another
predicate be applied to this table? Are
the predicates well defined?
It is way more work, but more often than just doing a hint
here or there, rewriting the code is the best way to tune a query.
No comments:
Post a Comment