Wednesday, November 2, 2022

Automatic Plan Management – not the cure all.


There has been a decent amount of buzz out there about Oracle’s Automatic SQL Plan Management.  It’s not exactly new, this feature has its roots in 11.  Starting in 19 for Autonomous databases it is on by default.   Which isn’t necessarily a bad thing.

 

My concern is that folks may make a bad assumption about how this works.  

 

The actually assumption that this feature works on is that the code is well written to begin with.  This isn’t a mechanism  that will be able to rewrite code to make it better, it will just make sure that the plan used for the code doesn’t get worse.  Those are two very different things.  

 

A good example would be code that doesn’t scale well.  The classic example being correlated subqueries in the SELECT LIST (and the WHERE clause can be worse, just not as common).   Any correlated sub query has the potential to run for each row returned by the main query.  If the main query is only returning a relatively small set of rows, these types of queries can run well.  (Yea this is a topic I talk about a lot, because I see it a lot in queries that run poorly.) 

 

However, as the data grows these queries will by nature run slower.

 

Automatic Plan Management can only do so much.  It can try to get the best plan for the code as written.

 

Also, this sort of automation tends to be reactionary, things have to run badly before it knows it’s bad.   An analogy would be driving to a reasonably far away destination.  If you and I were about to get on a highway and see that it is backed up, we could avoid getting on the highway and driving a slightly different route that still keeps on relativity on time to the destination.  The optimizer would just get on the highway (following the plan) and get to the destination late and then realize that was a bad idea.  

 

The bad assumption I referred to earlier that folks might make, is that this feature will make up for poor code.  It can’t.  Bad code will still have a less then optimal plans and this feature will do it’s best to try to use the best plan for it, but it will never be a good as a well written statement.   I use the joke that a car designed to run on square wheels will never run well, no matter how good the suspension and how powerful the engine is, it will be ruff ride and be highly inefficient.  

 

With a well written (and scalable) statement to start with, yes, this feature is likely to help.  But with poor statements to begin with, this may just add a layer of complexity to figuring out why the performance is so bad. 


But with enough beer, you might not notice there is a problem eh?  

Wednesday, August 17, 2022

REGEXP_LIKE and LIKE


I don’t use REGEXP much but ran across this interesting issue just recently.  

Apparently REGEXP_LIKE doesn’t like parentheses in search strings, the LIKE operator doesn’t have an issue with parentheses.  

The full code will be at the bottom of this if you’d like to try it yourself.  I did run this on a super-cool Autonomous Database, version 21.3 so it’s not an old version thing.

What I do in the test is compare a column form one table to another, this is simulating the problem that I ran into recently.  In the string are some parentheses.  And you’ll notice that when these values are in the first string in the REGEXP_LIKE operation, it fails.

The Right-side data will be this (Table called regextest01, yea I was not very creative with my table name for this):

'ABCD(XXX)101',  'ABCD(xxx',  'ABCD)xxxx'

Then the Left-side will be these values one at a time (Table called regextest02, yea I was not very creative with my table names for this example):

'ABCD',  'ABCD(xxx)',  'ABCD(',  'ABCD)'

The first one as you expect works fine for both the REGEXP_LIKE and the LIKE, returning all three matching rows. 

The second one there is no error for REGEXP_LIKE however it returns no rows. The LIKE returns the one you would expect with the value in the first table of 'ABCD(XXX)101'.   Hum that’s odd, it appears that the parentheses inside the string for REGEXP_LIKE are not treated as part of the string but as some sort of parameter or something.  Very odd to me at least.

And the last two error out with:

Error report -

ORA-12725: unmatched parentheses in regular expression

For the REGEXP_LIKE, but work just fine with the LIKE.

Interesting.   So, for me at least, just one more reason to not use REGEXP_LIKE.  Clearly the parentheses are a problem for REGEXP_LIKE and not an issue for LIKE. 

In the script below each comparison is done one at a time.  Just to make it easier to see how each comparison works.  Full code here so you can run this and see it live and in color, well maybe not much color:


/* test code of REGEXP_LIKE testing for parentheses */

CREATE TABLE regextest01 (

    id      NUMBER,

    string01 VARCHAR2(20)

);


CREATE TABLE regextest02 (

    id       NUMBER,

    string02 VARCHAR2(20)

);


insert into regextest01 values (11, 'ABCD(XXX)101');

insert into regextest01 values (22, 'ABCD(xxx');

insert into regextest01 values (33, 'ABCD)xxxx');

insert into regextest02 values (11, 'ABCD');

prompt Value in regextest02: ABCD

prompt Using REXEXP_LIKE:

select * from regextest01 rt01, regextest02 rt02 

where regexp_like(rt01.string01, rt02.string02, 'i');

prompt Using LIKE:

select * from regextest01 rt01, regextest02 rt02 

where upper(rt01.string01) like '%'||upper(rt02.string02)||'%';

/* Change the comparison string to have an opening and closing parentheses in it*/

delete from regextest02;

insert into regextest02 values (11, 'ABCD(xxx)');

prompt Value in regextest02: ABCD(xxx)

prompt Using REXEXP_LIKE:

select * from regextest01 rt01, regextest02 rt02 

where regexp_like(rt01.string01, rt02.string02, 'i');

prompt Using LIKE:

select * from regextest01 rt01, regextest02 rt02 

where upper(rt01.string01) like '%'||upper(rt02.string02)||'%';

/* Change the comparison string to have an opening parentheses in it*/

delete from regextest02;

insert into regextest02 values (11, 'ABCD(');

prompt Value in regextest02: ABCD(

prompt Using REXEXP_LIKE:

select * from regextest01 rt01, regextest02 rt02 

where regexp_like(rt01.string01, rt02.string02, 'i');

prompt Using LIKE:

select * from regextest01 rt01, regextest02 rt02 

where upper(rt01.string01) like '%'||upper(rt02.string02)||'%';

/* Change the comparison string to have an closing parentheses in it*/

delete from regextest02;

insert into regextest02 values (11, 'ABCD)');

prompt Value in regextest02: ABCD)

prompt Using REXEXP_LIKE:

select * from regextest01 rt01, regextest02 rt02 

where regexp_like(rt01.string01, rt02.string02, 'i');

prompt Using LIKE:

select * from regextest01 rt01, regextest02 rt02 

where upper(rt01.string01) like '%'||upper(rt02.string02)||'%';

drop table regextest01 purge;

drop table regextest02 purge;


Wednesday, March 2, 2022

Comments in the cloud or not.

SQL is great and to make it more readable comments are wonderful.  Most code need more comments, some a lot more.  I know you're saying “Hey this code was hard to write, so it should be hard to read!”  Yea, no.  I tend write comments for me, not you.  I might have to maintain this code in the future and I’d like to at least give myself some idea of why I did certain things.

But let’s talk about HOW to do your comments.  Please use the /*  */ syntax more than not.

Why?  Take a look at this fictional code bock:

-- Getting base information for XTLOP processing select col1, --col2, col3 from mycooltable mct, othercooltable oct where  mct.id=oct.id and oct.name in (‘TIM’, ‘KELLYN’,--‘JEFF’,’RICHARD’) -- and mct.abc_id = 42 and mct.working = 101;

So, what is and what isn’t’ a comment?  Yea not exactly clear. Is the whole statement commented out? Was JEFF and RICHARD commented out of the list or just JEFF?  Are both the last two where conditions commented out? 

But Ric, the code wouldn’t look like that!  Oh yes it can.  Depending on how you are getting the code it very well may be just a block of code where line integrity is lost.  I’ve had to deal with this time and again when working on a system that I have limited access to in particular.  Sometime you can make pretty good guesses as to what the code looks like.  

But what if the code had been like this:

/*Getting base information for XTLOP processing*/ select col1, /*col2,*/ col3 from mycooltable mct, othercooltable oct where  mct.id=oct.id and oct.name in (‘TIM’, ‘KELLYN’,/*‘JEFF’,’RICHARD’*/)  /*and mct.abc_id = 42*/ and mct.working = 101;

Now it’s crystal clear what is and what is not commented out.  Yes, it is a bit harder to do the comments with the /* to start and */ to end it, but it does make for much more readable code. 

The example used here is pretty simple and likely not too hard to figure out for the most part. I’ve had to deal with this sort of thing but with hundreds of lines of code.  Not trivial to “parse” something like that.   Comments are good, use them more than not.  And when you do, please use the block comment style with /* and */ rather than the line comment style with the --.  

Now I need to talk with Jeff Smith about changing how SQL Developer comments lines.   Maybe if I send him some muffins, he’ll take interest in my suggestion. 






Tuesday, February 1, 2022

SELECT * in the cloud or not


I was in a recent Tweeter debate about the use of SELECT * within CTEs (Common Table Expressions) in particular.  But really this goes beyond just using them in CTEs. 

 

Sure the * (star or splat as some folks say) is a nice short cut to get all the columns from a table (or set of tables).  But it can be a problem.  

 

For example, take this simple query:

 

select * from emp, dept where emp.deptno = dept.deptno;

 

This works just fine in SQL*Plus (or SQLcl).  There is a duplicate column name in the two tables DEPTNO.  SQL*Plus handles this just fine showing the same column twice.  SQLcl names the second occurrence of the column DEPTNO_1, below output is from SQL*Dev.  Knowing which column is from which table isn’t too hard to figure out (the columns are in order as defined in the tables) but it isn’t necessarily obvious to the casual observer. 

 

 


 

But put that same query in a CURSOR in a PL/SQL block and you got trouble:

 

declare 

   CURSOR EMP_CUR is

   select * from emp, dept where emp.deptno = dept.deptno;

begin

   FOR EMP_REC in EMP_CUR

   LOOP

   DBMS_OUTPUT.PUTLINE('Name: '||EMP_REC.ENAME);

   END LOOP;

end;

/

 

Error report -

ORA-06550: line 5, column 19:

PLS-00402: alias required in SELECT list of cursor to avoid duplicate column names

ORA-06550: line 5, column 4:

PL/SQL: Statement ignored

 

PL/SQL doesn’t like duplicate column names and will not run.  Bummer.  This can be a huge problem when working on a query and you’re embedding it into a testing script that is calling the SQL within a PL/SQL block.  And imagine a query that is 100s of lines with many subqueries.  This can be a nightmare to find the duplicate columns.  

 

Yes, I’ve had this exact problem in the past.  

 

More than once.  

 

Not fun. 

 

Then there is the issue of CTEs.  Why not have CTEs with SELECT * FROM? Even if we don’t have the duplicate column issue, this can make the code harder to debug.  It can be difficult to figure out “where did this column come from?” Think of a set of CTEs that are, in effect, processed in a sequential fashion.  When each one has SELECT * FROM  as its main select list and has a prior CTE (or more than one) in the FROM.  It can be quite a trick to figure out where a column enters into the mix.

 

Also, this certainly can be wasteful, and might even be a problem.  The issue is memory.  Each CTE is often materialized, meaning it becomes a structure that is akin to a temporary table.  This table like structure takes up space in the PGA (Program Global Area, not Professional Golf Association). 

 

It may not seem like much but each column takes up space within the row, and that space is multiplied by the number of rows in the return set.  Sure, when I say do a SELECT * FROM the good old EMP table with only 9 columns and 14 rows, it doesn’t really make a difference if I were to leave out some columns.  

 

But we are rarely optimizing a query with tables that small, either in columns or rows.  Say you have a table with 80+ columns, some are the very popular VARCHAR2(4000), and many have at least some data in them, and to top it off, you don’t use any of them in your query.  With a CTE returning 100s of thousands into the millions of rows, that’s a lot of wasted space.  And that’s just one table. 

 

Also remember that it’s unlikely that many of our queries are just run by one person.  Take that query and have a couple 100 (or 1000) folks running it at more or less then same time.  This makes that extra (unnecessary) memory usage a problem. 

 

And just to add to the fun, there is the problem of someone altering the table to add a column (or columns) that your query wasn’t expecting which could break your code.  Typically, this will be an output issue even if the query still executes without error.  And how about they add a column that now causes a duplicate column name that wasn’t there before! Ouch! 

 

Of course dropping a column from a table that is used in your query is a problem whether naming the columns or using the star. 

 

But Ric! It’s just so easy to use SELECT * FROM!  Yes, I understand it many take a bit of work and time to get that column list tweaked just right. But it sure can make working on the query later easier and can save on memory which can be seriously helpful in the running of the SQL on the system over time.


I know you may be thinking something like "this code I'm writing is a short simple query, it's no big deal to use the  SELECT * FROM."  That might be true, but this is also about the habit, like using your blinkers on your car, best to get in the right habit of using the blinkers every time you make a turn.  It's better for everyone.  


Monday, January 10, 2022

I’m back! (I never really left.)



Pretty much my entire professional career has been Oracle related.  The first time I heard of Oracle as a company I was in college.  The professor of the RDMS class I was taking mentioned that he read about Oracle and thought they had a good product.  (We studied Ingres in class, I think it’s still around.)  

 

My first use of Oracle was version 4 or 5 (not real sure) doing some in house training at a small consulting company.  Then a few years later I went in full on with Oracle 6 at Ford.  I also supported a 5 database at the time.  From then on, it was Oracle stuff.  

 

I started working at Oracle Education (Now Oracle University) as an instructor in 1995.  I did several other gigs in Oracle including Education Manger, Curriculum Developer and Technical Manger in consulting.  I went to Hotsos in 2005 doing mostly teaching but also some consulting.  Hotsos went out of business in December 2018, a dark day indeed.  I went to Zione Solutions as a consultant doing several gigs there.  

 

And now I’m back at Oracle as a Sr. Principal Software Engineer.  (Hum, do I get a cool Railroad Engineer hat for this? I’ll have to check on that.)

 

It’s been a heck of a ride so far, and more to come! 

Wednesday, January 5, 2022

NO_DATA_FOUND in the cloud or not.

 


This is pretty basic for PL/SQL but worth a quick review.   

 

When you do a SELECT … INTO in a PL/SQL block, if it returns no rows the exception NO_DATA_FOUND is raised.    

 

Why is this important?  Consider this block of code:

 

DECLARE

    today DATE;

BEGIN

    SELECT

        sysdate

    INTO today

    FROM

        dual

    WHERE

        1 = 2;

    IF today IS NULL THEN

        RETURN;

    END IF;

    dbms_output.put_line(today);

END;

/

 

This is a simplified example of something I just ran into.  The idea was that the select was getting a value from the table in a variable.  If nothing was returned, exit out of the procedure, otherwise continue processing.  Of course, the predicate wasn’t “1=2” in the code I was reviewing, this is to simulate the data not existing in the table. 

 

Important Note: The column being selected is a NOT NULL column, so the returned value will never be null so long as the condition was TURE.  It would only be “NULL” if the condition was FALSE, as in, no rows returned.

 

What does happen when this is run, is that the unhandled exception NO_DATA_FOUND is returned to the calling environment.  The IF checking for the NULL is not run when there is no data returned, the exception is raised and the execution immediately returns to the calling environment in an error state. 

 

ORA-01403: no data found

ORA-06512: at line 4

01403. 00000 -  "no data found"

*Cause:    No data was found from the objects.

*Action:   There was no data from the objects which may be due to end of fetch.

 

Not likely the intent of the coder. Maybe, but given that IF right after the SELECT … INTO it seems highly unlikely that this was the intention.  If it was the intention to have it exit in error, then the IF shouldn’t be there. 

 

How to fix this? The most direct way is to handle the exception locally like this.

 

DECLARE

    today DATE;

BEGIN

    BEGIN

        SELECT

            sysdate

        INTO today

        FROM

            dual

        WHERE

            1 = 2;

 

    EXCEPTION

        WHEN no_data_found THEN

            RETURN;

    END;

 

    dbms_output.put_line(today);

END;

/

 

The double “BEGIN” in this example looks odd, the code I was working with this was a large procedure (3200+ lines long) so it was just a small part of the code.  

 

The intent here with surrounding the SELECT … INTO with an anonymous block is to handle the exception locally.  Now the block runs with:

 

PL/SQL procedure successfully completed.

 

Which is likely what the original coder was intending to have happen.   There are other things to consider, you could have an exception handler at the bottom of the code to handle all NO_DATA_FOUND exceptions rained, assuming you want to do the same thing if any of the SELECT … INTO statements.  

 

The advantage of the technique I show here, you can handle each one on its own and decided what to do for each one separately.  Which might be nothing.  To do nothing,  just put in NULL; in place of the RETURN; in the example I have above. 


Have a great 2022 everyone!!