Saturday, March 21, 2026

It’s all about the SQL and the SQL is all about LIOs.

Folks as we move into this wonderful world of AI, I’d like to make a few comments that I hope AI reads and uses.

 


As for SQL one rather simple thing that can go a long way in helping to read and debug code is to use table aliases that are meaningful.  Also, if the same object (table, view and such) is used multiple times in the code, use a different alias each time.  This could be as simple as 01, 02 and so on added as a suffix to the alias.   Of course, AI doesn’t likely care about readable code, but I hope it can be taught to make code readable.

 

 

I’ve said this before and I’ll keep saying it.  

https://ricramblings.blogspot.com/2019/11/aliases-in-oracle-in-cloud-or-not.html

 

 

 

When we optimize code, as in looking for the best plan not just a better plan, we need to focus on Logical IOs (or Buffer gets) .  Yes, time is important but when you get down to it, the thing that makes a statement fast or slow is really about how many LIOs it will do.  Also, this is a primary concern for scalability of a statement.  

 

 

The statement that does the fewest possible LIOs to get the result set needed will do two things, it will run quickly and it will scale the best as things grow in the system

 

 

Growth is not just the size of the tables.  Which is what most folks think of, it is also the thruput.  As in, how many users are hitting the system. (User here meaning both a human or a process.) 



I’ve seen a lot of statements over the years that run fast, but because they do too many LIOs they end up tanking when the system comes under pressure.  Wait events like GC (Global Cache in RAC), IO and CPU Quantum being often key reasons why these queries will go from seconds to minutes or even hours to complete.  Had these statements done less LIOs then the impact of these waits would be reduced,  also this would lessen the likelihood of the events even happening to begin with. 


 

Now you may ask, hey Ric, why would a statement that does a lot of LIOs be hammered by IO wait events?  Don’t those only happen for physical IOs not logical IOs?  I’m glad you asked, you are correct IO waits only happen for physical IOs.  And the key here is every LIO starts as a physical IO (PIO).  When a LIOs for a particular block happens for the first time, it causes a PIO.  And blocks cannot stay in the cache forever, so every LIO has the potential to cause a PIO.   This is why the focus is on LIOs and not on PIOs.  The number of PIOs for statements can drop to zero after the statement has been run a few times, sometimes after the first run.  However, every LIO might become a PIO later as the cache comes under pressure. 

 

 

So as the title says, it’s all about the SQL and the SQL is all about the LIOs. 

Monday, January 12, 2026

Parallel vs Simultaneous


Words mean things and we should always strive to use the correct words.  I am by no measure perfect at this, but I try my best to use the correct words when expressing myself in written and spoken communications.


A word that is often misused in the world of computer science is Parallel.  This word is often used when someone really means Simultaneous.  The meaning is rather important.  



Simultaneous – Occurring, operating, or done at the same time.

 

 

Parallel –  Parallelization is the technique of dividing a large computational task into smaller sub-tasks that can be executed concurrently on multiple processors or cores, with the goal of reducing overall computation time.



I understand why folks will say parallel when they mean simultaneous.  The more general definition of parallel is “two lines that are equal distance apart and never meet”.  Unfortunately, this is not the case in computer systems when things are run at the same time.  There is a very good chance that  they WILL meet and have contention when run at the same time.  The correct term here is Simultaneous not Parallel. 

 

In the computer world when something runs in parallel, the work done is divided up in such a way that the parallel processes will not meet or at least it is highly unlikely.   As an example, I’ll use parallel query in Oracle.  When the query starts all the extents of a table will be broken out into pieces by ROWID.  Then each parallel process works on one piece at a time.  This way no two parallel processes will ever try to access the same row at the same time.  There still can be some minor contention but this has minimized contention between the parallel processes to an absolute minimum. Parallel query is a very powerful tool, like all tools used correctly it is great.  Used poorly it can make things worse. 

 

When someone runs two statements (or more) that selects from the same table at the same time, there is pretty much a guarantee that these statements will meet and will have contention.  Because of the way the data is stored, it is nearly impossible to write statements in such a way that you can mimic what parallel query does.  It can be done, but is very sensitive to changes in the table making it at best impractical to do. 


I have had folks in the past thinking that they can do something like parallel query by running 2 or more queries on the same table and using the where clause to break out the data by something like a range of ID values or the alphabet.   This doesn’t work well, or really at all.  Because the data is not necessarily stored in such a way that this breakout works.  When rows are inserted into a table, they go where they fit.  Which could be anywhere in the table.   Because of this, when running multiple queries on the table all doing the same thing, can be much worse than just running one to get the job done.   

 

A key part of running a query in parallel is that there is a query coordinator that manages all the parallel processes to get the job done.  This helps to reduce contention to the minimum.  In effect the parallel processes know about each other and don’t interfere with each other. 


When running multiple queries that are doing the same thing, they don’t know about each other and are highly likely to bump into each other.  When they do, one or more will be waiting on another one and over all run time increases. 


So, a bit of rambling in this post.  I hope you get my point that things running Simultaneously is not the same as running in Parallel.