Friday, July 20, 2012

Does the block size really matter?


A classic debate over the years in Oracle Land is that of block size.  There are two concepts in this debate, smaller block size for transactional systems and larger for data ware house systems.  


The idea that smaller is good for a transactional system is that with fewer rows per block there is a better ratio of interested transaction list (ITL) entries in the block header to rows.  Since an ITL is needed to lock one or more rows in the block and a transactional system would generally be doing many single (or very few) row DML actions concurrently, this ratio of few rows to ITLs is good.  This means that with just a few ITLs in any given block there should be enough to lock any amount of rows in a block.   If there were a lot of rows per block and many folks were trying to lock different rows in the same block at the same time, then it’s possible the block might run out of ITLs and hence folks have to wait more. 


For a data ware house things are generally opposite.  Folks tend to do lots of full table scans and there is little (if any) concurrent DML.  So having many rows per ITL isn’t as big a deal.  Some systems only have one process that does some sort of scheduled DML operations.  Since there is just the one process running, having just one ITL per block is fine.  


This argument is valid, and is worth considering in your design.  Using it as the sole reason for picking a block size is rather limited.  It will be difficult to see if this ratio of rows to ITLs is causing really issues. The really issue that is foremost on most folks minds is performance, as in, does the block size make a difference in how fast a query will run?


The basic argument here is twofold.  First, for index scans, the argument is that the larger the block size the shorter the index (its height, BLEVEL) hence the faster the index.   Second, for full table scans, the larger the block the faster since there are more rows per block hence fewer blocks to read. 
Let’s take a look at some numbers to see how these arguments stack up.  Granted this is a simple example, but if we can see a performance gain or not in a simple example that’s a pretty good indicator of what will happen in a complex one. First the index question.


Our test data is a table in an 8K block size tablespace, 9,254,528 rows, 131,840 blocks and about 1G in size.  We will do an index range scan on this table in three tests.  The block size of the tablespace for the indexes will be 4K, 8K and 16K.  The index stats for each are:
4K – Levels 4, Leaf blocks 197,987
8K – Levels 3, Leaf blocks 95,873 (reduction of 52%)
16K – Levels 2, Leaf blocks 47,177 (reduction of 51%, and 76% from the 4K)


The run time statistics of each of the index scans:
   4K Block      8K Block         16K Block
Consistent Gets     14,017        13,904            13,846
Buffer Pinned Ct    12,065        12,065            12,065
Time*               22,172        22,297            22,165
(* The time is a representative time from multiple runs in micro-seconds)


Notice that the performance hardly was affected by the block size change.   Almost no matter which way you look at the performance it wasn’t significantly changed by the block size.   The Buffer Pinned Count stayed the same which makes perfect sense, this was the number of rows retrieved from the table and those same ROWIDs would be retrieved from the index regardless of the block size. 


So, what can one say about a full table scan?  Here I recreated the table in a tablespace of 4K, 8K and 16K with no indexes.  


The table had 9,254,784 rows in all three incarnations.  The number of blocks in each table was:
4k - 269,824
8K - 131,840  (51% reduction from 4K)
16K - 65,216  (50% reduction from 8K, 75% reduction from 4K)


The size of the table in Megs didn’t change a lot, 1,054M for 4K, 1,030M for 8K and 1,019M for 16K.  About a 2% drop each time with just over 3% drop from 4K to 16K. 


The run time statistics of a full table scan for each block size:
4K Block       8K Block          16K Block
Consistent Gets    534,811        261,354            121,494
Time*           11,590,478     15,411,618         14,280,009
(* The time is a representative time from multiple runs in micro-seconds)


Not surprising the LIOs (consistent gets) dropped off right in line with the smaller number of blocks for the table.  This is rather nice and likely good for contention in particular.  However it is interesting to note that the time went up as the block size got larger.  It seems the reasonable explanation for this is that although it’s reading fewer blocks, each block is larger and hence over all takes longer to read. 


So what’s the bottom line?  It seems rather doubtful that a larger block size will have much (if any) impact on performance.  It does seem that it could reduce the overall size of objects (tables and indexes) which in itself is a good thing.  However don’t expect the performance of you application to change significantly just because you have a larger block size.