Monday, January 22, 2018

Does performance optimization still matter in the cloud or not?

Computers get faster, massive parallel processing is now standard, spinning disks are disappearing, and gigabit networks are common.  So with all this improvement in hardware speed do we need to optimizer our code anymore?  

If you’ve been in the industry for a while you’ve seen this pattern before.  Hardware gets faster and cheaper and what was a perceived problem before now seems to be solved with faster better machines.  And that is true, if we stayed with the same software and data load of times gone by.  But we don’t.

A good example of this is just data volume.  Many years ago, a 20Megabyte hard drive was considered big.  (My first hard drive was a big as a bread box and was 5 Megabytes, yes Five.)   Newer faster smaller drives came out.  Then it was easy to store 100s of Megabytes, then Gigabytes and now Terabytes are common place. What did this do? In the past a company might only store a year or so of data online as they couldn’t’ afford the storage for more.  Now they can easily store 10 or more years of data.

What did this do? It demanded more of the computer to process all that data, faster CPUs and more of them were needed.  And code had to be more efficient.  In the Oracle world I work in this drove for different ways to store data, partitioning for example.  But the code also needed to access the data the right way.  The right index and predicate could mean the difference between get a response in a reasonable amount of time, or never having it complete.

Today were seeing this all over again.  We have “big data” now, and we want to process this data in a blink of an eye to slice and dice sales data to get that marketing edge on the competition.  Or make that decision on what to invest in, or a thousand other questions.  We will continue to ask for more and want it faster from our data as we accumulate more data.

All the things I mentioned at the start give us even the chance to make this happen.  What it doesn’t mean is that we can write sloppy code and just hope that the hardware will be fast enough to make up for our suboptimal code.

It’s not good enough that code get the right results; it still has to do that efficiently.  The right indexes still matter, the right predicates still matter and the right statistics for the optimizer still matter.  In Oracle land the database engine can do a lot of amazing things.  But it still isn’t as good as we are, we can write better code and we all should strive for that in every piece of code.