Monday, January 12, 2026

Parallel vs Simultaneous


Words mean things and we should always strive to use the correct words.  I am by no measure perfect at this, but I try my best to use the correct words when expressing myself in written and spoken communications.


A word that is often misused in the world of computer science is Parallel.  This word is often used when someone really means Simultaneous.  The meaning is rather important.  



Simultaneous – Occurring, operating, or done at the same time.

 

 

Parallel –  Parallelization is the technique of dividing a large computational task into smaller sub-tasks that can be executed concurrently on multiple processors or cores, with the goal of reducing overall computation time.



I understand why folks will say parallel when they mean simultaneous.  The more general definition of parallel is “two lines that are equal distance apart and never meet”.  Unfortunately, this is not the case in computer systems when things are run at the same time.  There is a very good chance that  they WILL meet and have contention when run at the same time.  The correct term here is Simultaneous not Parallel. 

 

In the computer world when something runs in parallel, the work done is divided up in such a way that the parallel processes will not meet or at least it is highly unlikely.   As an example, I’ll use parallel query in Oracle.  When the query starts all the extents of a table will be broken out into pieces by ROWID.  Then each parallel process works on one piece at a time.  This way no two parallel processes will ever try to access the same row at the same time.  There still can be some minor contention but this has minimized contention between the parallel processes to an absolute minimum. Parallel query is a very powerful tool, like all tools used correctly it is great.  Used poorly it can make things worse. 

 

When someone runs two statements (or more) that selects from the same table at the same time, there is pretty much a guarantee that these statements will meet and will have contention.  Because of the way the data is stored, it is nearly impossible to write statements in such a way that you can mimic what parallel query does.  It can be done, but is very sensitive to changes in the table making it at best impractical to do. 


I have had folks in the past thinking that they can do something like parallel query by running 2 or more queries on the same table and using the where clause to break out the data by something like a range of ID values or the alphabet.   This doesn’t work well, or really at all.  Because the data is not necessarily stored in such a way that this breakout works.  When rows are inserted into a table, they go where they fit.  Which could be anywhere in the table.   Because of this, when running multiple queries on the table all doing the same thing, can be much worse than just running one to get the job done.   

 

A key part of running a query in parallel is that there is a query coordinator that manages all the parallel processes to get the job done.  This helps to reduce contention to the minimum.  In effect the parallel processes know about each other and don’t interfere with each other. 


When running multiple queries that are doing the same thing, they don’t know about each other and are highly likely to bump into each other.  When they do, one or more will be waiting on another one and over all run time increases. 


So, a bit of rambling in this post.  I hope you get my point that things running Simultaneously is not the same as running in Parallel. 

No comments:

Post a Comment