PostgreSQL Syntax is a Mystery Only Exceeded by its Power

Posted on September 13, 2013 by lukaseder

I just ran across this rather powerful PostgreSQL example statement from the manual. It reads

Increment the sales count of the salesperson who manages the account for Acme Corporation, and record the whole updated row along with current time in a log table:
WITH upd AS (
  UPDATE employees 
  SET sales_count = sales_count + 1 
  WHERE id = (
    SELECT sales_person 
    FROM accounts 
    WHERE name = 'Acme Corporation'
  )
  RETURNING *
)
INSERT INTO employees_log 
SELECT *, current_timestamp 
FROM upd;

This database keeps amazing me. A single statement allows for updating one table, taking the updated record(s) including all generated values as input for an insert into another table. But at the same time, PostgreSQL does not yet implement the SQL:2003 MERGE statement. Crazy database…

Published by lukaseder

I made jOOQ View all posts by lukaseder

9 thoughts on “PostgreSQL Syntax is a Mystery Only Exceeded by its Power”

agentgt says:

September 13, 2013 at 17:30

UPSERTING aka MERGE is a complicated subject for truly acid compliant databases. The databases that implement MERGE are generally sacrificing something (performance or consistency). Thus blindly using MERGE because the database supports it is not necessarily a good idea.

However Postgres could offer some solution that maybe fits 80% but I see the two use cases of upserting very divided e.g. stats that matter like money which will sacrifice speed or stats that don’t matter like facebook likes.

That being said “upserting” is very useful for real time analytics and I have run into Postgres issues with transaction/lock performance. I think this is where databases like VoltDB or NoSQL options like Redis really shine (than can do it generally because they are in memory and synchronous).

Loading...

Reply
1. lukaseder says:
  
  September 13, 2013 at 18:09
  
  Yeah, a powerful MERGE execution touching millions of records is certainly not a good idea. But if you know you’ll operate only on one record (if it exists), then there is not that big of a performance loss. I particularly like the H2 MERGE syntax, which is really a very simple UPSERT.
  
  Do you have any experience with VoltDB or other of Stonebraker’s NewSQL stores?
  
  Loading...
  
  Reply
  1. Andreas says:
    
    February 23, 2014 at 23:15
    
    The complex part of UPSERT is retaining ACID while still having good performance and concurrency. These are the concerns which prevented UPSERT from being included in 9.4.
    
    Loading...
    
    Reply
    1. lukaseder says:
      
      February 24, 2014 at 09:56
      
      Interesting. I wasn’t aware this was considered for PostgreSQL 9.4. Can you cite a reference where this decision was made?
      
      Loading...
      
      Reply
Vlad Khazin says:

November 6, 2015 at 00:09

In pragmatic terms how one would update/insert into 100M records table in chunks loading data from an external non-rdbms system?

There is no prior knowledge which records should be inserted and which updated.

I tried following:
1. create staging table, truncate, load into temp table using insert statements and then
2. insert into permanent table – where not exists and update permanent table where exists from temp table, result – deadlocks as the process was running trying to process the 1024 record chunks with couple of parallel threads.

Loading...

Reply
1. lukaseder says:
  
  November 6, 2015 at 09:14
  With such a huge merge operation, a good idea is to use a “loader table”, or staging table as you called it, i.e. a table in which you first load all those 100M records, e.g. via the PostgreSQL COPY command (http://www.postgresql.org/docs/current/static/sql-copy.html), and then transfer the records to the target table 1000 records at a time (using LIMIT), before committing. The rule here is:
  - Don’t commit too often to avoid the transaction overhead
  - Don’t commit rarely to avoid huge UNDO logs
  If you’re running parallel transfer threads, you should partition your data to avoid contention as good as possible. There’s always an attribute by which you can partition your threads.
  
  Also, consider upgrading to PostgreSQL 9.5, which finally supports UPSERT.
  
  Loading...
  Reply

Hi Lukas,

is there a way to describe the above SQL update statement via jOOQs DSL? For example something like:

dsl.with( "upd" ).as( dsl.update( EMPLOYEES )
			                         .set( EMPLOYEES.SALES_COUNT, EMPLOYEES.SALES_COUNT.add( DSL.inline(1) ) )
			                         .where( dsl.select( ACCOUNTS.SALES_PERSON )
			                                      .from( ACCOUNTS )
			                                      .where( ACCOUNTS.NAME.eq( "Acme Corporation" ) )
			                               )
			                         .returning()  
			                    )
			.insertInto( EMPLOYEES_LOG )
			    .select( dsl.select()
			                .select(DSL.currentTimestamp())
			                  .from("upd")
			           )
		  .execute();

I’m asking because the “with(…).as(Select)” only takes a Select-Object and I can’t find a way to convert the Update statement into a Select.

Regards

lukaseder says:

May 8, 2018 at 13:21

Not yet:

– https://github.com/jOOQ/jOOQ/issues/4474
– https://github.com/jOOQ/jOOQ/issues/5689

Loading...

Reply

RB says:

May 8, 2018 at 13:25

Thx, for the fast response.

Loading...

Reply

PostgreSQL Syntax is a Mystery Only Exceeded by its Power

Like this:

Published by lukaseder

9 thoughts on “PostgreSQL Syntax is a Mystery Only Exceeded by its Power”

Leave a ReplyCancel reply

Like this:

Published by lukaseder

9 thoughts on “PostgreSQL Syntax is a Mystery Only Exceeded by its Power”

Leave a ReplyCancel reply

Discover more from Java, SQL and jOOQ.