Creating a Microsoft Excel Style Pivot Table With Grand Totals in SQL

This answer to a beautiful Stack Overflow question I’ve given recently needs further explanation in a blog post. When working with Microsoft Excel, we can create beautiful and also very insightful Pivot Tables with grand totals. What are they? This is best explained visually.

Assuming you have this normalised form for your raw data. As in the question, it’s an inventory table in a bike shop:

Now, in order to analyse our inventory, we’d love to pivot the above normalised representation to the following non-normalised representation, and we’d also like to display the grand totals to learn how many bikes of each type we have, and how many bikes of each colour, and how many bikes in total:

There are tons of great tutorials out there explaining how to do this with Microsoft Excel. What we care about is:

How to do this with SQL

We’re using two SQL features for this:

Let’s create some data first. I’m going to use SQL Server syntax for most of this blog post. At the end, there will be a full solution for SQL Server, Oracle, and PostgreSQL:

WITH Bikes AS (
  SELECT * FROM (
    VALUES ('Mountain Bikes', 'Black'), 
           ('Mountain Bikes', 'Black'),
           ('Mountain Bikes', 'Silver'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Black'),
           ('Road Bikes', 'Yellow'),
           ('Touring Bikes', 'Blue'),
           ('Touring Bikes', 'Blue'),
           ('Touring Bikes', 'Yellow')
  ) AS Bikes (Name, Colour)
)
SELECT * FROM Bikes

This simply produces an in-memory table representation of our original, normalised data set.

Now, the first step is to create the following totals and grand totals:

  • Total bikes per name and colour
  • (Grand) total bikes per name
  • (Grand) total bikes per colour
  • (Grand) total bikes

In this particular case, we can use CUBE(), which forms all the possible GROUPING SETS combinations:

SELECT Name, Colour, COUNT(*) AS Total
FROM Bikes
GROUP BY CUBE (Name, Colour)
ORDER BY Name, Colour

The result looks like this:

Name            Colour  Total
-----------------------------
NULL            NULL    11
NULL            Black   3
NULL            Blue    2
NULL            Red     3
NULL            Silver  1
NULL            Yellow  2
Mountain Bikes  NULL    3
Mountain Bikes  Black   2
Mountain Bikes  Silver  1
Road Bikes      NULL    5
Road Bikes      Black   1
Road Bikes      Red     3
Road Bikes      Yellow  1
Touring Bikes   NULL    3
Touring Bikes   Blue    2
Touring Bikes   Yellow  1

Excellent! All the (grand) totals are now in the result set. Notice that we could have manually written this using the following, much more tedious syntax:

SELECT Name, Colour, COUNT(*) AS Total
FROM Bikes
GROUP BY Name, Colour
UNION ALL
SELECT Name, NULL, COUNT(*) AS Total
FROM Bikes
GROUP BY Name
UNION ALL
SELECT NULL, Colour, COUNT(*) AS Total
FROM Bikes
GROUP BY Colour
UNION ALL
SELECT NULL, NULL, COUNT(*) AS Total
FROM Bikes
ORDER BY Name, Colour

So, CUBE() (and ROLLUP() and GROUPING SETS()) is just syntax sugar for the above more verbose UNION ALL representation, with the additional important difference that very likely you’re going to get a much more optimal execution plan using CUBE():

Than using manual UNION ALL:

The result would be similar in Oracle and other databases.

This isn’t surprising. We can aggregate all the grand totals in one go with CUBE() (in fact, the “grand grand total” is calculated separately in this case), whereas it’s hard for the optimiser to prove that the UNION ALL version is really the same thing and the individual subqueries can be factored out.

Before we move on, just a slight improvement, let’s rename the grand totals from NULL to Total and wrap the thing in a derived table T:

SELECT *
FROM (
  SELECT 
    COALESCE(Name, 'Total') AS Name, 
    COALESCE(Colour, 'Total') AS Colour, 
    COUNT(*) AS Count
  FROM Bikes
  GROUP BY CUBE (Name, Colour)
) AS t

Now, pivot this representation into a more readable one

The data still looks normalised with repeating names and colours in the result tables. Let’s pivot it using … wait for it … the PIVOT clause (available in Oracle and SQL Server).

The PIVOT clause is a bit funky. It can be appended to any table expression (including derived tables) to pivot it. It will apply an implicit GROUP BY operation and generate a set of aggregated SELECT columns. When we pivot our previous derived table T:

SELECT *
FROM t
PIVOT (
  SUM(Count) FOR Colour IN (
    Red, Blue, Black, Silver, Yellow, 
    Grey, Multi, Uncoloured, Total
  )
) AS p

Then we’re getting the following, nice-looking result:

Name            Red     Blue    Black   Silver  Yellow  Grey    Multi   Uncoloured  Total
-----------------------------------------------------------------------------------------
Mountain Bikes  NULL    NULL    2       1       NULL    NULL    NULL    NULL        3
Road Bikes      3       NULL    1       NULL    1       NULL    NULL    NULL        5
Touring Bikes   NULL    2       NULL    NULL    1       NULL    NULL    NULL        3
Total           3       2       3       1       2       NULL    NULL    NULL        11

That’s almost the desired result – all that’s missing is some null handling. How does it work? We have the following syntax:

[ table ] PIVOT (
  [ aggregate function(s) ] FOR [ column(s) ] IN ( [ values ] )
)

Where

  • The [ table ] is the table being pivoted
  • The [ column(s) ] are the columns from the [ table ] being grouped, as in any ordinary GROUP BY clause
  • The [ values ] are the values of the [ column(s) ], for which filtered aggregations are made
  • The [ aggregate function(s) ] are the aggregations that are made per [ column(s) ] (group) and per [ value ] (filter)

This syntax is Oracle and SQL Server specific. Oracle can do a bit more than SQL Server. If this syntax is not available in your database, you can write it out manually again (just like the above CUBE() to get this (select with your mouse to remove colouring):

SELECT
  Name,
  SUM(CASE WHEN Colour = 'Red'        THEN Count END) AS Red,
  SUM(CASE WHEN Colour = 'Blue'       THEN Count END) AS Blue,
  SUM(CASE WHEN Colour = 'Black'      THEN Count END) AS Black,
  SUM(CASE WHEN Colour = 'Silver'     THEN Count END) AS Silver,
  SUM(CASE WHEN Colour = 'Yellow'     THEN Count END) AS Yellow,
  SUM(CASE WHEN Colour = 'Grey'       THEN Count END) AS Grey,
  SUM(CASE WHEN Colour = 'Multi'      THEN Count END) AS Multi,
  SUM(CASE WHEN Colour = 'Uncoloured' THEN Count END) AS Uncoloured,
  SUM(CASE WHEN Colour = 'Total'      THEN Count END) AS Total
FROM t
GROUP BY Name

There should be no performance penalty in the manually written version (although, as always, do check). More details about this in a previous article.

Putting it all together

Here’s the complete query in SQL Server:

WITH Bikes(Name, Colour) AS (
  SELECT * FROM (
    VALUES ('Mountain Bikes', 'Black'), 
           ('Mountain Bikes', 'Black'),
           ('Mountain Bikes', 'Silver'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Black'),
           ('Road Bikes', 'Yellow'),
           ('Touring Bikes', 'Blue'),
           ('Touring Bikes', 'Blue'),
           ('Touring Bikes', 'Yellow')
  ) AS Bikes(Name, Colour)
)
SELECT
  Name, 
  COALESCE(Red, 0) AS Red, 
  COALESCE(Blue, 0) AS Blue, 
  COALESCE(Black, 0) AS Black, 
  COALESCE(Silver, 0) AS Silver, 
  COALESCE(Yellow, 0) AS Yellow, 
  COALESCE(Grey, 0) AS Grey, 
  COALESCE(Multi, 0) AS Multi, 
  COALESCE(Uncoloured, 0) AS Uncoloured, 
  Total
FROM (
  SELECT 
    COALESCE(Name, 'Total') AS Name, 
    COALESCE(Colour, 'Total') AS Colour, 
    COUNT(*) AS Count
  FROM Bikes
  GROUP BY CUBE (Name, Colour)
) AS t
PIVOT (
  SUM(Count) FOR Colour IN (
    Red, Blue, Black, Silver, Yellow, 
    Grey, Multi, Uncoloured, Total
  )
) AS p
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

Or Oracle:

WITH Bikes(Name, Colour) AS (
  SELECT 'Mountain Bikes', 'Black'  FROM dual UNION ALL
  SELECT 'Mountain Bikes', 'Black'  FROM dual UNION ALL
  SELECT 'Mountain Bikes', 'Silver' FROM dual UNION ALL
  SELECT 'Road Bikes',     'Red'    FROM dual UNION ALL
  SELECT 'Road Bikes',     'Red'    FROM dual UNION ALL
  SELECT 'Road Bikes',     'Red'    FROM dual UNION ALL
  SELECT 'Road Bikes',     'Black'  FROM dual UNION ALL
  SELECT 'Road Bikes',     'Yellow' FROM dual UNION ALL
  SELECT 'Touring Bikes',  'Blue'   FROM dual UNION ALL
  SELECT 'Touring Bikes',  'Blue'   FROM dual UNION ALL
  SELECT 'Touring Bikes',  'Yellow' FROM dual
)
SELECT
  Name, 
  COALESCE(Red, 0) AS Red, 
  COALESCE(Blue, 0) AS Blue, 
  COALESCE(Black, 0) AS Black, 
  COALESCE(Silver, 0) AS Silver, 
  COALESCE(Yellow, 0) AS Yellow, 
  COALESCE(Grey, 0) AS Grey, 
  COALESCE(Multi, 0) AS Multi, 
  COALESCE(Uncoloured, 0) AS Uncoloured, 
  Total
FROM (
  SELECT 
    COALESCE(Name, 'Total') AS Name, 
    COALESCE(Colour, 'Total') AS Colour, 
    COUNT(*) AS Count
  FROM Bikes
  GROUP BY CUBE (Name, Colour)
) t
PIVOT (
  SUM(Count) FOR Colour IN (
    'Red' AS Red, 
    'Blue' AS Blue, 
    'Black' AS Black, 
    'Silver' AS Silver, 
    'Yellow' AS Yellow, 
    'Grey' AS Grey, 
    'Multi' AS Multi, 
    'Uncoloured' AS Uncoloured, 
    'Total' AS Total
  )
) p
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

Or PostgreSQL:

WITH Bikes(Name, Colour) AS (
  SELECT * FROM (
    VALUES ('Mountain Bikes', 'Black'), 
           ('Mountain Bikes', 'Black'),
           ('Mountain Bikes', 'Silver'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Red'),
           ('Road Bikes', 'Black'),
           ('Road Bikes', 'Yellow'),
           ('Touring Bikes', 'Blue'),
           ('Touring Bikes', 'Blue'),
           ('Touring Bikes', 'Yellow')
  ) AS Bikes(Name, Colour)
)
SELECT
  Name,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Red'       ), 0) AS Red,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Blue'      ), 0) AS Blue,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Black'     ), 0) AS Black,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Silver'    ), 0) AS Silver,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Yellow'    ), 0) AS Yellow,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Grey'      ), 0) AS Grey,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Multi'     ), 0) AS Multi,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Uncoloured'), 0) AS Uncoloured,
  COALESCE(SUM(Count) FILTER (WHERE Colour = 'Total'     ), 0) AS Total
FROM (
  SELECT 
    COALESCE(Name, 'Total') AS Name, 
    COALESCE(Colour, 'Total') AS Colour, 
    COUNT(*) AS Count
  FROM Bikes
  GROUP BY CUBE (Name, Colour)
) AS t
GROUP BY Name
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

Or MySQL (which doesn’t support CUBE, only ROLLUP, thus slightly tweaked PostgreSQL variant):

WITH Bikes(Name, Colour) AS (
  SELECT 'Mountain Bikes', 'Black'  UNION ALL
  SELECT 'Mountain Bikes', 'Black'  UNION ALL
  SELECT 'Mountain Bikes', 'Silver' UNION ALL
  SELECT 'Road Bikes',     'Red'    UNION ALL
  SELECT 'Road Bikes',     'Red'    UNION ALL
  SELECT 'Road Bikes',     'Red'    UNION ALL
  SELECT 'Road Bikes',     'Black'  UNION ALL
  SELECT 'Road Bikes',     'Yellow' UNION ALL
  SELECT 'Touring Bikes',  'Blue'   UNION ALL
  SELECT 'Touring Bikes',  'Blue'   UNION ALL
  SELECT 'Touring Bikes',  'Yellow'
)
SELECT
  Name,
  COALESCE(SUM(CASE WHEN Colour = 'Red'        THEN Count END), 0) AS Red,
  COALESCE(SUM(CASE WHEN Colour = 'Blue'       THEN Count END), 0) AS Blue,
  COALESCE(SUM(CASE WHEN Colour = 'Black'      THEN Count END), 0) AS Black,
  COALESCE(SUM(CASE WHEN Colour = 'Silver'     THEN Count END), 0) AS Silver,
  COALESCE(SUM(CASE WHEN Colour = 'Yellow'     THEN Count END), 0) AS Yellow,
  COALESCE(SUM(CASE WHEN Colour = 'Grey'       THEN Count END), 0) AS Grey,
  COALESCE(SUM(CASE WHEN Colour = 'Multi'      THEN Count END), 0) AS Multi,
  COALESCE(SUM(CASE WHEN Colour = 'Uncoloured' THEN Count END), 0) AS Uncoloured,
  COALESCE(SUM(CASE WHEN Name != 'Total' OR Colour != 'Total' THEN Count END), 0) AS Total
FROM (
  SELECT 
    COALESCE(Name, 'Total') AS Name, 
    COALESCE(Colour, 'Total') AS Colour, 
    COUNT(*) AS Count
  FROM Bikes
  GROUP BY Colour, Name WITH ROLLUP
) AS t
GROUP BY Name
ORDER BY CASE Name WHEN 'Total' THEN 1 ELSE 0 END, Name

Conclusion

Whenever working with data and SQL, try finding an elegant solution with SQL. There are many tools for a variety of data processing and data presentation use-cases. Here are some other cool, related reads from our blog:

The Top 10 Productivity Booster Techs for Programmers

This is the list we’ve all been waiting for. The top 10 productivity booster techs for programmers that – once you’ve started using them – you can never do without them any longer.

Here it is:

1. Git

logo@2x Before, there were various version control systems. Better ones, worse ones. But somehow they all felt wrong in one way or another.

Came along Git (and GitHub, EGit). Once you’re using this miraculous tool, it’s hard to imagine that you’ll ever meet a better VCS again.

You’ve never used Git? Get started with this guide.

2. Stack Overflow

stackoverflow

No kidding. Have you ever googled for anything tech-related back in 2005? Or altavista’d something back in 2000? Or went to FidoNet in search for answers in 1995? It was horrible. The top results always consisted in boring forum discussions with lots of un-experts and script kiddies claiming wrong things.

These forums still exist, but they don’t turn up on page 1 of Google search results.

Today, any time you search for something, you’ll have 2-3 hits per top 10 from Stack Overflow. And chances are, you’ll look no further because those answers are 80% wonderful! That’s partially because of Stack Overflow’s cunning reputation system, but also partially because of Stack Overflow’s even more cunning SEO rewarding system. (I already got 98 announcer, 19 booster, and 5 publicist badges. Yay).

While Stack Overflow allows its more active user to pursue their vanity (see above ;-) ), all the other users without any accounts will continue to flock in, finding perfect answers and clicking on very relevant ads.

Thumbs up for Stack Overflow and their awesome business model.

3. Office 365

excel We’re a small startup. Keeping costs low is of the essence. With Office 365, we only pay around $120 per user for a full-fledged Office 2013 suite, integrated with Microsoft Onedrive, Sharepoint, Exchange, Access, and much more.

In other words, we get enterprise-quality office software for the price of what students used to pay, before.

And do note, Office 2013 is better than any other Microsoft (or Libre) Office suite before. While not a 100% Programmer thing, it’s still an awesome tool chain for a very competitive price.

4. IntelliJ

intellij

While Eclipse is great (and free), IntelliJ IDEA, and also phpStorm for those unfortunate enough to write PHP are just subtly better in almost every aspect of an IDE. You can try their free community edition any time, but beware, you probably won’t switch back. And then you probably won’t be able to evade the Ultimate edition for long ;-)

5. PostgreSQL

pg PostgreSQL claims to be the world’s most advanced Open Source database, and we think it’s also one of the most elegant, easy, standards-compliant databases. It is really the one database that makes working with SQL fun.

We believe that within a couple of years, there’s a real chance of PostgreSQL not only beating commercial databases in terms of syntax but also in terms of performance.

Any time you need a data storage system with a slight preference for SQL-based ones, just make PostgreSQL your default choice. You won’t be missing any feature in that database.

Let’s hear it for PostgreSQL.

6. Java

duke Java is almost 20 years old, but it’s still the #1 or #2 language on the TIOBE index (sharing ranks with C), for very good reasons:

  • It’s robust
  • It’s mature
  • It works everywhere (almost, really too bad it has never succeeded in the browser)
  • It runs on the best platform ever, the JVM
  • It is Open Source
  • It has millions of tools, libraries, extensions, and applications

While some languages may seem a bit more modern or sexy or geeky, Java has and will always rule them all in terms of popularity. It is a first choice and with Java 8, things have improved even more.

7. jOOQ

jooq-logo-black-100x80 Now, learning this from the jOOQ blog is really unexpected and a shocker, but we think that jOOQ fits right into this programmer’s must-have top-10 tool chain. Most jOOQ users out there have never returned back to pre-jOOQ tools, as they’ve found writing SQL in Java as simple as never before.

Given that we’ve had Java and PostgreSQL before, there’s only this one missing piece gluing the two together in the most sophisticated way.

And besides, no one wants to hack around with the JDBC API, these days, do they?

8. Less CSS

less When you try Less CSS for the first time, you’ll think that

Why isn’t CSS itself like this!?

And you’re right. It feels just like CSS the way it should have always been. All the things that you have always hated about CSS (repetitiveness, verbosity, complexity) are gone. And if you’re using phpStorm or some other JetBrains product (see above), you don’t even have to worry about compiling it to CSS.

As an old HTML-table lover who doesn’t care too much about HTML5, layout, and all that, using Less CSS makes me wonder if I should finally dare creating more fancy websites!

Never again without Less CSS.

9. jQuery

jqueryWhat Less CSS is for CSS, jQuery is for JavaScript. Heck, so many junior developers on Stack Overflow don’t even realise that jQuery is just a JavaScript library. They think it is the language, because we’ve grown to use it all over the place.

Yes, sometimes, jQuery can be overkill as is indicated by this slightly cynical website: http://vanilla-js.com

joox-logo-blackBut it helps so much abstracting all the DOM manipulation in a very fluent way. If only all libraries were written this way.

Do note that we’ve also published a similar library for Java, in case you’re interested in jQuery-style DOM XML manipulation. Along with Java 8’s new lambda expressions, manipulating the DOM becomes a piece of cake.

10. C8H10N4O2

764px-Caffeine.svgC8H10N4O2 (more commonly known as Caffeine) is probably the number one productivity booster for programmers.

Some may claim that there’s such a thing like the Ballmer Peak. That might be true, but the Caffeine Peak has been proven times and again.

Have Dilbert’s view on the matter:

http://dilbert.com/strips/comic/2006-10-19/

More productivity boosters

We’re certainly not the only ones believing that there is such a thing as a programmer-productivity-booster. Enjoy this alternative list by Troy Topnik here for more insight:

http://www.activestate.com/blog/2010/03/top-ten-list-productivity-boosters-programmers

The Power of Spreadsheets in a Reactive, RESTful API

Being mostly a techie, I’ve recently and admittedly been deceived by my own Dilbertesque attitude when I stumbled upon this buzzword-filled TechCrunch article about Espresso Logic. Ever concerned about my social media reputation (e.g. reddit and hackernews karma), I thought it would be witty to put a link on those platforms titled:

Just found this article on TechCrunch. Reads like a markov-chain-generated series of buzzwords.

With such a catchy headline, the post quickly skyrocketed – and like many other redditors, my thoughts were with Geek and Poke:

But like a few other redditors, I couldn’t resist clicking through to the actual product that claims to implement “reactive programming” through a REST and JSON API. And I’m frankly impressed by the ideas behind this product. For once, the buzzwords are backed by software implementing them very nicely! Let’s first delve into…

Reactive Programming

Reactive programming is a term that has gained quite some traction recently around Typesafe, the company behind Akka. It has also gained additional traction since Erik Meijer (creator of LINQ) has left Microsoft to fully dedicate his time to his new company Applied Duality. With those brilliant minds sharply on the topic, we’ll certainly hear more about the Reactive Manifesto in the near future.

excelBut in fact, every manager knows the merits of “reactive programming” already as they’re working with the most reactive and probably the most awesome software on the planet: Microsoft Excel, a device whose mystery is only exceeded by its power. Think about how awesome Excel is. You have hundreds of rules, formulas, cell-interdependencies. And any time you change a value, the whole spreadsheet magically updates itself. That’s Reactive Programming.

The power of reactive programming lies in its expressiveness. With only very little expressive logic, you can express what otherwise needs dozens of lines of SQL, or hundreds of lines of Java.

Espresso Logic

With this in mind, I started to delve into Espresso Logic’s free trial. Note, that I’m the kind of impatient person who wants quick results without reading the docs. In case you work the other way round, there are some interesting resources to get you started:

Anyway, the demo ships with a pre-installed MySQL database containing what looks like a typical E-Commerce schema containing customer, employee, lineitem, product, purchaseorder, and purchaseorder_audit tables:

The schema browsing view in Espresso Logic

The schema browsing view in Espresso Logic

So I get schema navigation information (such as parent / child relationships) and an overview of rules. These rules look like triggers calculating sums or validating things. We’ll get to these rules later on.

Live API

So far, things are as expected. The UI is maybe a bit edgy, as the product only exists since late 2013. But what struck me as quite interesting is what Espresso Logic calls the Live API. With a couple of clicks, I can assemble a REST Resource tree structure from various types of resources, such as database tables. The Espresso Designer will then almost automatically join tables to produce trees like this one:

The Resource Tree view of Espresso Logic

The Resource Tree view of Espresso Logic

Notice how I can connect child entities to their parents quite easily. Now, this API is still a bit limited. For instance, I couldn’t figure out how to drag-and-drop a reporting relationship where I calculate the order amount per customer and product. However, I can switch the Resource Type from “Normal” to “SQL” to achieve just that with a plain old GROUP BY and aggregate function.

I started to grasp that I’m actually managing and developing a RESTful API based on the available database resources! A little further down the menu, I then found the “Quick Ref” item, which helped me understand how to call this API:

A quick API reference

A quick API reference

So, each of the previously defined resources is exposed through a URL as I’d expect from any RESTful API. What looks really nice is that I have built-in API versioning and an API key. Note, it is strongly discouraged from an OWASP point of view to pass API keys around in GET requests. This is just a use-case for a quick-start demo and for the odd developer test. Do not use this in production!

Anyway, I called the URL in my browser with the API key as parameter (going against my own rules):

https://eval.espressologic.com/rest/[my-user]/demo/v1/AllCustomers?auth=[my-key]:1

And I got a JSON document like this:

[
  {
    "@metadata": {
      "href": "https://eval.espressologic.com/rest/[my-user]/demo/v1/Customers/Alpha%20and%20Sons",
      "checksum": "A:cf1f4fb79e8e7142"
    },
    "Name": "Alpha and Sons",
    "Balance": 105,
    "CreditLimit": 900,
    "links": [
    ],
    "Orders": [
      {
        "@metadata": {
          "href": "https://eval.espressologic.com/rest/[my-user]/demo/v1/Customers.Orders/6",
          "checksum": "A:0bf14e2d58cc97b5"
        },
        "OrderNumber": 6,
        "TotalAmount": 70,
        "Paid": false,
        "Notes": "Pack with care - fragile merchandise",
        "links": [
        ], ...

Notice how each resource has a link and a checksum. The checksum is needed for optimistic locking, which is built-in, should you choose to concurrently update any of the above resources. Notice also, how the nested resource Orders is referenced as Customers.Orders. I can also access it directly by calling the above URL.

Live Logic / Reactive Programming

So far so good. Similar things have been implemented in a variety of software. For instance, Adobe Experience Manager / Apache Sling intuitively exposes the JCR repository through REST as well. But where the idea behind Espresso Logic really started fascinating me is when I clicked on “Live Logic”, and I was exposed to a preconfigured set of rules that are applied to the data:

The rules view

The rules view

I’ve quickly skimmed through the manual to see if I understood correctly. These rules actually resemble the kind of rules that I can enter in any spreadsheet software. For instance, it appears as though the customer.balance column is calculated as the sum of all purchaseorder.amount_total having a paid value of false, and so on.

So, if I continue through this rule-chain I’ll wind up with lineitem.product_price being the shared dependency of all other calculated values. When changing that value, a whole set of updates should run through my rule set to finally change the customer.balance:

changing lineitem.product_price
-> changes lineitem.amount
  -> changes purchaseorder.amount_total
    -> changes customer.balance

Depending how much of a console hacker you are, you might want to write your own PUT call using curl, or you can leverage the REST Lab from the Espresso Designer, which helps you get all the parameters right. So, assuming we want to change a line item from the previous call:

{
  "@metadata": {
    "href": "https://eval.espressologic.com/rest/[my_user]/demo/v1/Customers.Orders.LineItems/11",
    "checksum": "A:2e3d8cb0bff42763"
  },
  "lineitem_id": 11,
  "ProductNumber": 2,
  "OrderNumber": 6,
  "Quantity": 2,
  "Price": 25,
  ...

Let’s just try to update that to have a price of 30:

Using the REST lab to execute PUT requests

Using the REST lab to execute PUT requests

And you can see in the response, there is a transaction summary, which shows that the Customers.Orders.TotalAmount has changed from 50 to 60, the Customers.Balance has changed from 105 to 95, and an audit record has been written. The audit record itself is also defined by a rule like any other rule. But there’s also an ordinary log file that shows what really happened when I ran this PUT request:

The log view showing all the INSERTs and UPDATEs executed

The log view showing all the INSERTs and UPDATEs executed

Imagine having to put all those INSERT and UPDATE statements into a correct order yourself, and correctly manage caching, and transactions! Instead, all we have done is define some rules. For a complete overview of what rule types are available, consider this page of the Live Logic manual

Out of scope features for this post

… So far, we’ve had a look at the most obvious features of Espresso Logic. There are more, though. A couple of examples:

Server-side JavaScript

If rules cannot express it, JavaScript can. There are various points of the application where you can inject your JavaScript snippets, e.g. for validation, more complex rule expressions, request and response transformation, etc. Although we haven’t tried it, it reads like row-based triggers written in JavaScript.

Stored procedure support

The people behind Espresso Logic are “legacy-embracing” people, just like us at Data Geekery. Their target audience might already have thousands of complex stored procedures with lots of business logic in them. Those should not be rewritten in JavaScript. But just like tables, views, and REST resources, they are exposed through the REST API, taking GET parameters for IN parameters and returning JSON for OUT parameters and cursors.

From a jOOQ perspective, it’s pretty awesome to see that someone else is taking stored procedures as seriously as we do.

Row / column level security

There is a built-in user and role management module that allows you to provide centrally-managed, fine-grained access control to your data. Not many databases support row-level security like the Oracle database, for instance. So having this kind of feature in your platform really adds value to many RDBMS integrations. Some further resources on that topic:

Conclusion: Querying vs. updating vs. rule-based persistence

On our jOOQ blog and our marketing websites (e.g. hibernate-alternative.com), we always advocate two main use-cases when operating on databases:

  • Querying: You have very complex queries to calculate things like reports. For this, SQL (e.g. through jOOQ) is perfect
  • Updating: You have a very complex domain model with lots of items and deltas that you want to persist in one go. For this, Hibernate / ORMs are perfect

But today, Espresso Logic has shown to us that there is yet another use-case. One that is covered by reactive programming (or “spreadsheet-programming“) techniques. And that’s:

  • Rule-based persistence: You have a very complex domain model with lots of items and lots of rules which you want to validate, calculate, and keep in sync all the time. For this, both SQL and ORMs are solutions at the wrong level of abstraction.

This “new” use-case is actually quite common in a lot of enterprise applications where complex business rules are currently spelled out in millions of lines of imperative code that is very hard to decipher and even harder to validate / modify. How can you reverse-engineer your business rules from millions of lines of legacy code, written in COBOL?

At Data Geekery, we’re always looking out for brand new tech. Espresso Logic is a young startup with a new product. Yet, as originally mentioned, they’re a startup with seed funding, a very compelling and innovative idea, and a huge market of legacy COBOL applications that wants to start delving into “sexy” new technologies, such as RESTful APIs, JSON, reactive programming. It might just work! If you haven’t seen enough, go work through this tutorial, which covers advanced examples such as a “bill of materials price rollup”, “bill of materials kit explosion”, “budget rollup”, “audit salary chagnes” and more.

We’ll certainly keep an eye out for future improvements to the Espresso Logic platform!