The Power of Spreadsheets in a Reactive, RESTful API

Being mostly a techie, I’ve recently and admittedly been deceived by my own Dilbertesque attitude when I stumbled upon this buzzword-filled TechCrunch article about Espresso Logic. Ever concerned about my social media reputation (e.g. reddit and hackernews karma), I thought it would be witty to put a link on those platforms titled:

Just found this article on TechCrunch. Reads like a markov-chain-generated series of buzzwords.

With such a catchy headline, the post quickly skyrocketed – and like many other redditors, my thoughts were with Geek and Poke:

Geek and Poke on Big Data
Geek and Poke on Big Data. Image licensed CC-BY 3.0

But like a few other redditors, I couldn’t resist clicking through to the actual product that claims to implement “reactive programming” through a REST and JSON API. And I’m frankly impressed by the ideas behind this product. For once, the buzzwords are backed by software implementing them very nicely! Let’s first delve into…

Reactive Programming

Reactive programming is a term that has gained quite some traction recently around Typesafe, the company behind Akka. It has also gained additional traction since Erik Meijer (creator of LINQ) has left Microsoft to fully dedicate his time to his new company Applied Duality. With those brilliant minds sharply on the topic, we’ll certainly hear more about the Reactive Manifesto in the near future.

excelBut in fact, every manager knows the merits of “reactive programming” already as they’re working with the most reactive and probably the most awesome software on the planet: Microsoft Excel, a device whose mystery is only exceeded by its power. Think about how awesome Excel is. You have hundreds of rules, formulas, cell-interdependencies. And any time you change a value, the whole spreadsheet magically updates itself. That’s Reactive Programming.

The power of reactive programming lies in its expressiveness. With only very little expressive logic, you can express what otherwise needs dozens of lines of SQL, or hundreds of lines of Java.

Espresso Logic

With this in mind, I started to delve into Espresso Logic’s free trial. Note, that I’m the kind of impatient person who wants quick results without reading the docs. In case you work the other way round, there are some interesting resources to get you started:

Anyway, the demo ships with a pre-installed MySQL database containing what looks like a typical E-Commerce schema containing customer, employee, lineitem, product, purchaseorder, and purchaseorder_audit tables:

The schema browsing view in Espresso Logic
The schema browsing view in Espresso Logic

So I get schema navigation information (such as parent / child relationships) and an overview of rules. These rules look like triggers calculating sums or validating things. We’ll get to these rules later on.

Live API

So far, things are as expected. The UI is maybe a bit edgy, as the product only exists since late 2013. But what struck me as quite interesting is what Espresso Logic calls the Live API. With a couple of clicks, I can assemble a REST Resource tree structure from various types of resources, such as database tables. The Espresso Designer will then almost automatically join tables to produce trees like this one:

The Resource Tree view of Espresso Logic
The Resource Tree view of Espresso Logic

Notice how I can connect child entities to their parents quite easily. Now, this API is still a bit limited. For instance, I couldn’t figure out how to drag-and-drop a reporting relationship where I calculate the order amount per customer and product. However, I can switch the Resource Type from “Normal” to “SQL” to achieve just that with a plain old GROUP BY and aggregate function.

I started to grasp that I’m actually managing and developing a RESTful API based on the available database resources! A little further down the menu, I then found the “Quick Ref” item, which helped me understand how to call this API:

A quick API reference
A quick API reference

So, each of the previously defined resources is exposed through a URL as I’d expect from any RESTful API. What looks really nice is that I have built-in API versioning and an API key. Note, it is strongly discouraged from an OWASP point of view to pass API keys around in GET requests. This is just a use-case for a quick-start demo and for the odd developer test. Do not use this in production!

Anyway, I called the URL in my browser with the API key as parameter (going against my own rules):

https://eval.espressologic.com/rest/[my-user]/demo/v1/AllCustomers?auth=[my-key]:1

And I got a JSON document like this:

[
  {
    "@metadata": {
      "href": "https://eval.espressologic.com/rest/[my-user]/demo/v1/Customers/Alpha%20and%20Sons",
      "checksum": "A:cf1f4fb79e8e7142"
    },
    "Name": "Alpha and Sons",
    "Balance": 105,
    "CreditLimit": 900,
    "links": [
    ],
    "Orders": [
      {
        "@metadata": {
          "href": "https://eval.espressologic.com/rest/[my-user]/demo/v1/Customers.Orders/6",
          "checksum": "A:0bf14e2d58cc97b5"
        },
        "OrderNumber": 6,
        "TotalAmount": 70,
        "Paid": false,
        "Notes": "Pack with care - fragile merchandise",
        "links": [
        ], ...

Notice how each resource has a link and a checksum. The checksum is needed for optimistic locking, which is built-in, should you choose to concurrently update any of the above resources. Notice also, how the nested resource Orders is referenced as Customers.Orders. I can also access it directly by calling the above URL.

Live Logic / Reactive Programming

So far so good. Similar things have been implemented in a variety of software. For instance, Adobe Experience Manager / Apache Sling intuitively exposes the JCR repository through REST as well. But where the idea behind Espresso Logic really started fascinating me is when I clicked on “Live Logic”, and I was exposed to a preconfigured set of rules that are applied to the data:

The rules view
The rules view

I’ve quickly skimmed through the manual to see if I understood correctly. These rules actually resemble the kind of rules that I can enter in any spreadsheet software. For instance, it appears as though the customer.balance column is calculated as the sum of all purchaseorder.amount_total having a paid value of false, and so on.

So, if I continue through this rule-chain I’ll wind up with lineitem.product_price being the shared dependency of all other calculated values. When changing that value, a whole set of updates should run through my rule set to finally change the customer.balance:

changing lineitem.product_price
-> changes lineitem.amount
  -> changes purchaseorder.amount_total
    -> changes customer.balance

Depending how much of a console hacker you are, you might want to write your own PUT call using curl, or you can leverage the REST Lab from the Espresso Designer, which helps you get all the parameters right. So, assuming we want to change a line item from the previous call:

{
  "@metadata": {
    "href": "https://eval.espressologic.com/rest/[my_user]/demo/v1/Customers.Orders.LineItems/11",
    "checksum": "A:2e3d8cb0bff42763"
  },
  "lineitem_id": 11,
  "ProductNumber": 2,
  "OrderNumber": 6,
  "Quantity": 2,
  "Price": 25,
  ...

Let’s just try to update that to have a price of 30:

Using the REST lab to execute PUT requests
Using the REST lab to execute PUT requests

And you can see in the response, there is a transaction summary, which shows that the Customers.Orders.TotalAmount has changed from 50 to 60, the Customers.Balance has changed from 105 to 95, and an audit record has been written. The audit record itself is also defined by a rule like any other rule. But there’s also an ordinary log file that shows what really happened when I ran this PUT request:

The log view showing all the INSERTs and UPDATEs executed
The log view showing all the INSERTs and UPDATEs executed

Imagine having to put all those INSERT and UPDATE statements into a correct order yourself, and correctly manage caching, and transactions! Instead, all we have done is define some rules. For a complete overview of what rule types are available, consider this page of the Live Logic manual

Out of scope features for this post

… So far, we’ve had a look at the most obvious features of Espresso Logic. There are more, though. A couple of examples:

Server-side JavaScript

If rules cannot express it, JavaScript can. There are various points of the application where you can inject your JavaScript snippets, e.g. for validation, more complex rule expressions, request and response transformation, etc. Although we haven’t tried it, it reads like row-based triggers written in JavaScript.

Stored procedure support

The people behind Espresso Logic are “legacy-embracing” people, just like us at Data Geekery. Their target audience might already have thousands of complex stored procedures with lots of business logic in them. Those should not be rewritten in JavaScript. But just like tables, views, and REST resources, they are exposed through the REST API, taking GET parameters for IN parameters and returning JSON for OUT parameters and cursors.

From a jOOQ perspective, it’s pretty awesome to see that someone else is taking stored procedures as seriously as we do.

Row / column level security

There is a built-in user and role management module that allows you to provide centrally-managed, fine-grained access control to your data. Not many databases support row-level security like the Oracle database, for instance. So having this kind of feature in your platform really adds value to many RDBMS integrations. Some further resources on that topic:

Conclusion: Querying vs. updating vs. rule-based persistence

On our jOOQ blog and our marketing websites (e.g. hibernate-alternative.com), we always advocate two main use-cases when operating on databases:

  • Querying: You have very complex queries to calculate things like reports. For this, SQL (e.g. through jOOQ) is perfect
  • Updating: You have a very complex domain model with lots of items and deltas that you want to persist in one go. For this, Hibernate / ORMs are perfect

But today, Espresso Logic has shown to us that there is yet another use-case. One that is covered by reactive programming (or “spreadsheet-programming“) techniques. And that’s:

  • Rule-based persistence: You have a very complex domain model with lots of items and lots of rules which you want to validate, calculate, and keep in sync all the time. For this, both SQL and ORMs are solutions at the wrong level of abstraction.

This “new” use-case is actually quite common in a lot of enterprise applications where complex business rules are currently spelled out in millions of lines of imperative code that is very hard to decipher and even harder to validate / modify. How can you reverse-engineer your business rules from millions of lines of legacy code, written in COBOL?

At Data Geekery, we’re always looking out for brand new tech. Espresso Logic is a young startup with a new product. Yet, as originally mentioned, they’re a startup with seed funding, a very compelling and innovative idea, and a huge market of legacy COBOL applications that wants to start delving into “sexy” new technologies, such as RESTful APIs, JSON, reactive programming. It might just work! If you haven’t seen enough, go work through this tutorial, which covers advanced examples such as a “bill of materials price rollup”, “bill of materials kit explosion”, “budget rollup”, “audit salary chagnes” and more.

We’ll certainly keep an eye out for future improvements to the Espresso Logic platform!

SQL and NoSQL are Really Just Two Sides of the Same Coin

In a recent debate about NoSQL vs. SQL on Hackernews, I was made aware of a quite amusing paper by Erik Meijer and Gavin Bierman. Remember, Erik Meijer has brought LINQ to the .NET universe, a formidable unified query DSL whose main purpose was to unify typesafe querying against XML, SQL, and object-oriented data structures.

It is important to note that LINQ surfaced shortly before the NoSQL hype really caught on, so NoSQL data structures (e.g. key / value stores, document stores, graph databases) were not yet in scope for LINQ, and LINQ providers might have had to tweak the odd implementation detail to fully match the LINQ API.

What Erik Meijer and Gavin Bierman are claiming in their article (which was also discussed intensively on Hackernews) is the fact that SQL and NoSQL are duals of each other, i.e. two sides of the same coin. In their quest to unify all query languages, the LINQ people would obviously love to simplify things to such a level. To us, who are focusing on SQL only via jOOQ, this seems more like a plea for LINQ than anything else. It would be all too nice if things were as easy as a simple duality, specifically given the fact that Erik Meijer has now also created a company called Applied Duality Inc… What we found very interesting, though, is Erik’s and Gavin’s hint about the relational model and other models inversing arrows between relationships. When child records point to parent records in the relational model, parent objects point to child objects in object-oriented design, or in XML.

But history will teach us where these things go. I currently don’t see a second E.F. Codd to solve the complexity introduced with the new abundance of NoSQL data stores – yet. But maybe, we’ll eventually remember Erik Meijer as the new E.F. Codd, bringing NoSQL to full circle.

Where’s the Self-Confidence when Advertising Java 8, Oracle?

I have often wondered, why the team around Brian Goetz has been heading towards a “decent compromise” so strongly from the beginning, both from a marketing AND technical point of view, instead of adding more boldness to how Java 8 is advertised. At Devoxx Belgium 2013, Brian Goetz seems to have really sold his accomplishments completely under value, according to this interesting article. Having extensively followed the lambda-dev mailing list, I can only stress how little the creators of Java 8 loved their new defender methods feature, for instance.

Java 8 is what we have all been waiting for, for so long! After all, the introduction of lambda expressions and defender methods (equally impactful, if not as often advertised!) is one of the most significant improvements to the Java language since the very beginnings.

Given the tremendous success of LINQ in .NET, I have recently contemplated whether Java 8, lambda expressions and the Streams API might actually be an equally interesting approach to adding features to an ecosystem, compared to the “scariness” of comprehensions and monads as understood by LINQ: https://blog.jooq.org/2013/11/02/does-java-8-still-need-linq-or-is-it-better-than-linq/

While my article above certainly wasn’t well received with the .NET community (and even Erik Meijer himself smirked at it), it did get quite a bit of love from the Java community. In other words, the Java community is ready for Java 8 goodness. Let’s hope Oracle will start advertising it as the cool new thing that it is.

Does Java 8 Still Need LINQ? Or is it Better than LINQ?

Photo by Ade Oshineye
Erik Meijer, Tye Dye Expert. Photo by Ade Oshineye. Licensed under CC-BY-SA

LINQ was one of the best things that happened to the .NET software engineering ecosystem in a long time. With its introduction of lambda expressions and monads in Visual Studio 2008, it had catapulted the C# language way ahead of Java, which was at version 6 at the time, still discussing the pros and cons of generic type erasure. This achievement was mostly due to and accredited to Erik Meijer, a Dutch computer scientist and tye-dye expert who is now off to entirely other projects.

Where is Java now?

With the imminent release of Java 8 and JSR-355, do we still need LINQ? Many attempts of bringing LINQ goodness to Java have been made since the middle of the last decade. At the time, Quaere and Lambdaj seemed to be a promising implementation on the library level (not the language level). In fact, a huge amount of popular Stack Overflow questions hints at how many Java folks were (and still are!) actually looking for something equivalent:

Interestingly, “LINQ” has even made it into EL 3.0!

But do we really need LINQ?

LINQ has one major flaw, which is advertised as a feature, but in our opinion, will inevitably lead to the “next big impedance mismatch”. LINQ is inspired by SQL and this is not at all a good thing. LINQ is most popular for LINQ-to-Objects, which is a fine way of querying collections in .NET. The success of Haskell or Scala, however, has shown that the true functional nature of “collection querying” tends to employ entirely other terms than SELECT, WHERE, GROUP BY, or HAVING.  They use terms like “fold”, “map”, “flatMap”, “reduce”, and many many more. LINQ, on the other hand, employs a mixture of GROUP BY and terms like “skip”, “take” (instead of OFFSET and FETCH).

In fact, nothing could be further from the functional truth than a good old SQL partitioned outer join, grouping set, or framed window function. These constructs are mere declarations of what a SQL developer would like to see as a result. They’re not self-contained functions, which actually contain the logic to be executed in any given context. Moreover, window functions can be used only in SELECT and ORDER BY clauses, which is obvious when thinking in a declarative way, but which is also very weird if you don’t have the SQL context. Specifically, a window function in a SELECT clause influences the whole execution plan and the way indexes are employed to pre-fetch the right data.

Conversely, true functional programming can do so much more to in-memory collections than SQL ever can. Using a SQLesque API for collection querying was a cunning decision to trick “traditional” folks into adopting functional programming. But the hopes that collection and SQL table querying could be interfused were disappointed, as such constructs will not produce the wanted SQL execution plans.

But what if I am doing SQL?

It’s simple. When you do SQL, you have two essential choices.

  • Do it “top-down”, putting most focus on your Java domain model. In that case, use Hibernate / JPA for querying and transform Hibernate results using the Java 8 Streams API.
  • Do it “bottom-up”, putting most focus on your SQL / relational domain model. In that case, use JDBC or jOOQ and again, transform your results using the Java 8 Streams API.

This is illustrated more in detail here: http://www.hibernate-alternative.com

Don’t look back. Embrace the future!

While .NET was “ahead” of Java for a while, this was not due to LINQ itself. This was mainly due to the introduction of lambda expressions and the impact lambdas had on *ALL* APIs. LINQ is just one example of how such APIs could be constructed, although LINQ got most of the credit.

But I’m much more excited about Java 8’s new Streams API, and how it will embrace some functional programming in the Java ecosystem. A very good blog post by Informatech illustrates, how common LINQ expressions translate to Java 8 Streams API expressions.

So, don’t look back. Stop envying .NET developers. With Java 8, we will not need LINQ or any API that tries to imitate LINQ on the grounds of “unified querying”, which is a better sounding name for what is really the “query target impedance mismatch”. We need true SQL for relational database querying, and we need the Java 8 Streams API for functional transformations of in-memory collections. That’s it. Go Java 8!

Heavyweights Martin Odersky, Erik Meijer and Roland Kuhn Team up for a Coursera Course

Erik Meijer (famously known for LINQ, lots of other .NET goodies, and tie-dye shirts of timeless beauty) teams up with Typesafe‘s Martin Odersky (Scala Language) and Roland Kuhn (Akka) to bring you a 7-week-course on the Principles of Reactive Programming, starting on November 4, 2013.

This cooperation of sharp minds can only mean good things, far beyond the scope of this concrete course. Sign up for the Coursera course here:
https://www.coursera.org/course/reactive

Curious what Erik Meijer is up to these days? Check out his new company Applied Duality, Inc’s website:
http://www.applied-duality.com

The slogan is quite cheeky:

After being stuck for > 40 years in a first-order flat model of rigid tabular data locked up in a closed world, it is time for a developer revolution. It is time to embrace monads and duality to lift the database industry to the next level.

It seems like every great scientist and every great salesman is trying to lift the database industry to “the next level”, these days. Exciting times for database professionals!