Archive by Author | lukaseder

Java 8 Friday: Optional Will Remain an Option in Java


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Optional: A new Option in Java

So far, we’ve been pretty thrilled with all the additions to Java 8. All in all, this is a revolution more than anything before. But there are also one or two sore spots. One of them is how Java will never really get rid of

Null: The billion dollar mistake tweet this

In a previous blog post, we have explained the merits of NULL handling in the Ceylon language, which has found one of the best solutions to tackle this issue – at least on the JVM which is doomed to support the null pointer forever. In Ceylon, nullability is a flag that can be added to every type by appending a question mark to the type name. An example:

void hello() {
    String? name = process.arguments.first;
    String greeting;
    if (exists name) {
        greeting = "Hello, ``name``!";
    }
    else {
        greeting = "Hello, World!";
    }
    print(greeting);
}

That’s pretty slick. Combined with flow-sensitive typing, you will never run into the dreaded NullPointerException again:

Recently in the Operating Room. By Geek and Poke

Recently in the Operating Room. By Geek and Poke

Other languages have introduced the Option type. Most prominently: Scala. Java 8 now also introduced the Optional type (as well as the OptionalInt, OptionalLong, OptionalDouble types – more about those later on)

How does Optional work?

The main point behind Optional is to wrap an Object and to provide convenience API to handle nullability in a fluent manner. This goes well with Java 8 lambda expressions, which allow for lazy execution of operations. An example:

Optional<String> stringOrNot = Optional.of("123");

// This String reference will never be null
String alwaysAString =
    stringOrNot.orElse("");

// This Integer reference will be wrapped again
Optional<Integer> integerOrNot = 
    stringOrNot.map(Integer::parseInt);

// This int reference will never be null
int alwaysAnInt = stringOrNot
        .map(s -> Integer.parseInt(s))
        .orElse(0);

There are certain merits to the above in fluent APIs, specifically in the new Java 8 Streams API, which makes extensive use of Optional. For example:

Arrays.asList(1, 2, 3)
      .stream()
      .findAny()
      .ifPresent(System.out::println);

The above piece of code will print any number from the Stream onto the console, but only if such a number exists.

Old API is not retrofitted

For obvious backwards-compatibility reasons, the “old API” is not retrofitted. In other words, unlike Scala, Java 8 doesn’t use Optional all over the JDK. In fact, the only place where Optional is used is in the Streams API. As you can see in the Javadoc, usage is very scarce:

http://docs.oracle.com/javase/8/docs/api/java/util/class-use/Optional.html

This makes Optional a bit difficult to use. We’ve already blogged about this topic before. Concretely, the absence of an Optional type in the API is no guarantee of non-nullability. This is particularly nasty if you convert Streams into collections and collections into streams.

The Java 8 Optional type is treacherous tweet this

Parametric polymorphism

The worst implication of Optional on its “infected” API is parametric polymorphism, or simply: generics. When you reason about types, you will quickly understand that:

// This is a reference to a simple type:
Number s;

// This is a reference to a collection of
// the above simple type:
Collection<Number> c;

Generics are often used for what is generally accepted as composition. We have a Collection of String. With Optional, this compositional semantics is slightly abused (both in Scala and Java) to “wrap” a potentially nullable value. We now have:

// This is a reference to a nullable simple type:
Optional<Number> s;

// This is a reference to a collection of 
// possibly nullable simple types
Collection<Optional<Number>> c;

So far so good. We can substitute types to get the following:

// This is a reference to a simple type:
T s;

// This is a reference to a collection of
// the above simple type:
Collection<T> c;

But now enter wildcards and use-site variance. We can write

// No variance can be applied to simple types:
T s;

// Variance can be applied to collections of
// simple types:
Collection<? extends T> source;
Collection<? super T> target;

What do the above types mean in the context of Optional? Intuitively, we would like this to be about things like Optional<? extends Number> or Optional<? super Number>. In the above example we can write:

// Read a T-value from the source
T s = source.iterator().next();

// ... and put it into the target
target.add(s);

But this doesn’t work any longer with Optional

Collection<Optional<? extends T>> source;
Collection<Optional<? super T>> target;

// Read a value from the source
Optional<? extends T> s = source.iterator().next();

// ... cannot put it into the target
target.add(s); // Nope

… and there is no other way to reason about use-site variance when we have Optional and subtly more complex API.

If you add generic type erasure to the discussion, things get even worse. We no longer erase the component type of the above Collection, we also erase the type of virtually any reference. From a runtime / reflection perspective, this is almost like using Object all over the place!

Generic type systems are incredibly complex even for simple use-cases. Optional makes things only worse. It is quite hard to blend Optional with traditional collections API or other APIs. Compared to the ease of use of Ceylon’s flow-sensitive typing, or even Groovy’s elvis operator, Optional is like a sledge-hammer in your face.

Be careful when you apply it to your API!

Primitive types

One of the main reasons why Optional is still a very useful addition is the fact that the “object-stream” and the “primitive streams” have a “unified API” by the fact that we also have OptionalInt, OptionalLong, OptionalDouble types.

In other words, if you’re operating on primitive types, you can just switch the stream construction and reuse the rest of your stream API usage source code, in almost the same way. Compare these two chains:

// Stream and Optional
Optional<Integer> anyInteger = 
Arrays.asList(1, 2, 3)
      .stream()
      .filter(i -> i % 2 == 0)
      .findAny();
anyInteger.ifPresent(System.out::println);

// IntStream and OptionalInt
OptionalInt anyInt =
Arrays.stream(new int[] {1, 2, 3})
      .filter(i -> i % 2 == 0)
      .findAny();
anyInt.ifPresent(System.out::println);

In other words, given the scarce usage of these new types in JDK API, the dubious usefulness of such a type in general (if retrofitted into a very backwards-compatible environment) and the implications generics erasure have on Optional we dare say that

The only reason why this type was really added is to provide a more unified Streams API for both reference and primitive types tweet this

That’s tough. And makes us wonder, if we should finally get rid of primitive types altogether.

Oh, and…

Optional isn’t Serializable.

Nope. Not Serializable. Unlike ArrayList, for instance. For the usual reason:

Making something in the JDK serializable makes a dramatic increase in our maintenance costs, because it means that the representation is frozen for all time. This constrains our ability to evolve implementations in the future, and the number of cases where we are unable to easily fix a bug or provide an enhancement, which would otherwise be simple, is enormous. So, while it may look like a simple matter of “implements Serializable” to you, it is more than that. The amount of effort consumed by working around an earlier choice to make something serializable is staggering.

Citing Brian Goetz, from:

http://mail.openjdk.java.net/pipermail/jdk8-dev/2013-September/003276.html

Want to discuss Optional? Read these threads on reddit:

Stay tuned for more exciting Java 8 stuff published in this blog series.

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

CUME_DIST(), a Lesser-Known SQL Gem


When doing reporting or statistics with SQL, you better know your window functions. There are many of them, and few SQL developers know about them.

CUME_DIST() is one such function. We’ve recently re-discovered it on Stack Overflow. The following query yields two times the same result for fraction1 and fraction2:

SELECT 
  ename, 
  CUME_DIST() OVER (ORDER BY ename) fraction1,
  ROWNUM / (MAX(ROWNUM) OVER()) fraction2
FROM emp
ORDER BY ename

The above query then yields:

|  ENAME | FRACTION1 | FRACTION2 |
|--------|-----------|-----------|
|  ALLEN |      0.08 |      0.08 |
|  BLAKE |      0.17 |      0.17 |
|  CLARK |      0.25 |      0.25 |
|   FORD |      0.33 |      0.33 |
|  JAMES |      0.42 |      0.42 |
|  JONES |       0.5 |       0.5 |
|   KING |      0.58 |      0.58 |
| MARTIN |      0.67 |      0.67 |
| MILLER |      0.75 |      0.75 |
|  SMITH |      0.83 |      0.83 |
| TURNER |      0.92 |      0.92 |
|   WARD |         1 |         1 |

… as can be seen in this SQLFiddle. In plain English, the CUME_DIST() (or cumulative distribution) of a value within a group of values helps you see how far “advanced” a value is in the ordering of the whole result set – or of a partition thereof.

The second expression using ROWNUM informally explains this with an equivalent expression. The value is always strictly greater than zero and smaller or equal to 1:

0 < CUME_DIST() OVER(ORDER BY ename) <= 1

Note that Oracle (and the SQL standard) also support CUME_DIST() as an “ordered aggregate function”, “ordered set function” or “hypothetical set function”:

SELECT 
  ename, 
  CUME_DIST(ename)
    WITHIN GROUP (ORDER BY ename) fraction
FROM emp
GROUP BY ename
ORDER BY ename

The standard specifies the above as:

<hypothetical set function> ::=
    <rank function type> <left paren>
        <hypothetical set function value expression list> 
            <right paren>
        <within group specification>

<within group specification> ::=
    WITHIN GROUP <left paren> 
        ORDER BY <sort specification list> 
            <right paren>

jOOQ also supports the cumeDist() window function, and the upcoming jOOQ 3.4 will also support the ordered aggregate function.

… and you, you should definitely make this nice function a part of your SQL vocabulary.

Java 8 Friday: Java 8 Will Revolutionize Database Access


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. For our Java 8 series, we’re honoured to host a very relevant guest post by Dr. Ming-Yee Iu.

Dr. Ming-Yee Iu completed a PhD on Database Queries in Java at EPFL. He has created the open source project Jinq to demonstrate some new techniques for supporting database queries in Java.

Our editorial note:

Ever since Erik Meijer has introduced LINQ to the .NET ecosystem, us Java folks have been wondering whether we could have the same. We’ve blogged about this topic before, a couple of times:

While most LINQesque APIs in the Java ecosystem operate as internal domain-specific languages like jOOQ, some try to tackle the integration on a bytecode level, like JaQu.

JINQ formalises runtime bytecode transformations through what Dr. Ming-Yee Iu calls symbolic execution. We find this very interesting to a point that we wonder if we should start building a JINQ-to-jOOQ JINQ provider, where the expressive power of the Java 8 Streams API could be combined with our great SQL standardisation and transformation features…?

Convince yourselves:

Java 8 Goodie: Java 8 Will Revolutionize Database Access

Java 8 is finally here! After years of waiting, Java programmers will finally get support for functional programming in Java. Functional programming support helps streamline existing code while providing powerful new capabilities to the Java language. One area that will be disrupted by these new features is how programmers work with databases in Java. Functional programming support opens up exciting new possibilities for simpler yet more powerful database APIs. Java 8 will enable new ways to access databases that are competitive with those of other programming languages such as C#’s LINQ.

The Functional Way of Working With Data

Java 8 not only adds functional-support to the Java language, but it extends the Java collection classes with new functional ways of working with data. Traditionally, working with large amounts of data in Java requires a lot of loops and iterators.

For example, suppose you have a collection of Customer objects:

Collection<Customer> customers;

If you were only interested in the customers from Belgium, you would have to iterate over all the customers and save the ones you wanted.

Collection<Customer> belgians = new ArrayList<>();
for (Customer c : customers) {
    if (c.getCountry().equals("Belgium"))
        belgians.add(c);
}

This takes five lines of code. It is also poorly abstracted. What happens if you have 10 million customers, and you want to speed up the code by filtering it in parallel using two threads? You would have to rewrite everything to use futures and a lot of hairy multi-threaded code.

With Java 8, you can write the same code in one line. With its support for functional programming, Java 8 lets you write a function saying which customers you are interested in (those from Belgium) and then to filter collections using that function. Java 8 has a new Streams API that lets you do this.

customers.stream().filter(
    c -> c.getCountry().equals("Belgium")
);

Not only is the Java 8 version of the code shorter, but the code is easier to understand as well. There is almost no boilerplate. The code calls the method filter(), so it’s clear that this code is used for filtering customers. You don’t have to spend your time trying to decipher the code in a loop to understand what it is doing with its data.

And what happens if you want to run the code in parallel? You just have to use a different type of stream.

customers.parallelStream().filter(
    c -> c.getCountry().equals("Belgium")
);

What’s even more exciting is that this functional-style of code works with databases as well!

The Functional Way of Working with Databases

Traditionally, programmers have needed to use special database query languages to access the data in databases. For example, below is some JDBC code for finding all the customers from Belgium:

PreparedStatement s = con.prepareStatement(
      "SELECT * "
    + "FROM Customer C "
    + "WHERE C.Country = ? ");
s.setString(1, "Belgium");
ResultSet rs = s.executeQuery();

Much of the code is in the form a string, which the compiler can’t check for errors and which can lead to security problems due to sloppy coding. There is also a lot of boilerplate code that makes writing database access code quite tedious. Tools such as jOOQ solve the problem of error-checking and security by providing a database query language that can be written using special Java libraries. Or you can use tools such as object-relational mappers to hide a lot of boring database code for common access patterns, but if you need to write non-trivial database queries, you will still need to use a special database query language again.

With Java 8, it’s possible to write database queries using the same functional-style used when working with the Streams API. For example, Jinq is an open source project that explores how future database APIs can make use of functional programming. Here is a database query written using Jinq:

customers.where(
    c -> c.getCountry().equals("Belgium")
);

This code is almost identical to the code using the Streams API. In fact, future versions of Jinq will let you write queries directly using the Streams API. When the code is run, Jinq will automatically translate the code into a database query like the JDBC query shown before.

So without having to learn a new database query language, you can write efficient database queries. You can use the same style of code you would use for Java collections. You also don’t need a special Java compiler or virtual machine. All of this code compiles and runs using the normal Java 8 JDK. If there are errors in your code, the compiler will find them and report them to you, just like normal Java code.

Jinq supports queries that can be as complicated as SQL92. Selection, projection, joins, and subqueries are all supported. The algorithm for translating Java code into database queries is also very flexible in what code it will accept and translate. For example, Jinq has no problem translating the code below into a database query, despite its complexity.

customers
    .where( c -> c.getCountry().equals("Belgium") )
    .where( c -> {
        if (c.getSalary() < 100000)
            return c.getSalary() < c.getDebt();
        else
            return c.getSalary() < 2 * c.getDebt();
        } );

As you can see, the functional programming support in Java 8 is well-suited for writing database queries. The queries are compact, and complex queries are supported.

Inner Workings

But how does this all work? How can a normal Java compiler translate Java code into database queries? Is there something special about Java 8 that makes this possible?

The key to supporting these new functional-style database APIs is a type of bytecode analysis called symbolic execution. Although your code is compiled by a normal Java compiler and run in a normal Java virtual machine, Jinq is able to analyze your compiled Java code when it is run and construct database queries from them. Symbolic execution works best when analyzing small functions, which are common when using the Java 8 Streams API.

The easiest way to understand how this symbolic execution works is with an example. Let’s examine how the following query is converted by Jinq into the SQL query language:

customers
    .where( c -> c.getCountry().equals("Belgium") )

Initially, the customers variable is a collection that represents this database query

SELECT *
  FROM Customers C

Then, the where() method is called, and a function is passed to it. In this where() method, Jinq opens the .class file of the function and gets the compiled bytecode for the function to analyze. In this example, instead of using real bytecode, let’s just use some simple instructions to represent the bytecode of the function:

  1. d = c.getCountry()
  2. e = “Belgium”
  3. e = d.equals(e)
  4. return e

Here, we pretend that the function has been compiled by the Java compiler into four instructions. This is what Jinq sees when the where() method is called. How can Jinq make sense of this code?

Jinq analyzes the code by executing it. Jinq doesn’t run the code directly though. It runs the code ‘abstractly’. Instead of using real variables and real values, Jinq uses symbols to represent all values when executing the code. This is why the analysis is called symbolic execution.

Jinq executes each instruction and keeps track of all the side-effects or all the things that the code changes in the state of the program. Below is a diagram showing all the side-effects that Jinq finds when it executes the four lines of code using symbolic execution.

Symbolic execution example

Symbolic execution example

In the diagram, you can see how after the first instruction runs, Jinq finds two side-effects: the variable d has changed and the method Customer.getCountry() has been called. With symbolic execution, the variable d is not given a real value like “USA” or “Denmark”. It is assigned the symbolic value of c.getCountry().

After all the instructions have been executed symbolically, Jinq prunes the side-effects. Since the variables d and e are local variables, any changes to them are discarded after the function exits, so those side-effects can be ignored. Jinq also knows that the methods Customer.getCountry() and String.equals() do not modify any variables or show any output, so those method calls can also be ignored. From this, Jinq can conclude that executing the function produces only one effect: it returns c.getCountry().equals("Belgium").

Once Jinq has understood what the function passed to it in the where() method does, it can then merge this knowledge with the database query underlying the customers collection to create a new database query.

Generating a database query

Generating a database query

And that’s how Jinq generates database queries from your code. The use of symbolic execution means that this approach is quite robust to the different code patterns outputted by different Java compilers. If Jinq ever encounters code with side-effects that can’t be emulated using a database query, Jinq will leave your code untouched. Since everything is written using normal Java code, Jinq can just run that code directly instead, and your code will produce the expected results.

This simple translation example should have given you an idea of how the query translation works. You should feel confident that these algorithms can correctly generate database queries from your code.

An Exciting Future

I hope I have given you a taste for how Java 8 enables new ways of working with databases in Java. The functional programming support in Java 8 allows you write database code in a similar way to writing code for working with Java collections. Hopefully, existing database APIs will soon be extended to support these styles of queries.

To play with a prototype for these new types of queries, you can visit http://www.jinq.org

A Better Query Language than SQL


Leland Richardson, Founder of Tech.Pro has recently published a very interesting article about BQL, his visions of a better query language (than SQL). The deciding feat of his new language proposal is the fact that it is really a superset of SQL itself.

SQL is a very rich and expressive language to query relational databases. But it is awkward in many aspects, and a lot of people perceive it to be evolving only slowly – even if that is not true, considering the pace of SQL standards. But the standard is one thing, the implementations another – especially in the enterprise. When we blog about SQL, we’re constantly surprised ourselves, how awesome the PostgreSQL dialect is. But often, PostgreSQL actually just implements the standard. So there is hope that we’re getting somewhere.

Nonetheless, in Leland’s article, there are a couple of ideas worth picking up. From our point of view, these are mainly:

Flexibility in ordering the SELECT clause and the table expression

In SQL, SELECT is always the first keyword. It must be expressed before the table expression. We’ve shown in a previous article that this is quite confusing for many SQL users. While the existing syntax should continue to exist, it would be good to be able to inverse the SELECT clause and the table expression.

FROM table
WHERE predicate
GROUP BY columns
SELECT columns

Remember, the table expression contains FROM, WHERE, GROUP BY clauses, as well as vendor-specific CONNECT BY clauses and others:

<query specification> ::=
  SELECT [ <set quantifier> ] 
    <select list> <table expression>

This language feature is already available in LINQ, by the way.

Implicit KEY JOINs

This feature is also available in jOOQ, using the ON KEY clause. Note that Sybase also supports ON KEY joins:

from post 
key join user
key join comment
select *

Named projections

This is one of the features we really wish that the SQL language had. However, we wouldn’t count on specifying projections in a dedicated syntax. We had rather use an extension to the table expression syntax, allowing for a table to produce “side-tables” as such:

from dbo.users
with projection as (
  firstName, lastName, phoneNumber, email
)
select projection.*

In the above example, projection is really nothing else than another table expression that is derived from the users table. From a SQL syntax semantics, this would be extremely powerful, because such projections would inherit all syntactic features of a regular table. We’ve blogged about this before, when we called that feature “common column expressions”.

Conclusion

Leland has lots of other ideas. He’s just at the beginning of a project that will still need a lot of refinement. The feedback he got on reddit, however, is rather good. Clearly, there is a lot of potential in creating “BQL” for SQL what

Let’s see where this endeavour leads. We’ll certainly be keeping an eye out for BQL’s next steps.

The Power of Spreadsheets in a Reactive, RESTful API


Being mostly a techie, I’ve recently and admittedly been deceived by my own Dilbertesque attitude when I stumbled upon this buzzword-filled TechCrunch article about Espresso Logic. Ever concerned about my social media reputation (e.g. reddit and hackernews karma), I thought it would be witty to put a link on those platforms titled:

Just found this article on TechCrunch. Reads like a markov-chain-generated series of buzzwords.

With such a catchy headline, the post quickly skyrocketed – and like many other redditors, my thoughts were with Geek and Poke:

But like a few other redditors, I couldn’t resist clicking through to the actual product that claims to implement “reactive programming” through a REST and JSON API. And I’m frankly impressed by the ideas behind this product. For once, the buzzwords are backed by software implementing them very nicely! Let’s first delve into…

Reactive Programming

Reactive programming is a term that has gained quite some traction recently around Typesafe, the company behind Akka. It has also gained additional traction since Erik Meijer (creator of LINQ) has left Microsoft to fully dedicate his time to his new company Applied Duality. With those brilliant minds sharply on the topic, we’ll certainly hear more about the Reactive Manifesto in the near future.

excelBut in fact, every manager knows the merits of “reactive programming” already as they’re working with the most reactive and probably the most awesome software on the planet: Microsoft Excel, a device whose mystery is only exceeded by its power. Think about how awesome Excel is. You have hundreds of rules, formulas, cell-interdependencies. And any time you change a value, the whole spreadsheet magically updates itself. That’s Reactive Programming.

The power of reactive programming lies in its expressiveness. With only very little expressive logic, you can express what otherwise needs dozens of lines of SQL, or hundreds of lines of Java.

Espresso Logic

With this in mind, I started to delve into Espresso Logic’s free trial. Note, that I’m the kind of impatient person who wants quick results without reading the docs. In case you work the other way round, there are some interesting resources to get you started:

Anyway, the demo ships with a pre-installed MySQL database containing what looks like a typical E-Commerce schema containing customer, employee, lineitem, product, purchaseorder, and purchaseorder_audit tables:

The schema browsing view in Espresso Logic

The schema browsing view in Espresso Logic

So I get schema navigation information (such as parent / child relationships) and an overview of rules. These rules look like triggers calculating sums or validating things. We’ll get to these rules later on.

Live API

So far, things are as expected. The UI is maybe a bit edgy, as the product only exists since late 2013. But what struck me as quite interesting is what Espresso Logic calls the Live API. With a couple of clicks, I can assemble a REST Resource tree structure from various types of resources, such as database tables. The Espresso Designer will then almost automatically join tables to produce trees like this one:

The Resource Tree view of Espresso Logic

The Resource Tree view of Espresso Logic

Notice how I can connect child entities to their parents quite easily. Now, this API is still a bit limited. For instance, I couldn’t figure out how to drag-and-drop a reporting relationship where I calculate the order amount per customer and product. However, I can switch the Resource Type from “Normal” to “SQL” to achieve just that with a plain old GROUP BY and aggregate function.

I started to grasp that I’m actually managing and developing a RESTful API based on the available database resources! A little further down the menu, I then found the “Quick Ref” item, which helped me understand how to call this API:

A quick API reference

A quick API reference

So, each of the previously defined resources is exposed through a URL as I’d expect from any RESTful API. What looks really nice is that I have built-in API versioning and an API key. Note, it is strongly discouraged from an OWASP point of view to pass API keys around in GET requests. This is just a use-case for a quick-start demo and for the odd developer test. Do not use this in production!

Anyway, I called the URL in my browser with the API key as parameter (going against my own rules):

https://eval.espressologic.com/rest/[my-user]/demo/v1/AllCustomers?auth=[my-key]:1

And I got a JSON document like this:

[
  {
    "@metadata": {
      "href": "https://eval.espressologic.com/rest/[my-user]/demo/v1/Customers/Alpha%20and%20Sons",
      "checksum": "A:cf1f4fb79e8e7142"
    },
    "Name": "Alpha and Sons",
    "Balance": 105,
    "CreditLimit": 900,
    "links": [
    ],
    "Orders": [
      {
        "@metadata": {
          "href": "https://eval.espressologic.com/rest/[my-user]/demo/v1/Customers.Orders/6",
          "checksum": "A:0bf14e2d58cc97b5"
        },
        "OrderNumber": 6,
        "TotalAmount": 70,
        "Paid": false,
        "Notes": "Pack with care - fragile merchandise",
        "links": [
        ], ...

Notice how each resource has a link and a checksum. The checksum is needed for optimistic locking, which is built-in, should you choose to concurrently update any of the above resources. Notice also, how the nested resource Orders is referenced as Customers.Orders. I can also access it directly by calling the above URL.

Live Logic / Reactive Programming

So far so good. Similar things have been implemented in a variety of software. For instance, Adobe Experience Manager / Apache Sling intuitively exposes the JCR repository through REST as well. But where the idea behind Espresso Logic really started fascinating me is when I clicked on “Live Logic”, and I was exposed to a preconfigured set of rules that are applied to the data:

The rules view

The rules view

I’ve quickly skimmed through the manual to see if I understood correctly. These rules actually resemble the kind of rules that I can enter in any spreadsheet software. For instance, it appears as though the customer.balance column is calculated as the sum of all purchaseorder.amount_total having a paid value of false, and so on.

So, if I continue through this rule-chain I’ll wind up with lineitem.product_price being the shared dependency of all other calculated values. When changing that value, a whole set of updates should run through my rule set to finally change the customer.balance:

changing lineitem.product_price
-> changes lineitem.amount
  -> changes purchaseorder.amount_total
    -> changes customer.balance

Depending how much of a console hacker you are, you might want to write your own PUT call using curl, or you can leverage the REST Lab from the Espresso Designer, which helps you get all the parameters right. So, assuming we want to change a line item from the previous call:

{
  "@metadata": {
    "href": "https://eval.espressologic.com/rest/[my_user]/demo/v1/Customers.Orders.LineItems/11",
    "checksum": "A:2e3d8cb0bff42763"
  },
  "lineitem_id": 11,
  "ProductNumber": 2,
  "OrderNumber": 6,
  "Quantity": 2,
  "Price": 25,
  ...

Let’s just try to update that to have a price of 30:

Using the REST lab to execute PUT requests

Using the REST lab to execute PUT requests

And you can see in the response, there is a transaction summary, which shows that the Customers.Orders.TotalAmount has changed from 50 to 60, the Customers.Balance has changed from 105 to 95, and an audit record has been written. The audit record itself is also defined by a rule like any other rule. But there’s also an ordinary log file that shows what really happened when I ran this PUT request:

The log view showing all the INSERTs and UPDATEs executed

The log view showing all the INSERTs and UPDATEs executed

Imagine having to put all those INSERT and UPDATE statements into a correct order yourself, and correctly manage caching, and transactions! Instead, all we have done is define some rules. For a complete overview of what rule types are available, consider this page of the Live Logic manual

Out of scope features for this post

… So far, we’ve had a look at the most obvious features of Espresso Logic. There are more, though. A couple of examples:

Server-side JavaScript

If rules cannot express it, JavaScript can. There are various points of the application where you can inject your JavaScript snippets, e.g. for validation, more complex rule expressions, request and response transformation, etc. Although we haven’t tried it, it reads like row-based triggers written in JavaScript.

Stored procedure support

The people behind Espresso Logic are “legacy-embracing” people, just like us at Data Geekery. Their target audience might already have thousands of complex stored procedures with lots of business logic in them. Those should not be rewritten in JavaScript. But just like tables, views, and REST resources, they are exposed through the REST API, taking GET parameters for IN parameters and returning JSON for OUT parameters and cursors.

From a jOOQ perspective, it’s pretty awesome to see that someone else is taking stored procedures as seriously as we do.

Row / column level security

There is a built-in user and role management module that allows you to provide centrally-managed, fine-grained access control to your data. Not many databases support row-level security like the Oracle database, for instance. So having this kind of feature in your platform really adds value to many RDBMS integrations. Some further resources on that topic:

Conclusion: Querying vs. updating vs. rule-based persistence

On our jOOQ blog and our marketing websites (e.g. hibernate-alternative.com), we always advocate two main use-cases when operating on databases:

  • Querying: You have very complex queries to calculate things like reports. For this, SQL (e.g. through jOOQ) is perfect
  • Updating: You have a very complex domain model with lots of items and deltas that you want to persist in one go. For this, Hibernate / ORMs are perfect

But today, Espresso Logic has shown to us that there is yet another use-case. One that is covered by reactive programming (or “spreadsheet-programming“) techniques. And that’s:

  • Rule-based persistence: You have a very complex domain model with lots of items and lots of rules which you want to validate, calculate, and keep in sync all the time. For this, both SQL and ORMs are solutions at the wrong level of abstraction.

This “new” use-case is actually quite common in a lot of enterprise applications where complex business rules are currently spelled out in millions of lines of imperative code that is very hard to decipher and even harder to validate / modify. How can you reverse-engineer your business rules from millions of lines of legacy code, written in COBOL?

At Data Geekery, we’re always looking out for brand new tech. Espresso Logic is a young startup with a new product. Yet, as originally mentioned, they’re a startup with seed funding, a very compelling and innovative idea, and a huge market of legacy COBOL applications that wants to start delving into “sexy” new technologies, such as RESTful APIs, JSON, reactive programming. It might just work! If you haven’t seen enough, go work through this tutorial, which covers advanced examples such as a “bill of materials price rollup”, “bill of materials kit explosion”, “budget rollup”, “audit salary chagnes” and more.

We’ll certainly keep an eye out for future improvements to the Espresso Logic platform!

A SQL query DSL for Scala by ScalikeJDBC


There are a tremendous amount of SQL APIs natively written in Scala. Manuel Bernhardt has summarised a nice collection in his a post. Another collection of Scala SQL APIs can be seen in this Stack Overflow question.

One API that we want to focus on in particular is ScalikeJDBC (licensed ASL 2.0), which has recently published a SQL query DSL API similar to that of jOOQ. See the full documentation here:

http://scalikejdbc.org/documentation/query-dsl.html

A couple of examples:

val orders: List[Order] = withSQL {
  select
    .from(Order as o)
    .innerJoin(Product as p).on(o.productId, p.id)
    .leftJoin(Account as a).on(o.accountId, a.id)
    .where.eq(o.productId, 123)
    .orderBy(o.id).desc
    .limit(4)
    .offset(0)
  }.map(Order(o, p, a)).list.apply()

The above example looks very similar to jOOQ code, except that the SELECT DSL seems to be a bit more rigid than jOOQ’s. For instance, it is not immediately obvious how to connect several complex predicates in that WHERE clause, or if complex predicates are available at all.

What’s really nice, however, is their way of leveraging Scala language features to provide a very fluent way of constructing dynamic SQL, as can be seen in this example:

def findOrder(id: Long, accountRequired: Boolean) = 
withSQL {
  select
    .from[Order](Order as o)
    .innerJoin(Product as p).on(o.productId, p.id)
    .map { sql =>
      if (accountRequired) 
        sql.leftJoin(Account as a)
           .on(o.accountId, a.id)
      else 
        sql
    }.where.eq(o.id, 13)
  }.map { rs =>
    if (accountRequired) 
      Order(o, p, a)(rs) 
    else 
      Order(o, p)(rs)
  }.single.apply()

From how we understand things, the map method that is invoked in the middle of the SQL statement (between innerJoin and where) can transform the intermediate DSL state using a lambda expression that allows for appending a leftJoin if needed. Obviously, this can be done in a more procedural fashion as well, by assigning that intermediate DSL state to a local variable.

The need for SQL query DSLs

We’ve blogged about many of these similar SQL query DSLs in the past. The fact that they constantly pop up in various APIs is no coincidence. SQL is a very typesafe and composable language that is hard to use dynamically through string-based APIs such as JDBC, ODBC, etc.

Having a typesafe internal domain-specific language model SQL in a host language like Java or Scala brings great advantages. But the disadvantages may shine through quickly, when the DSL is not carefully crafted in a completely foreseeable way. Take the following ScalikeJDBC QueryDSL example, for instance:

val ids = withSQL {
  select(o.result.id).from(Order as o)
    .where(sqls.toAndConditionOpt(
      productId.map(id => sqls.eq(o.productId, id)),
      accountId.map(id => sqls.eq(o.accountId, id))
    ))
    .orderBy(o.id)
}.map(_.int(1)).list.apply()

This toAndConditionOpt method is really unexpected and doesn’t follow the principle of least astonishment.

This is why jOOQ’s API design is based on a formal BNF that closely mimicks SQL itself. Read more about that here.

Java 8 Friday Goodies: Lean Concurrency


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Java 8 Goodie: Lean Concurrency

Someone once said that (unfortunately, we don’t have the source anymore):

Junior programmers think concurrency is hard.
Experienced programmers think concurrency is easy.
tweet thisSenior programmers think concurrency is hard.

That is quite true. But on the bright side, Java 8 will at least improve things by making it easier to write concurrent code with lambdas and the many improved APIs. Let’s have a closer look:

Java 8 improving on JDK 1.0 API

java.lang.Thread has been around from the very beginning in JDK 1.0. So has java.lang.Runnable, which is going to be annotated with FunctionalInterface in Java 8.

It is almost a no-brainer how we can finally submit Runnables to a Thread from now on. Let’s assume we have a long-running operation:

public static int longOperation() {
    System.out.println("Running on thread #"
       + Thread.currentThread().getId());

    // [...]
    return 42;
}

We can then pass this operation to Threads in various ways, e.g.

Thread[] threads = {

    // Pass a lambda to a thread
    new Thread(() -> {
        longOperation();
    }),

    // Pass a method reference to a thread
    new Thread(ThreadGoodies::longOperation)
};

// Start all threads
Arrays.stream(threads).forEach(Thread::start);

// Join all threads
Arrays.stream(threads).forEach(t -> {
    try { t.join(); }
    catch (InterruptedException ignore) {}
});

As we’ve mentioned in our previous blog post, it’s a shame that lambda expressions did not find a lean way to work around checked exceptions. None of the newly added functional interfaces in the java.util.function package allow for throwing checked exceptions, leaving the work up to the call-site.

jool-logo-blackIn our last post, we’ve thus published jOOλ (also jOOL, jOO-Lambda), which wraps each one of the JDK’s functional interfaces in an equivalent functional interface that allows for throwing checked exceptions. This is particularly useful with old JDK APIs, such as JDBC, or the above Thread API. With jOOλ, we can then write:

// Join all threads
Arrays.stream(threads).forEach(Unchecked.consumer(
    t -> t.join()
));

Java 8 improving on Java 5 API

Java’s multi-threading APIs had been pretty dormant up until the release of Java 5′s awesome ExecutorService. Managing threads had been a burden, and people needed external libraries or a J2EE / JEE container to manage thread pools. This has gotten a lot easier with Java 5. We can now submit a Runnable or a Callable to an ExecutorService, which manages its own thread-pool.

Here’s an example how we can leverage these Java 5 concurrency APIs in Java 8:

ExecutorService service = Executors
    .newFixedThreadPool(5);

Future[] answers = {
    service.submit(() -> longOperation()),
    service.submit(ThreadGoodies::longOperation)
};

Arrays.stream(answers).forEach(Unchecked.consumer(
    f -> System.out.println(f.get())
));

Note, how we again use an UncheckedConsumer from jOOλ to wrap the checked exception thrown from the get() call in a RuntimeException.

Parallelism and ForkJoinPool in Java 8

Now, the Java 8 Streams API changes a lot of things in terms of concurrency and parallelism. In Java 8, you can write the following, for instance:

Arrays.stream(new int[]{ 1, 2, 3, 4, 5, 6 })
      .parallel()
      .max()
      .ifPresent(System.out::println);

While it isn’t necessary in this particular case, it’s still interesting to see that the mere calling of parallel() will run the IntStream.max() reduce operation on all available threads of your JDK’s internal ForkJoinPool without you having to worry about the involved ForkJoinTasks. This can be really useful, as not everybody welcomed the JDK 7 ForkJoin API the complexity it has introduced.

Read more about Java 8′s parallel streams in this interesting InfoQ article.

More on Java 8

Parallelism was one of the main driving forces behind the new Streams API. Being able to just set a flag called parallel() on a Stream is marvellous in many situations.

In the last example, we’ve seen the OptionalInt.ifPresent() method that takes an IntConsumer argument to be executed if the previous reduce operation succeeded.

Other languages such as Scala have known an “Option” type to improve NULL handling. We’ve blogged about Optional before, and we’ll reiterate the Java 8 Optional type in the context of Java 8 Streams, so stay tuned!

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

The 10 Commandments of Programming


Patterns 34:29

As Turing descended from Mount Compute – with the two iPads of the testimony in his hands as he descended the mountain – he did not realize that the skin of his blog shone as a result of his Compiling the Code.

Patterns 35:1

Turing assembled the entire Geek community and said to them, “These are the things that the Compiler has commanded you to do:”

  1. Thou shalt not GOTO
  2. Thou shalt not TODO
  3. Thou shalt not catch all
  4. Thou shalt not <br>
  5. Thou shalt not label thy code to break or continue
  6. Thou shalt not bear false witness or side effects in getters
  7. Thou shalt not neglect your curly braces
  8. Thou shalt not desire the Unsafe
  9. Thou shalt not covet your neighbor’s private fields or methods
  10. Thou shalt not deceive with cleverness in names

But the Geek community did not obey, and the Compiler was not amused:

Disclaimer

Neither Turing nor the Compiler may not have actually said these things

jOOQ Newsletter: March 12, 2014


Subscribe to the newsletter here

Tweet of the Day

Our customers, users, and followers are sharing their love for jOOQ to the world. Here are:

Dominik Dorn who finds jOOQ awesome. As simple as that!

Mariusz Nosiński who cannot believe he hasn’t discovered jOOQ before.

Thanks for the shouts, guys!

jOOQ and Scala

In our last newsletter, we’ve given you a brief outlook about the new features we’re planning on integrating into jOOQ 3.4. But the jOOQ labs are also working hard on a more long-term roadmap. One important item on that roadmap is our Scala integration.

Already today, jOOQ and Scala can be integrated effortlessly thanks to Scala’s various language features. Writing jOOQ-SQL in Scala isn’t far away from “the real thing”. Would you have guessed that the following is actual jOOQ code?

select (
  BOOK.ID * BOOK.AUTHOR_ID,
  BOOK.ID + BOOK.AUTHOR_ID * 3 + 4,
  BOOK.TITLE || " abc" || " xy"
)
from BOOK
leftOuterJoin (
  select (x.ID, x.YEAR_OF_BIRTH)
  from x
  limit 1
)
on BOOK.AUTHOR_ID === x.ID
where (BOOK.ID <> 2)
and (BOOK.TITLE in ("O Alquimista", "Brida"))
fetch

Together with the very active Vienna Scala User Group, we’re going to challenge this integration in April, to gather new ideas for an exciting future! Scala currently doesn’t have any appropriate SQL framework that fully embraces SQL as a first-class language. We’re going to be there to fill this gap in 2014.

If you’re in Vienna on April 7, be sure not to miss out on this event!

Community zone

jOOQ is not only about jOOQ-the-product™. jOOQ is a whole experience for Java and SQL developers. As a frequent reader of this newsletter, you are already part of this experience. An early adopter of what will continue to evolve into a great technology in the near future.

Proof can be seen in a comparison we got from MailChimp, between the jOOQ newsletter and the industry average for such newsletters. These numbers are from the February 26 mailing:

As a reader of our newsletter, you are among the most active and most interested peers of our industry. It makes us proud to have you on board as our readers, customers, followers.

But let us reach even more like-minded people! Do you have ideas? Contributions? Blog posts that you want to share? Drop us an E-Mail. Join the community. Share the experience with your friends and co-workers. Together, we’ll make the Java and SQL experience awesome!

Java Zone – how to write SQL in Java 8

Java 8 is a game changer for everyone in our industry. Many other platforms have known lambda expressions and functional programming for a long time. Finally, we Java folks can join the fun they have been having with their languages.

We’ve recently published an overview of a variety of ways to interact with plain SQL from Java using JDBC, using jOOλ (also jOOL, jOO-Lambda, our latest Open Source product to improve the Java 8 lambda expression experience), using jOOQ, using Spring JDBC, and using Apache DbUtils.

You can find this overview, which is part of our Java 8 Friday blog series here:
http://www.jooq.org/java-8-and-sql.

But Java 8 is not only about lambda expressions, of course. A very comprehensive list of resources can be found in this excellent assembly:
http://www.baeldung.com/java8.

Upcoming Events

Have you missed any of our previous jOOQ talks? Soon you’ll get another chance to hear us talk about jOOQ or SQL in general in any of these upcoming events:

We’re also happy to present jOOQ at in-house events and conferences this month

  • March 25, 2014: SBB, Swiss Federal Railways (German, about jOOQ)
  • March 29, 2014: Trivadis TechEvent (English, about jOOQ)

Do you work for any of the above companies? Don’t miss our talk! Do you work for an other company and would like to host such a talk at your venue? Contact us!

Stay informed about 2014 events on www.jooq.org/news.

Please, Run That Calculation in Your RDBMS


There’s one thing that you can do terribly wrong when working with RDBMS. And that thing is not running your calculations in the database, when you should.

We’re not advocating to blindly move all business logic into the database, but when I see a Stack Overflow question like this, I feel the urge to gently remind you of the second item in our popular 10 Common Mistakes Java Developers Make When Writing SQL.

Gently reminding you of the "right way"

Gently reminding you of the second item.
Some background on this illustration here (in German).tweet this

The Stack Overflow question essentially boils down to this (liberally quoted):

From the following medium-sized table, I wish to count the number of documents with status 0 or 1 per application ID:

AppID | DocID | DocStatus 
------+-------+----------
1     | 100   | 0
1     | 101   | 1    
2     | 200   | 0    
2     | 300   | 1
...   | ...   | ...

Should I use Hibernate for that?

And the answer: NO! Don’t use Hibernate for that (unless you mean native querying). You should use SQL for that. Es-Queue-El! You have so many trivial options to make your SQL Server help you run this query in a fraction of the time it would take if you loaded all that data into Java memory before aggregating!

For instance (using SQL Server):

Using GROUP BY

This is the most trivial one, but it might not return result in exactly the way you wanted, i.e. different aggregation results are in different rows:

SELECT [AppID], [DocStatus], count(*)
FROM [MyTable]
GROUP BY [AppID], [DocStatus]

Example on SQLFiddle, returning something like

| APPID | DOCSTATUS | COLUMN_2 |
|-------|-----------|----------|
|     1 |         0 |        2 |
|     2 |         0 |        3 |
|     1 |         1 |        3 |
|     2 |         1 |        2 |

Using nested selects

This is probably the solution that this particular user was looking for. They probably want each aggregation in a separate column, and one very generic way to achieve this is by using nested selects. Note that this solution might prove to be a bit slow in some databases that have a hard time optimising these things

SELECT [AppID],
       (SELECT count(*) FROM [MyTable] [t2]
        WHERE [t1].[AppID] = [t2].[AppID]
        AND [DocStatus] = 0) [Status_0],
       (SELECT count(*) FROM [MyTable] [t2]
        WHERE [t1].[AppID] = [t2].[AppID]
        AND [DocStatus] = 1) [Status_1]
FROM [MyTable] [t1]
GROUP BY [AppID]

Example on SQLFiddle, returning something like

| APPID | STATUS_0 | STATUS_1 |
|-------|----------|----------|
|     1 |        2 |        3 |
|     2 |        3 |        2 |

Using SUM()

This solution is probably the optimal one. It is equivalent to the previous one with nested selects, although it only works for simple queries, whereas the nested selects version is more versatile.

SELECT [AppID],
       SUM(IIF([DocStatus] = 0, 1, 0)) [Status_0],
       SUM(IIF([DocStatus] = 1, 1, 0)) [Status_1]
FROM [MyTable] [t1]
GROUP BY [AppID]

Example on SQLFiddle, same result as before

Using PIVOT

This solution is for the SQL Aficionados among yourselves. It uses the T-SQL PIVOT clause!

SELECT [AppID], [0], [1]
FROM (
    SELECT [AppID], [DocStatus]
    FROM [MyTable]
) [t]
PIVOT (
    count([DocStatus]) 
    FOR [DocStatus] 
    IN ([0], [1])
) [pvt]

SQL aficionados use PIVOT tweet this

Example on SQLFiddle, same result as before

Conclusion

You may freely choose your weapon among the above suggestions, and I’m sure there are more alternatives. All of them will outperform any Java-based aggregation implementation by orders of magnitude, even for trivially small data sets for sure. We’ll say this time and again, and we’ll quote Gavin King time and again for the same thing:

Just because you’re using Hibernate, doesn’t mean you have to use it for everything. A point I’ve been making for about ten years now.

And in our words:

Use SQL whenever appropriate! And that is much more often than you might think!

Follow

Get every new post delivered to your Inbox.

Join 1,234 other followers

%d bloggers like this: