Tag Archive | jooq

Don’t be Fooled by Generics and Backwards-Compatibility. Use Generic Generic Types

I’ve recently had a very interesting discussion with Sebastian Gruber from Ergon, a very early jOOQ customer, whom we’re in close touch with. Talking to Sebastian has lead our engineering team to the conclusion that we should completely rewrite the jOOQ API. Right now, we already have lots of generics for various purposes, e.g.

  • Generics for column types, such as
    interface Field<T> { ... }
    Field<String> field = BOOK.TITLE;
  • Generics for table types, such as
    interface Table<R extends Record> { ... }
    Table<BookRecord> books = BOOK;
  • Combined generics where both <T> and <R> are used
  • … and much more

Sometimes, you just cannot anticipate how many different generic types you’ll need on your classes and interfaces two years down the line, and the problem with Java is: You can generify your classes only exactly once. Let’s assume that you’ve always had a type like this:

class Foo {}

Now you happen to know that you need two generic type parameters right now:

// Still compatible
class Foo<Bar, Baz> {}

That’ll work and all the existing client code will still compile, with a rawtype warning. But once you’ve published Foo<Bar, Baz>, you can no longer add more type variables to it, or remove them. Every modification will break client code!

// Breaking change
class Foo<Bar, Baz, Fizz> {}

The solution: Generic generic types

We don’t want to place that burden upon our customers, the heavy burden of backwards-incompatibility. This is why we’re now publishing our next release of jOOQ with a new feature that we call generic generic types. How does it work? It’s easy. We’ve learned from the best database designers who have already been using generic column types all along. In SQL, if you run into this kind of problem, you’d simply write:

    bar int,
    baz int,
    fizz int,

    generic_1 varchar(4000),
    generic_2 varchar(4000),
    generic_3 varchar(4000),
    generic_4 varchar(4000),
    -- [...]

Now your SQL schema is safe for ages to come. We’ll do the same in Java:

class Foo<

    // [...]
> {}

We’ll thus generify all our types to have exactly 256 generic type parameters. 256 was the sensible limit that MS Access chose for the number of possible columns. That way, our customers will only have to upgrade to the new version of jOOQ once and from then on, generic type backwards-compatibility will be guaranteed forever.

Happy coding!

Hack up a Simple JDBC ResultSet Cache Using jOOQ’s MockDataProvider

Some queries shouldn’t hit the database all the time. When you query for master data (such as system settings, languages, translations, etc.), for instance, you may want to avoid sending the same silly query (and the results) over the wire all the time. For example:

SELECT * FROM languages

Most databases maintain buffer caches to accelerate these queries, so you don’t always hit the disk. Some databases maintain result set caches per cursor, or their JDBC drivers might even implement result set caches directly in the driver – a little known feature in Oracle, for instance:

SELECT /*+ RESULT_CACHE */ * FROM languages

But you may not be using Oracle, and because patching JDBC is a pain, you might have resorted to implementing the cache one or two layers up in the data access or service layer:

class LanguageService {
    private Cache cache;

    List<Language> getLanguages() {
        List<Language> result = cache.get();

        if (result == null) {
            result = doGetLanguages();

        return result;

Doing it in the JDBC layer, instead

While this might work fine on a per-service-and-method level, it might quickly become tedious when you query only parts of those results. E.g. what happens when you add an additional filter? Should you cache that query as well? Should you perform the filter on the cache, or hit the database at least once per filter?

class LanguageService {
    private Cache cache;

    List<Language> getLanguages() { ... }
    List<Language> getLanguages(Country country) {
        // Another cache?
        // Query the cache only and delegate to
        //     getLanguages()?
        // Or don't cache this at all?

wouldn’t it be nice if we had a cache of the form:

Map<String, ResultSet> cache;

… which caches re-usable JDBC ResultSets (or better: jOOQ Results) and returns the same results every time an identical query string is encountered.

Use jOOQ’s MockDataProvider for this

jOOQ ships with a MockConnection, which implements the JDBC Connection API for you, mocking all other objects, such as PreparedStatement, ResultSet, etc. We’ve already introduced this useful tool for unit testing in a previous blog post.

But you can “mock” your connection also in order to implement a cache! Consider the following, very simple MockDataProvider:

class ResultCache implements MockDataProvider {
    final Map<String, Result<?>> cache = 
        new ConcurrentHashMap<>();
    final Connection connection;

    ResultCache(Connection connection) {
        this.connection = connection;

    public MockResult[] execute(MockExecuteContext ctx)
    throws SQLException {
        Result<?> result;

        // Add more sophisticated caching criteria
        if (ctx.sql().contains("from language")) {

            // We're using this very useful new Java 8
            // API for atomic cache value calculation
            result = cache.computeIfAbsent(
                sql -> DSL.using(connection).fetch(

        // All other queries go to the database
        else {
            result = DSL.using(connection).fetch(

        return new MockResult[] { 
            new MockResult(result.size(), result)

Obviously, this is a very simplistic example. A real cache would involve invalidation (time-based, update-based, etc.) as well as more selective caching criteria than just matching on from language.

But the fact is that using the above ResultCache, we can now wrap all JDBC connections and prevent hitting the database more than once for all queries that query from the language table! An example using jOOQ API:

DSLContext normal = DSL.using(connection);
DSLContext cached = DSL.using(
    new MockConnection(new ResultCache(connection))

// This executs a select count(*) from language query
assertEquals(4, cached.fetchCount(LANGUAGE));
assertEquals(4, normal.fetchCount(LANGUAGE));

// Let's add another language (using normal config):
LanguageRecord lang = normal.newRecord(LANGUAGE);

// Checking again on the language table:
assertEquals(4, cached.fetchCount(LANGUAGE));
assertEquals(5, normal.fetchCount(LANGUAGE));

The cache works like a charm! Note that the current cache implementation is merely SQL string based (as it should be). If you modify the SQL string even only slightly, you’ll experience another cache miss and the query goes back to the database:

// This query is not the same as the cached one, it
// fetches two count(*) expressions. Thus we go back
// to the database and get the latest result.
assertEquals(5, (int) cached

// This still has the "stale" previous result
assertEquals(4, cached.fetchCount(LANGUAGE));


Caching is hard. Very hard. Apart from concurrency, naming things and off-by-one errors, it’s one of the three hardest problems in software.

This article doesn’t recommend to implement a cache at the JDBC level. You may or may not make that decision yourself. But when you do, then you can see how easy it is to implement such a cache using jOOQ.

jOOQ is the best way to write SQL in Java

And the best thing is that you don’t have to use jOOQ in all of your application. You can use it just for this particular use-case (and for mocking JDBC), and continue using JDBC, MyBatis, Hibernate, etc, as long as you patch other framework’s JDBC Connections with the jOOQ MockConnection.

jOOQ vs. Hibernate: When to Choose Which

Hibernate has become a de-facto standard in the Java ecosystem, and after the fact, also an actual JavaEE standard implementation if standards matter to you, and if you put the JCP on the same level with ISO, ANSI, IEEE, etc.

This article does not intended to discuss standards, but visions. Hibernate shares JPA’s vision of ORM. jOOQ shares SQL’s vision of powerful querying, so for the sake of the argument, let’s use Hibernate / JPA / ORM interchangeably, much like jOOQ / JDBC / SQL.

The question why should anyone not use Hibernate these days always shows up frequently – precisely because Hibernate is a de-facto standard, and the first framework choice in many other frameworks such as Grails (which uses GORM, which again uses Hibernate).

However, even Gavin King, the creator of Hibernate, doesn’t believe that Hibernate should be used for everything:


If that’s the case, are there any objective decision helping points that you could consider, when to use an ORM and when to use SQL?

Discussing on a high level

First off, let’s bring this discussion to a higher level. Instead of deciding between Hibernate and jOOQ as concrete implementations of their own domains, let’s think about ORM vs. SQL, and their different use-cases.

When deciding between an ORM (e.g. Hibernate) and SQL (e.g. jOOQ), the driving question that you should ask yourself is not the question of project complexity. Some of our most demanding customers are using jOOQ on medium-sized schemas with thousands of tables / views. Often, those schemas are extremely normalised and sometimes even deployed on as many as six different RDBMS. jOOQ was specifically designed to work in these scenarios, while keeping the simple use-case in mind as well.

So, instead of thinking about project complexity, ask yourself the following questions:

  • 1. Will your data model drive your application design, or will your application design drive your data model(s)?

    A main aspect here is the question whether you “care” about your database in the sense of whether it might survive your application. Very often, applications come and go. They may be re-written in Python / JavaScript, etc. 5 years down the line. Or you have multiple applications accessing the same database: Your Java application, some Perl scripts, stored procedures, etc. If this is the case, database design is a priority in your project, and jOOQ works extremely well in these setups.

    If you don’t necessarily “care” about your database in the sense that you just want to “persist” your Java domain somewhere, and this happens to be a relational database, then Hibernate might be a better choice – at least in early stages of your project, because you can easily generate your database schema from your Entity model.

  • 2. Will you do mostly complex reading and simple writing, or will you engage in complex writing?

    SQL really shines when reading is complex. When you join many tables, when you aggregate data in your database, when you do reporting, when you do bulk reading and writing. You think of your data in terms of set theory, e.g. your data as a whole. Writing CRUD with SQL is boring, though. This is why jOOQ also provides you with an ActiveRecord-style API that handles the boring parts, when you’re operating on single tables (Jason mentioned this).

    If, however, your writing becomes complex, i.e. you have to load a complex object graph with 20 entities involved into memory, perform optimistic locking on it, modify it in many different ways and then persist it again in one go, then SQL / jOOQ will not help you. This is what Hibernate has originally been created for.


I believe that data is forever. You should *always* assume that your database survives your application. It is much easier to rewrite (parts of) an application than to migrate a database. Having a clean and well-designed database schema will always pay off down the line of a project, specifically of a complex project. See also our previous article about the fallacy of “schemaless” databases.

Also, most projects really do 90% reading and 10% writing, writing often not being complex (2-3 tables modified within a transaction). This means that most of the time, the complexity solved by Hibernate / JPA’s first and second level caches is not needed. People often misunderstand these features and simply turn off caching, flushing Hibernate’s cache to the server all the time, and thus using Hibernate in the wrong way.

If, however, you’re undecided about the above two axes of decision, you can go the middle way and use jOOQ only for reporting, batch processing, etc. and use Hibernate for your CRUD – in a CQRS (Command Query Responsibility Segregation: http://martinfowler.com/bliki/CQRS.html) style. There are also quite a few jOOQ users who have chosen this path.

Further reading

Integrating jOOQ with Grails Featuring the UWS-jOOQ Plugin


Grails is a web framework aimed to boost development productivity. One of the main features is domain centric database schema generation. Applications built with Grails are able to update existing schema just before they start. To do this, Grails is using built-in domain mappers or migrations in more advanced cases. The goal of the UWS-jOOQ Grails-plugin is to integrate jOOQ into the existing Grails lifecycle in order to leverage features of jOOQ without compromising the ones provided by Grails.

This article is part of a series brought to you by the Germany based jOOQ integration partner UWS Software Service (UWS). UWS is specialised in custom software development, application modernisation and outsourcing with a distinct focus on the Java Enterprise ecosystem.

Why should I use jOOQ with Grails?

In enterprise applications we often face issues with Hibernate performance, lack of support of some statements or just too much hassle caused by the Hibernate model. Hibernate’s Query Language HQL often is not sophisticated enough to cope with some requirements. This forces us to use plain SQL, which is not bad and helps solving specific business problems. However in big projects where a larger group of people is involved and a product continuously evolves, type-safety is very precious but is thrown away when using plain SQL. That’s the moment where the jOOQ framework excels and the UWS-jOOQ Grails-Plugin comes into the game.

How can I Integrate jOOQ into Grails?

We tried to provide a simple integration of jOOQ into Grails using Grails built-in dependency resolution. Just add the following line to the plugins section of your BuildConfig.groovy:

compile ':uws-jooq:0.1.1'

Add plugin configuration to your Config.groovy:

jooq.dataSource = ['dataSource']
jooq.xmlConfigDir = "src/resources/jooq/"
jooq.generatedClassOutputDirectory = 'src/java'
jooq.generatedClassPackageName.dataSource = 'ie.uws.example'

As in this example the plugin allows to configure the datasources to be used and also some of the key-paths. Eventually you want to also take a look at a customized version of the Config.groovy in our sample integration project.

Next jOOQ needs an xml configuration file, which can be generated by the plugin using the following command. The plugin will use your existing datasources and their configuration to generate the jOOQ configuration:

grails jooq-generate-config

Now all the configuration is ready and it is time to get to one of the main features of jOOQ which is type-safe SQL. With the following command jOOQ will generate Java-classes which you want to use when writing SQL as they will give you compiler-based autocompletion:

grails jooq-init

Now that everything is in place, let’s say you would like to insert a new record into your database via jOOQ in one of your Controllers. It’s as simple as that:

class ExampleController {
  JooqService jooqService

  def insert() {
    DSLContext dsl = jooqService.dataSource	
    BookRecord record = dsl.newRecord(Tables.BOOK)
    record.author = "John"
    record.name = "Uws"
    record.version = 1

How does the integration of jOOQ with Grails work under the hood?

In the example above you noticed the JooqService which is dependency-injected by Grails. The JooqService is your entrypoint when it comes to using jOOQ within Grails as it is able to pick your datasource and provide the jOOQ DSL context for you. When you have multiple datasources it also allows you to select a different datasource just by providing the name of it:

DSLContext dsl = jooqService.dataSource_custom

Note that autocompletion won’t tell you about the existence of a dataSource_custom field but JooqService will handle that for you.

In version 0.1 we added JooqService but DSLContext have to generated based on your databases schema. So it is important to execute jooq-init command every time you change your Grails domain model since this command compiles your code and executes all migrations so that latest Java-classes are generated on the latest database-schema. Thanks to this approach it is possible to generate structures even from an in-memory H2 database which will not be available right after the application will be shut down.

Best-practices for using jOOQ with Grails

Integrate legacy databases

You might face the situation where you have to connect to some legacy database using the Grails framework. It’s doable, for sure, but you have to create the right Hibernate mapping first or – with some luck – let Hibernate generate the right one for you. With this plugin you can just let jOOQ generate its Java-classes and you’re ready to communicate with the database using a fully type-safe DSL.

Let a database schema change break your code

It’s one of the most precious gifts when you know that something will break early. With using jOOQ in Grails it will happen during compilation time. When you are executing jooq-init, your application is compiled and the Java-classes are regenerated using the latest database schema. If the generated classes change, you will be notified that your code is not able to compile anymore. You can fix your SQL statements and ensure that your application won’t break during runtime.

Keep generated classes in your version control system

We recommend you to check in jOOQs generated classes into your VCS along with the rest of your applications source code. When you are using jOOQ classes to communicate with the database it’s mandatory for compilation to have those classes already defined. Do you use a different approach? Please let us know!


We’re planning to simplify integration even more and attach jooq-init into regular grails compilation process. Also we’d like to make our plugin harder to misuse (see jooqService section) and add the possibility to use jOOQ not only in services or controllers but also in plain Java classes.

Contribution to UWS-jOOQ Grails-Plugin

This software is distributed under the Apache License, Version 2.0. We want to keep this software free and provide services for people
who integrate jOOQ and Grails. If you’re interested in this project feel free to submit issues or pull requests to the project’s git repository.

Further reading

The following links provide additional information about the UWS-jOOQ Grails-Plugin :

General information about UWS or jOOQ can be found here:

jOOQ – Ein alternativer Weg mit Java und SQL zu arbeiten

We’ve published an article in the German magazine www.java-aktuell.de, which is published by the iJUG e.V..

You can read and download the article free of charge from our blog!

In Java gibt es kein Standard-API, das die Ausdrucksstärke und Mächtigkeit von SQL direkt unterstützt. Alle Aufmerksamkeit ist auf objekt-relationales Mapping und andere höhere Abstraktionslevel gerichtet, beispielsweise OQL, HQL, JPQL, CriteriaQuery. jOOQ ist ein dual-lizenziertes Open-Source-Produkt, das diese Lücke füllt. Es implementiert SQL als typsichere domänen-spezifische Sprache direkt in Java und ist eine gute Wahl für Java-Applikationen, in denen SQL und herstellerspezifische Datenbankfunktionalität wichtig sind. Es zeigt, wie eine moderne domänenspezifische Sprache die Entwicklerproduktivität stark erhöhen kann, indem SQL direkt in Java eingebettet ist.

jOOQ vs. Slick – Pros and Cons of Each Approach

Every framework introduces a new compromise. A compromise that is introduced because the framework makes some assumptions about how you’d like to interact with your software infrastructure.

An example of where this compromise has struck users recently is the discussion “Are Slick queries generally isomorphic to the SQL queries?“. And, of course, the answer is: No. What appears to be a simple Slick query:

val salesJoin = sales 
      join purchasers 
      join products 
      join suppliers on {
  case (((sale, purchaser), product), supplier) =>
    sale.productId === product.id &&
    sale.purchaserId === purchaser.id &&
    product.supplierId === supplier.id

… turns into a rather large monster with tons of derived tables that are totally unnecessary, given the original query (formatting is mine):

select x2.x3, x4.x5, x2.x6, x2.x7 
from (
    select x8.x9 as x10, 
           x8.x11 as x12, 
           x8.x13 as x14, 
           x8.x15 as x7, 
           x8.x16 as x17, 
           x8.x18 as x3, 
           x8.x19 as x20, 
           x21.x22 as x23, 
           x21.x24 as x25, 
           x21.x26 as x6 
    from (
        select x27.x28 as x9,
               x27.x29 as x11, 
               x27.x30 as x13, 
               x27.x31 as x15, 
               x32.x33 as x16, 
               x32.x34 as x18, 
               x32.x35 as x19 
        from (
            select x36."id" as x28, 
                   x36."purchaser_id" as x29, 
                   x36."product_id" as x30, 
                   x36."total" as x31 
            from "sale" x36
        ) x27 
        inner join (
            select x37."id" as x33, 
                   x37."name" as x34, 
                   x37."address" as x35 
	    from "purchaser" x37
        ) x32 
        on 1=1
    ) x8 
    inner join (
        select x38."id" as x22, 
               x38."supplier_id" as x24, 
               x38."name" as x26 
        from "product" x38
    ) x21
    on 1=1
) x2 
inner join (
    select x39."id" as x40, 
           x39."name" as x5, 
           x39."address" as x41 
    from "supplier" x39
) x4 
on ((x2.x14 = x2.x23) 
and (x2.x12 = x2.x17)) 
and (x2.x25 = x4.x40) 
where x2.x7 >= ?

Christopher Vogt, a former Slick maintainer and still actively involved member of the Slick community, explains the above in the following words:

This means that Slick relies on your database’s query optimizer to be able to execute the sql query that Slick produced efficiently. Currently that is not always the case in MySQL

One of the main ideas behind Slick, according to Christopher, is:

Slick is not a DSL that allows you to build exactly specified SQL strings. Slick’s Scala query translation allows for re-use and composition and using Scala as the language to write your queries. It does not allow you to predict the exact sql query, only the semantics and the rough structure.

Slick vs. jOOQ

Since Christopher later on also compared Slick with jOOQ, I allowed myself to chime in and to add my two cents:

From a high level (without actual Slick experience) I’d say that Slick and jOOQ embrace compositionality equally well. I’ve seen crazy queries of several 100s of lines of [jOOQ] SQL in customer code, composed over several methods. You can do that with both APIs.

On the other hand, as Chris said: Slick has a focus on Scala collections, jOOQ on SQL tables.

  • From a conceptual perspective (= in theory), this focus shouldn’t matter.
  • From a type safety perspective, Scala collections are easier to type-check than SQL tables and queries because SQL as a language itself is rather hard to type-check given that the semantics of various of the advanced SQL clauses alter type configurations rather implicitly (e.g. outer joins, grouping sets, pivot clauses, unions, group by, etc.).
  • From a practical perspective, SQL itself is only an approximation of the original relational theories and has attained a life of its own. This may or may not matter to you.

I guess in the end it really boils down to whether you want to reason about Scala collections (queries are better integrated / more idiomatic with your client code) or about SQL tables (queries are better integrated / more idiomatic with your database).

At this point, I’d like to add another two cents to the discussion. Customers don’t buy the product that you’re selling. They never do. In the case of Hibernate, customers and users were hoping to be able to forget SQL forever. The opposite is true. As Gavin King himself (the creator of Hibernate) had told me:


Because customers and users had never listened to Gavin (and to other ORM creators), we now have what many call the object-relational impedance mismatch. A lot of unjustified criticism has been expressed against Hibernate and JPA, APIs which are simply too popular for the limited scope they really cover.

With Slick (or C#’s LINQ, for that matter), a similar mismatch is impeding integrations, if users abuse these tools for what they believe to be a replacement for SQL. Slick does a great job at modelling the relational model directly in the Scala language. This is wonderful if you want to reason about relations just like you reason about collections. But it is not a SQL API. To illustrate how difficult it is to overcome these limitations, you can browse the issue tracker or user group to learn about:

We’ll simply call this:

The Functional-Relational Impedance Mismatch

SQL is much more

Markus Winand (the author of the popular SQL Performance Explained) has recently published a very good presentation about “modern SQL”, an idea that we fully embrace at jOOQ:

We believe that APIs that have been trying to hide the SQL language from general purpose languages like Java, Scala, C# are missing out on a lot of the very nice features that can add tremendous value to your application. jOOQ is an API that fully embraces the SQL language, with all its awesome features (and with all its quirks). You obviously may or may not agree with that.

We’ll leave this article open ended, hoping you’ll chime in to discuss the benefits and caveats of each approach. Of staying close to Scala vs. staying close to SQL.

As a small teaser, however, I’d like to announce a follow-up article showing that there is no such thing as an object-relational impedance mismatch. You (and your ORM) are just not using SQL correctly. Stay tuned!

Transform Your SQL Data into Charts Using jOOQ and JavaFX

In the recent past, we’ve shown how Java 8 and functional programming will bring a new perspective to Java developers when it comes to functional data transformation of SQL data using jOOQ and Java 8 lambdas and Streams.

Today, we take this a step further and transform the data into JavaFX XYChart.Series to produce nice-looking bar charts from our data.

Setting up the database

We’re going to be using a small subset of the World Bank’s Open Data again, in a PostgreSQL database. The data that we’re using is this here:



CREATE TABLE world.countries (
  code CHAR(2) NOT NULL,
  year INT NOT NULL,
  gdp_per_capita DECIMAL(10, 2) NOT NULL,
  govt_debt DECIMAL(10, 2) NOT NULL
INSERT INTO world.countries
VALUES ('CA', 2009, 40764, 51.3),
       ('CA', 2010, 47465, 51.4),
       ('CA', 2011, 51791, 52.5),
       ('CA', 2012, 52409, 53.5),
       ('DE', 2009, 40270, 47.6),
       ('DE', 2010, 40408, 55.5),
       ('DE', 2011, 44355, 55.1),
       ('DE', 2012, 42598, 56.9),
       ('FR', 2009, 40488, 85.0),
       ('FR', 2010, 39448, 89.2),
       ('FR', 2011, 42578, 93.2),
       ('FR', 2012, 39759,103.8),
       ('GB', 2009, 35455,121.3),
       ('GB', 2010, 36573, 85.2),
       ('GB', 2011, 38927, 99.6),
       ('GB', 2012, 38649,103.2),
       ('IT', 2009, 35724,121.3),
       ('IT', 2010, 34673,119.9),
       ('IT', 2011, 36988,113.0),
       ('IT', 2012, 33814,131.1),
       ('JP', 2009, 39473,166.8),
       ('JP', 2010, 43118,174.8),
       ('JP', 2011, 46204,189.5),
       ('JP', 2012, 46548,196.5),
       ('RU', 2009,  8616,  8.7),
       ('RU', 2010, 10710,  9.1),
       ('RU', 2011, 13324,  9.3),
       ('RU', 2012, 14091,  9.4),
       ('US', 2009, 46999, 76.3),
       ('US', 2010, 48358, 85.6),
       ('US', 2011, 49855, 90.1),
       ('US', 2012, 51755, 93.8);

(see also this article here about another awesome set of SQL queries against the above data)

What we want to do now is plot the two sets of values in two different bar charts:

  • Each country’s GDP per capita in each year between 2009-2012
  • Each country’s debt as a percentage of its GDP in each year between 2009-2012

This will then create 8 series with four data points for each series in both charts. In addition to the above, we’d like to order the series among themselves by the average projected value between 2009-2012, such that the series – and thus the countries – can be compared easily.

This is obviously easier to explain visually via the resulting chart than in text, so stay tuned until the end of the article.

Collecting the data with jOOQ and JavaFX

The query that we would write to calculate the above data series would look as follows in plain SQL:

join (
        avg(COUNTRIES.GOVT_DEBT) avg 
    group by 
) c1 
order by 
    avg asc, 

In other words, we’ll simply select the relevant columns from the COUNTRIES table, and we’ll self-join the average projected value per country such that we can order the result by that average. The same query could be written using window functions. We’ll get to that later on.

The code that we’ll write to create such a bar chart with jOOQ and JavaFX is the following:

CategoryAxis xAxis = new CategoryAxis();
NumberAxis yAxis = new NumberAxis();
yAxis.setLabel("% of GDP");

BarChart<String, Number> bc = 
    new BarChart<>(xAxis, yAxis);
bc.setTitle("Government Debt");

    // SQL data transformation, executed in the DB
    // -------------------------------------------
               name("c1", COUNTRIES.CODE.getName()), 

       // order countries by their average 
       // projected value

       // The result produced by the above statement
       // looks like this:
       // +----+----+---------+
       // |year|code|govt_debt|
       // +----+----+---------+
       // |2009|RU  |     8.70|
       // |2010|RU  |     9.10|
       // |2011|RU  |     9.30|
       // |2012|RU  |     9.40|
       // |2009|CA  |    51.30|
       // +----+----+---------+

    // Java data transformation, executed in app memory
    // ------------------------------------------------

       // Group results by year, keeping sort 
       // order in place

       // The generic type of this is inferred...
       // Stream<Entry<Integer, Result<
       //     Record3<BigDecimal, String, Integer>>
       // >>

       // Map entries into { Year -> Projected value }
       .map(entry -> new XYChart.Series<>(

           // Map records into a chart Data
           entry.getValue().map(country -> 
                new XYChart.Data<String, Number>(

The interesting thing here is really that we can fetch data from the database, and later on, transform it into JavaFX data structures all in one go. The whole thing is almost a single Java statement.

SQL and Java is cleanly separated

As we’ve blogged on this blog before, there is a very important difference when comparing the above approach to LINQ or to JPQL’s DTO fetching capabilities. The SQL query is cleanly separated from the Java in-memory data transformation, even if we express the whole transformation in one single statement.

We want to be as precise as possible when expressing our SQL query for the database to be able to calculate the optimal execution plan. Only once we have materialised our data set, the Java 8 Stream transformation will kick in.

The importance of this is made clear when we change the above SQL-92 compatible query with a SQL-1999 compatible one that makes use of awesome window functions. The jOOQ part of the above statement could be replaced by the following query:


… or in SQL:

order by
        over (partition by COUNTRIES.CODE),

As you can see, staying in control of you actual SQL statement is of the essence when you run such reports. There is no way you could have refactored ordering via nested selects into a much more efficient ordering via window functions as easily as this. Let alone refactoring dozens of lines of Java sorting logic.

Yep. It’s hard to beat the beauty of window functions

If we add some additional JavaFX boilerplate to put the chart into a Pane, a Scene, and a Stage, we’ll get these nice-looking charts below:

SQL with JavaFX and jOOQ

Play with it yourself

You can download and run the above example yourself. Simply download the following example and run mvn clean install:


How to Emulate the MEDIAN() Aggregate Function Using Inverse Distribution Functions

Some databases are awesome enough to implement the MEDIAN() aggregate function. Remember that the MEDIAN() is sligthly different from (and often more useful than) the MEAN() or AVG() (average).

While the average is calculated as the SUM(exp) / COUNT(exp), the MEDIAN() tells you that 50% of all values in the sample are higher than the MEDIAN() whereas the other 50% of the set are lower than the MEDIAN().

So, in other words, if you take the following query:

WITH t(value) AS (

… then both average and median are the same:

avg   median
2     2

But if you heavily skew your data like this:

WITH t(value) AS (

Then your average will also be skewed, whereas your median will still indicate where most of the values are in your sample

avg      median
34.333   2

The above sample is of course statistically insignificant, but you can easily see that the effect can be dramatic and relevant, if you have more data:

The skewing effect is very important in statistics and in order to make any interesting claim about anything, using percentiles is most often more useful than using averages. Take the average income vs. the median income in a country, for instance. While the average income in the U.S. (and in many other countries) has been steadily increasing, the median income has seen a decline over the past decade. This is due to wealth being heavily skewed towards the super-rich more and more.

This blog is not about politics but about Java and SQL, so let’s get back into calculating the actual facts.

Using precentiles in SQL

As we’ve seen before, the MEDIAN() divides a sample into two equally-sized groups and takes the value “between” those two groups. This particular value is also called the 50th percentile because 50% of all values in the sample are lower than the MEDIAN(). We can thus establish:

  • MIN(exp): The 0-percentile
  • MEDIAN(exp): The 50th-percentile
  • MAX(exp): The 100th-percentile

All of the above are special cases of percentiles, and while MIN() and MAX() are supported in all SQL databases (and the SQL standard), MEDIAN() is not in the SQL standard and only supported by the following jOOQ databases:

  • Oracle
  • Sybase SQL Anywhere

There is another way of calculating the MEDIAN() in particular and any sort of percentile in general in the SQL standard, and since PostgreSQL 9.4 also in PostgreSQL using …

Ordered-set aggregate functions

Interestingly, apart from window functions, you can also specify ORDER BY clauses to certain aggregate functions that aggregate data based on ordered sets.

One such function is the SQL standard percentile_cont function, which takes the percentile as an argument, and then accepts an additional WITHIN GROUP clause that takes an ORDER BY clause as an argument. These particular ordered-set functions are also called inverse distribution functions, because we want to find where a particular percentile is located in the distribution of all values in the sample (if you’re not scared by the math, check out the wikipedia article)

So, in PostgreSQL 9.4+, the MEDIAN() function can be emulated like this:

WITH t(value) AS (
  SELECT 100
  percentile_cont(0.5) WITHIN GROUP (ORDER BY value)

This interesting syntax is standardised and may be known to some of you from Oracle’s LISTAGG(), which allows to aggregate values into concatenated strings:

WITH t(value) AS (
  listagg(value, ', ') WITHIN GROUP (ORDER BY value)

This query yields simply:

1, 2, 100

On a side-note: LISTAGG() is, of course, completely useless, because it returns VARCHAR2, which again has a max length of 4000 in Oracle. Useless…

Emulation out-of-the-box with jOOQ

As always, jOOQ will emulate these kinds of things out of the box. You can either use the DSL.median() function, or with the upcoming jOOQ 3.6, the new DSL.percentileCont() function to produce the same value:


jOOQ is the best way to write SQL in Java

The Awesome PostgreSQL 9.4 / SQL:2003 FILTER Clause for Aggregate Functions

Sometimes when aggregating data with SQL, we’d love to add some additional filters. For instance, consider the following world bank data:

GDP per capita (current US$)

          2009    2010    2011    2012
CA      40,764  47,465  51,791  52,409	
DE      40,270  40,408  44,355  42,598	
FR      40,488  39,448  42,578  39,759	
GB      35,455  36,573  38,927  38,649	
IT      35,724  34,673  36,988  33,814	
JP      39,473  43,118  46,204  46,548	
RU       8,616  10,710  13,324  14,091	
US      46,999  48,358  49,855  51,755	

And the table structure:

CREATE TABLE countries (
  code CHAR(2) NOT NULL,
  year INT NOT NULL,
  gdp_per_capita DECIMAL(10, 2) NOT NULL

Now, let’s assume we’d like to find the number of countries with a GDP higher than 40,000 for each year.

With standard SQL:2003, and now also with the newly released PostgreSQL 9.4, we can now take advantage of the new FILTER clause, which allows us to write the following query:

  count(*) FILTER (WHERE gdp_per_capita >= 40000)

The above query will now yield:

year   count
2012   4
2011   5
2010   4
2009   4

And that’s not it! As always, you can use any aggregate function also as a window function simply by adding an OVER() clause to the end:

    FILTER (WHERE gdp_per_capita >= 40000) 
    OVER   (PARTITION BY year)

The result would then look something like this:

year   code   gdp_per_capita   count
2009   CA           40764.00       4
2009   DE           40270.00       4
2009   FR           40488.00       4
2009   GB           35455.00       4

jOOQ 3.6 will also support the new FILTER clause for aggregate functions

Good news for jOOQ users. You can write the same query with jOOQ intuitively as such:


… and


And the best thing is that jOOQ (as usual) emulates the above clause for you if you’re not using PostgreSQL. The equivalent query would be:

  count(CASE WHEN gdp_per_capita >= 40000 THEN 1 END)

jOOQ: The best way to write SQL in Java

Read more about what’s new in PostgreSQL 9.4 here

Really Too Bad that Java 8 Doesn’t Have Iterable.stream()

This is one of the more interesting recent Stack Overflow questions:

Why does Iterable not provide stream() and parallelStream() methods?

At first, it might seem intuitive to make it straight-forward to convert an Iterable into a Stream, because the two are really more or less the same thing for 90% of all use-cases.

Granted, the expert group had a strong focus on making the Stream API parallel capable, but anyone who works with Java every day will notice immediately, that Stream is most useful in its sequential form. And an Iterable is just that. A sequential stream with no guarantees with respect to parallelisation. So, it would only be intuitive if we could simply write:


In fact, subtypes of Iterable do have such methods, e.g.


Brian Goetz himself gave an answer to the above Stack Overflow question. The reasons for this omittance are rooted in the fact that some Iterables might prefer to return an IntStream instead of a Stream. This really seems to be a very remote reason for a design decision, but as always, omittance today doesn’t mean omittance forever. On the other hand, if they had introduced Iterable.stream() today, and it turned out to be a mistake, they couldn’t have removed it again.

Well, primitive types in Java are a pain and they did all sorts of bad things to generics in the first place, and now to Stream as well, as we have to write the following, in order to turn an Iterable into a Stream:

Stream s = StreamSupport.stream(iterable.spliterator(), false);

Brian Goetz argues that this is “easy”, but I would disagree. As an API consumer, I experience a lot of friction in productivity because of:

  • Having to remember this otherwise useless StreamSupport type. This method could very well have been put into the Stream interface, because we already have Stream construction methods in there, such as Stream.of().
  • Having to remember the subtle difference between Iterator and Spliterator in the context of what I believe has nothing to do with parallelisation. It may well be that Spliterators will become popular eventually, though, so this doubt is for the magic 8 ball to address.
  • In fact, I have to repeat the information that there is nothing to be parallelised via the boolean argument false

Parallelisation really has such a big weight in this new API, even if it will cover only around 5%-10% of all functional collection manipulation operations. While sequential processing was not the main design goal of the JDK 8 APIs, it is really the main benefit for all of us, and the friction around APIs related to sequential processing should be as low as possible.

The method above should have just been called

Stream s = Stream.stream(iterable);

It could be implemented like this:

public static<T> Stream<T> stream(Iterable<T> i) {
    return StreamSupport.stream(i.spliterator(), false);

Obviously with convenience overloads that allow for the additional specialisations, like parallelisation, or passing a Spliterator

But again, if Iterable had its own stream() default method, an incredible number of APIs would be so much better integrated with Java 8 out of the box, without even supporting Java 8 explicitly!

Take jOOQ for instance. jOOQ still supports Java 6, so a direct dependency is not possible. However, jOOQ’s ResultQuery type is an Iterable. This allows you to use such queries directly inline in foreach loops, as if you were writing PL/SQL:


FOR book IN (
  SELECT * FROM books ORDER BY books.title
  -- Do things with book


for (BookRecord book : 
) {
  // Do things with book

Now imagine the same thing in Java 8:

   .map / reduce / findAny, etc...

Unfortunately, the above is currently not possible. You could, of course, eagerly fetch all the results into a jOOQ Result, which extends List:

   .map / reduce / findAny, etc...

But it’s one more method to call (every time), and the actual stream semantics is broken, because the fetch is done eagerly.

Complaining on a high level

This is, of course, complaining on a high level, but it would really be great if a future version of Java, e.g. Java 9, would add this missing method to the Iterable API. Again, 99% of all use-cases will want the Stream type to be returned, not the IntStream type. And if they do want that for whatever obscure reason (much more obscure than many evil things from old legacy Java APIs, looking at you Calendar), then why shouldn’t they just declare an intStream() method. After all, if someone is crazy enough to write Iterable<Integer> when they’re really operating on int primitive types, they’ll probably accept a little workaround.


Get every new post delivered to your Inbox.

Join 3,379 other followers

%d bloggers like this: