Archive | java RSS for this section

Asynchronous SQL Execution with jOOQ and Java 8’s CompletableFuture


Reactive programming is the new buzzword, which essentially just means asynchronous programming or messaging.

Fact is that functional syntax greatly helps with structuring asynchronous execution chains, and today, we’ll see how we can do this in Java 8 using jOOQ and the new CompletableFuture API.

In fact, things are quite simple:

// Initiate an asynchronous call chain
CompletableFuture

    // This lambda will supply an int value
    // indicating the number of inserted rows
    .supplyAsync(() -> DSL
        .using(configuration)
        .insertInto(AUTHOR, AUTHOR.ID, AUTHOR.LAST_NAME)
        .values(3, "Hitchcock")
        .execute()
    )
                          
    // This will supply an AuthorRecord value
    // for the newly inserted author
    .handleAsync((rows, throwable) -> DSL
        .using(configuration)
        .fetchOne(AUTHOR, AUTHOR.ID.eq(3))
    )

    // This should supply an int value indicating
    // the number of rows, but in fact it'll throw
    // a constraint violation exception
    .handleAsync((record, throwable) -> {
        record.changed(true);
        return record.insert();
    })
    
    // This will supply an int value indicating
    // the number of deleted rows
    .handleAsync((rows, throwable) -> DSL
        .using(configuration)
        .delete(AUTHOR)
        .where(AUTHOR.ID.eq(3))
        .execute()
    )

    // This tells the calling thread to wait for all
    // chained execution units to be executed
    .join();

What did really happen here? Nothing out of the ordinary. There are 4 execution blocks:

  1. One that inserts a new AUTHOR
  2. One that fetches that same AUTHOR again
  3. One that re-inserts the newly fetched AUTHOR (throwing an exception)
  4. One that ignores the thrown exception and delets the AUTHOR again

Finally, when the execution chain is established, the calling thread will join the whole chain using the CompletableFuture.join() method, which is essentially the same as the Future.get() method, except that it doesn’t throw any checked exception.

Comparing this to other APIs

Other APIs like Scala’s Slick have implemented similar things via “standard API”, such as calls to flatMap(). We’re currently not going to mimick such APIs as we believe that the new Java 8 APIs will become much more idiomatic to native Java speakers. Specifically, when executing SQL, getting connection pooling and transactions right is of the essence. The semantics of asynchronously chained execution blocks and how they relate to transactions is very subtle. If you want a transaction to span more than one such block, you will have to encode this yourself via jOOQ’s Configuration and its contained ConnectionProvider.

Blocking JDBC

Obviously, there will always be one blocking barrier to such solutions, and that is JDBC itself – which is very hard to turn into an asynchronous API. In fact, few databases really support asynchronous query executions and cursors, as most often, a single database session can only be used by a single thread for a single query at a time.

We’d be very interested to learn about your asynchronous SQL querying requirements, so feel free to comment on this post!

How Nashorn Impacts API Evolution on a New Level


Following our previous article about how to use jOOQ with Java 8 and Nashorn, one of our users discovered a flaw in using the jOOQ API as discussed here on the user group. In essence, the flaw can be summarised like so:

Java code

package org.jooq.nashorn.test;

public class API {
    public static void test(String string) {
        throw new RuntimeException("Don't call this");
    }

    public static void test(Integer... args) {
        System.out.println("OK");
    }
}

JavaScript code

var API = Java.type("org.jooq.nashorn.test.API");
API.test(1); // This will fail with RuntimeException

After some investigation and the kind help of Attila Szegedi, as well as Jim Laskey (both Nashorn developers from Oracle), it became clear that Nashorn disambiguates overloaded methods and varargs differently than what an old Java developer might expect. Quoting Attila:

Nashorn’s overload method resolution mimics Java Language Specification (JLS) as much as possible, but allows for JavaScript-specific conversions too. JLS says that when selecting a method to invoke for an overloaded name, variable arity methods can be considered for invocation only when there is no applicable fixed arity method.

I agree that variable arity methods can be considered only when there is no applicable fixed arity method. But the whole notion of “applicable” itself is completely changed as type promotion (or coercion / conversion) using ToString, ToNumber, ToBoolean is preferred over what intuitively appear to be “exact” matches with varargs methods!

Let this sink in!

Given that we now know how Nashorn resolves overloading, we can see that any of the following are valid workarounds:

Explicitly calling the test(Integer[]) method using an array argument:

This is the simplest approach, where you ignore the fact that varargs exist and simply create an explicit array

var API = Java.type("org.jooq.nashorn.test.API");
API.test([1]);

Explicitly calling the test(Integer[]) method by saying so:

This is certainly the safest approach, as you’re removing all ambiguity from the method call

var API = Java.type("org.jooq.nashorn.test.API");
API["test(Integer[])"](1);

Removing the overload:

public class AlternativeAPI1 {
    public static void test(Integer... args) {
        System.out.println("OK");
    }
}

Removing the varargs:

public class AlternativeAPI3 {
    public static void test(String string) {
        throw new RuntimeException("Don't call this");
    }

    public static void test(Integer args) {
        System.out.println("OK");
    }
}

Providing an exact option:

public class AlternativeAPI4 {
    public static void test(String string) {
        throw new RuntimeException("Don't call this");
    }

    public static void test(Integer args) {
        test(new Integer[] { args });
    }

    public static void test(Integer... args) {
        System.out.println("OK");
    }
}

Replacing String by CharSequence (or any other “similar type”):

Now, this is interesting:

public class AlternativeAPI5 {
    public static void test(CharSequence string) {
        throw new RuntimeException("Don't call this");
    }

    public static void test(Integer args) {
        System.out.println("OK");
    }
}

Specifically, the distinction between CharSequence and String types appears to be very random from a Java perspective in my opinion.

Agreed, implementing overloaded method resolution in a dynamically typed language is very hard, if even possible. Any solution is a compromise that will introduce flaws at some ends. Or as Attila put it:

As you can see, no matter what we do, something else would suffer; overloaded method selection is in a tight spot between Java and JS type systems and very sensitive to even small changes in the logic.

True! But not only is overload method selection very sensitive to even small changes. Using Nashorn with Java interoperability is, too! As an API designer, over the years, I have grown used to semantic versioning, and the many subtle rules to follow when keeping an API source compatible, behavior compatible – and if ever possible – to a large degree also binary compatible.

Forget about that when your clients are using Nashorn. They’re on their own. A newly introduced overload in your Java API can break your Nashorn clients quite badly. But then again, that’s JavaScript, the language that tells you at runtime that:

['10','10','10','10'].map(parseInt)

… yields

[10, NaN, 2, 3]

… and where

++[[]][+[]]+[+[]] === "10"

yields true! (sources here)

For more information about JavaScript, please visit this introductory tutorial

This is the Final Discussion!


Pun intended… Let’s discuss Java final.

Recently, our popular blog post “10 Subtle Best Practices when Coding Java” had a significant revival and a new set of comments as it was summarised and linked from JavaWorld. In particular, the JavaWorld editors challenged our opinion about the Java keyword “final“:

More controversially, Eder takes on the question of whether it’s ever safe to make methods final by default:

“If you’re in full control of all source code, there’s absolutely nothing wrong with making methods final by default, because:”

  • “If you do need to override a method (do you really?), you can still remove the final keyword”
  • “You will never accidentally override any method anymore”

Yes, indeed. All classes, methods, fields and local variables should be final by default and mutable via keyword.

Here are fields and local variables:

    int finalInt   = 1;
val int finalInt   = 2;
var int mutableInt = 3;

Whether the Scala/C#-style val keyword is really necessary is debatable. But clearly, in order to modify a field / variable ever again, we should have a keyword explicitly allowing for it. The same for methods – and I’m using Java 8’s default keyword for improved consistency and regularity:

class FinalClass {
    void finalMethod() {}
}

default class ExtendableClass {
            void finalMethod      () {}
    default void overridableMethod() {}
}

That would be the perfect world in our opinion, but Java goes the other way round making default (overridable, mutable) the default and final (non-overridable, immutable) the explicit option.

Fair enough, we’ll live with that

… and as API designers (from the jOOQ API, of course), we’ll just happily put final all over the place to at least pretend that Java had the more sensible defaults mentioned above.

But many people disagree with this assessment, mostly for the same reason:

As someone who works mostly in osgi environments, I could not agree more, but can you guarantee that another api designer felt the same way? I think it’s better to preempt the mistakes of api designers rather than preempt the mistakes of users by putting limits on what they can extend by default. – eliasv on reddit

Or…

Strongly disagree. I would much rather ban final and private from public libraries. Such a pain when I really need to extend something and it cannot be done.

Intentionally locking the code can mean two things, it either sucks, or it is perfect. But if it is perfect, then nobody needs to extend it, so why do you care about that.

Of course there exists valid reasons to use final, but fear of breaking someone with a new version of a library is not one of them. – meotau on reddit

Or also…

I know we’ve had a very useful conversation about this already, but just to remind other folks on this thread: much of the debate around ‘final’ depends on the context: is this a public API, or is this internal code? In the former context, I agree there are some good arguments for final. In the latter case, final is almost always a BAD idea. – Charles Roth on our blog

All of these arguments tend to go into one direction: “We’re working on crappy code so we need at least some workaround to ease the pain.”

But why not think about it this way:

The API designers that all of the above people have in mind will create precisely that horrible API that you’d like to patch through extension. Coincidentally, the same API designer will not reflect on the usefulness and communicativeness of the keyword final, and thus will never use it, unless required by the Java language. Win-win (albeit crappy API, shaky workarounds and patches).

The API designers that want to use final for their API will reflect a lot on how to properly design APIs (and well-defined extension points / SPIs), such that you will never worry about something being final. Again, win-win (and an awesome API).

Plus, in the latter case, the odd hacker will be kept from hacking and breaking your API in a way that will only lead to pain and suffering, but that’s not really a loss.

Final interface methods

For the aforementioned reasons, I still deeply regret that final is not possible in Java 8 interfaces. Brian Goetz has given an excellent explanation why this has been decideed upon like that. In fact, the usual explanation. The one about this not being the main design goal for the change ;-)

But think about the consistency, the regularity of the language if we had:

default interface ImplementableInterface {
            void abstractMethod   () ;
            void finalMethod      () {}
    default void overridableMethod() {}
}

(Ducks and runs…)

Or, more realistically with our status quo of defaulting to default:

interface ImplementableInterface {
          void abstractMethod   () ;
    final void finalMethod      () {}
          void overridableMethod() {}
}

Finally

So again, what are your (final) thoughts on this discussion?

If you haven’t heard enough, consider also reading this excellent post by Dr. David Pearce, author of the whiley programming language

When the Java 8 Streams API is not Enough


Java 8 was – as always – a release of compromises and backwards-compatibility. A release where the JSR-335 expert group might not have agreed upon scope or feasibility of certain features with some of the audience. See some concrete explanations by Brian Goetz about why …

But today we’re going to focus on the Streams API’s “short-comings”, or as Brian Goetz would probably put it: things out of scope given the design goals.

Parallel Streams?

Parallel computing is hard, and it used to be a pain. People didn’t exactly love the new (now old) Fork / Join API, when it was first shipped with Java 7. Conversely, and clearly, the conciseness of calling Stream.parallel() is unbeatable.

But many people don’t actually need parallel computing (not to be confused with multi-threading!). In 95% of all cases, people would have probably preferred a more powerful Streams API, or perhaps a generally more powerful Collections API with lots of awesome methods on various Iterable subtypes.

Changing Iterable is dangerous, though. Even a no-brainer as transforming an Iterable into a Stream via a potential Iterable.stream() method seems to risk opening pandora’s box!.

Sequential Streams!

So if the JDK doesn’t ship it, we create it ourselves!

Streams are quite awesome per se. They’re potentially infinite, and that’s a cool feature. Mostly – and especially with functional programming – the size of a collection doesn’t really matter that much, as we transform element by element using functions.

If we admit Streams to be purely sequential, then we could have any of these pretty cool methods as well (some of which would also be possible with parallel Streams):

  • cycle() – a guaranteed way to make every stream infinite
  • duplicate() – duplicate a stream into two equivalent streams
  • foldLeft() – a sequential and non-associative alternative to reduce()
  • foldRight() – a sequential and non-associative alternative to reduce()
  • limitUntil() – limit the stream to those records before the first one to satisfy a predicate
  • limitWhile() – limit the stream to those records before the first one not to satisfy a predicate
  • maxBy() – reduce the stream to the maximum mapped value
  • minBy() – reduce the stream to the minimum mapped value
  • partition() – partition a stream into two streams, one satisfying a predicate and the other not satisfying the same predicate
  • reverse() – produce a new stream in inverse order
  • skipUntil() – skip records until a predicate is satisified
  • skipWhile() – skip records as long as a predicate is satisfied
  • slice() – take a slice of the stream, i.e. combine skip() and limit()
  • splitAt() – split a stream into two streams at a given position
  • unzip() – split a stream of pairs into two streams
  • zip() – merge two streams into a single stream of pairs
  • zipWithIndex() – merge a stream with its corresponding stream of indexes into a single stream of pairs

jOOλ’s new Seq type does all that

All of the above is part of jOOλ. jOOλ (pronounced “jewel”, or “dju-lambda”, also written jOOL in URLs and such) is an ASL 2.0 licensed library that emerged from our own development needs when implementing jOOQ integration tests with Java 8. Java 8 is exceptionally well-suited for writing tests that reason about sets, tuples, records, and all things SQL.

But the Streams API just slightly feels insufficient, so we have wrapped JDK’s Streams into our own Seq type (Seq for sequence / sequential Stream):

// Wrap a stream in a sequence
Seq<Integer> seq1 = seq(Stream.of(1, 2, 3));

// Or create a sequence directly from values
Seq<Integer> seq2 = Seq.of(1, 2, 3);

We’ve made Seq a new interface that extends the JDK Stream interface, so you can use Seq fully interoperably with other Java APIs – leaving the existing methods unchanged:

public interface Seq<T> extends Stream<T> {

    /**
     * The underlying {@link Stream} implementation.
     */
    Stream<T> stream();
	
	// [...]
}

Now, functional programming is only half the fun if you don’t have tuples. Unfortunately, Java doesn’t have built-in tuples and while it is easy to create a tuple library using generics, tuples are still second-class syntactic citizens when comparing Java to Scala, for instance, or C# and even VB.NET.

Nonetheless…

jOOλ also has tuples

We’ve run a code-generator to produce tuples of degree 1-8 (we might add more in the future, e.g. to match Scala’s and jOOQ’s “magical” degree 22).

And if a library has such tuples, the library also needs corresponding functions. The essence of these TupleN and FunctionN types is summarised as follows:

public class Tuple3<T1, T2, T3>
implements 
    Tuple, 
	Comparable<Tuple3<T1, T2, T3>>, 
	Serializable, Cloneable {
    
    public final T1 v1;
    public final T2 v2;
    public final T3 v3;
	
	// [...]
}

and

@FunctionalInterface
public interface Function3<T1, T2, T3, R> {

    default R apply(Tuple3<T1, T2, T3> args) {
        return apply(args.v1, args.v2, args.v3);
    }

    R apply(T1 v1, T2 v2, T3 v3);
}

There are many more features in Tuple types, but let’s leave them out for today.

On a side note, I’ve recently had an interesting discussion with Gavin King (the creator of Hibernate) on reddit. From an ORM perspective, Java classes seem like a suitable implementation for SQL / relational tuples, and they are indeed. From an ORM perspective.

But classes and tuples are fundamentally different, which is a very subtle issue with most ORMs – e.g. as explained here by Vlad Mihalcea.

Besides, SQL’s notion of row value expressions (i.e. tuples) is quite different from what can be modelled with Java classes. This topic will be covered in a subsequent blog post.

Some jOOλ examples

With the aforementioned goals in mind, let’s see how the above API can be put to work by example:

zipping

// (tuple(1, "a"), tuple(2, "b"), tuple(3, "c"))
Seq.of(1, 2, 3).zip(Seq.of("a", "b", "c"));

// ("1:a", "2:b", "3:c")
Seq.of(1, 2, 3).zip(
    Seq.of("a", "b", "c"), 
    (x, y) -> x + ":" + y
);

// (tuple("a", 0), tuple("b", 1), tuple("c", 2))
Seq.of("a", "b", "c").zipWithIndex();

// tuple((1, 2, 3), (a, b, c))
Seq.unzip(Seq.of(
    tuple(1, "a"),
    tuple(2, "b"),
    tuple(3, "c")
));

This is already a case where tuples have become very handy. When we “zip” two streams into one, we want a wrapper value type that combines both values. Classically, people might’ve used Object[] for quick-and-dirty solutions, but an array doesn’t indicate attribute types or degree.

Unfortunately, the Java compiler cannot reason about the effective bound of the <T> type in Seq<T>. This is why we can only have a static unzip() method (instead of an instance one), whose signature looks like this:

// This works
static <T1, T2> Tuple2<Seq<T1>, Seq<T2>> 
    unzip(Stream<Tuple2<T1, T2>> stream) { ... }
	
// This doesn't work:
interface Seq<T> extends Stream<T> {
    Tuple2<Seq<???>, Seq<???>> unzip();
}

Skipping and limiting

// (3, 4, 5)
Seq.of(1, 2, 3, 4, 5).skipWhile(i -> i < 3);

// (3, 4, 5)
Seq.of(1, 2, 3, 4, 5).skipUntil(i -> i == 3);

// (1, 2)
Seq.of(1, 2, 3, 4, 5).limitWhile(i -> i < 3);

// (1, 2)
Seq.of(1, 2, 3, 4, 5).limitUntil(i -> i == 3);

Other functional libraries probably use different terms than skip (e.g. drop) and limit (e.g. take). It doesn’t really matter in the end. We opted for the terms that are already present in the existing Stream API: Stream.skip() and Stream.limit()

Folding

// "abc"
Seq.of("a", "b", "c").foldLeft("", (u, t) -> t + u);

// "cba"
Seq.of("a", "b", "c").foldRight("", (t, u) -> t + u);

The Stream.reduce() operations are designed for parallelisation. This means that the functions passed to it must have these important attributes:

But sometimes, you really want to “reduce” a stream with functions that do not have the above attributes, and consequently, you probably don’t care about the reduction being parallelisable. This is where “folding” comes in.

A nice explanation about the various differences between reducing and folding (in Scala) can be seen here.

Splitting

// tuple((1, 2, 3), (1, 2, 3))
Seq.of(1, 2, 3).duplicate();

// tuple((1, 3, 5), (2, 4, 6))
Seq.of(1, 2, 3, 4, 5, 6).partition(i -> i % 2 != 0)

// tuple((1, 2), (3, 4, 5))
Seq.of(1, 2, 3, 4, 5).splitAt(2);

The above functions all have one thing in common: They operate on a single stream in order to produce two new streams, that can be consumed independently.

Obviously, this means that internally, some memory must be consumed to keep buffers of partially consumed streams. E.g.

  • duplication needs to keep track of all values that have been consumed in one stream, but not in the other
  • partitioning needs to fast forward to the next value that satisfies (or doesn’t satisfy) the predicate, without losing all the dropped values
  • splitting might need to fast forward to the split index

For some real functional fun, let’s have a look at a possible splitAt() implementation:

static <T> Tuple2<Seq<T>, Seq<T>> 
splitAt(Stream<T> stream, long position) {
    return seq(stream)
          .zipWithIndex()
          .partition(t -> t.v2 < position)
          .map((v1, v2) -> tuple(
              v1.map(t -> t.v1),
              v2.map(t -> t.v1)
          ));
}

… or with comments:

static <T> Tuple2<Seq<T>, Seq<T>> 
splitAt(Stream<T> stream, long position) {
    // Add jOOλ functionality to the stream
    // -> local Type: Seq<T>
    return seq(stream)
	
    // Keep track of stream positions
    // with each element in the stream
    // -> local Type: Seq<Tuple2<T, Long>>
          .zipWithIndex()
	  
    // Split the streams at position
    // -> local Type: Tuple2<Seq<Tuple2<T, Long>>,
    //                       Seq<Tuple2<T, Long>>>
          .partition(t -> t.v2 < position)
		  
    // Remove the indexes from zipWithIndex again
    // -> local Type: Tuple2<Seq<T>, Seq<T>>
          .map((v1, v2) -> tuple(
              v1.map(t -> t.v1),
              v2.map(t -> t.v1)
          ));
}

Nice, isn’t it? A possible implementation for partition(), on the other hand, is a bit more complex. Here trivially with Iterator instead of the new Spliterator:

static <T> Tuple2<Seq<T>, Seq<T>> partition(
        Stream<T> stream, 
        Predicate<? super T> predicate
) {
    final Iterator<T> it = stream.iterator();
    final LinkedList<T> buffer1 = new LinkedList<>();
    final LinkedList<T> buffer2 = new LinkedList<>();

    class Partition implements Iterator<T> {

        final boolean b;

        Partition(boolean b) {
            this.b = b;
        }

        void fetch() {
            while (buffer(b).isEmpty() && it.hasNext()) {
                T next = it.next();
                buffer(predicate.test(next)).offer(next);
            }
        }

        LinkedList<T> buffer(boolean test) {
            return test ? buffer1 : buffer2;
        }

        @Override
        public boolean hasNext() {
            fetch();
            return !buffer(b).isEmpty();
        }

        @Override
        public T next() {
            return buffer(b).poll();
        }
    }

    return tuple(
        seq(new Partition(true)), 
        seq(new Partition(false))
    );
}

I’ll let you do the exercise and verify the above code.

Get and contribute to jOOλ, now!

All of the above is part of jOOλ, available for free from GitHub. There is already a partially Java-8-ready, full-blown library called functionaljava, which goes much further than jOOλ.

Yet, we believe that all what’s missing from Java 8’s Streams API is really just a couple of methods that are very useful for sequential streams.

In a previous post, we’ve shown how we can bring lambdas to String-based SQL using a simple wrapper for JDBC (of course, we still believe that you should use jOOQ instead).

Today, we’ve shown how we can write awesome functional and sequential Stream processing very easily, with jOOλ.

Stay tuned for even more jOOλ goodness in the near future (and pull requests are very welcome, of course!)

Look no Further! The Final Answer to “Where to Put Generated Code?”


This recent question on Stack Overflow made me think.

Why does jOOQ suggest to put generated code under “/target” and not under “/src”?

… and I’m about to give you the final answer to “Where to Put Generated Code?”

This isn’t only about jOOQ

Even if you’re not using jOOQ, or if you’re using jOOQ but without the code generator, there might be some generated source code in your project. There are many tools that generate source code from other data, such as:

  • The Java compiler (ok, byte code, not strictly source code. But still code generation)
  • XJC, from XSD files
  • Hibernate from .hbm.xml files, or from your schema
  • Xtend translates Xtend code to Java code
  • You could even consider data transformations, like XSLT
  • many more…

In this article, we’re going to look at how to deal with jOOQ-generated code, but the same thoughts apply also to any other type of code generated from other code or data.

Now, the very very interesting strategic question that we need to ask ourselves is: Where to put that code? Under version control, like the original data? Or should we consider generated code to be derived code that must be re-generated all the time?

The answer is nigh…

It depends!

Nope, unfortunately, as with many other flame-wary discussions, this one doesn’t have a completely correct or wrong answer, either. There are essentially two approaches:

Considering generated code as part of your code base

When you consider generated code as part of your code base, you will want to:

  • Check in generated sources in your version control system
  • Use manual source code generation
  • Possibly use even partial source code generation

This approach is particularly useful when your Java developers are not in full control of or do not have full access to your database schema (or your XSD or your Java code, etc.), or if you have many developers that work simultaneously on the same database schema, which changes all the time. It is also useful to be able to track side-effects of database changes, as your checked-in database schema can be considered when you want to analyse the history of your schema.

With this approach, you can also keep track of the change of behaviour in the jOOQ code generator, e.g. when upgrading jOOQ, or when modifying the code generation configuration.

When you use this approach, you will treat your generated code as an external library with its own lifecycle.

The drawback of this approach is that it is more error-prone and possibly a bit more work as the actual schema may go out of sync with the generated schema.

Considering generated code as derived artefacts

When you consider generated code to be derived artefacts, you will want to:

  • Check in only the actual DDL, i.e. the “original source of truth” (e.g. controlled via Flyway)
  • Regenerate jOOQ code every time the schema changes
  • Regenerate jOOQ code on every machine – including continuous integration machines, and possibly, if you’re crazy enough, on production

This approach is particularly useful when you have a smaller database schema that is under full control by your Java developers, who want to profit from the increased quality of being able to regenerate all derived artefacts in every step of your build.

This approach is fully supported by Maven, for instance, which foresees special directories (e.g. target/generated-sources), and phases (e.g. <phase>generate-sources</phase>) specifically for source code generation.

The drawback of this approach is that the build may break in perfectly “acceptable” situations, when parts of your database are temporarily unavailable.

Pragmatic approach

Some of you might not like that answer, but there is also a pragmatic approach, a combination of both. You can consider some code as part of your code base, and some code as derived. For instance, jOOQ-meta’s generated sources (used to query the dictionary views / INFORMATION_SCHEMA when generating jOOQ code) are put under version control as few jOOQ contributors will be able to run the jOOQ-meta code generator against all supported databases. But in many integration tests, we re-generate the sources every time to be sure the code generator works correctly.

Huh!

Conclusion

I’m sorry to disappoint you. There is no final answer to whether one approach or the other is better. Pick the one that offers you more value in your specific situation.

In case you’re choosing your generated code to be part of the code base, read this interesting experience report on the jOOQ User Group by Witold Szczerba about how to best achieve this.

Integrating jOOQ with PostgreSQL: Partitioning


Introduction

jOOQ is a great framework when you want to work with SQL in Java without having too much ORM in your way. At the same time, it can be integrated into many environments as it is offering you support for many database-specific features. One such database-specific feature is partitioning in PostgreSQL. Partitioning in PostgreSQL is mainly used for performance reasons because it can improve query performance in certain situations. jOOQ has no explicit support for this feature but it can be integrated quite easily as we will show you.

This article is brought to you by the Germany based jOOQ integration partner UWS Software Service (UWS). UWS is specialised in custom software development, application modernisation and outsourcing with a distinct focus on the Java Enterprise ecosystem.

Partitioning in PostgreSQL

With the partitioning feature of PostgreSQL you have the possibility of splitting data that would form a huge table into multiple separate tables. Each of the partitions is a normal table which inherits its columns and constraints from a parent table. This so-called table inheritance can be used for “range partitioning” where, for example, the data from one range does not overlap the data from another range in terms of identifiers, dates or other criteria.

Like in the following example, you can have partitioning for a table “author” that shares the same foreign-key of a table “authorgroup” in all its rows.

CREATE TABLE author (
  authorgroup_id int,
  LastName varchar(255)
);

CREATE TABLE author_1 (
  CONSTRAINT authorgroup_id_check_1
    CHECK ((authorgroup_id = 1))
) INHERITS (author);

CREATE TABLE author_2 (
  CONSTRAINT authorgroup_id_check_2
    CHECK ((authorgroup_id = 2))
) INHERITS (author);

...

As you can see, we set up inheritance and – in order to have a simple example – we just put one constraint checking that the partitions have the same “authorgroup_id”. Basically, this results in the “author” table only containing table and column definitions, but no data. However, when querying the “author” table, PostgreSQL will really query all the inheriting “author_n” tables returning a combined result.

A trivial approach to using jOOQ with partitioning

In order to work with the partitioning described above, jOOQ offers several options. You can use the default way which is to let jOOQ generate one class per table. In order to insert data into multiple tables, you would have to use different classes. This approach is used in the following snippet:

// add
InsertQuery query1 = dsl.insertQuery(AUTHOR_1);
query1.addValue(AUTHOR_1.ID, 1);
query1.addValue(AUTHOR_1.LAST_NAME, "Nowak");
query1.execute();

InsertQuery query2 = dsl.insertQuery(AUTHOR_2);
query2.addValue(AUTHOR_2.ID, 1);
query2.addValue(AUTHOR_2.LAST_NAME, "Nowak");
query2.execute();

// select
Assert.assertTrue(dsl
    .selectFrom(AUTHOR_1)
    .where(AUTHOR_1.LAST_NAME.eq("Nowak"))
    .fetch().size() == 1);

Assert.assertTrue(dsl
    .selectFrom(AUTHOR_2)
    .where(AUTHOR_2.LAST_NAME.eq("Nowak"))
    .fetch().size() == 1);

You can see that multiple classes generated by jOOQ need to be used, so depending on how many partitions you have, generated classes can pollute your codebase. Also, imagine that you eventually need to iterate over partitions, which would be cumbersome to do with this approach. Another approach could be that you use jOOQ to build fields and tables using string manipulation but that is error prone again and prevents support for generic type safety. Also, consider the case where you want true data separation in terms of multi-tenancy.

You see that there are some considerations to do when working with partitioning. Fortunately jOOQ offers various ways of working with partitioned tables, and in the following we’ll compare approaches, so that you can choose the one most suitable for you.

Using jOOQ with partitioning and multi-tenancy

JOOQ’s runtime-schema mapping is often used to realize database environments, such that for example during development, one database is queried but when deployed to production, the queries are going to another database. Multi-tenancy is another recommended use case for runtime-schema mapping as it allows for strict partitioning and for configuring your application to only use databases or tables being configured in the runtime-schema mapping. So running the same code would result in working with different databases or tables depending on the configuration, which allows for true separation of data in terms of multi-tenancy.

The following configuration taken from the jOOQ documentation is executed when creating the DSLContext so it can be considered a system-wide setting:

Settings settings = new Settings()
  .withRenderMapping(new RenderMapping()
  .withSchemata(
      new MappedSchema().withInput("DEV")
                        .withOutput("MY_BOOK_WORLD")
                        .withTables(
      new MappedTable().withInput("AUTHOR")
                       .withOutput("AUTHOR_1"))));

// Add the settings to the Configuration
DSLContext create = DSL.using(
  connection, SQLDialect.ORACLE, settings);

// Run queries with the "mapped" configuration
create.selectFrom(AUTHOR).fetch();

// results in SQL:
// “SELECT * FROM MY_BOOK_WORLD.AUTHOR_1”

Using this approach you can map one table to one partition permanently eg. “AUTHOR” to “AUTHOR_1” for environment “DEV”. In another environment you could choose to map “AUTHOR” table to “AUTHOR_2”.

Runtime-schema mapping only allows you to map to exactly one table on a per-query basis, so you could not handle the use case where you would want to manipulate more than one table partition. If you would like to have more flexibility you might want to consider the next approach.

Using jOOQ with partitioning and without multi-tenancy

If you need to handle multiple table partitions without having multi-tenancy, you need a more flexible way of accessing partitions. The following example shows how you can do it in a dynamic and type safe way, avoiding errors and being usable in the same elegant way you are used to by jOOQ:

// add
for(int i=1; i<=2; i++) {
  Builder part = forPartition(i);
  InsertQuery query = dsl.insertQuery(part.table(AUTHOR));
  query.addValue(part.field(AUTHOR.ID), 1);
  query.addValue(part.field(AUTHOR.LAST_NAME), "Nowak");
  query.execute();
}

// select

for(int i=1; i<=2; i++) {
  Builder part = forPartition(i);
  Assert.assertTrue(dsl
      .selectFrom(part.table(AUTHOR))
      .where(part.field(AUTHOR.LAST_NAME).eq("Nowak"))
      .fetch()
      .size() == 1);
}

What you can see above is that the partition numbers are abstracted away so that you can use “AUTHOR” table instead of “AUTHOR_1”. Thus, your code won’t be polluted with many generated classes. Another thing is that the partitioner object is initialized dynamically so you can use it for example in a loop like above. Also it follows the Builder pattern so that you can operate on it like you are used to by jOOQ.

The code above is doing exactly the same as the first trivial snippet, but there are multiple benefits like type safe and reusable access to partitioned tables.

Integration of jOOQ partitioning without multi-tenancy into a Maven build process (optional)

If you are using Continuous-Integration you can integrate the solution above so that jOOQ is not generating tables for the partitioned tables. This can be achieved using a regular expression that excludes certain table names when generating Java classes. When using Maven, your integration might look something like this:

<generator>
  <name>org.jooq.util.DefaultGenerator</name>
  <database>
    <name>org.jooq.util.postgres.PostgresDatabase</name>
    <includes>.*</includes>
    <excludes>.*_[0-9]+</excludes>
    <inputSchema>${db.schema}</inputSchema>
  </database>
  <target>
    <packageName>com.your.company.jooq</packageName>
    <directory>target/generated-sources/jooq</directory>
  </target>
</generator>

Then it’s just calling mvn install and jOOQ maven plugin will be generating the database schema in compilation time.

Integrating jOOQ with PostgreSQL: Partitioning

This article described how jOOQ in combination with the partitioning feature of PostgreSQL can be used to implement multi-tenancy and improve database performance. PostgreSQL’s documentation states that for partitioning “the benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server.”

Achieving support for partitioning with jOOQ is as easy as adding configuration or a small utility class, jOOQ is then able to support partitioning with or without multi-tenancy and without sacrificing type safety. Apart from Java-level integration, the described solution also smoothly integrates into your build and test process.

You may want to look at the sources of the partitioner utility class which also includes a test-class so that you can see the behavior and integration in more detail.

Please let us know if you need support for this or other jOOQ integrations within your environment. UWS Software Service (UWS) is an official jOOQ integration partner.

The 10 Most Annoying Things Coming Back to Java After Some Days of Scala


So, I’m experimenting with Scala because I want to write a parser, and the Scala Parsers API seems like a really good fit. After all, I can implement the parser in Scala and wrap it behind a Java interface, so apart from an additional runtime dependency, there shouldn’t be any interoperability issues.

After a few days of getting really really used to the awesomeness of Scala syntax, here are the top 10 things I’m missing the most when going back to writing Java:

1. Multiline strings

That is my personal favourite, and a really awesome feature that should be in any language. Even PHP has it: Multiline strings. As easy as writing:

println ("""Dear reader,

If we had this feature in Java,
wouldn't that be great?

Yours Sincerely,
Lukas""")

Where is this useful? With SQL, of course! Here’s how you can run a plain SQL statement with jOOQ and Scala:

println(
  DSL.using(configuration)
     .fetch("""
            SELECT a.first_name, a.last_name, b.title
            FROM author a
            JOIN book b ON a.id = b.author_id
            ORDER BY a.id, b.id
            """)
)

And this isn’t only good for static strings. With string interpolation, you can easily inject variables into such strings:

val predicate =
  if (someCondition)
    "AND a.id = 1"
  else
    ""

println(
  DSL.using(configuration)
      // Observe this little "s"
     .fetch(s"""
            SELECT a.first_name, a.last_name, b.title
            FROM author a
            JOIN book b ON a.id = b.author_id
            -- This predicate is the referencing the
            -- above Scala local variable. Neat!
            WHERE 1 = 1 $predicate
            ORDER BY a.id, b.id
            """)
)

That’s pretty awesome, isn’t it? For SQL, there is a lot of potential in Scala.

jOOQ: The Best Way to Write SQL in Scala

2. Semicolons

I sincerely haven’t missed them one bit. The way I structure code (and probably the way most people structure code), Scala seems not to need semicolons at all. In JavaScript, I wouldn’t say the same thing. The interpreted and non-typesafe nature of JavaScript seems to indicate that leaving away optional syntax elements is a guarantee to shoot yourself in the foot. But not with Scala.

val a = thisIs.soMuchBetter()
val b = no.semiColons()
val c = at.theEndOfALine()

This is probably due to Scala’s type safety, which would make the compiler complain in one of those rare ambiguous situations, but that’s just an educated guess.

3. Parentheses

This is a minefield and leaving away parentheses seems dangerous in many cases. In fact, you can also leave away the dots when calling a method:

myObject method myArgument

Because of the amount of ambiguities this can generate, especially when chaining more method calls, I think that this technique should be best avoided. But in some situations, it’s just convenient to “forget” about the parens. E.g.

val s = myObject.toString

4. Type inference

This one is really annoying in Java, and it seems that many other languages have gotten it right, in the meantime. Java only has limited type inference capabilities, and things aren’t as bright as they could be.

In Scala, I could simply write

val s = myObject.toString

… and not care about the fact that s is of type String. Sometimes, but only sometimes I like to explicitly specify the type of my reference. In that case, I can still do it:

val s : String = myObject.toString

5. Case classes

I think I’d fancy writing another POJO with 40 attributes, constructors, getters, setters, equals, hashCode, and toString

— Said no one. Ever

Scala has case classes. Simple immutable pojos written in one-liners. Take the Person case class for instance:

case class Person(firstName: String, lastName: String)

I do have to write down the attributes once, agreed. But everything else should be automatic.

And how do you create an instance of such a case class? Easily, you don’t even need the new operator (in fact, it completely escapes my imagination why new is really needed in the first place):

Person("George", "Orwell")

That’s it. What else do you want to write to be Enterprise-compliant?

Side-note

OK, some people will now argue to use project lombok. Annotation-based code generation is nonsense and should be best avoided. In fact, many annotations in the Java ecosystem are simple proof of the fact that the Java language is – and will forever be – very limited in its evolution capabilities. Take @Override for instance. This should be a keyword, not an annotation. You may think it’s a cosmetic difference, but I say that Scala has proven that annotations are pretty much always the wrong tool. Or have you seen heavily annotated Scala code, recently?

6. Methods (functions!) everywhere

This one is really one of the most useful features in any language, in my opinion. Why do we always have to link a method to a specific class? Why can’t we simply have methods in any scope level? Because we can, with Scala:

// "Top-level", i.e. associated with the package
def m1(i : Int) = i + 1

object Test {

    // "Static" method in the Test instance
    def m2(i : Int) = i + 2
    
    def main(args: Array[String]): Unit = {

        // Local method in the main method
        def m3(i : Int) = i + 3
        
        println(m1(1))
        println(m2(1))
        println(m3(1))
    }
}

Right? Why shouldn’t I be able to define a local method in another method? I can do that with classes in Java:

public void method() {
    class LocalClass {}

    System.out.println(new LocalClass());
}

A local class is an inner class that is local to a method. This is hardly ever useful, but what would be really useful is are local methods.

These are also supported in JavaScript or Ceylon, by the way.

7. The REPL

Because of various language features (such as 6. Methods everywhere), Scala is a language that can easily run in a REPL. This is awesome for testing out a small algorithm or concept outside of the scope of your application.

In Java, we usually tend to do this:

public class SomeRandomClass {

    // [...]
  
    public static void main(String[] args) {
        System.out.println(SomeOtherClass.testMethod());
    }

    // [...]
}

In Scala, I would’ve just written this in the REPL:

println(SomeOtherClass.testMethod)

Notice also the always available println method. Pure gold in terms of efficient debugging.

8. Arrays are NOT (that much of) a special case

In Java, apart from primitive types, there are also those weird things we call arrays. Arrays originate from an entirely separate universe, where we have to remember quirky rules originating from the ages of Capt Kirk (or so)

array

Yes, rules like:

// Compiles but fails at runtime
Object[] arrrrr = new String[1];
arrrrr[0] = new Object();

// This works
Object[] arrrr2 = new Integer[1];
arrrr2[0] = 1; // Autoboxing

// This doesn't work
Object[] arrrr3 = new int[];

// This works
Object[] arr4[] = new Object[1][];

// So does this (initialisation):
Object[][] arr5 = { { } };

// Or this (puzzle: Why does it work?):
Object[][] arr6 = { { new int[1] } };

// But this doesn't work (assignment)
arr5 = { { } };

Yes, the list could go on. With Scala, arrays are less of a special case, syntactically speaking:

val a = new Array[String](3);
a(0) = "A"
a(1) = "B"
a(2) = "C"
a.map(v => v + ":")

// output Array(A:, B:, C:)

As you can see, arrays behave much like other collections including all the useful methods that can be used on them.

9. Symbolic method names

Now, this topic is one that is more controversial, as it reminds us of the perils of operator overloading. But every once in a while, we’d wish to have something similar. Something that allows us to write

val x = BigDecimal(3);
val y = BigDecimal(4);
val z = x * y

Very intuitively, the value of z should be BigDecimal(12). That cannot be too hard, can it? I don’t care if the implementation of * is really a method called multiply() or whatever. When writing down the method, I’d like to use what looks like a very common operator for multiplication.

By the way, I’d also like to do that with SQL. Here’s an example:

select ( 
  AUTHOR.FIRST_NAME || " " || AUTHOR.LAST_NAME,
  AUTHOR.AGE - 10
)
from AUTHOR
where AUTHOR.ID > 10
fetch

Doesn’t that make sense? We know that || means concat (in some databases). We know what the meaning of - (minus) and > (greater than) is. Why not just write it?

The above is a compiling example of jOOQ in Scala, btw.

jOOQ: The Best Way to Write SQL in Scala

Attention: Caveat

There’s always a flip side to allowing something like operator overloading or symbolic method names. It can (and will be) abused. By libraries as much as by the Scala language itself.

10. Tuples

Being a SQL person, this is again one of the features I miss most in other languages. In SQL, everything is either a TABLE or a ROW. few people actually know that, and few databases actually support this way of thinking.

Scala doesn’t have ROW types (which are really records), but at least, there are anonymous tuple types. Think of rows as tuples with named attributes, whereas case classes would be named rows:

  • Tuple: Anonymous type with typed and indexed elements
  • Row: Anonymous type with typed, named, and indexed elements
  • case class: Named type with typed and named elements

In Scala, I can just write:

// A tuple with two values
val t1 = (1, "A")

// A nested tuple
val t2 = (1, "A", (2, "B"))

In Java, a similar thing can be done, but you’ll have to write the library yourself, and you have no language support:

class Tuple2<T1, T2> {
    // Lots of bloat, see missing case classes
}

class Tuple3<T1, T2, T3> {
    // Bloat bloat bloat
}

And then:

// Yikes, no type inference...
Tuple2<Integer, String> t1 = new Tuple2<>(1, "A");

// OK, this will certainly not look nice
Tuple3<Integer, String, Tuple2<Integer, String>> t2 =
    new Tuple3<>(1, "A", new Tuple2<>(2, "B"));

jOOQ makes extensive use of the above technique to bring you SQL’s row value expressions to Java, and surprisingly, in most cases, you can do without the missing type inference as jOOQ is a fluent API where you never really assign values to local variables… An example:

DSL.using(configuration)
   .select(T1.SOME_VALUE)
   .from(T1)
   .where(
      // This ROW constructor is completely type safe
      row(T1.COL1, T1.COL2)
      .in(select(T2.A, T2.B).from(T2))
   )
   .fetch();

Conclusion

This was certainly a pro-Scala and slightly contra-Java article. Don’t get me wrong. By no means, I’d like to migrate entirely to Scala. I think that the Scala language goes way beyond what is reasonable in any useful software. There are lots of little features and gimmicks that seem nice to have, but will inevitably blow up in your face, such as:

  • implicit conversion. This is not only very hard to manage, it also slows down compilation horribly. Besides, it’s probably utterly impossible to implement semantic versioning reasonably using implicit, as it is probably not possible to foresee all possible client code breakage through accidental backwards-incompatibility.
  • local imports seem great at first, but their power quickly makes code unintelligible when people start partially importing or renaming types for a local scope.
  • symbolic method names are most often abused. Take the parser API for instance, which features method names like ^^, ^^^, ^?, or ~!

Nonetheless, I think that the advantages of Scala over Java listed in this article could all be implemented in Java as well:

  • with little risk of breaking backwards-compatibility
  • with (probably) not too big of an effort, JLS-wise
  • with a huge impact on developer productivity
  • with a huge impact on Java’s competitiveness

In any case, Java 9 will be another promising release, with hot topics like value types, declaration-site variance, specialisation (very interesting!) or ClassDynamic

With these huge changes, let’s hope there’s also some room for any of the above little improvements, that would add more immediate value to every day work.

jOOQ Tip of the Day: Reuse Bind Values


jOOQ implements your SQL statements as AST (Abstract Syntax Tree). This means that your SQL statement is modelled in a non-text form prior to serialising it as a textual SQL statement to your JDBC driver.

One advantage of this is that you can freely manipulate this AST any way you want. This can be done using jOOQ’s SQL transformation capabilities, or in some cases much more simply directly in your client code.

Imagine, for instance, that you have a SQL statement where several bind values should be identical. This is a frequent problem in SQL, as SQL is verbose and repetitive. For instance:

-- Both "1" should in fact be the same value
SELECT 1
FROM   TABLE
WHERE  TABLE.COL < 1

-- Both "2" should in fact be the same value
SELECT 2
FROM   TABLE
WHERE  TABLE.COL < 2

With jOOQ, what you can do is this:

// Create a single bind value reference
Param<Integer> value = val(1);

// And use that reference several times in your query:
Select<?> query =
DSL.using(configuration)
   .select(value.as("a"))
   .from(TABLE)
   .where(TABLE.COL.lt(value));

assertEquals(1, (int) query.fetchOne("a"));

// Now, for the second query, simply change the value
value.setValue(2);

// ... and execute the query again
assertEquals(2, (int) query.fetchOne("a"));

As a jOOQ developer, you’re directly manipulating your SQL statement’s AST. Nothing keeps you from turning that AST into a directed graph (beware of cycles, of course), to improve your SQL expressiveness.

jOOQ: The best way to write SQL in Java

Keeping things DRY: Method overloading


A good clean application design requires discipline in keeping things DRY:

Everything has to be done once.
Having to do it twice is a coincidence.
Having to do it three times is a pattern.

— An unknown wise man

Now, if you’re following the Xtreme Programming rules, you know what needs to be done, when you encounter a pattern:

refactor mercilessly

Because we all know what happens when you don’t:

Not DRY: Method overloading

One of the least DRY things you can do that is still acceptable is method overloading – in those languages that allow it (unlike Ceylon, JavaScript). Being an internal domain-specific language, the jOOQ API makes heavy use of overloading. Consider the type Field (modelling a database column):

public interface Field<T> {

    // [...]

    Condition eq(T value);
    Condition eq(Field<T> field);
    Condition eq(Select<? extends Record1<T>> query);
    Condition eq(QuantifiedSelect<? extends Record1<T>> query);

    Condition in(Collection<?> values);
    Condition in(T... values);
    Condition in(Field<?>... values);
    Condition in(Select<? extends Record1<T>> query);

    // [...]

}

So, in certain cases, non-DRY-ness is inevitable, also to a given extent in the implementation of the above API. The key rule of thumb here, however, is to always have as few implementations as possible also for overloaded methods. Try calling one method from another. For instance these two methods are very similar:

Condition eq(T value);
Condition eq(Field<T> field);

The first method is a special case of the second one, where jOOQ users do not want to explicitly declare a bind variable. It is literally implemented as such:

@Override
public final Condition eq(T value) {
    return equal(value);
}

@Override
public final Condition equal(T value) {
    return equal(Utils.field(value, this));
}

@Override
public final Condition equal(Field<T> field) {
    return compare(EQUALS, nullSafe(field));
}

@Override
public final Condition compare(Comparator comparator, Field<T> field) {
    switch (comparator) {
        case IS_DISTINCT_FROM:
        case IS_NOT_DISTINCT_FROM:
            return new IsDistinctFrom<T>(this, nullSafe(field), comparator);

        default:
            return new CompareCondition(this, nullSafe(field), comparator);
    }
}

As you can see:

  • eq() is just a synonym for the legacy equal() method
  • equal(T) is a more specialised, convenience form of equal(Field<T>)
  • equal(Field<T>) is a more specialised, convenience form of compare(Comparator, Field<T>)
  • compare() finally provides access to the implementation of this API

All of these methods are also part of the public API and can be called by the API consumer, directly, which is why the nullSafe() check is repeated in each method.

Why all the trouble?

The answer is simple.

  • There is only very little possibility of a copy-paste error throughout all the API.
  • … because the same API has to be offered for ne, gt, ge, lt, le
  • No matter what part of the API happens to be integration-tested, the implementation itself is certainly covered by some test.
  • This way, it is extremely easy to provide users with a very rich API with lots of convenience methods, as users do not want to remember how these more general-purpose methods (like compare()) really work.

The last point is particularly important, and because of risks related to backwards-compatibility, not always followed by the JDK, for instance. In order to create a Java 8 Stream from an Iterator, you have to go through all this hassle, for instance:

// Aagh, my fingers hurt...
   StreamSupport.stream(iterator.spliterator(), false);
// ^^^^^^^^^^^^^                 ^^^^^^^^^^^    ^^^^^
//       |                            |           |
// Not Stream!                        |           |
//                                    |           |
// Hmm, Spliterator. Sounds like      |           |
// Iterator. But what is it? ---------+           |
//                                                |
// What's this true and false?                    |
// And do I need to care? ------------------------+

When, intuitively, you’d like to have:

// Not Enterprise enough
iterator.stream();

In other words, subtle Java 8 Streams implementation details will soon leak into a lot of client code, and many new utility functions will wrap these things again and again.

See Brian Goetz’s explanation on Stack Overflow for details.

On the flip side of delegating overload implementations, it is of course harder (i.e. more work) to implement such an API. This is particularly cumbersome if an API vendor also allows users to implement the API themselves (e.g. JDBC). Another issue is the length of stack traces generated by such implementations. But we’ve shown before on this blog that deep stack traces can be a sign of good quality.

Now you know why.

Takeaway

The takeaway is simple. Whenever you encounter a pattern, refactor. Find the most common denominator, factor it out into an implementation, and see that this implementation is hardly ever used by delegating single responsibility steps from method to method.

By following these rules, you will:

  • Have less bugs
  • Have a more convenient API

Happy refactoring!

Java 8 Friday: More Functional Relational Transformation


In the past, we’ve been providing you with a new article every Friday about what’s new in Java 8. It has been a very exciting blog series, but we would like to focus again more on our core content, which is Java and SQL. We will still be occasionally blogging about Java 8, but no longer every Friday (as some of you have already notice).

In this last, short post of the Java 8 Friday series, we’d like to re-iterate the fact that we believe that the future belongs to functional relational data transformation (as opposed to ORM). We’ve spent about 20 years now using the object-oriented software development paradigm. Many of us have been very dogmatic about it. In the last 10 years, however, a “new” paradigm has started to get increasing traction in programming communities: Functional programming.

Functional programming is not that new, however. Lisp has been a very early functional programming language. XSLT and SQL are also somewhat functional (and declarative!). As we’re big fans of SQL’s functional (and declarative!) nature, we’re quite excited about the fact that we now have sophisticated tools in Java to transform tabular data that has been extracted from SQL databases. Streams!

SQL ResultSets are very similar to Streams

As we’ve pointed out before, JDBC ResultSets and Java 8 Streams are quite similar. This is even more true when you’re using jOOQ, which replaces the JDBC ResultSet by an org.jooq.Result, which extends java.util.List, and thus automatically inherits all Streams functionality. Consider the following query that allows fetching a one-to-many relationship between BOOK and AUTHOR records:

Map<Record2<String, String>, 
    List<Record2<Integer, String>>> booksByAuthor =

// This work is performed in the database
// --------------------------------------
ctx.select(
        BOOK.ID,
        BOOK.TITLE,
        AUTHOR.FIRST_NAME,
        AUTHOR.LAST_NAME
    )
   .from(BOOK)
   .join(AUTHOR)
   .on(BOOK.AUTHOR_ID.eq(AUTHOR.ID))
   .orderBy(BOOK.ID)
   .fetch()

// This work is performed in Java memory
// -------------------------------------
   .stream()

   // Group BOOKs by AUTHOR
   .collect(groupingBy(

        // This is the grouping key      
        r -> r.into(AUTHOR.FIRST_NAME, 
                    AUTHOR.LAST_NAME),

        // This is the target data structure
        LinkedHashMap::new,

        // This is the value to be produced for each
        // group: A list of BOOK
        mapping(
            r -> r.into(BOOK.ID, BOOK.TITLE),
            toList()
        )
    ));

The fluency of the Java 8 Streams API is very idiomatic to someone who has been used to writing SQL with jOOQ. Obviously, you can also use something other than jOOQ, e.g. Spring’s JdbcTemplate, or Apache Commons DbUtils, or just wrap the JDBC ResultSet in an Iterator…

What’s very nice about this approach, compared to ORM is the fact that there is no magic happening at all. Every piece of mapping logic is explicit and, thanks to Java generics, fully typesafe. The type of the booksByAuthor output is complex, and a bit hard to read / write, in this example, but it is also fully descriptive and useful.

The same functional transformation with POJOs

If you aren’t too happy with using jOOQ’s Record2 tuple types, no problem. You can specify your own data transfer objects like so:

class Book {
    public int id;
    public String title;

    @Override
    public String toString() { ... }

    @Override
    public int hashCode() { ... }

    @Override
    public boolean equals(Object obj) { ... }
}

static class Author {
    public String firstName;
    public String lastName;

    @Override
    public String toString() { ... }

    @Override
    public int hashCode() { ... }

    @Override
    public boolean equals(Object obj) { ... }
}

With the above DTO, you can now leverage jOOQ’s built-in POJO mapping to transform the jOOQ records into your own domain classes:

Map<Author, List<Book>> booksByAuthor =
ctx.select(
        BOOK.ID,
        BOOK.TITLE,
        AUTHOR.FIRST_NAME,
        AUTHOR.LAST_NAME
    )
   .from(BOOK)
   .join(AUTHOR)
   .on(BOOK.AUTHOR_ID.eq(AUTHOR.ID))
   .orderBy(BOOK.ID)
   .fetch()
   .stream()
   .collect(groupingBy(

        // This is the grouping key      
        r -> r.into(Author.class),
        LinkedHashMap::new,

        // This is the grouping value list
        mapping(
            r -> r.into(Book.class),
            toList()
        )
    ));

Explicitness vs. implicitness

At Data Geekery, we believe that a new time has started for Java developers. A time where Annotatiomania™ (finally!) ends and people stop assuming all that implicit behaviour through annotation magic. ORMs depend on a huge amount of specification to explain how each annotation works with each other annotation. It is hard to reverse-engineer (or debug!) this kind of not-so-well-understood annotation-language that JPA has brought to us.

On the flip side, SQL is pretty well understood. Tables are an easy-to-handle data structure, and if you need to transform those tables into something more object-oriented, or more hierarchically structured, you can simply apply functions to those tables and group values yourself! By grouping those values explicitly, you stay in full control of your mapping, just as with jOOQ, you stay in full control of your SQL.

This is why we believe that in the next 5 years, ORMs will lose relevance and people start embracing explicit, stateless and magicless data transformation techniques again, using Java 8 Streams.

Follow

Get every new post delivered to your Inbox.

Join 2,190 other followers

%d bloggers like this: