Java 8 Friday Goodies: Lambdas and Sorting


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: Lambdas and Sorting

Sorting arrays, and collections is an awesome use-case for Java 8’s lambda expression for the simple reason that Comparator has always been a @FunctionalInterface all along since its introduction in JDK 1.2. We can now supply Comparators in the form of a lambda expression to various sort() methods.

For the following examples, we’re going to use this simple Person class:

static class Person {
    final String firstName;
    final String lastName;

    Person(String firstName, String lastName) {
        this.firstName = firstName;
        this.lastName = lastName;
    }

    @Override
    public String toString() {
        return "Person{" +
                "firstName='" + firstName + '\'' +
                ", lastName='" + lastName + '\'' +
                '}';
    }
}

Obviously, we could add natural sorting to Person as well by letting it implement Comparable, but lets focus on external Comparators. Consider the following list of Person, whose names are generated with some online random name generator:

List<Person> people =
Arrays.asList(
    new Person("Jane", "Henderson"),
    new Person("Michael", "White"),
    new Person("Henry", "Brighton"),
    new Person("Hannah", "Plowman"),
    new Person("William", "Henderson")
);

We probably want to sort them by last name and then by first name.

Sorting with Java 7

A “classic” Java 7 example of such a Comparator is this:

people.sort(new Comparator<Person>() {
  @Override
  public int compare(Person o1, Person o2) {
    int result = o1.lastName.compareTo(o2.lastName);

    if (result == 0)
      result = o1.firstName.compareTo(o2.firstName);

    return result;
  }
});
people.forEach(System.out::println);

And the above would yield:

Person{firstName='Henry', lastName='Brighton'}
Person{firstName='Jane', lastName='Henderson'}
Person{firstName='William', lastName='Henderson'}
Person{firstName='Hannah', lastName='Plowman'}
Person{firstName='Michael', lastName='White'}

Sorting with Java 8

Now, let’s translate the above to equivalent Java 8 code:

Comparator<Person> c = (p, o) ->
    p.lastName.compareTo(o.lastName);

c = c.thenComparing((p, o) ->
    p.firstName.compareTo(o.firstName));

people.sort(c);
people.forEach(System.out::println);

The result is obviously the same. How to read the above? First, we assign a lambda expression to a local Person Comparator variable:

Comparator<Person> c = (p, o) ->
    p.lastName.compareTo(o.lastName);

Unlike Scala, C#, or Ceylon which know type inference from an expression towards a local variable declaration through a val keyword (or similar), Java performs type inference from a variable (or parameter, member) declaration towards an expression that is being assigned.

In other, more informal words, type inference is performed from “left to right”, not from “right to left”. This makes chaining Comparators a bit cumbersome, as the Java compiler cannot delay type inference for lambda expressions until you pass the comparator to the sort() method.

Once we have assigned a Comparator to a variable, however, we can fluently chain other comparators through thenComparing():

c = c.thenComparing((p, o) ->
    p.firstName.compareTo(o.firstName));

And finally, we pass it to the List‘s new sort() method, which is a default method implemented directly on the List interface:

default void sort(Comparator<? super E> c) {
    Collections.sort(this, c);
}

Workaround for the above limitation

While Java’s type inference “limitations” can turn out to be a bit frustrating, we can work around type inference by creating a generic IdentityComparator:

class Utils {
    static <E> Comparator<E> compare() {
        return (e1, e2) -> 0;
    }
}

With the above compare() method, we can write the following fluent comparator chain:

people.sort(
    Utils.<Person>compare()
         .thenComparing((p, o) -> 
              p.lastName.compareTo(o.lastName))
         .thenComparing((p, o) -> 
              p.firstName.compareTo(o.firstName))
);

people.forEach(System.out::println);

Extracting keys

This can get even better. Since we’re usually comparing the same POJO / DTO value from both Comparator arguments, we can provide them to the new APIs through a “key extractor” function. This is how it works:

people.sort(Utils.<Person>compare()
      .thenComparing(p -> p.lastName)
      .thenComparing(p -> p.firstName));
people.forEach(System.out::println);

So, given a Person p we provide the API with a function extracting, for instance, p.lastName. And in fact, once we use key extractors, we can omit our own utility method, as the libraries also have a comparing() method to initiate the whole chain:

people.sort(
    Comparator.comparing((Person p) -> p.lastName)
          .thenComparing(p -> p.firstName));
people.forEach(System.out::println);

Again, we need to help the compiler as it cannot infer all types, even if in principle, the sort() method would provide enough information in this case. To learn more about Java 8’s generalized type inference, see our previous blog post.

Conclusion

As with Java 5, the biggest improvements of the upgrade can be seen in the JDK libraries. When Java 5 brought typesafety to Comparators, Java 8 makes them easy to read and write (give or take the odd type inference quirk).

Java 8 is going to revolutionise the way we program, and next week, we will see how Java 8 impacts the way we interact with SQL.

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

Introducing CQLC – a Query DSL for Cassandra’s CQL in Go, Inspired by jOOQ


This morning, we’ve had very nice feedback about our work on the jOOQ User Group by Ben Hood, who has been a long-time jOOQ user and ideas/bug report contributor. He has taken inspiration from our software to build CQLC, a fluent API / query DSL for Cassandra’s CQL written in Go. Citing Ben:

I think JOOQ is the best thing since sliced bread when it comes to dealing with SQL databases on the JVM. I’ve been writing some CQL based apps in Go, and I’ve really missed JOOQ. So I decided to build a JOOQ for Go/Cassandra.

cqlc generates Go code from your Cassandra schema so that you can write type safe CQL statements in Go with a natural query syntax.

It’s aimed at people using CQL in Golang apps who are looking to reduce boilerplate code. […] I think it reinforces the pragmatism of the underlying concept behind JOOQ.

The project is available here:

http://relops.com/cqlc/

I’ve written a blog post to explain what motivated me to do this here:

http://relops.com/blog/2014/01/25/cqlc/

That’s great news for Cassandra and CQL! NoSQL databases will not profit from jOOQ directly any time soon – unless they implement actual SQL. We have studied Cassandra’s CQL syntax in the past and we’ve come to the conclusion that formal jOOQ support would be overkill, similar to JCR-SQL2 support or CMIS SQL support. 90% of the jOOQ API wouldn’t work.

So if your a Go developer using Cassandra, do have a look at CQLC!

Java 8 Friday Goodies: The New New I/O APIs


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: The New New I/O APIs

In a previous blog post from this series, we have shown how Java 8’s lambda expressions improve on the existing (yet outdated) JDK 1.2 I/O API, mainly by helping you express java.io.FileFilter instances as lambda expressions.

Many readers have rightfully pointed out that much of the java.io API has been superseded by Java 7’s java.nio API, where “N” stands for “New” (I know. New. Old. Old-2. Old-2-FIXME. Old-2-TODO…). But things get even better with Java 8. We’re calling it the New New I/O APIs (NNIO), though jOOQ community members have suggested to call it “Enterprise IO”:

Back to more constructive blogging. Let’s have a short walk (pun intended, see Files.walk()) around the improved Java 8 NIO features. Let’s first have a look at the new methods in java.nio.Files. It’s actually quite awesome that we can finally just list contents of a Path! In Java 8 we would use the newly introduced Files.list(), which returns a lazy Stream of files:

Files.list(new File(".").toPath())
     .forEach(System.out::println);

The output I get is this:

.\.git
.\.gitignore
.\.idea
.\java8-goodies.iml
.\LICENSE.txt
.\pom.xml
.\README.txt
.\src
.\target

Remember that forEach() is a “terminal method”, i.e. a method that consumes the stream. You mustn’t call any further methods on such a Stream.

We could also skip all hidden files and list only the first three “regular” files like this:

Files.list(new File(".").toPath())
     .filter(p -> !p.getFileName()
                    .toString().startsWith("."))
     .limit(3)
     .forEach(System.out::println);

The new output I get is this one:

.\java8-goodies.iml
.\LICENSE.txt
.\pom.xml

Now, that’s already pretty awesome. Can it get better? Yes it can. You can also “walk” a whole file hierarchy by descending into directories using the new Files.walk() method. Here’s how:

Files.walk(new File(".").toPath())
     .filter(p -> !p.getFileName()
                    .toString().startsWith("."))
     .forEach(System.out::println);

Unfortunately, the above will create a Stream of Paths excluding all the hidden files and directories, but their descendants are still listed. So we get:

Omitted:
.\.git

But listed:
.\.git\COMMIT_EDITMSG
.\.git\config
.\.git\description
[...]

It is easy to understand why this happens. Files.walk() returns a (lazy) Stream of all descendant files. The call to .filter() will remove the ones that are hidden from the Stream, but this has no influence on any recursive algorithm that might apply in the implementation of walk(). Frankly, this is a bit disappointing. We cannot leverage Java 7’s Files.walkFileTree() method, because the receiving FileVisitor type is not a @FunctionalInterface

We can, however, inefficiently work around this limitation with the following trivial logic:

Files.walk(new File(".").toPath())
     .filter(p -> !p.toString()
                    .contains(File.separator + "."))
     .forEach(System.out::println);

This now yields the expected

.
.\java8-goodies.iml
.\LICENSE.txt
.\pom.xml
.\README.txt
.\src
.\src\main
.\src\main\java
.\src\main\java\org
.\src\main\java\org\jooq
[...]

Good news, however, is the new Files.lines() method. The following example shows how we can easily read line by line from a file, trimming each line (removing indentation) and filtering out the empty ones:

Files.lines(new File("pom.xml").toPath())
     .map(s -> s.trim())
     .filter(s -> !s.isEmpty())
     .forEach(System.out::println);

The above yields:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.jooq</groupId>
<artifactId>java8-goodies</artifactId>
<version>1.0-SNAPSHOT</version>
[...]

Conclusion

Clearly, the notion of lazy evaluation will produce a big amount of confusion in the community, similar to the fact that a Stream can be consumed only once. We are placing a bet that the Java 8 Streams API will be the single biggest source of new Stack Overflow questions.

Nonetheless, the Streams API will be awesome, and next week on the Java 8 Friday series, we’ll see how we can leverage lambda expressions and Streams to sort things, before we’ll see how Java 8 will improve our database interactions!

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

jOOQ Newsletter: January 22, 2014


Subscribe to the newsletter here

Tweet of the Day

We are contributing this new section of the newsletter to our followers, users, and customers. Here are:

Jose M. Arranz who has had plans to build jOOQ, when he happily discovered that jOOQ already exists

Majid Azimi who wishes for jOOQ to become the new de facto standard in all languages. (we wish for the same, shocker, I know)

jOOQ 3.3 Preview

We’re closing in on releasing jOOQ 3.3 towards the beginning of February 2014, which is an exciting release for both existing and new jOOQ users. Apart from many defects fixed, there are now also

… and much more. jOOQ Open Source Edition users can download a preview from GitHub, commercial users can request a pre-built download directly from us.

The Data Geekery Business Case at RedHat’s opensource.com

At Data Geekery, we’re in close touch with various communities, among which those by Oracle or RedHat. RedHat has been selling Open Source software as a business model for a long time. In the enterprise, apart from the flagship RHEL, RedHat is also providing support for a variety of other stacks, such as the JBoss platform, or cloud computing solutions

We’re thrilled to present to you our feature article on RedHat’s opensource.com platform, where we have published an article on our own commercial Open Source business case:

http://opensource.com/business/14/1/how-to-transition-open-source-to-revenue

This is the first part of a series of blog posts. In the next part, we’re going to talk about the five lessons learned when making a business of Open Source, so stay tuned!

Community Work

In the last few weeks, there had been a couple of excellent blog posts by jOOQ community members, which we do not want to keep from you. In December, Gregor Riegler has written a witty rant about the Annotation Nightmare, which we’re increasingly suffering from in the Java ecosystem. Declarative programming at its best.

In January 2014, Petri Kainulainen has started writing a series of excellent blog posts and tutorials for new jOOQ users. The first two posts he has written can be seen here:

Petri is a very active blogger, whom we have been following for a while now. We’re eagerly looking forward to future posts about jOOQ from him.

There are more goodies. Johannes Bühler has contributed new functionality around the loader API and JSON, whereas Darren Shepherd who is a very active contributor to Apache CloudStack has been very active on the jOOQ User Group as well, discovering many issues. Thank you very much for all your help, guys!

Upcoming Events

In January, we have been visiting the JUG-HH in Hamburg for our talk about jOOQ. This was a very exciting and welcoming event with a heavily “JPA / JEE – biased” audience. Thrilling to see how jOOQ (and SQL!) finds high resonance in this kind of industry!

Here is an overview of our upcoming events. We’re very happy to talk at geecon this year!

Stay informed about 2014 events on www.jooq.org/news.

SQL Zone – DEFAULT values

We’re always surprised ourselves at the depth and breadth of the SQL language, and we love share our discoveries with you. Few people know that in SQL (in the standard and in many SQL dialects), you can use the DEFAULT keyword in INSERT and UPDATE statements. Read more about DEFAULT VALUES on our blog.

SQL Zone – JDBC Batch

In heavy-throughput environments, SQL users rely on vendor-specific tools, such as Oracle’s SQL*Loader. In slightly less performance-critical environments, JDBC batch operations are still good enough. But how good are they compared to standalone statements? The answer is: very good. But not in all databases / JDBC drivers. We have found a very interesting benchmark by James Sutherland, which we want to share with you.

Java 8 Friday Goodies: Lambdas and XML


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: Lambdas and XML

There isn’t too much that Java 8 can do to the existing SAX and DOM APIs. The SAX ContentHandler has too many abstract methods to qualify as a @FunctionalInterface, and DOM is a huge, verbose API specified by w3c, with little chance of adding new extension methods.

Luckily there is a small Open Source library called jOOX that allows for processing the w3c standard DOM API through a wrapper API that mimicks the popular jQuery library. jQuery leverages JavaScript’s language features by allowing users to pass functions to the API for DOM traversal. The same is the case with jOOX. Let’s have a closer look:

Assume that we’re using the following pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>org.jooq</groupId>
  <artifactId>java8-goodies</artifactId>
  <version>1.0-SNAPSHOT</version>

  <dependencies>
    <dependency>
      <groupId>org.jooq</groupId>
      <artifactId>joox</artifactId>
      <version>1.2.0</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>2.3.2</version>
        <configuration>
          <fork>true</fork>
          <maxmem>512m</maxmem>
          <meminitial>256m</meminitial>
          <encoding>UTF-8</encoding>
          <source>1.8</source>
          <target>1.8</target>
          <debug>true</debug>
          <debuglevel>lines,vars,source</debuglevel>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

Let’s assume we wanted to know all the involved artifacts in Maven’s groupId:artifactId:version notation. Here’s how we can do that with jOOX and lambda expressions:

$(new File("./pom.xml")).find("groupId")
                        .each(ctx -> {
    System.out.println(
        $(ctx).text() + ":" +
        $(ctx).siblings("artifactId").text() + ":" +
        $(ctx).siblings("version").text()
    );
});

Executing the above yields:

org.jooq:java8-goodies:1.0-SNAPSHOT
org.jooq:joox:1.2.0
org.apache.maven.plugins:maven-compiler-plugin:2.3.2

Let’s assume we only wanted to display those artifacts that don’t have SNAPSHOT in their version numbers. Simply add a filter:

$(new File("./pom.xml"))
    .find("groupId")
    .filter(ctx -> $(ctx).siblings("version")
                         .matchText(".*-SNAPSHOT")
                         .isEmpty())
    .each(ctx -> {
        System.out.println(
        $(ctx).text() + ":" +
        $(ctx).siblings("artifactId").text() + ":" +
        $(ctx).siblings("version").text());
    });

This will now yield

org.jooq:joox:1.2.0
org.apache.maven.plugins:maven-compiler-plugin:2.3.2

We can also transform the XML content. For instance, if the target document doesn’t need to be a POM, we could replace the matched groupId elements by an artifical artifact element that contains the artifact name in Maven notation. Here’s how to do this:

$(new File("./pom.xml"))
    .find("groupId")
    .filter(ctx -> $(ctx).siblings("version")
                         .matchText(".*-SNAPSHOT")
                         .isEmpty())
    .content(ctx ->
        $(ctx).text() + ":" +
        $(ctx).siblings("artifactId").text() + ":" +
        $(ctx).siblings("version").text()
    )
    .rename("artifact")
    .each(ctx -> System.out.println(ctx));

The above puts new content in place of the previous one through .content(), and then renames the groupId elements to artifact, before printing out the element. The result is:

<artifact>org.jooq:joox:1.2.0</artifact>
<artifact>org.apache.maven.plugins:maven-compiler-plugin:2.3.2</artifact>

More goodies next week

What becomes immediately obvious is the fact that the lambda expert group’s choice of making all SAMs (Single Abstract Method interfaces) eligible for use with lambda expressions adds great value to pre-existing APIs. Quite a clever move.

But there are also new APIs. Last week, we have discussed how the existing JDK 1.2 File API can be improved through the use of lambdas. Some of our readers have expressed their concerns that the java.io API has been largely replaced by java.nio (nio as in New I/O). Next week, we’ll have a look at Java 8’s java.nnio API (for new-new I/O 😉 ) and how it relates to the Java 8 Streams API.

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

What you Didn’t Know About JDBC Batch


In our previous blog post “10 Common Mistakes Java Developers Make When Writing SQL“, we have made a point about batching being important when inserting large data sets. In most databases and with most JDBC drivers, you can get a significant performance improvement when running a single prepared statement in batch mode as such:

PreparedStatement s = connection.prepareStatement(
    "INSERT INTO author(id, first_name, last_name)"
  + "  VALUES (?, ?, ?)");

s.setInt(1, 1);
s.setString(2, "Erich");
s.setString(3, "Gamma");
s.addBatch();

s.setInt(1, 2);
s.setString(2, "Richard");
s.setString(3, "Helm");
s.addBatch();

s.setInt(1, 3);
s.setString(2, "Ralph");
s.setString(3, "Johnson");
s.addBatch();

s.setInt(1, 4);
s.setString(2, "John");
s.setString(3, "Vlissides");
s.addBatch();

int[] result = s.executeBatch();

Or with jOOQ:

create.batch(
        insertInto(AUTHOR, ID, FIRST_NAME, LAST_NAME)
       .values((Integer) null, null, null))
      .bind(1, "Erich", "Gamma")
      .bind(2, "Richard", "Helm")
      .bind(3, "Ralph", "Johnson")
      .bind(4, "John", "Vlissides")
      .execute();

What you probably didn’t know, however, is how dramatic the improvement really is and that JDBC drivers like that of MySQL don’t really support batching, whereas Derby, H2, and HSQLDB don’t really seem to benefit from batching. James Sutherland has assembled this very interesting benchmark on his Java Persistence Performance blog, which can be summarised as such:

Database Performance gain when batched
DB2 503%
Derby 7%
H2 20%
HSQLDB 25%
MySQL 5%
MySQL 332% (with rewriteBatchedStatements=true)
Oracle 503%
PostgreSQL 325%
SQL Server 325%

The above table shows the improvement when comparing each database against itself for INSERT, not databases against each other. Regardless of the actual results, it can be said that batching is never worse than not batching for the data set sizes used in the benchmark.

See the full article here to see a more detailed interpretation of the above benchmark results, as well as results for UPDATE statements:
http://java-persistence-performance.blogspot.ch/2013/05/batch-writing-and-dynamic-vs.html

Do You Want to be a Better Software Developer?


Bloggers are a different breed. They’re spending a lot of time investigating issues in a systematic way that is presentable to others. And then they share – mostly just for the fun of it and for the rewarding feeling sharing gives them. Whenever we google for a technical issue, chances are high that we stumble upon such a blog post.

One of the best blogs out there is Petri Kainulainen’s Do You Want to be a Better Software Developer? Petri has also written a book about Spring Data, which is available from Amazon, O’Reilly and Packt Publishing.

Most recently, I have found his two Maven-related tutorials very useful and well-written:

Also, in 2013, he has written an extensive series of blog posts titled “What I Learned This Week”. Some examples:

Petri’s blog is certainly one that you should follow. His posts are very well structured and quite complete. Currently, he’s also writing an extensive series about jOOQ, which is a very useful additional resource for new jOOQ users.

Thanks for all this great content, Petri!