Archive | java RSS for this section

SQL Developer’s “Securely” Encrypted Passwords


Recently, while at one of our customers’ site, the customer and I needed to get access to a database. On my machine, I had stored the password, but the customer obviously didn’t want to rely on my machine, and the password itself is hashed, so we couldn’t guess it. But guess what? Yes we can! I googled a bit, and incredibly, I found instructions to write the following little utility programme, which I’m licensing to you under the terms of the ASL 2.0:

DISCLAIMER: This program is BY NO MEANS intended for you to do any harm. You could have found this information anywhere else on the web. Please use this ONLY to recover your own “lost” passwords. Like I did.

Note also, this only works with SQL Developer versions less than 4.

import java.io.File;
import java.security.GeneralSecurityException;

import javax.crypto.Cipher;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;

public class SQLDeveloperDecrypter {
    public static void main(String[] args) throws Exception {
        if (args.length == 0) {
            System.err.println("  Usage 1: " + SQLDeveloperDecrypter.class.getName() + " 0501F83890..... (a single encrypted password)");
            System.err.println("  Usage 2: " + SQLDeveloperDecrypter.class.getName() + " C:\\Users\\...... (the path to the connections.xml file)");
            System.err.println();
            System.err.println("  Pass the password hash code from your connections.xml file. The file might be located at (example)");
            System.err.println("  C:\\Users\\[User]\\AppData\\Roaming\\SQL Developer\\system2.1.1.64.45\\o.jdeveloper.db.connection.11.1.1.2.36.55.30");

            System.exit(-1);
        }

        if (args[0].startsWith("05")) {
            System.out.println(decryptPassword(args[0]));
        }
        else {
            File file = new File(args[0]);
            if (file.isDirectory())
                file = new File(file, "connections.xml");

            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document doc = builder.parse(file.toURI().toString());

            // The relevant structure is:
            //
            // <Reference name="connection name">
            //   <RefAddresses>
            //     <StringRefAddr addrType="password">
            //       <Contents>057D3DE2...

            XPathFactory xPathfactory = XPathFactory.newInstance();
            XPath xpath = xPathfactory.newXPath();
            XPathExpression expr = xpath.compile("//StringRefAddr[@addrType='password']/Contents");

            NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
            for (int i = 0; i < nodes.getLength(); i++) {
                Element e = (Element) nodes.item(i);

                System.out.println("Connection name     : " +
                    ((Element) e.getParentNode().getParentNode().getParentNode()).getAttribute("name")
                );

                System.out.println("Password (encrypted): " +
                    e.getTextContent()
                );

                System.out.println("Password (decrypted): " +
                    decryptPassword(e.getTextContent())
                );

                System.out.println();
            }
        }
    }

    // From: http://stackoverflow.com/a/140861
    public static byte[] hexStringToByteArray(String s) {
        int len = s.length();
        byte[] data = new byte[len / 2];
        for (int i = 0; i < len; i += 2) {
            data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
                                 + Character.digit(s.charAt(i+1), 16));
        }
        return data;
    }

    // From: http://stackoverflow.com/a/3109774
    public static String decryptPassword(String result) throws GeneralSecurityException {
          return new String(decryptPassword(hexStringToByteArray(result)));
    }

    public static byte[] decryptPassword(byte[] result) throws GeneralSecurityException {
        byte constant = result[0];
        if (constant != 5) {
            throw new IllegalArgumentException();
        }

        byte[] secretKey = new byte[8];
        System.arraycopy(result, 1, secretKey, 0, 8);

        byte[] encryptedPassword = new byte[result.length - 9];
        System.arraycopy(result, 9, encryptedPassword, 0, encryptedPassword.length);

        byte[] iv = new byte[8];
        for (int i = 0; i < iv.length; i++) {
            iv[i] = 0;
        }

        Cipher cipher = Cipher.getInstance("DES/CBC/PKCS5Padding");
        cipher.init(Cipher.DECRYPT_MODE, new SecretKeySpec(secretKey, "DES"), new IvParameterSpec(iv));
        return cipher.doFinal(encryptedPassword);
    }
}

Parts of the source code were borrowed, from here and here. In other words, virtually any hacker could’ve come up with the above programme. And the output? This:

Connection name     : SAKILA
Password (encrypted): 0517CB1A41E3C2CC3A3163234A6A8E92F8
Password (decrypted): SAKILA

Connection name     : TEST
Password (encrypted): 05B03F45511F83F6CD4D322C9E173B5A94
Password (decrypted): TEST

Wonderful! All the passwords on my machine are now recovered in constant time (no brute force).

Does this make you think? I hope that your DBA doesn’t store their passwords in SQL Developer. On a laptop. Which they forget in the train. With access to your customers’ credit card information.

In the meantime, though, I’m glad I could recover the “lost” password for my client ;-)

Java 8 Friday: Most Internal DSLs are Outdated


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Most Internal DSLs are Outdated

That’s quite a statement from a vendor of one of the most advanced internal DSLs currently on the market. Let me explain:

Languages are hard

Learning a new language (or API) is hard. You have to understand all the keywords, the constructs, the statement and expression types, etc. This is true both for external DSLs, internal DSLs and “regular” APIs, which are essentially internal DSLs with less fluency.

When using JUnit, people have grown used to using hamcrest matchers. The fact that they’re available in six languages (Java, Python, Ruby, Objective-C, PHP, Erlang) makes them somewhat of a sound choice. As a domain-specific language, they have established idioms that are easy to read, e.g.

assertThat(theBiscuit, equalTo(myBiscuit));
assertThat(theBiscuit, is(equalTo(myBiscuit)));
assertThat(theBiscuit, is(myBiscuit));

When you read this code, you will immediately “understand” what is being asserted, because the API reads like prosa. But learning to write code in this API is harder. You will have to understand:

  • Where all of these methods are coming from
  • What sorts of methods exist
  • Who might have extended hamcrest with custom Matchers
  • What are best practices when extending the DSL

For instance, in the above example, what exactly is the difference between the three? When should I use one and when the other? Is is() checking for object identity? Is equalTo() checking for object equality?

The hamcrest tutorial goes on with examples like these:

public void testSquareRootOfMinusOneIsNotANumber() {
    assertThat(Math.sqrt(-1), is(notANumber()));
}

You can see that notANumber() apparently is a custom matcher, implemented some place in a utility:

public class IsNotANumber
extends TypeSafeMatcher<Double> {

  @Override
  public boolean matchesSafely(Double number) {
    return number.isNaN();
  }

  public void describeTo(Description description) {
    description.appendText("not a number");
  }

  @Factory
  public static <T> Matcher<Double> notANumber() {
    return new IsNotANumber();
  }
}

While this sort of DSL is very easy to create, and probably also a bit fun, it is dangerous to start delving into writing and enhancing custom DSLs for a simple reason. They’re in no way better than their general-purpose, functional counterparts – but they’re harder to maintain. Consider the above examples in Java 8:

Replacing DSLs with Functions

Let’s assume we have a very simple testing API:

static <T> void assertThat(
    T actual, 
    Predicate<T> expected
) {
    assertThat(actual, expected, "Test failed");
}

static <T> void assertThat(
    T actual, 
    Predicate<T> expected, 
    String message
) {
    assertThat(() -> actual, expected, message);
}

static <T> void assertThat(
    Supplier<T> actual, 
    Predicate<T> expected
) {
    assertThat(actual, expected, "Test failed");
}

static <T> void assertThat(
    Supplier<T> actual, 
    Predicate<T> expected, 
    String message
) {
    if (!expected.test(actual.get()))
        throw new AssertionError(message);
}

Now, compare the hamcrest matcher expressions with their functional equivalents:

// BEFORE
// ---------------------------------------------
assertThat(theBiscuit, equalTo(myBiscuit));
assertThat(theBiscuit, is(equalTo(myBiscuit)));
assertThat(theBiscuit, is(myBiscuit));

assertThat(Math.sqrt(-1), is(notANumber()));

// AFTER
// ---------------------------------------------
assertThat(theBiscuit, b -> b == myBiscuit);
assertThat(Math.sqrt(-1), n -> Double.isNaN(n));

With lambda expressions, and a well-designed assertThat() API, I’m pretty sure that you won’t be looking for the right way to express your assertions with matchers any longer.

Note that unfortunately, we cannot use the Double::isNaN method reference, as that would not be compatible with Predicate<Double>. For that, we’d have to do some primitive type magic in the assertion API, e.g.

static void assertThat(
    double actual, 
    DoublePredicate expected
) { ... }

Which can then be used as such:

assertThat(Math.sqrt(-1), Double::isNaN);

Yeah, but…

… you may hear yourself saying, “but we can combine matchers with lambdas and streams”. Yes, of course we can. I’ve just done so now in the jOOQ integration tests. I want to skip the integration tests for all SQL dialects that are not in a list of dialects supplied as a system property:

String dialectString = 
    System.getProperty("org.jooq.test-dialects");

// The string must not be "empty"
assumeThat(dialectString, not(isOneOf("", null)));

// And we check if the current dialect() is
// contained in a comma or semi-colon separated
// list of allowed dialects, trimmed and lowercased
assumeThat(
    dialect().name().toLowerCase(),

    // Another matcher here
    isOneOf(stream(dialectString.split("[,;]"))
        .map(String::trim)
        .map(String::toLowerCase)
        .toArray(String[]::new))
);

… and that’s pretty neat, too, right?

But why don’t I just simply write:

// Using Apache Commons, here
assumeThat(dialectString, StringUtils::isNotEmpty);
assumeThat(
    dialect().name().toLowerCase(),
    d -> stream(dialectString.split("[,;]"))
        .map(String::trim)
        .map(String::toLowerCase())
        .anyMatch(d::equals)
);

No Hamcrest needed, just plain old lambdas and streams!

Now, readability is a matter of taste, of course. But the above example clearly shows that there is no longer any need for Hamcrest matchers and for the Hamcrest DSL. Given that within the next 2-3 years, the majority of all Java developers will be very used to using the Streams API in every day work, but not very used to using the Hamcrest API, I urge you, JUnit maintainers, to deprecate the use of Hamcrest in favour of Java 8 APIs.

Is Hamcrest now considered bad?

Well, it has served its purpose in the past, and people have grown somewhat used to it. But as we’ve already pointed out in a previous post about Java 8 and JUnit Exception matching, yes, we do believe that we Java folks have been barking up the wrong tree in the last 10 years.

The lack of lambda expressions has lead to a variety of completely bloated and now also slightly useless libraries. Many internal DSLs or annotation-magicians are also affected. Not because they’re no longer solving the problems they used to, but because they’re not Java-8-ready. Hamcrest’s Matcher type is not a functional interface, although it would be quite easy to transform it into one. In fact, Hamcrest’s CustomMatcher logic should be pulled up to the Matcher interface, into default methods.

Things dont’ get better with alternatives, like AssertJ, which create an alternative DSL that is now rendered obsolete (in terms of call-site code verbosity) through lambdas and the Streams API.

If you insist on using a DSL for testing, then probably Spock would be a far better choice anyway.

Other examples

Hamcrest is just one example of such a DSL. This article has shown how it can be almost completely removed from your stack by using standard JDK 8 constructs and a couple of utility methods, which you might have in JUnit some time soon, anyway.

Java 8 will bring a lot of new traction into last decade’s DSL debate, as also the Streams API will greatly improve the way we look at transforming or building data. But many current DSLs are not ready for Java 8, and have not been designed in a functional way. They have too many keywords for things and concepts that are hard to learn, and that would be better modelled using functions.

An exception to this rule are DSLs like jOOQ or jRTF, which are modelling actual pre-existing external DSLs in a 1:1 fashion, inheriting all the existing keywords and syntax elements, which makes them much easier to learn in the first place.

What’s your take?

What is your take on the above assumptions? What is your favourite internal DSL, that might vanish or that might be completely transformed in the next five years because it has been obsoleted by Java 8?

Stay tuned for more Java 8 Friday articles here on this blog.

QueryDSL vs. jOOQ. Feature Completeness vs. Now More Than Ever


This week, Timo Westkämper from QueryDSL has announced feature completeness on the QueryDSL user group, along with his call for contributions and increased focus on bugfixes and documentation.

Timo and us, we have always been in close contact, observing each other’s products. In the beginning of jOOQ in 2009, QueryDSL was ahead of us.

But we learned quickly and removed all of our shortcomings such that jOOQ and QueryDSL were quickly at eye level by 2011. Ever since, we have been taking inspiration from one another, as in the end, we have had similar goals. Today, whenever someone is looking for a querying DSL, people tend to recommend either of our tools:

QueryDSL is often a good choice in JPA-based environments, while jOOQ is mostly the best choice in SQL-based environments, although jOOQ has already been given some credit in JPA-based environments as well:

Anyway, today, we’d like to congratulate Timo to his new job, and to QueryDSL’s feature completeness.

jOOQ, on the other hand, is far from feature complete.

jOOQ is what SQLJ should have been from the beginning.

We’re only at the beginning. Java and SQL are the two platforms that are used by most of the developers on this planet. According to db-engines.com, almost every popular DBMS is a SQL-based relational DBMS. According to TIOBE, Java currently ranks #2 among all languages.

We strongly believe that all of these developers are in dire need for better SQL integration into the Java language. While ORMs and JPA are very well integrated, SQL is not, and that is what we are working on. jOOQ will be feature complete when the Java compiler can natively compile actual SQL code and SQL code fragments into jOOQ, which will serve as its backing AST model for further SQL transformation.

Until we reach that goal, we’ll be adding support for more SQL goodness. A small selection of things that we already support, beyond QueryDSL’s “feature completeness”:

  • Table-valued functions
  • PIVOT tables
  • DDL (with jOOQ 3.4)
  • MERGE statement
  • Derived tables and derived column lists
  • Row value expressions
  • Flashback query
  • Window functions
  • Ordered aggregate functions
  • Common table expressions (with jOOQ 3.4)
  • Object-oriented PL/SQL
  • User-defined types
  • Hierarchical SQL
  • Custom SQL transformation
  • 16 supported RDBMS (even MS Access!)
  • … you name it

Our roadmap is full of great ideas. There’s plenty of work, so let’s get going! Join us, your partner for…

jOOQ is the best way to write SQL in Java

Yet Another 10 Common Mistakes Java Developers Make When Writing SQL (You Won’t BELIEVE the Last One)


(Sorry for that click-bait heading. Couldn’t resist ;-) )

We’re on a mission. To teach you SQL. But mostly, we want to teach you how to appreciate SQL. You’ll love it!

Getting SQL right or wrong shouldn’t be about that You’re-Doing-It-Wrong™ attitude that can be encountered often when evangelists promote their object of evangelism. Getting SQL right should be about the fun you’ll have once you do get it right. The things you start appreciating when you notice that you can easily replace 2000 lines of slow, hard-to-maintain, and ugly imperative (or object-oriented) code with 300 lines of lean functional code (e.g. using Java 8), or even better, with 50 lines of SQL.

We’re glad to see that our blogging friends have started appreciating SQL, and most specifically, window functions after reading our posts. For instance, take

So, after our previous, very popular posts:

… we’ll bring you:

Yet Another 10 Common Mistakes Java Developer Make When Writing SQL

And of course, this doesn’t apply to Java developers alone, but it’s written from the perspective of a Java (and SQL) developer. So here we go (again):

1. Not Using Window Functions

After all that we’ve been preaching, this must be our number 1 mistake in this series. Window functions are probably the coolest SQL feature of them all. They’re so incredibly useful, they should be the number one reason for anyone to switch to a better database, e.g. PostgreSQL:

If free and/or Open Source is important to you, you have absolutely no better choice than using PostgreSQL (and you’ll even get to use the free jOOQ Open Source Edition, if you’re a Java developer).

And if you’re lucky enough to work in an environment with Oracle or SQL Server (or DB2, Sybase) licenses, you get even more out of your new favourite tool.

We won’t repeat all the window function goodness in this section, we’ve blogged about them often enough:

The Cure:

Remove MySQL. Take a decent database. And start playing with window functions. You’ll never go back, guaranteed.

2. Not declaring NOT NULL constraints

This one was already part of a previous list where we claimed that you should add as much metadata as possible to your schema, because your database will be able to leverage that metadata for optimisations. For instance, if your database knows that a foreign key value in BOOK.AUTHOR_ID must also be contained exactly once in AUTHOR.ID, then a whole set of optimisations can be achieved in complex queries.

Now let’s have another look at NOT NULL constraints. If you’re using Oracle, NULL values will not be part of your index. This doesn’t matter if you’re expressing an IN constraint, for instance:

SELECT * FROM table
WHERE value IN (
  SELECT nullable_column FROM ...
)

But what happens with a NOT IN constraint?

SELECT * FROM table
WHERE value NOT IN (
  SELECT nullable_column FROM ...
)

Due to SQL’s slightly unintuitive way of handling NULL, there is a slight risk of the second query unexpectedly not returning any results at all, namely if there is at least one NULL value as a result from the subquery. This is true for all databases that get SQL right.

But because the index on nullable_column doesn’t contain any NULL values, Oracle has to look up the complete content in the table, resulting in a FULL TABLE SCAN. Now that is unexpected! Details about this can be seen in this article.

The Cure:

Carefully review all your nullable, yet indexed columns, and check if you really cannot add a NOT NULL constraint to those columns.

The Tool:

If you’re using Oracle, use this query to detect all nullable, yet indexed columns:

SELECT
  i.table_name,
  i.index_name,
  LISTAGG(
    LPAD(i.column_position,  2) || ': ' || 
    RPAD(i.column_name    , 30) || ' '  ||
    DECODE(t.nullable, 'Y', '(NULL)', '(NOT NULL)'), 
    ', '
  ) WITHIN GROUP (ORDER BY i.column_position) 
    AS "NULLABLE columns in indexes"
FROM user_ind_columns i
JOIN user_tab_cols t
ON (t.table_name, t.column_name) = 
  ((i.table_name, i.column_name))
WHERE EXISTS (
  SELECT 1
  FROM user_tab_cols t
  WHERE (t.table_name, t.column_name, t.nullable) = 
       ((i.table_name, i.column_name, 'Y'       ))
)
GROUP BY i.table_name, i.index_name
ORDER BY i.index_name ASC;

Example output:

TABLE_NAME | INDEX_NAME   | NULLABLE columns in indexes
-----------+--------------+----------------------------
PERSON     | I_PERSON_DOB | 1: DATE_OF_BIRTH (NULL)

And then, fix it!

(Accidental criticism of Maven is irrelevant here ;-) )

If you’re curious about more details, see also these posts:

3. Using PL/SQL Package State

Now, this is a boring one if you’re not using Oracle, but if you are (and you’re a Java developer), be very wary of PL/SQL package state. Are you really doing what you think you’re doing?

Yes, PL/SQL has package-state, e.g.

CREATE OR REPLACE PACKAGE pkg IS
  -- Package state here!
  n NUMBER := 1;

  FUNCTION next_n RETURN NUMBER;
END pkg;

CREATE OR REPLACE PACKAGE BODY pkg IS
  FUNCTION next_n RETURN NUMBER
  IS
  BEGIN
    n := n + 1;
    RETURN n;
  END next_n;
END pkg;

Wonderful, so you’ve created yourself an in-memory counter that generates a new number every time you call pkg.next_n. But who owns that counter? Yes, the session. Each session has their own initialised “package instance”.

But no, it’s probably not the session you might have thought of.

We Java developers connect to databases through connection pools. When we obtain a JDBC Connection from such a pool, we recycle that connection from a previous “session”, e.g. a previous HTTP Request (not HTTP Session!). But that’s not the same. The database session (probably) outlives the HTTP Request and will be inherited by the next request, possibly from an entirely different user. Now, imagine you had a credit card number in that package…?

Not The Cure:

Nope. Don’t just jump to using SERIALLY_REUSABLE packages

CREATE OR REPLACE PACKAGE pkg IS
  PRAGMA SERIALLY_REUSABLE;
  n NUMBER := 1;

  FUNCTION next_n RETURN NUMBER;
END pkg;

Because:

  • You cannot even use that package from SQL, now (see ORA-06534).
  • Mixing this PRAGMA with regular package state from other packages just makes things a lot more complex.

So, don’t.

Not The Cure:

I know. PL/SQL can be a beast. It often seems like such a quirky language. But face it. Many things run much much faster when written in PL/SQL, so don’t give up, just yet. Dropping PL/SQL is not the solution either.

The Cure:

At all costs, try to avoid package state in PL/SQL. Think of package state as of static variables in Java. While they might be useful for caches (and constants, of course) every now and then, you might not actually access that state that you wanted. Think about load-balancers, suddenly transferring you to another JVM. Think about class loaders, that might have loaded the same class twice, for some reason.

Instead, pass state as arguments through procedures and functions. This will avoid side-effects and make your code much cleaner and more predictable.

Or, obviuously, persist state to some table.

4. Running the same query all the time

Master data is boring. You probably wrote some utility to get the latest version of your master data (e.g. language, locale, translations, tenant, system settings), and you can query it every time, once it is available.

At all costs, don’t do that. You don’t have to cache many things in your application, as modern databases have grown to be extremely fast when it comes to caching:

  • Table / column content
  • Index content
  • Query / materialized view results
  • Procedure results (if they’re deterministic)
  • Cursors
  • Execution plans

So, for your average query, there’s virtually no need for an ORM second-level cache, at least from a performance perspective (ORM caches mainly fulfil other purposes, of course).

But when you query master data, i.e. data that never changes, then, network latency, traffic and many other factors will impair your database experience.

The Cure:

Please do take 10 minutes, download Guava, and use its excellent and easy to set up cache, that ships with various built-in invalidation strategies. Choose time-based invalidation (i.e. polling), choose Oracle AQ or Streams, or PostgreSQL’s NOTIFY for event-based invalidation, or just make your cache permanent, if it doesn’t matter. But don’t issue an identical master data query all the time.

… This obviously brings us to

5. Not knowing about the N+1 problem

You had a choice. At the beginning of your software product, you had to choose between:

So, obviously, you chose an ORM, because otherwise you wouldn’t be suffering from “N+1″. What does “N+1″ mean?

The accepted answer on this Stack Overflow question explains it nicely. Essentially, you’re running:

SELECT * FROM book

-- And then, for each book:
SELECT * FROM author WHERE id = ?
SELECT * FROM author WHERE id = ?
SELECT * FROM author WHERE id = ?

Of course, you could go and tweak your hundreds of annotations to correctly prefetch or eager fetch each book’s associated author information to produce something along the lines of:

SELECT * 
FROM   book
JOIN   author 
  ON   book.author_id = author.id

But that would be an awful lot of work, and you’ll risk eager-fetching too many things that you didn’t want, resulting in another performance issue.

Maybe, you could upgrade to JPA 2.1 and use the new @NamedEntityGraph to express beautiful annotation trees like this one:

@NamedEntityGraph(
    name = "post",
    attributeNodes = {
        @NamedAttributeNode("title"),
        @NamedAttributeNode(
            value = "comments", 
            subgraph = "comments"
        )
    },
    subgraphs = {
        @NamedSubgraph(
            name = "comments",
            attributeNodes = {
                @NamedAttributeNode("content")
            }
        )
    }
)

The example was taken from this blog post by Hantsy Bai. Hantsy then goes on explaining that you can use the above beauty through the following statement:

em.createQuery("select p from Post p where p.id=:id",
               Post.class)
  .setHint("javax.persistence.fetchgraph", 
           postGraph)
  .setParameter("id", this.id)
  .getResultList()
  .get(0);

Let us all appreciate the above application of JEE standards with all due respect, and then consider…

The Cure:

You just listen to the wise words at the beginning of this article and replace thousands of lines of tedious Java / Annotatiomania™ code with a couple of lines of SQL. Because that will also likely help you prevent another issue that we haven’t even touched yet, namely selecting too many columns as you can see in these posts:

Since you’re already using an ORM, this might just mean resorting to native SQL – or maybe you manage to express your query with JPQL. Of course, we agree with Alessio Harri in believing that you should use jOOQ together with JPA:

The Takeaway:

While the above will certainly help you work around some real world issues that you may have with your favourite ORM, you could also take it one step further and think about it this way. After all these years of pain and suffering from the object-relational impedance mismatch, the JPA 2.1 expert group is now trying to tweak their way out of this annotation madness by adding more declarative, annotation-based fetch graph hints to JPQL queries, that no one can debug, let alone maintain.

The alternative is simple and straight-forward SQL. And with Java 8, we’ll add functional transformation through the Streams API. That’s hard to beat.

But obviuosly, your views and experiences on that subject may differ from ours, so let’s head on to a more objective discussion about…

6. Not using Common Table Expressions

While common table expressions obviously offer readability improvements, they may also offer performance improvements. Consider the following query that I have recently encountered in a customer’s PL/SQL package (not the actual query):

SELECT round (
  (SELECT amount FROM payments WHERE id = :p_id)
    * 
  (
    SELECT e.bid
    FROM   currencies c, exchange_rates e
    WHERE  c.id     = 
      (SELECT cur_id FROM payments WHERE id = :p_id)
    AND    e.cur_id = 
      (SELECT cur_id FROM payments WHERE id = :p_id)
    AND    e.org_id = 
      (SELECT org_id FROM payments WHERE id = :p_id)
  ) / (
    SELECT c.factor
    FROM   currencies c, exchange_rates e
    WHERE  c.id     = 
      (SELECT cur_id FROM payments WHERE id = :p_id)
    AND    e.cur_id = 
      (SELECT cur_id FROM payments WHERE id = :p_id)
    AND    e.org_id = 
      (SELECT org_id FROM payments WHERE id = :p_id)
  ), 0
) 
INTO amount 
FROM dual;

So what does this do? This essentially converts a payment’s amount from one currency into another. Let’s not delve into the business logic too much, let’s head straight to the technical problem. The above query results in the following execution plan (on Oracle):

------------------------------------------------------
| Operation                         | Name           |
------------------------------------------------------
| SELECT STATEMENT                  |                |
|  TABLE ACCESS BY INDEX ROWID      | PAYMENTS       |
|   INDEX UNIQUE SCAN               | PAYM_PK        |
|   NESTED LOOPS                    |                |
|    INDEX UNIQUE SCAN              | CURR_PK        |
|     TABLE ACCESS BY INDEX ROWID   | PAYMENTS       |
|      INDEX UNIQUE SCAN            | PAYM_PK        |
|    TABLE ACCESS BY INDEX ROWID    | EXCHANGE_RATES |
|     INDEX UNIQUE SCAN             | EXCH_PK        |
|      TABLE ACCESS BY INDEX ROWID  | PAYMENTS       |
|       INDEX UNIQUE SCAN           | PAYM_PK        |
|      TABLE ACCESS BY INDEX ROWID  | PAYMENTS       |
|       INDEX UNIQUE SCAN           | PAYM_PK        |
|     NESTED LOOPS                  |                |
|      TABLE ACCESS BY INDEX ROWID  | CURRENCIES     |
|       INDEX UNIQUE SCAN           | CURR_PK        |
|        TABLE ACCESS BY INDEX ROWID| PAYMENTS       |
|         INDEX UNIQUE SCAN         | PAYM_PK        |
|      INDEX UNIQUE SCAN            | EXCH_PK        |
|       TABLE ACCESS BY INDEX ROWID | PAYMENTS       |
|        INDEX UNIQUE SCAN          | PAYM_PK        |
|       TABLE ACCESS BY INDEX ROWID | PAYMENTS       |
|        INDEX UNIQUE SCAN          | PAYM_PK        |
|  FAST DUAL                        |                |
------------------------------------------------------

The actual execution time is negligible in this case, but as you can see, the same objects are accessed again and again within the query. This is a violation of Common Mistake #4: Running the same query all the time.

The whole thing would be so much easier to read, maintain, and for Oracle to execute, if we had used a common table expression. From the original source code, observe the following thing:

-- We're always accessing a single payment:
  FROM payments WHERE id = :p_id

-- Joining currencies and exchange_rates twice:
  FROM currencies c, exchange_rates e

So, let’s factor out the payment first:

-- "payment" contains only a single payment
-- But it contains all the columns that we'll need
-- afterwards
WITH payment AS (
    SELECT cur_id, org_id, amount
    FROM   payments
    WHERE  id = :p_id
)
SELECT round(p.amount * e.bid / c.factor, 0)

-- Then, we simply don't need to repeat the
-- currencies / exchange_rates joins twice
FROM   payment p
JOIN   currencies c     ON  p.cur_id = c.id
JOIN   exchange_rates e ON  e.cur_id = p.cur_id
                        AND e.org_id = p.org_id

Note, that we’ve also replaced table lists with ANSI JOINs as suggested in our previous list

You wouldn’t believe it’s the same query, would you? And what about the execution plan? Here it is!

---------------------------------------------------
| Operation                      | Name           |
---------------------------------------------------
| SELECT STATEMENT               |                |
|  NESTED LOOPS                  |                |
|   NESTED LOOPS                 |                |
|    NESTED LOOPS                |                |
|     FAST DUAL                  |                |
|     TABLE ACCESS BY INDEX ROWID| PAYMENTS       |
|      INDEX UNIQUE SCAN         | PAYM_PK        |
|    TABLE ACCESS BY INDEX ROWID | EXCHANGE_RATES |
|     INDEX UNIQUE SCAN          | EXCH_PK        |
|   TABLE ACCESS BY INDEX ROWID  | CURRENCIES     |
|    INDEX UNIQUE SCAN           | CURR_PK        |
---------------------------------------------------

No doubt that this is much much better.

The Cure:

If you’re lucky enough and you’re using one of those databases that supports window functions, chances are incredibly high (100%) that you also have common table expression support. This is another reason for you to migrate from MySQL to PostgreSQL, or appreciate the fact that you can work on an awesome commercial database.

Common table expressions are like local variables in SQL. In every large statement, you should consider using them, as soon as you feel that you’ve written something before.

The Takeaway:

Some databases (e.g. PostgreSQL, or SQL Server) also support common table expressions for DML statements. In other words, you can write:

WITH ...
UPDATE ...

This makes DML incredibly more powerful.

7. Not using row value expressions for UPDATEs

We’ve advertised the use of row value expressions in our previous listing. They’re very readable and intuitive, and often also promote using certain indexes, e.g. in PostgreSQL.

But few people know that they can also be used in an UPDATE statement, in most databases. Check out the following query, which I again found in a customer’s PL/SQL package (simplified again, of course):

UPDATE u
SET n = (SELECT n + 1    FROM t WHERE u.n = t.n),
    s = (SELECT 'x' || s FROM t WHERE u.n = t.n),
    x = 3;

So this query takes a subquery as a data source for updating two columns, and the third column is updated “regularly”. How does it perform? Moderately:

-----------------------------
| Operation          | Name |
-----------------------------
| UPDATE STATEMENT   |      |
|  UPDATE            | U    |
|   TABLE ACCESS FULL| U    |
|   TABLE ACCESS FULL| T    |
|   TABLE ACCESS FULL| T    |
-----------------------------

Let’s ignore the full table scans, as this query is constructed. The actual query could leverage indexes. But T is accessed twice, i.e. in both subqueries. Oracle didn’t seem to be able to apply scalar subquery caching in this case.

To the rescue: row value expressions. Let’s simply rephrase our UPDATE to this:

UPDATE u
SET (n, s) = ((
      SELECT n + 1, 'x' || s FROM t WHERE u.n = t.n
    )),
    x = 3;

Let’s ignore the funny, Oracle-specific double-parentheses syntax for the right hand side of such a row value expression assignment, but let’s appreciate the fact that we can easily assign a new value to the tuple (n, s) in one go! Note, we could have also written this, instead, and assign x as well:

UPDATE u
SET (n, s, x) = ((
      SELECT n + 1, 'x' || s, 3 
      FROM t WHERE u.n = t.n
    ));

As you will have expected, the execution plan has also improved, and T is accessed only once:

-----------------------------
| Operation          | Name |
-----------------------------
| UPDATE STATEMENT   |      |
|  UPDATE            | U    |
|   TABLE ACCESS FULL| U    |
|   TABLE ACCESS FULL| T    |
-----------------------------

The Cure:

Use row value expressions. Where ever you can. They make your SQL code incredibly more expressive, and chances are, they make it faster, as well.

Note that the above is supported by jOOQ’s UPDATE statement. This is the moment we would like to make you aware of this cheap, in-article advertisement:

jOOQ - The best way to write SQL in Java

;-)

8. Using MySQL when you could use PostgreSQL

To some, this may appear to be a bit of a hipster discussion. But let’s consider the facts:

  • MySQL claims to be the “most popular Open Source database”.
  • PostgreSQL claims to be the “most advanced Open Source database”.

Let’s consider a bit of history. MySQL has always been very easy to install, maintain, and it has had a great and active community. This has lead to MySQL still being the RDBMS of choice with virtually every web hoster on this planet. Those hosters also host PHP, which was equally easy to install, and maintain.

BUT!

We Java developers tend to have an opinion about PHP, right? It’s summarised by this image here:

The PHP Hammer

The PHP Hammer

Well, it works, but how does it work?

The same can be said about MySQL. MySQL has always worked somehow, but while commercial databases like Oracle have made tremendous progress both in terms of query optimisation and feature scope, MySQL has hardly moved in the last decade.

Many people choose MySQL primarily because of its price (USD $ 0.00). But often, the same people have found MySQL to be slow and quickly concluded that SQL is slow per se – without evaluating the options. This is also why all NoSQL stores compare themselves with MySQL, not with Oracle, the database that has been winning the Transaction Processing Performance Council’s (TPC) benchmarks almost forever. Some examples:

While the last article bluntly adds “(and other RDBMS)” it doesn’t go into any sort of detail whatsoever, what those “other RDBMS” do wrong. It really only compares MongoDB with MySQL.

The Cure:

We say: Stop complaining about SQL, when in fact, you’re really complaining about MySQL. There are at least four very popular databases out there that are incredibly good, and millions of times better than MySQL. These are:

(just kidding about the last one, of course)

The Takeaway:

Don’t fall for agressive NoSQL marketing. 10gen is an extremely well-funded company, even if MongoDB continues to disappoint, technically.

The same is true for Datastax.

Both companies are solving a problem that few people have. They’re selling us niche products as commodity, making us think that our real commodity databases (the RDBMS) no longer fulfil our needs. They are well-funded and have big marketing teams to throw around with blunt claims.

In the mean time, PostgreSQL just got even better, and you, as a reader of this blog / post, are about to bet on the winning team :-)

… just to cite Mark Madsen once more:

The Disclaimer:

This article has been quite strongly against MySQL. We don’t mean to talk badly about a database that perfectly fulfils its purpose, as this isn’t a black and white world. Heck, you can get happy with SQLite in some situations. MySQL, being the cheap and easy to use, easy to install commodity database. We just wanted to make you aware of the fact, that you’re expressly choosing the cheap, not-so-good database, rather than the cheap, awesome one.

9. Forgetting about UNDO / REDO logs

We have claimed that MERGE statements or bulk / batch updates are good. That’s correct, but nonetheless, you should be wary when updating huge data sets in transactional contexts. If your transaction “takes too long”, i.e. if you’re updating 10 million records at a time, you will run into two problems:

  • You increase the risk of race conditions, if another process is also writing to the same table. This may cause a rollback on their or on your transaction, possibly making you roll out the huge update again
  • You cause a lot of concurrency on your system, because every other transaction / session, that wants to see the data that you’re about to update, will have to temporarily roll back all of your updates first, before they reach the state on disk that was there before your huge update. That’s the price of ACID.

One way to work around this issue is to allow for other sessions to read uncommitted data.

Another way to work around this issue is to frequently commit your own work, e.g. after 1000 inserts / updates.

In any case, due to the CAP theorem, you will have to make a compromise. Frequent commits will produce the risk of an inconsistent database in the event of the multi-million update going wrong after 5 million (committed) records. A rollback would then mean to revert all database changes towards a backup.

The Cure:

There is no definitive cure to this issue. But beware that you are very very rarely in a situation where it is OK to simply update 10 million records of a live and online table outside of an actual scheduled maintenance window. The simplest acceptable workaround is indeed to commit your work after N inserts / updates.

The Takeaway:

By this time, NoSQL aficionados will claim (again due to excessive marketing by aforementioned companies) that NoSQL has solved this by dropping schemas and typesafety. “Don’t update, just add another property!” – they said.

But that’s not true!

First off, I can add columns to my database without any issue at all. An ALTER TABLE ADD statement is executed instantly on live databases. Filling the column with data doesn’t bother anyone either, because no one reads the column yet (remember, don’t SELECT * !). So adding columns in RDBMS is as cheap as adding JSON properties to a MongoDB document.

But what about altering columns? Removing them? Merging them?

It is simply not true that denormalisation takes you anywhere far. Denormalisation is always a short-term win for the developer. Hardly a long-term win for the operations teams. Having redundant data in your database for the sake of speeding up an ALTER TABLE statement is like sweeping dirt under the carpet.

Don’t believe the marketers. And while you’re at it, perform some doublethink and forget that we’re SQL tool vendors ourselves ;-) Here’s again the “correct” message:

jOOQ - The best way to write SQL in Java

10. Not using the BOOLEAN type correctly

This is not really a mistake per se. It’s just again something that hardly anyone knows. When the SQL:1999 standard introduced the new BOOLEAN data type, they really did it right. Because before, we already had something like booleans in SQL. We’ve had <search condition> in SQL-92, which are essentially predicates for use in WHERE, ON, and HAVING clauses, as well as in CASE expressions.

SQL:1999, however, simply defined the new <boolean value expression> as a regular <value expression>, and redefined the <search condition> as such:

<search condition> ::=
    <boolean value expression>

Done! Now, for most of us Java / Scala / etc. developers, this doesn’t seem like such an innovation. Heck it’s a boolean. Obviuosly it can be interchangeably used as predicate and as variable.

But in the mind-set of the keyword-heavy SQL folks who have taken inspiration from COBOL when designing the language, this was quite a step forward.

Now, what does this mean? This means that you can use any predicate also as a column! For instance:

SELECT a, b, c
FROM (
  SELECT EXISTS (SELECT ...) a,
         MY_COL IN (1, 2, 3) b,
         3 BETWEEN 4 AND 5   c
  FROM   MY_TABLE
) t
WHERE a AND b AND NOT(c)

This is a bit of a dummy query, agreed, but are you aware of how powerful this is?

Luckily, again, PostgreSQL fully supports this (unlike Oracle, which still doesn’t have any BOOLEAN data type in SQL).

The Cure:

Every now and then, using BOOLEAN types feels very right, so do it! You can transform boolean value expressions into predicates and predicates into boolean value expressions. They’re the same. This makes SQL all so powerful.

Conclusion

SQL has evolved steadily over the past years through great standards like SQL:1999, SQL:2003, SQL:2008 and now SQL:2011. It is the only surviving mainstream declarative language, now that XQuery can be considered pretty dead for the mainstream. It can be easily mixed with procedural languages, as PL/SQL and T-SQL (and other procedural dialects) have shown. It can be easily mixed with object-oriented or functional languages, as jOOQ has shown.

At Data Geekery, we believe that SQL is the best way to query data. You don’t agree with any of the above? That’s fine, you don’t have to. Sometimes, even we agree with Winston Churchill who is known to have said:

tweet thisSQL is the worst form of database querying, except for all the other forms.

But as Yakov Fain has recently put it:

You can run from SQL, but you can’t hide

So, let’s better get back to work and learn this beast! Thanks for reading.

Java 8 Friday: Better Exceptions


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Better Exceptions

I had the idea when I stumbled upon JUnit GitHub issue #706, which is about a new method proposal:

ExpectedException#expect(Throwable, Callable)

One suggestion was to create an interceptor for exceptions like this.

assertEquals(Exception.class, 
    thrown(() -> foo()).getClass());
assertEquals("yikes!", 
    thrown(() -> foo()).getMessage());

On the other hand, why not just add something completely new along the lines of this?

// This is needed to allow for throwing Throwables
// from lambda expressions
@FunctionalInterface
interface ThrowableRunnable {
    void run() throws Throwable;
}

// Assert a Throwable type
static void assertThrows(
    Class<? extends Throwable> throwable,
    ThrowableRunnable runnable
) {
    assertThrows(throwable, runnable, t -> {});
}

// Assert a Throwable type and implement more
// assertions in a consumer
static void assertThrows(
    Class<? extends Throwable> throwable,
    ThrowableRunnable runnable,
    Consumer<Throwable> exceptionConsumer
) {
    boolean fail = false;
    try {
        runnable.run();
        fail = true;
    }
    catch (Throwable t) {
        if (!throwable.isInstance(t))
            Assert.fail("Bad exception type");

        exceptionConsumer.accept(t);
    }

    if (fail)
        Assert.fail("No exception was thrown");
}

So the above methods both assert that a given throwable is thrown from a given runnable – ThrowableRunnable to be precise, because most functional interfaces, unfortunately, don’t allow for throwing checked exceptions. See this article for details.

We’re now using the above hypothetical JUnit API as such:

assertThrows(Exception.class, 
    () -> { throw new Exception(); });

assertThrows(Exception.class, 
    () -> { throw new Exception("Message"); },
    e  -> assertEquals("Message", e.getMessage()));

In fact, we could even go further and declare an exception swallowing helper method like this:

// This essentially swallows exceptions
static void withExceptions(
    ThrowableRunnable runnable
) {
    withExceptions(runnable, t -> {});
}

// This delegates exception handling to a consumer
static void withExceptions(
    ThrowableRunnable runnable,
    Consumer<Throwable> exceptionConsumer
) {
    try {
        runnable.run();
    }
    catch (Throwable t) {
        exceptionConsumer.accept(t);
    }
}

This is useful to swallow all sorts of exceptions. The following two idioms are thus equivalent:

try {
    // This will fail
    assertThrows(SQLException.class, () -> {
        throw new Exception();
    });
}
catch (Throwable t) {
    t.printStackTrace();
}

withExceptions(
    // This will fail
    () -> assertThrows(SQLException.class, () -> {
        throw new Exception();
    }),
    t -> t.printStackTrace()
);

Obviuously, these idioms aren’t necessarily more useful than an actual try .. catch .. finally block, specifically also because it does not support proper typing of exceptions (at least not in this example), nor does it support the try-with-resources statement.

Nonetheless, such utility methods will come in handy every now and then.

Next week

Stay tuned for more Java 8 goodness on this blog when we continue our Java 8 Friday series with great new examples.

Java 8 Friday: API Designers, be Careful


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Lean Functional API Design

With Java 8, API design has gotten a whole lot more interesting, but also a bit harder. As a successful API designer, it will no longer suffice to think about all sorts of object-oriented aspects of your API, you will now also need to consider functional aspects of it. In other words, instead of simply providing methods like:

void performAction(Parameter parameter);

// Call the above:
object.performAction(new Parameter(...));

… you should now think about whether your method arguments are better modelled as functions for lazy evaluation:

// Keep the existing method for convenience
// and for backwards compatibility
void performAction(Parameter parameter);

// Overload the existing method with the new
// functional one:
void performAction(Supplier<Parameter> parameter);

// Call the above:
object.performAction(() -> new Parameter(...));

This is great. Your API can be Java-8 ready even before you’re actually targeting Java 8. But if you’re going this way, there are a couple of things to consider.

JDK dependency

The above example makes use of the JDK 8 Supplier type. This type is not available before the JDK 8, so if you’re using it, you’re going to limit your APIs use to the JDK 8. If you want to continue supporting older Java versions, you’ll have to roll your own supplier, or maybe use Callable, which has been available since Java 5:

// Overload the existing method with the new
// functional one:
void performAction(Callable<Parameter> parameter);

// Call the above:
object.performAction(() -> new Parameter(...));

One advantage of using Callable is the fact that your lambda expressions (or “classic” Callable implementations, or nested / inner classes) are allowed to throw checked exceptions. We’ve blogged about another possibility to circumvent this limitation, here.

Overloading

While it is (probably) perfectly fine to overload these two methods

void performAction(Parameter parameter);
void performAction(Supplier<Parameter> parameter);

… you should stay wary when overloading “more similar” methods, like these ones:

void performAction(Supplier<Parameter> parameter);
void performAction(Callable<Parameter> parameter);

If you produce the above API, your API’s client code will not be able to make use of lambda expressions, as there is no way of disambiguating a lambda that is a Supplier from a lambda that is a Callable. We’ve also mentioned this in a previous blog post.

“void-compatible” vs “value-compatible”

I’ve recently (re-)discovered this interesting early JDK 8 compiler bug, where the compiler wasn’t able to disambiguate the following:

void run(Consumer<Integer> consumer);
void run(Function<Integer, Integer> function);

// Remember, the above types are roughly:
interface Consumer<T> {
    void accept(T t);
//  ^^^^ void-compatible
}

interface Function<T, R> {
    R apply(T t);
//  ^ value-compatible
}

The terms “void-compatible” and “value-compatible” are defined in the JLS §15.27.2 for lambda expressions. According to the JLS, the following two calls are not ambiguous:

// Only run(Consumer) is applicable
run(i -> {});

// Only run(Function) is applicable
run(i -> 1);

In other words, it is safe to overload a method to take two “similar” argument types, such as Consumer and Function, as lambda expressions used to express method arguments will not be ambiguous.

This is quite useful, because having an optional return value is very elegant when you’re using lambda expressions. Consider the upcoming jOOQ 3.4 transaction API, which is roughly summarised as such:


// This uses a "void-compatible" lambda
ctx.transaction(c -> {
    DSL.using(c).insertInto(...).execute();
    DSL.using(c).update(...).execute();
});

// This uses a "value-compatible" lambda
Integer result =
ctx.transaction(c -> {
    DSL.using(c).update(...).execute();
    DSL.using(c).delete(...).execute();

    return 42;
});

In the above example, the first call resolves to TransactionalRunnable whereas the second call resolves to TransactionalCallable whose API are like these:

interface TransactionalRunnable {
    void run(Configuration c) throws Exception;
}

interface TransactionalCallable<T> {
    T run(Configuration c) throws Exception;
}

Note, though, that as of JDK 1.8.0_05 and Eclipse Kepler (with the Java 8 support patch), this ambiguity resolution does not yet work because of these bugs:

So, in order to stay on the safe side, maybe you could just simply avoid overloading.

Generic methods are not SAMs

Do note that “SAM” interfaces that contain a single abstract generic method are NOT SAMs in the sense for them to be eligible as lambda expression targets. The following type will never form any lambda expression:

interface NotASAM {
    <T> void run(T t);
}

This is specified in the JLS §15.27.3

A lambda expression is congruent with a function type if all of the following are true:

  • The function type has no type parameters.
  • [ ... ]

What do you have to do now?

If you’re an API designer, you should now start writing unit tests / integration tests also in Java 8. Why? For the simple reason that if you don’t you’ll get your API wrong in subtle ways for those users that are actually using it with Java 8. These things are extremely subtle. Getting them right takes a bit of practice and a lot of regression tests. Do you think you’d like to overload a method? Be sure you don’t break client API that is calling the original method with a lambda.

That’s it for today. Stay tuned for more awesome Java 8 content on this blog.

Java 8 Friday: Language Design is Subtle


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Language Design is Subtle

It’s been a busy week for us. We have just migrated the jOOQ integration tests to Java 8 for two reasons:

  • We want to be sure that client code compiles with Java 8
  • We started to get bored of writing the same old loops over and over again

The trigger was a loop where we needed to transform a SQLDialect[] into another SQLDialect[] calling .family() on each array element. Consider:

Java 7

SQLDialect[] families = 
    new SQLDialect[dialects.length];
for (int i = 0; i < families.length; i++)
    families[i] = dialects[i].family();

Java 8

SQLDialect[] families = 
Stream.of(dialects)
      .map(d -> d.family())
      .toArray(SQLDialect[]::new);

OK, it turns out that the two solutions are equally verbose, even if the latter feels a bit more elegant. :-)

And this gets us straight into the next topic:

Backwards-compatibility

For backwards-compatibility reasons, arrays and the pre-existing Collections API have not been retrofitted to accommodate all the useful methods that Streams now have. In other words, an array doesn’t have a map() method, just as much as List doesn’t have such a method. Streams and Collections/arrays are orthogonal worlds. We can transform them into each other, but they don’t have a unified API.

This is fine in everyday work. We’ll get used to the Streams API and we’ll love it, no doubt. But because of Java being extremely serious about backwards compatibility, we will have to think about one or two things more deeply.

Recently, we have published a post about The Dark Side of Java 8. It was a bit of a rant, although a mild one in our opinion (and it was about time to place some criticism, after all the praise we’ve been giving Java 8 in our series, before ;-) ). First off, that post triggered a reaction by Edwin Dalorzo from our friends at Informatech. (Edwin has written this awesome post comparing LINQ and Java 8 Streams, before). The criticism in our article evolved around three main aspects:

  • Overloading getting more complicated (see also this compiler bug)
  • Limited support for method modifiers on default methods
  • Primitive type “API overloads” for streams and functional interfaces

A response by Brian Goetz

I then got a personal mail from no one less than Brian Goetz himself (!), who pointed out a couple of things to me that I had not yet thought about in this way:

I still think you’re focusing on the wrong thing. Its not really the syntax you don’t like; its the model — you don’t want “default methods”, you want traits, and the syntax is merely a reminder that you didn’t get the feature you wanted. (But you’d be even more confused about “why can’t they be final” if we dropped the “default” keyword!) But that’s blaming the messenger (where here, the keyword is the messenger.)

Its fair to say “this isn’t the model I wanted”. There were many possible paths in the forest, and it may well be the road not taken was equally good or better.

This is also what Edwin had concluded. Default methods were a necessary means to tackle all the new API needed to make Java 8 useful. If Iterator, Iterable, List, Collection, and all the other pre-existing interfaces had to be adapted to accommodate lambdas and Streams API interaction, the expert group would have needed to break an incredible amount of API. Conversely, without adding these additional utility methods (see the awesome new Map methods, for instance!), Java 8 would have been only half as good.

And that’s it.

Even if maybe, some more class building tools might have been useful, they were not in the center of focus for the expert group who already had a lot to do to get things right. The center of focus was to provide a means for API evolution. Or in Brian Goetz’s own words:

Reaching out to the community

It’s great that Brian Goetz reaches out to the community to help us get the right picture about Java 8. Instead of explaining rationales about expert group decisions in private messages, he then asked me to publicly re-ask my questions again on Stack Overflow (or lambda-dev), such that he can then publicly answer them. For increased publicity and greater community benefit, I chose Stack Overflow. Here are:

The amount of traction these two questions got in no time shows how important these things are to the community, so don’t miss reading through them!

“Uncool”? Maybe. But very stable!

Java may not have the “cool” aura that node.js has. You may think about JavaScript-the-language whatever you want (as long as it contains swear words), but from a platform marketing perspective, Java is being challenged for the first time in a long time – and being “uncool” and backwards-compatible doesn’t help keeping developers interested.

But let’s think long-term, instead of going with trends. Having such a great professional platform like the Java language, the JVM, the JDK, JEE, and much more, is invaluable. Because at the end of the day, the “uncool” backwards-compatibility can also be awesome. As mentioned initially, we have upgraded our integration tests to Java 8. Not a single compilation error, not a single bug. Using Eclipse’s BETA support for Java 8, I could easily transform anonymous classes into lambdas and write awesome things like these upcoming jOOQ 3.4 nested transactions (API not final yet):

ctx.transaction(c1 -> {
    DSL.using(c1)
       .insertInto(AUTHOR, AUTHOR.ID, AUTHOR.LAST_NAME)
       .values(3, "Doe")
       .execute();

    // Implicit savepoint here
    try {
        DSL.using(c1).transaction(c2 -> {
            DSL.using(c2)
               .update(AUTHOR)
               .set(AUTHOR.FIRST_NAME, "John")
               .where(AUTHOR.ID.eq(3))
               .execute();

            // Rollback to savepoint
            throw new MyRuntimeException("No");
        });
    }

    catch (MyRuntimeException ignore) {}

    return 42;
});

So at the end of the day, Java is great. Java 8 is a tremendous improvement over previous versions, and with great people in the expert groups (and reaching out to the community on social media), I trust that Java 9 will be even better. In particular, I’m looking forward to learning about how these two projects evolve:

Although, again, I am really curious how they will pull these two improvements off from a backwards-compatibility perspective, and what caveats we’ll have to understand, afterwards. ;-)

Anyway, let’s hope the expert groups will continue to provide public feedback on Stack Overflow. Stay tuned for more awesome Java 8 content on this blog.

Three-State Booleans in Java


Every now and then, I miss SQL’s three-valued BOOLEAN semantics in Java. In SQL, we have:

  • TRUE
  • FALSE
  • UNKNOWN (also known as NULL)

Every now and then, I find myself in a situation where I wish I could also express this UNKNOWN or UNINITIALISED semantics in Java, when plain true and false aren’t enough.

Implementing a ResultSetIterator

For instance, when implementing a ResultSetIterator for jOOλ, a simple library modelling SQL streams for Java 8:

SQL.stream(stmt, Unchecked.function(r ->
    new SQLGoodies.Schema(
        r.getString("FIELD_1"),
        r.getBoolean("FIELD_2")
    )
))
.forEach(System.out::println);

In order to implement a Java 8 Stream, we need to construct an Iterator, which we can then pass to the new Spliterators.spliteratorUnknownSize() method:

StreamSupport.stream(
  Spliterators.spliteratorUnknownSize(iterator, 0), 
  false
);

Another example for this can be seen here on Stack Overflow.

When implementing the Iterator interface, we must implement hasNext() and next(). Note that with Java 8, remove() now has a default implementation, so we don’t need to implement it any longer.

While most of the time, a call to next() is preceded by a call to hasNext() exactly once, nothing in the Iterator contract requires this. It is perfectly fine to write:

if (it.hasNext()) {
    // Some stuff

    // Double-check again to be sure
    if (it.hasNext() && it.hasNext()) {

        // Yes, we're paranoid
        if (it.hasNext())
            it.next();
    }
}

How to translate the Iterator calls to backing calls on the JDBC ResultSet? We need to call ResultSet.next().

We could make the following translation:

  • Iterator.hasNext() == !ResultSet.isLast()
  • Iterator.next() == ResultSet.next()

But that translation is:

  • Expensive
  • Not dealing correctly with empty ResultSets
  • Not implemented in all JDBC drivers (Support for the isLast method is optional for ResultSets with a result set type of TYPE_FORWARD_ONLY)

So, we’ll have to maintain a flag, internally, that tells us:

  • If we had already called ResultSet.next()
  • What the result of that call was

Instead of creating a second variable, why not just use a three-valued java.lang.Boolean. Here’s a possible implementation from jOOλ:

class ResultSetIterator<T> implements Iterator<T> {

    final Supplier<? extends ResultSet>  supplier;
    final Function<ResultSet, T>         rowFunction;
    final Consumer<? super SQLException> translator;

    /**
     * Whether the underlying {@link ResultSet} has
     * a next row. This boolean has three states:
     * <ul>
     * <li>null:  it's not known whether there 
     *            is a next row</li>
     * <li>true:  there is a next row, and it
     *            has been pre-fetched</li>
     * <li>false: there aren't any next rows</li>
     * </ul>
     */
    Boolean hasNext;
    ResultSet rs;

    ResultSetIterator(
        Supplier<? extends ResultSet> supplier, 
        Function<ResultSet, T> rowFunction, 
        Consumer<? super SQLException> translator
    ) {
        this.supplier = supplier;
        this.rowFunction = rowFunction;
        this.translator = translator;
    }

    private ResultSet rs() {
        return (rs == null) 
             ? (rs = supplier.get()) 
             :  rs;
    }

    @Override
    public boolean hasNext() {
        try {
            if (hasNext == null) {
                hasNext = rs().next();
            }

            return hasNext;
        }
        catch (SQLException e) {
            translator.accept(e);
            throw new IllegalStateException(e);
        }
    }

    @Override
    public T next() {
        try {
            if (hasNext == null) {
                rs().next();
            }

            return rowFunction.apply(rs());
        }
        catch (SQLException e) {
            translator.accept(e);
            throw new IllegalStateException(e);
        }
        finally {
            hasNext = null;
        }
    }
}

As you can see, the hasNext() method locally caches the hasNext three-valued boolean state only if it was null before. This means that calling hasNext() several times will have no effect until you call next(), which resets the hasNext cached state.

Both hasNext() and next() advance the ResultSet cursor if needed.

Readability?

Some of you may argue that this doesn’t help readability. They’d introduce a new variable like:

boolean hasNext;
boolean hasHasNextBeenCalled;

The trouble with this is the fact that you’re still implementing three-valued boolean state, but distributed to two variables, which are very hard to name in a way that is truly more readable than the actual java.lang.Boolean solution. Besides, there are actually four state values for two boolean variables, so there is a slight increase in the risk of bugs.

Every rule has its exception. Using null for the above semantics is a very good exception to the null-is-bad histeria that has been going on ever since the introduction of Option / Optional

In other words: Which approach is best? There’s no TRUE or FALSE answer, only UNKNOWN ;-)

Be careful with this

However, as we’ve discussed in a previous blog post, you should avoid returning null from API methods if possible. In this case, using null explicitly as a means to model state is fine because this model is encapsulated in our ResultSetIterator. But try to avoid leaking such state to the outside of your API.

Java 8 Friday: Let’s Deprecate Those Legacy Libs


At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

For the last two Fridays, we’ve been off for our Easter break, but now we’re back with another fun article:

Let’s Deprecate Those Legacy Libs

d8938bef47ea2f62ed0543dd9e35a483Apart from Lambdas and extension methods, the JDK has also been enhanced with a lot of new library code, e.g. the Streams API and much more. This means that we can critically review our stacks and – to the great joy of Doctor Deprecator – throw out all the garbage that we no longer need.

Here are a couple of them, just to name a few:

LINQ-style libraries

There are lots of libraries that try to emulate LINQ (i.e. the LINQ-to-Collections part). We’ve already made our point before, because we now have the awesome Java 8 Streams API. 5 years from today, no Java developer will be missing LINQ any longer, and we’ll all be Streams-masters with Oracle Certified Streams Developer certifications hanging up our walls.

Don’t get me wrong. This isn’t about LINQ or Streams being better. They’re pretty much the same. But since we now have Streams in the JDK, why worry about LINQ? Besides, the SQLesque syntax for collection querying was misleading anyway. SQL itself is much more than Streams will ever be (or needs to be).

So let’s list a couple of LINQesque APIs, which we’ll no longer need:

LambdaJ

This was a fun attempt at emulating closures in Java through arcane and nasty tricks like ThreadLocal. Consider the following code snippet (taken from here):

// This lets you "close over" the
// System.out.println method
Closure println = closure(); { 
  of(System.out).println(var(String.class));
}

// in order to use it like so:
println.apply("one");
println.each("one", "two", "three");

Nice idea, although that semi-colon after closure(); and before that pseudo-closure-implementation block, which is not really a closure body… all of that seems quite quirky ;-)

Now, we’ll write:

Consumer<String> println = System.out::println;

println.accept("one");
Stream.of("one", "two", "three").forEach(println);

No magic here, just plain Java 8.

Let’s hear it one last time for Mario Fusco and Lambdaj.

Linq4j

Apparently, this is still being developed actively… Why? Do note that the roadmap also has a LINQ-to-SQL implementation in it, including:

Parser support. Either modify a Java parser (e.g. OpenJDK), or write a pre-processor. Generate Java code that includes expression trees.

Yes, we’d like to have such a parser for jOOQ as well. It would allow us to truly embed SQL in Java, similar to SQLJ, but typesafe. But if we have the Streams API, why not implement something like Streams-to-SQL?

We cannot say farewell to Julian Hyde‘s Linq4j just yet, as he’s still continuing work. But we believe that he’s investing in the wrong corner.

Coolection

This is a library with a fun name, and it allows for doing things like…

from(animals).where("name", eq("Lion"))
             .and("age", eq(2))
             .all();

from(animals).where("name", eq("Dog"))
             .or("age", eq(5))
             .all();

But why do it this way, when you can write:

animals.stream()
       .filter(a -> a.name.equals("Lion")
                 && a.age == 2)
       .collect(toList());

animals.stream()
       .filter(a -> a.name.equals("Dog")
                 || a.age == 5)
       .collect(toList());

Let’s hear it for Wagner Andrade. And then off to the bin

Half of Guava

Guava has been pretty much a dump for all sorts of logic that should have been in the JDK in the first place. Take com.google.guava.base.Joiner for instance. It is used for string-joining:

Joiner joiner = Joiner.on("; ").skipNulls();
. . .
return joiner.join("Harry", null, "Ron", "Hermione");

No need, any more. We can now write:

Stream.of("Harry", null, "Ron", "Hermione")
      .filter(s -> s != null)
      .collect(joining("; "));

Note also that the skipNulls flag and all sorts of other nice-to-have utilities are no longer necessary as the Streams API along with lambda expressions allows you to decouple the joining task from the filtering task very nicely.

Convinced? No?

What about:

And then, there’s the whole set of Functional stuff that can be thrown to the bin as well:

https://code.google.com/p/guava-libraries/wiki/FunctionalExplained

Of course, once you’ve settled on using Guava throughout your application, you won’t remove its usage quickly. But on the other hand, let’s hope that parts of Guava will be deprecated soon, in favour of an integration with Java 8.

JodaTime

Now, this one is a no-brainer, as the popular JodaTime library got standardised into the java.time packages. This is great news.

Let’s hear it for “Joda” Stephen Colebourne and his great work for the JSR-310.

Apache commons-io

The java.nio packages got even better with new methods that nicely integrate with the Streams API (or not). One of the main reasons why anyone would have ever used Apache Commons IO was the fact that it is horribly tedious to read files prior to Java 7 / 8. I mean, who would’ve enjoyed this piece of code (from here):

try (RandomAccessFile file = 
     new RandomAccessFile(filePath, "r")) {
    byte[] bytes = new byte[size];
    file.read(bytes);
    return new String(bytes); // encoding?? ouch!
}

Over this one?

List<String> lines = FileUtils.readLines(file);

But forget the latter. You can now use the new methods in java.nio.file.Files, e.g.

List<String> lines = Files.readAllLines(path);

No need for third-party libraries any longer!

Serialisation

Throw it all out, for there is JEP 154 deprecating serialisation. Well, it wasn’t accepted, but we could’ve surely removed about 10% of our legacy codebase.

A variety of concurrency APIs and helpers

With JEP 155, there had been a variety of improvements to concurrent APIs, e.g. to ConcurrentHashMaps (we’ve blogged about it before), but also the awesome LongAdders, about which you can read a nice article over at the Takipi blog.

Haven’t I seen a whole com.google.common.util.concurrent package over at Guava, recently? Probably not needed anymore.

JEP 154 (Serialisation) wasn’t real

It was an April Fools’ joke, of course…

Base64 encoders

How could this take so long?? In 2003, we’ve had RFC 3548, specifying Base16, Base32, and Base64 data encodings, which was in fact based upon base 64 encoding specified in RFC 1521, from 1993, or RFC 2045 from 1996, and if we’re willing to dig further into the past, I’m sure we’ll find earlier references to this simple idea of encoding binary data in text form.

Now, in 2014, we finally have JEP 135 as a part of the JavaSE8, and thus (you wouldn’t believe it): java.util.Base64.

Off to the trash can with all of these libraries!

… gee, it seems like everyone and their dog worked around this limitation, prior to the JDK 8…

More?

Provide your suggestions in the comments! We’re curious to hear your thoughts (with examples!)

Conclusion

As any Java major release, there is a lot of new stuff that we have to learn, and that allows us to remove third-party libraries. This is great, because many good concepts have been consolidated into the JDK, available on every JVM without external dependencies.

Disclaimer: Not everything in this article was meant seriously. Many people have created great pieces of work in the past. They have been very useful, even if they are somewhat deprecated now. Keep innovating, guys! :-)

Want to delve more into the many new things Java 8 offers? Go have a look over at the Baeldung blog, where this excellent list of Java 8 resources is featured:

http://www.baeldung.com/java8

… and stay tuned for our next Java 8 Friday blog post, next week!

We’re Hacking JDBC, so You Don’t Have To


We love working with JDBC

Said no one. Ever.

On a more serious note, JDBC is actually a very awesome API, if you think about it. It is probably also one of the very reasons Java has become the popular platform it is today. Before the JDK 1.1, and before ODBC (and that’s a very long time ago) it was hard to imagine any platform that would standardise database access at all. Heck, SQL itself was hardly even standardised at the time and along came Java with JDBC, a simple API with only few items that you have to know of in every day work:

  • Connection: the object that models all your DB interactions
  • PreparedStatement: the object that lets you execute a statement
  • ResultSet: the object that lets you fetch data from the database

That’s it!

Back to reality

That was the theory. In practice, enterprise software operating on top of JDBC quickly evolved towards this:

Hacking JDBC. Image copyright information on this page

JDBC is one of the last resorts for Java developers, where they can feel like real hackers, hacking this very stateful, very verbose, very arcane API in many ways. Pretty much everyone operating on JDBC will implement wrappers around the API to prevent at least:

  • Common syntax errors
  • Bind variable index mismatches
  • Dynamic SQL construction
  • Edge cases around the usage LOBs
  • Resource handling and closing
  • Array and UDT management
  • Stored procedure abstraction

… and so much more.

So while everyone is doing the above infrastructure work, they’re not working on their business logic. And pretty much everyone does these things, when working with JDBC. Hibernate and JPA do not have most these problems, but they’re not SQL APIs any longer, either.

Here are a couple of examples that we have been solving inside of jOOQ, so you don’t have to:

How to fetch generated keys in some databases

case DERBY:
case H2:
case MARIADB:
case MYSQL: {
    try {
        listener.executeStart(ctx);
        result = ctx.statement().executeUpdate();
        ctx.rows(result);
        listener.executeEnd(ctx);
    }

    // Yes. Not all warnings may have been consumed yet
    finally {
        consumeWarnings(ctx, listener);
    }

    // Yep. Should be as simple as this. But it isn't.
    rs = ctx.statement().getGeneratedKeys();

    try {
        List<Object> list = new ArrayList<Object>();

        // Some JDBC drivers seem to illegally return null
        // from getGeneratedKeys() sometimes
        if (rs != null) {
            while (rs.next()) {
                list.add(rs.getObject(1));
            }
        }

        // Because most JDBC drivers cannot fetch all
        // columns, only identity columns
        selectReturning(ctx.configuration(), list.toArray());
        return result;
    }
    finally {
        JDBCUtils.safeClose(rs);
    }
}

How to handle BigInteger and BigDecimal

else if (type == BigInteger.class) {
    // The SQLite JDBC driver doesn't support BigDecimals
    if (ctx.configuration().dialect() == SQLDialect.SQLITE) {
        return Convert.convert(rs.getString(index),
                               (Class) BigInteger.class);
    }
    else {
        BigDecimal result = rs.getBigDecimal(index);
        return (T) (result == null ? null :
                    result.toBigInteger());
    }
}
else if (type == BigDecimal.class) {
    // The SQLite JDBC driver doesn't support BigDecimals
    if (ctx.configuration().dialect() == SQLDialect.SQLITE) {
        return Convert.convert(rs.getString(index),
                               (Class) BigDecimal.class);
    }
    else {
        return (T) rs.getBigDecimal(index);
    }
}

How to fetch all exceptions from SQL Server

switch (configuration.dialect().family()) {
    case SQLSERVER:
        consumeLoop: for (;;)
            try {
                if (!stmt.getMoreResults() &&
                     stmt.getUpdateCount() == -1)
                    break consumeLoop;
            }
            catch (SQLException e) {
                previous.setNextException(e);
                previous = e;
            }
}

Convinced?

This is nasty code. And we have more examples of nasty code here, or in our source code.

All of these examples show that when working with JDBC, you’ll write code that you don’t want to / shouldn’t have to write in your application. This is why…

we have been hacking JDBC, so you don’t have to

Follow

Get every new post delivered to your Inbox.

Join 1,628 other followers

%d bloggers like this: