An MS Access SQL Transformation Odyssey

Recently, we’ve added support for the MS Access database in jOOQ 3.3. This may well have been our hardest integration so far. The MS Access database has its ways, and many of them, too. But luckily, jOOQ’s internal SQL transformation abilities are already very advanced, as we’ve shown previously in a blog post about the emulation of row value expression IN predicates.

In this post here, we want to show you how we can emulate a multi-value INSERT statement transparently through various steps until it works also for MS Access.

Consider the following standard SQL-92 INSERT statement:

INSERT INTO books (author, title)
VALUES ('George Orwell', '1984'),
       ('Leo Tolstoy', 'War and Peace');

With jOOQ, we can trivially write it like this:

DSL.using(configuration)
   .insertInto(BOOKS, AUTHOR, TITLE)
   .values("George Orwell", "1984")
   .values("Leo Tolstoy", "War and Peace")
   .execute();

See the manual for more details. The above multi-record INSERT syntax is supported by a variety of databases, but not by these ones:

  • Firebird
  • Ingres
  • MS Access
  • Oracle
  • SQLite
  • Sybase Adaptive Server Enterprise

But luckily, the above syntax can be emulated using INSERT .. SELECT

INSERT INTO books (author, title)
SELECT 'George Orwell', '1984'
UNION ALL
SELECT 'Leo Tolstoy', 'War and Peace';

Note that some databases require a FROM clause in most SQL statements. So does MS Access. A simple way to emulate what Oracle calls DUAL with MS Access is this:

INSERT INTO books (author, title)
SELECT 'George Orwell', '1984'
FROM (SELECT count(*) FROM MSysResources) AS DUAL
UNION ALL
SELECT 'Leo Tolstoy', 'War and Peace';
FROM (SELECT count(*) FROM MSysResources) AS DUAL

For simplicity, let’s just assume that DUAL actually exists:

INSERT INTO books (author, title)
SELECT 'George Orwell', '1984'
FROM DUAL
UNION ALL
SELECT 'Leo Tolstoy', 'War and Peace';
FROM DUAL

But this syntax is not supported, again, by MS Access as you can see in the manual.

It reads:

Syntax
Multiple-record append query:

INSERT INTO target [(field1[, field2[, …]])] 
[IN externaldatabase]
SELECT field1[, field2[, …]
FROM tableexpression

tableexpression: The name of the table or tables from which records are inserted. This argument can be a single table name or a compound resulting from an INNER JOIN, LEFT JOIN, or RIGHT JOIN operation or a saved query.

There’s clearly no room for UNION ALL clauses, but we can use a “saved query” in the FROM clause. Given our original intent, this would roughly translate to:

INSERT INTO books (author, title)
SELECT *
FROM (
  SELECT 'George Orwell', '1984'
  FROM DUAL
  UNION ALL
  SELECT 'Leo Tolstoy', 'War and Peace';
  FROM DUAL
)

… or so you might think. Unfortunately, the above attempt results into this error message:

SELECT * cannot be used in an INSERT INTO query when the source or destination table contains a multi-valued field

So we need to explicitly select each column from our newly created derived table. But those columns don’t have names (yet). The standard way to assign names to a derived table’s columns is by using derived column lists, which rename both the table and all of its columns in one go. We’ve previously blogged about this here.

For our SQL statement, this means:

INSERT INTO books (author, title)
SELECT a, b
FROM (
  SELECT 'George Orwell', '1984'
  FROM DUAL
  UNION ALL
  SELECT 'Leo Tolstoy', 'War and Peace';
  FROM DUAL
) t(a, b)

By now, you might have guessed that our SQL transformation odyssey is not over yet. MS Access does not support derived column lists, and it’s in good company. These databases don’t support them either:

  • H2
  • Ingres
  • MariaDB
  • MS Access
  • MySQL
  • Oracle
  • SQLite

This means that we need to emulate yet another SQL feature, which results in the following query:

INSERT INTO books (author, title)
SELECT a, b
FROM (
  -- This subselect defines column names
  SELECT '' AS a, '' AS b
  FROM DUAL
  WHERE 1 = 0
  UNION ALL
  -- This subselect provides our data
  SELECT *
  FROM (
    SELECT 'George Orwell', '1984'
    FROM DUAL
    UNION ALL
    SELECT 'Leo Tolstoy', 'War and Peace';
    FROM DUAL
  ) t
) t

And we’re set. Almost. Let’s substitute the actual DUAL expression back into the query

INSERT INTO books (author, title)
SELECT a, b
FROM (
  -- This subselect defines column names
  SELECT '' AS a, '' AS b
  FROM (SELECT count(*) FROM MSysResources) AS DUAL
  WHERE 1 = 0
  UNION ALL
  -- This subselect provides our data
  SELECT *
  FROM (
    SELECT 'George Orwell', '1984'
    FROM (SELECT count(*) FROM MSysResources) AS DUAL
    UNION ALL
    SELECT 'Leo Tolstoy', 'War and Peace';
    FROM (SELECT count(*) FROM MSysResources) AS DUAL
  ) t
) t

Now ain’t that a beauty!? tweet this

Real life scenario

In real life, you would probably work around this limitation in one way or another because no one manually writes (or reads and maintains) such SQL. You would probably resort to multiple executions of single-record INSERT statements, to batch processing, or whatever. But in real life, you would also support SQL Server or Oracle or some other database in addition to MS Access, and you’d constantly run into these kinds of issues which you’d have to patch manually. This can get very frustrating!

Unless, of course, you use jOOQ and forget about the above details in order to simply write the following, intuitive, standard SQL statement:

DSL.using(configuration)
   .insertInto(BOOKS, AUTHOR, TITLE)
   .values("George Orwell", "1984")
   .values("Leo Tolstoy", "War and Peace")
   .execute();

… which works precisely like this in all of jOOQ’s 16 supported RDBMS. What are you waiting for?

When All Else Fails: Using “the Unsafe”

Sometimes you have to hack. You just have to. Don’t listen to XKCD. You don’t always regret hacking. On our blog, we’ve shown a couple of hacks before:

But we’ve just been scratching the surface. Our friends at ZeroTurnaround / RebelLabs have recently published an awesome article about how to use “the Unsafe”. The sun.misc.Unsafe class to directly access memory in Java. While the first page introduces us to the Unsafe object itself and how to access it through reflection …

public static Unsafe getUnsafe() {
    try {
        Field f = Unsafe.class
            .getDeclaredField("theUnsafe");
        f.setAccessible(true);
        return (Unsafe) f.get(null);
    } catch (Exception e) { 
        /* ... */ 
    }
}

… subsequent sections nicely explain how to map “unsafe” memory access methods to addressing a Class in memory, of objects in memory

// If you're daring, go manipulate the heap directly!
Object helperArray[] = new Object[1];
helperArray[0] = targetObject;
long baseOffset = 
    unsafe.arrayBaseOffset(Object[].class);
long addressOfObject =
    unsafe.getLong(helperArray, baseOffset);

However, don’t think it’s so easy. In order to manipulate the Java heap directly, you will need to understand a lot about the various fields and flags in class headers, and you’ll always need to remember to distinguish between 32-bit and 64-bit JVMs.

This particular article was written at RebelLabs by Serkan Özal whose Open Source profile indicates that he’s a real hacker and “the Unsafe” is his home away from home.

Free Java Profiling with Oracle Java Mission Control

Do you profile your code using JProfiler or YourKit? You should, because their licenses are worth every penny when you can find a very hidden and subtle bottleneck deep down in your application.

For instance, the following chart shows nicely that there was a significant performance issue in jOOQ related to reflection:

Before...

Before…

I’ll spare you the details about the fix, but the RecordMapper.map() method is where the magic happens. The DefaultRecordMapperProvider.provide() method is really just boring initialisation. No way should it gobble up 96% of the benchmark. So the fix was applied and we profile again:

After... Yay!

After… Yay!

As you can see, the RecordMapper.map() is now using up most of the CPU time (the whole thing accelerated from 134 seconds down to 1.4 seconds in a benchmark)

Now, profilers cost some money and if money for the license is an issue for you, then there is good news! Since JDK 7u40, Oracle now ships their Oracle Java Mission Control (JMC) for its Hotspot VM and you can use it for free in development environments (not in production). But that’s more than enough!

JMC isn’t quite as powerful yet as JProfiler or YourKit. But again, you’re doing this because you want to save money, and that’s better than just staring at the JMX console or throwing random thread dumps around your console.

For more information, visit the following website:
http://download.oracle.com/technology/products/missioncontrol/updatesites/base/5.2.0/eclipse/

Java 8 Friday Goodies: Map Enhancements

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub. tweet this

Java 8 Goodie: Map Enhancements

In previous posts, we’ve already dealt with a couple of new Streams features, for instance when sorting. Most API improvements are indeed part of the new Streams API. But a few nice methods were also added to java.util.List and most importantly, to java.util.Map. If you want a quick overview of feature additions, go to the JDK8 Javadoc and click on the new “Default Methods” tab:

java.util.Map default methods

java.util.Map default methods

For backwards-compatibility reasons, all new methods added to Java interfaces are in fact default methods. So we have a couple of exciting new additions!

compute() methods

Often, we fetch a value from a map, make some calculations on it and put it back into the map. This can be verbose and hard to get right if concurrency is involved. With Java 8, we can pass a BiFunction to the new compute(), computeIfAbsent(), or computeIfPresent() methods and have the Map implementation handle the semantics of replacing a value.

The following example shows how this works:

// We'll be using this simple map
// Unfortunately, still no map literals in Java 8..
Map<String, Integer> map = new HashMap<>();
map.put("A", 1);
map.put("B", 2);
map.put("C", 3);

// Compute a new value for the existing key
System.out.println(map.compute("A", 
    (k, v) -> v == null ? 42 : v + 41));
System.out.println(map);

// This will add a new (key, value) pair
System.out.println(map.compute("X", 
    (k, v) -> v == null ? 42 : v + 41));
System.out.println(map);

The output of the above program is this:

42
{A=42, B=2, C=3}
42
{A=42, B=2, C=3, X=42}

This is really useful for ConcurrentHashMap, which ships with the following guarantee:

The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.

forEach() method

This is a really nice goodie which lets you pass a method reference or a lambda to receive (key, value) pairs one by one. A trivial example would be this:

map.forEach((k, v) -> 
    System.out.println(k + "=" + v));

Its output being:

A=1
B=2
C=3

merge() method

Now this one is really not so easy to understand. The Javadoc uses this example here:

map.merge(key, msg, String::concat)

Given the following contract:

If the specified key is not already associated with a value or is associated with null, associates it with the given value. Otherwise, replaces the value with the results of the given remapping function, or removes if the result is null.

So, the above code translates to the following atomic operation:

String value = map.get(key);
if (value == null)
    map.put(key, msg);
else
    map.put(key, value.concat(msg));

This is certainly not an everyday functionality and might just have leaked from an implementation to the top-level API. Additionally, if the map already contains null (so, null values are OK), and your remappingFunction returns null, then the entry is removed. That’s quite unexpected. Consider the following program:

map.put("X", null);
System.out.println(map.merge(
    "X", null, (v1, v2) -> null));
System.out.println(map);

Its output is:

null
{A=1, B=2, C=3}

Update: I first wrote the above code first with JDK 8 build 116. With build 129, things have changed completely again. First off, the value passed to merge() is not allowed to be null. Secondly. nullvalues are treated by merge() just like absent values. To produce the same output, we’ll write:

map.put("X", 1);
System.out.println(map.merge(
    "X", 1, (v1, v2) -> null));
System.out.println(map);

This merge() operation has thus removed a value from the map. That’s probably OK because the semantics of “merge” is often a combination of INSERT, UPDATE, and DELETE if we’re using SQL-speak. And a somewhat reasonable way to indicate that a value should be removed is to return null from such a function.

But the map is allowed to contain null values, which can never be inserted into the map using merge(). tweet this

getOrDefault()

This is a no-brainer. Right? Right! Wrong!

Unfortunately, there are two types of Maps. Those supporting null keys and/or values and those who don’t support nulls. While the previous merge() method didn’t distinguish between a map not containing a key and a map containing a key with a null value, this new getOrDefault() only returns the default when the key is not contained. It won’t protect you from a NullPointerException:

map.put("X", null);
try {
  System.out.println(map.getOrDefault("X", 21) + 21);
}
catch (NullPointerException nope) {
  nope.printStackTrace();
}

That’s quite a bummer. In general, it can be said the Map API has become even more complex with respect to nulls. tweet this

Trivial additions

There are a few more methods, like putIfAbsent() (pulled up from ConcurrentHashMap, remove() (with key and value arguments), replace().

Conclusion

All in all, it can be said that a lot of atomic operations have made it to the top-level Map API, which is good. But then again, the pre-existing confusion related to the semantics of null in maps has deepened. The terminologies “present” vs. “absent”, “contains”, “default” don’t necessarily help clarifying these things, which is surprisingly against the rules of keeping an API consistent and most importantly, regular. Thus as a consumer of this API, ideally, you should keep null out of maps, both as keys and as values!

Next week in this blog series, we’re going to look at how Java 8 will allow you to define local transactional scope very easily, so stay tuned!

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

No CROSS JOIN in MS Access

For the upcoming jOOQ 3.3, we’re now integrating support for the MS Access database through the JDBC-ODBC bridge, which is included in the JDK up until Java SE 7. Note that it will be removed in Java 8! Alternative access to access databases (pun intended) can be obtained through a hack involving ucanaccess, which is basically combining the HSQLDB parser with Jackcess, a low-level I/O library for MS Access database files.

MS Access is still an immensely popular relational database, ranking in the top 10 at db-engines.com’s DBMS ranking. Yet it has quirks. Many of them. One is the fact that there is no formal CROSS JOIN operation. Remember, most databases support explicit CROSS JOINing as such:

SELECT     p1.name player1, 
           p2.name player2
FROM       player p1 
CROSS JOIN player p2

The above query will generate all the pairings between two players.
The same can be written in pre-ANSI SQL-92 with comma-separated table lists:

SELECT     p1.name player1, 
           p2.name player2
FROM       player p1, 
           player p2

The first syntax, however, is more powerful and more expressive as it can be used in nested JOIN expressions and it shows your intent more clearly. A nice example was given here in a previous blog post.

How to work around a missing CROSS JOIN

Usually, missing support for CROSS JOIN can be emulated trivially using an INNER JOIN with a TRUE predicate as such:

SELECT     p1.name player1, 
           p2.name player2
FROM       player p1 
JOIN       player p2
ON         1 = 1

This is what jOOQ does for the Sybase Adaptive Server Enterprise database. But this doesn’t work for MS Access, because the JOIN operation there explicitly requires column references from either table on either side. The documentation reads:

Syntax:

FROM table1 INNER JOIN table2 
ON table1.field1 compopr table2.field2

This is quite a bummer from many points of view, not only for CROSS JOIN emulation. Given that any ANSI-92 JOIN syntax can be transformed into an ANSI-86 join expression (table list in the FROM clause and all predicates in the WHERE clause), it is also a bit surprising.

A simple workaround that seems to work for some use-cases is to take any numeric column from either table, and multiply it by zero:

SELECT     p1.name player1, 
           p2.name player2
FROM       player p1 
JOIN       player p2
ON         p1.id * 0 = p2.id * 0

But if the database itself is already this quirky, I suspect that it might not be able to optimise the above SQL.

In short…

MS Access does not support CROSS JOIN. For the time being, try to work around it using comma-separated table lists, while we work out more sophisticated SQL transformation in jOOQ.

jOOQ Newsletter: February 12, 2014

Subscribe to the newsletter here

Tweet of the Day

Our followers, users and customers are shouting to the world. Here are:

The Data Geekery Business Case at RedHat’s opensource.com

In our last newsletter, we have promoted our cooperation with RedHat’s content marketing team regarding our Open Source vendor business model. In the mean time, we have published another article on their platform about the 5 lessons learned for any business transitioning to a revenue-based model with Open Source. We recommend this interesting read to anyone who plans to make money with Open Source:

http://opensource.com/business/14/1/5-lessons-open-source-revenue-based-model

Our cooperation with other Open Source projects is a crucial part of our marketing strategy. Our ASL 2.0 / commercial dual licensing model might prove to be a viable use-case also for other Open Source projects building on top of jOOQ. Examples of such projects that are currently investigating an integration are

Also in the future, we’ll make an effort to better integrate with such projects to help spread the good jOOQ news to the world.

Community Work

It looks like great community-contributed work doesn’t stop appearing around jOOQ. We’ve worked together with Loiane Groner who has written this great tutorial about jOOQ in Portuguese. Loiane is consultant working for IBM and a well-known writer of books that mostly talk about Sencha Ext JS. She has also been promoting MyBatis in the past. We’re looking forward to more from her!

Ben Hood has been a long-term jOOQ aficionado who had been very active on the user group. He has taken inspiration from jOOQ to build his own DSL for Apache Cassandra, calling it CQLC, which is great news for the Cassandra community!Now, we’re looking forward to other APIs that cover MongoDB or Neo4j!

And just recently, Breandan Considine (who has lost ten pounds because of jOOQ) published great video tutorial for jOOQ, Gradle and IntelliJ IDEA users

Upcoming Events

In January, we have been visiting probably the largest JUG in Germany, theRheinJUG in Düsseldorf and also the JUGM in Munich. The German-speakers among you can watch the whole RheinJUG presentation. The audience was very very interactive and has left the talk with thousands of insights to continue discussing, as our talks inspire not only jOOQ business, but also SQL love!

Here is an overview of our upcoming events.

With our new SQL-talk, we’ll no longer just spread some jOOQ love, but also some SQL love in general. We believe that SQL deserves more presence in today’s software engineering talks, and who would be better to talk about SQL than us? Are you interested in hosting this talk at your company? Contact us!

Stay informed about 2014 events on www.jooq.org/news.

Java Zone – Java 8

In this section of the newsletter, we usually talk about SQL in the “SQL Zone”, but our other beloved technology is Java and exciting times are ahead in 2014 for us Java folks. This is why we have started a new series in our blog, the Java 8 Friday Goodies.

In this series, we talk about little everyday things that improve / impair / or just change with Java 8 when we code. Some examples:

If you haven’t already, sign up for the Java 8 Friday Goodies blog category!

Why Everyone Hates Operator Overloading

… no, don’t tell me you like Perl. Because you don’t. You never did. It does horrible things. It makes your code look like…

Perl made heavy use of operator overloading and used operators for a variety of things. A similar tendency can be seen in C++ and Scala. See also people comparing the two. So what’s wrong with operator overloading?

People never agreed whether Scala got operator overloading right or wrong:

Usually, people then cite the usual suspects, such as complex numbers (getting things right):

class Complex(val real:Int, 
              val imaginary:Int) {
    def +(operand:Complex):Complex = {
        new Complex(real + operand.real, 
                    imaginary + operand.imaginary)
    }
 
    def *(operand:Complex):Complex = {
        new Complex(real * operand.real - 
                    imaginary * operand.imaginary,
            real * operand.imaginary + 
            imaginary * operand.real)
    }
}

The above will now allow for adding and multiplying complex numbers, and there’s absolutely nothing wrong with that:

val c1 = new Complex(1, 2)
val c2 = new Complex(2, -3)
val c3 = c1 + c2
 
val res = c1 + c2 * c3

But then, there are these weirdo punctuation things that make average programmers simply go mad:

 ->
 ||=
 ++=
 <=
 _._
 ::
 :+=

Don’t believe it? Check out this graph library!

To the above, we say:

Operator Overloading? Meh

How operator overloading should be

Operator overloading can be good, but mostly isn’t. In Java, we’re all missing better ways to interact with BigDecimal and similar types:

// How it is:
bigdecimal1.add(bigdecimal2.multiply(bigdecimal3));

// How it should be:
bigdecimal1 + bigdecimal2 * bigdecimal3

Of course, operator precedence would take place as expected. Unlike C++ or Scala, ideal operator overloading would simply map common operators to common method names. Nothing more. No one really wants API developers to come up with fancy ##-%>> operators.

While Ceylon, Groovy, and Xtend implemented this in a somewhat predictable and useful way, Kotlin is probably the language that has implemented the best standard operator overloading mechanism into their language. Their documentation states:

Binary operations

Expression Translated to
a + b a.plus(b)
a – b a.minus(b)
a * b a.times(b)
a / b a.div(b)
a % b a.mod(b)
a..b a.rangeTo(b)

That looks pretty straightforward. Now check this out:

“Array” access

Symbol Translated to
a[i] a.get(i)
a[i, j] a.get(i, j)
a[i_1, …, i_n] a.get(i_1, …, i_n)
a[i] = b a.set(i, b)
a[i, j] = b a.set(i, j, b)
a[i_1, …, i_n] = b a.set(i_1, …, i_n, b)

Now, I really don’t see a single argument against the above. This goes on, and unfortunately, Java 8 has missed this train, as method references cannot be assigned to variables and invoked like JavaScript functions (although, that’s not too late for Java 9+):

Method calls

Symbol Translated to
a(i) a.invoke(i)
a(i, j) a.invoke(i, j)
a(i_1, …, i_n) a.invoke(i_1, …, i_n)

Simply beautiful!

Conclusion

We’ve recently blogged about Ceylon’s awesome language features. But the above Kotlin features are definitely a killer and would remove any other sorts of desires to introduce operator overloading in Java for good.

Let’s hope future Java versions take inspiration from Kotlin, a language that got operator overloading right.