jOOQ vs. Hibernate: When to Choose Which

Hibernate has become a de-facto standard in the Java ecosystem, and after the fact, also an actual JavaEE standard implementation if standards matter to you, and if you put the JCP on the same level with ISO, ANSI, IEEE, etc.

This article does not intended to discuss standards, but visions. Hibernate shares JPA’s vision of ORM. jOOQ shares SQL’s vision of powerful querying, so for the sake of the argument, let’s use Hibernate / JPA / ORM interchangeably, much like jOOQ / JDBC / SQL.

The question why should anyone not use Hibernate these days always shows up frequently – precisely because Hibernate is a de-facto standard, and the first framework choice in many other frameworks such as Grails (which uses GORM, which again uses Hibernate).

However, even Gavin King, the creator of Hibernate, doesn’t believe that Hibernate should be used for everything:

gavin-king

If that’s the case, are there any objective decision helping points that you could consider, when to use an ORM and when to use SQL?

Discussing on a high level

First off, let’s bring this discussion to a higher level. Instead of deciding between Hibernate and jOOQ as concrete implementations of their own domains, let’s think about ORM vs. SQL, and their different use-cases.

When deciding between an ORM (e.g. Hibernate) and SQL (e.g. jOOQ), the driving question that you should ask yourself is not the question of project complexity. Some of our most demanding customers are using jOOQ on medium-sized schemas with thousands of tables / views. Often, those schemas are extremely normalised and sometimes even deployed on as many as six different RDBMS. jOOQ was specifically designed to work in these scenarios, while keeping the simple use-case in mind as well.

So, instead of thinking about project complexity, ask yourself the following questions:

  • 1. Will your data model drive your application design, or will your application design drive your data model(s)?

    A main aspect here is the question whether you “care” about your database in the sense of whether it might survive your application. Very often, applications come and go. They may be re-written in Python / JavaScript, etc. 5 years down the line. Or you have multiple applications accessing the same database: Your Java application, some Perl scripts, stored procedures, etc. If this is the case, database design is a priority in your project, and jOOQ works extremely well in these setups.

    If you don’t necessarily “care” about your database in the sense that you just want to “persist” your Java domain somewhere, and this happens to be a relational database, then Hibernate might be a better choice – at least in early stages of your project, because you can easily generate your database schema from your Entity model.

  • 2. Will you do mostly complex reading and simple writing, or will you engage in complex writing?

    SQL really shines when reading is complex. When you join many tables, when you aggregate data in your database, when you do reporting, when you do bulk reading and writing. You think of your data in terms of set theory, e.g. your data as a whole. Writing CRUD with SQL is boring, though. This is why jOOQ also provides you with an ActiveRecord-style API that handles the boring parts, when you’re operating on single tables (Jason mentioned this).

    If, however, your writing becomes complex, i.e. you have to load a complex object graph with 20 entities involved into memory, perform optimistic locking on it, modify it in many different ways and then persist it again in one go, then SQL / jOOQ will not help you. This is what Hibernate has originally been created for.

Opinion

I believe that data is forever. You should *always* assume that your database survives your application. It is much easier to rewrite (parts of) an application than to migrate a database. Having a clean and well-designed database schema will always pay off down the line of a project, specifically of a complex project. See also our previous article about the fallacy of “schemaless” databases.

Also, most projects really do 90% reading and 10% writing, writing often not being complex (2-3 tables modified within a transaction). This means that most of the time, the complexity solved by Hibernate / JPA’s first and second level caches is not needed. People often misunderstand these features and simply turn off caching, flushing Hibernate’s cache to the server all the time, and thus using Hibernate in the wrong way.

If, however, you’re undecided about the above two axes of decision, you can go the middle way and use jOOQ only for reporting, batch processing, etc. and use Hibernate for your CRUD – in a CQRS (Command Query Responsibility Segregation: http://martinfowler.com/bliki/CQRS.html) style. There are also quite a few jOOQ users who have chosen this path.

Further reading

10 Java Articles Everyone Must Read

One month ago, we’ve published a list of 10 SQL Articles Everyone Must Read. A list of articles that we believe would add exceptional value to our readers on the jOOQ blog. The jOOQ blog is a blog focusing on both Java and SQL, so it is only natural that today, one month later, we’re publishing an equally exciting list of 10 Java articles everyone must read.

Note that by “must read”, we may not specifically mean the particular linked article only, but also other works from the same authors, who have been regular bloggers over the past years and never failed to produce new interesting content!

Here goes…

1. Brian Goetz: “Stewardship: the Sobering Parts”

The first blog post is actually not a blog post but a recording of a very interesting talk by Brian Goetz on Oracle’s stewardship of Java. On the jOOQ blog, we’ve been slightly critical about 1-2 features of the Java language in the past, e.g. when comparing it to Scala, or Ceylon.

Brian makes good points about why it would not be a good idea for Java to become just as “modern” as quickly as other languages. A must-watch for every Java developer (around 1h)

2. Aleksey Shipilёv: The Black Magic of (Java) Method Dispatch

In recent years, the JVM has seen quite a few improvements, including invokedynamic that arrived in Java 7 as a prerequisite for Java 8 lambdas, as well as a great tool for other, more dynamic languages built on top of the JVM, such as Nashorn.

invokedynamic is only a small, “high level” puzzle piece in the advanced trickery performed by the JVM. What really happens under the hood when you call methods? How are they resolved, optimised by the JIT? Aleksey’s article sub-title reveals what the article is really about:

“Everything you wanted to know about Black Deviously Surreptitious Magic in low-level performance engineering”

Definitely not a simple read, but a great post to learn about the power of the JVM.

Read Aleksey’s “The Black Magic of (Java) Method Dispatch

3. Oliver White: Java Tools and Technologies Landscape for 2014

We’re already in 2015, but this report by Oliver White (at the time head of ZeroTurnaround’s RebelLabs) had been exceptionally well executed and touches pretty much everything related to the Java ecosystem.

Read Oliver’s “Java Tools and Technologies Landscape for 2014

4. Peter Lawrey: Java Lambdas and Low Latency

When Aleksey has introduced us to some performance semantics in the JVM, Peter takes this one step further, talking about low latency in Java 8. We could have picked many other useful little blog posts from Peter’s blog, which is all about low-latency, high performance computing on the JVM, sometimes even doing advanced off-heap trickery.

Read Peter’s “Java Lambdas and Low Latency

5. Nicolai Parlog: Everything You Need To Know About Default Methods

Nicolai is a newcomer in the Java blogosphere, and a very promising one, too. His well-researched articles go in-depth about some interesting facts related to Java 8, digging out old e-mails from the expert group’s mailing list, explaining the decisions they made to conclude with what we call Java 8 today.

Read Nicolai’s “Everything You Need To Know About Default Methods

6. Lukas Eder: 10 Things You Didn’t Know About Java

This list wouldn’t be complete without listing another list that we wrote ourselves on the jOOQ blog. Java is an old beast with 20 years of history this year in 2015. This old beast has a lot of secrets and caveats that many people have forgotten or never thought about. We’ve uncovered them for you:

Read Lukas’s “10 Things You Didn’t Know About Java

7. Edwin Dalorzo: Why There Is Interface Pollution in Java 8

Edwin has been responding to our own blog posts a couple of times in the past with very well researched and thoroughly thought through articles, in particular about Java 8 related features, e.g. comparing Java 8 Streams with LINQ (something that we’ve done ourselves, as well).

This particular article explains why there are so many different and differently named functional interfaces in Java 8.

Read Edwin’s “Why There Is Interface Pollution in Java 8

8. Vlad Mihalcea: How Does PESSIMISTIC_FORCE_INCREMENT Lock Mode Work

When Java talks to databases, many people default to using Hibernate for convenience (see also 3. Oliver White: Java Tools and Technologies Landscape for 2014). Hibernate’s main vision, however, is not to add convenience – you can get that in many other ways as well. Hibernate’s main vision is to provide powerful means of navigating and persisting an object graph representation of your RDBMS’s data model, including various ways of locking.

Vlad is an extremely proficient Hibernate user, who has a whole blog series on how Hibernate works going. We’ve picked a recent, well-researched article about locking, but we strongly suggest you read the other articles as well:

Read Vlad’s “How Does PESSIMISTIC_FORCE_INCREMENT Lock Mode Work

9. Petri Kainulainen: Writing Clean Tests

This isn’t a purely Java-related blog post, although it is written from the perspective of a Java developer. Modern development involves testing – automatic testing – and lots of it. Petri has written an interesting blog series about writing clean tests in Java – you shouldn’t miss his articles!

Read Petri’s “Writing Clean Tests

10. Eugen Paraschiv: Java 8 Resources Collection

If you don’t already have at least 9 open tabs with interesting stuff to read after this list, get ready for a browser tab explosion! Eugen Paraschiv who maintains baeldung.com has been collecting all sorts of very interesting resources related to Java 8 in a single link collection. You should definitely bookmark this collection and check back frequently for interesting changes:

Read Eugen’s “Java 8 Resources Collection

Many other articles

There are, of course, many other very good articles providing deep insight into useful Java tricks. If you find you’ve encountered an article that would nicely complement this list, please leave a link and description in the comments section. Future readers will appreciate the additional insight.

Leaky Abstractions, or How to Bind Oracle DATE Correctly with Hibernate

We’ve recently published an article about how to bind the Oracle DATE type correctly in SQL / JDBC, and jOOQ. This article got a bit of traction on reddit with an interesting remark by Vlad Mihalcea, who is frequently blogging about Hibernate, JPA, transaction management and connection pooling on his blog. Vlad pointed out that this problem can also be solved with Hibernate, and we’re going to look into this, shortly.

What is the problem with Oracle DATE?

The problem that was presented in the previous article is dealing with the fact that if a query uses filters on Oracle DATE columns:

// execute_at is of type DATE and there's an index
PreparedStatement stmt = connection.prepareStatement(
    "SELECT * " + 
    "FROM rentals " +
    "WHERE rental_date > ? AND rental_date < ?");

… and we’re using java.sql.Timestamp for our bind values:

stmt.setTimestamp(1, start);
stmt.setTimestamp(2, end);

… then the execution plan will turn very bad with a FULL TABLE SCAN or perhaps an INDEX FULL SCAN, even if we should have gotten a regular INDEX RANGE SCAN.

-------------------------------------
| Id  | Operation          | Name   |
-------------------------------------
|   0 | SELECT STATEMENT   |        |
|*  1 |  FILTER            |        |
|*  2 |   TABLE ACCESS FULL| RENTAL |
-------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(:1<=:2)
   2 - filter((INTERNAL_FUNCTION("RENTAL_DATE")>=:1 AND 
              INTERNAL_FUNCTION("RENTAL_DATE")<=:2))

This is because the database column is widened from Oracle DATE to Oracle TIMESTAMP via this INTERNAL_FUNCTION(), rather than truncating the java.sql.Timestamp value to Oracle DATE.

More details about the problem itself can be seen in the previous article

Preventing this INTERNAL_FUNCTION() with Hibernate

You can fix this with Hibernate’s proprietary API, using a org.hibernate.usertype.UserType.

Assuming that we have the following entity:

@Entity
public class Rental {

    @Id
    @Column(name = "rental_id")
    public Long rentalId;

    @Column(name = "rental_date")
    public Timestamp rentalDate;
}

And now, let’s run this query here (I’m using Hibernate API, not JPA, for the example):

List<Rental> rentals =
session.createQuery("from Rental r where r.rentalDate between :from and :to")
       .setParameter("from", Timestamp.valueOf("2000-01-01 00:00:00.0"))
       .setParameter("to", Timestamp.valueOf("2000-10-01 00:00:00.0"))
       .list();

The execution plan that we’re now getting is again inefficient:

-------------------------------------
| Id  | Operation          | Name   |
-------------------------------------
|   0 | SELECT STATEMENT   |        |
|*  1 |  FILTER            |        |
|*  2 |   TABLE ACCESS FULL| RENTAL |
-------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(:1<=:2)
   2 - filter((INTERNAL_FUNCTION("RENTAL0_"."RENTAL_DATE")>=:1 AND 
              INTERNAL_FUNCTION("RENTAL0_"."RENTAL_DATE")<=:2))

The solution is to add this @Type annotation to all relevant columns…

@Entity
@TypeDefs(
    value = @TypeDef(
        name = "oracle_date", 
        typeClass = OracleDate.class
    )
)
public class Rental {

    @Id
    @Column(name = "rental_id")
    public Long rentalId;

    @Column(name = "rental_date")
    @Type(type = "oracle_date")
    public Timestamp rentalDate;
}

and register the following, simplified UserType:

import java.io.Serializable;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Timestamp;
import java.sql.Types;
import java.util.Objects;

import oracle.sql.DATE;

import org.hibernate.engine.spi.SessionImplementor;
import org.hibernate.usertype.UserType;

public class OracleDate implements UserType {

    @Override
    public int[] sqlTypes() {
        return new int[] { Types.TIMESTAMP };
    }

    @Override
    public Class<?> returnedClass() {
        return Timestamp.class;
    }

    @Override
    public Object nullSafeGet(
        ResultSet rs, 
        String[] names, 
        SessionImplementor session, 
        Object owner
    )
    throws SQLException {
        return rs.getTimestamp(names[0]);
    }

    @Override
    public void nullSafeSet(
        PreparedStatement st, 
        Object value, 
        int index, 
        SessionImplementor session
    )
    throws SQLException {
        // The magic is here: oracle.sql.DATE!
        st.setObject(index, new DATE(value));
    }

    // The other method implementations are omitted
}

This will work because using the vendor-specific oracle.sql.DATE type will have the same effect on your execution plan as explicitly casting the bind variable in your SQL statement, as shown in the previous article: CAST(? AS DATE). The execution plan is now the desired one:

------------------------------------------------------
| Id  | Operation                    | Name          |
------------------------------------------------------
|   0 | SELECT STATEMENT             |               |
|*  1 |  FILTER                      |               |
|   2 |   TABLE ACCESS BY INDEX ROWID| RENTAL        |
|*  3 |    INDEX RANGE SCAN          | IDX_RENTAL_UQ |
------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(:1<=:2)
   3 - access("RENTAL0_"."RENTAL_DATE">=:1 
          AND "RENTAL0_"."RENTAL_DATE"<=:2)

If you want to reproduce this issue, just query any Oracle DATE column with a java.sql.Timestamp bind value through JPA / Hibernate, and get the execution plan as indicated here.

Don’t forget to flush shared pools and buffer caches to enforce the calculation of new plans between executions, because the generated SQL is the same each time.

Can I do it with JPA 2.1?

At first sight, it looks like the new converter feature in JPA 2.1 (which works just like jOOQ’s converter feature) should be able to do the trick. We should be able to write:

import java.sql.Timestamp;

import javax.persistence.AttributeConverter;
import javax.persistence.Converter;

import oracle.sql.DATE;

@Converter
public class OracleDateConverter 
implements AttributeConverter<Timestamp, DATE>{

    @Override
    public DATE convertToDatabaseColumn(Timestamp attribute) {
        return attribute == null ? null : new DATE(attribute);
    }

    @Override
    public Timestamp convertToEntityAttribute(DATE dbData) {
        return dbData == null ? null : dbData.timestampValue();
    }
}

This converter can then be used with our entity:

import java.sql.Timestamp;

import javax.persistence.Column;
import javax.persistence.Convert;
import javax.persistence.Entity;
import javax.persistence.Id;

@Entity
public class Rental {

    @Id
    @Column(name = "rental_id")
    public Long rentalId;

    @Column(name = "rental_date")
    @Convert(converter = OracleDateConverter.class)
    public Timestamp rentalDate;
}

But unfortunately, this doesn’t work out of the box as Hibernate 4.3.7 will think that you’re about to bind a variable of type VARBINARY:

// From org.hibernate.type.descriptor.sql.SqlTypeDescriptorRegistry

    public <X> ValueBinder<X> getBinder(JavaTypeDescriptor<X> javaTypeDescriptor) {
        if ( Serializable.class.isAssignableFrom( javaTypeDescriptor.getJavaTypeClass() ) ) {
            return VarbinaryTypeDescriptor.INSTANCE.getBinder( javaTypeDescriptor );
        }

        return new BasicBinder<X>( javaTypeDescriptor, this ) {
            @Override
            protected void doBind(PreparedStatement st, X value, int index, WrapperOptions options)
                    throws SQLException {
                st.setObject( index, value, jdbcTypeCode );
            }
        };
    }

Of course, we can probably somehow tweak this SqlTypeDescriptorRegistry to create our own “binder”, but then we’re back to Hibernate-specific API. This particular implementation is probably a “bug” at the Hibernate side, which has been registered here, for the record:

https://hibernate.atlassian.net/browse/HHH-9553

Conclusion

Abstractions are leaky on all levels, even if they are deemed a “standard” by the JCP. Standards are often a means of justifying an industry de-facto standard in hindsight (with some politics involved, of course). Let’s not forget that Hibernate didn’t start as a standard and massively revolutionised the way the standard-ish J2EE folks tended to think about persistence, 14 years ago.

In this case we have:

  • Oracle SQL, the actual implementation
  • The SQL standard, which specifies DATE quite differently from Oracle
  • ojdbc, which extends JDBC to allow for accessing Oracle features
  • JDBC, which follows the SQL standard with respect to temporal types
  • Hibernate, which offers proprietary API in order to access Oracle SQL and ojdbc features when binding variables
  • JPA, which again follows the SQL standard and JDBC with respect to temporal types
  • Your entity model

As you can see, the actual implementation (Oracle SQL) leaked up right into your own entity model, either via Hibernate’s UserType, or via JPA’s Converter. From then on, it will hopefully be shielded off from your application (until it won’t), allowing you to forget about this nasty little Oracle SQL detail.

Any way you turn it, if you want to solve real customer problems (i.e. the significant performance issue at hand), then you will need to resort to vendor-specific API from Oracle SQL, ojdbc, and Hibernate – instead of pretending that the SQL, JDBC, and JPA standards are the bottom line.

But that’s probably alright. For most projects, the resulting implementation-lockin is totally acceptable.

Look no Further! The Final Answer to “Where to Put Generated Code?”

This recent question on Stack Overflow made me think.

Why does jOOQ suggest to put generated code under “/target” and not under “/src”?

… and I’m about to give you the final answer to “Where to Put Generated Code?”

This isn’t only about jOOQ

Even if you’re not using jOOQ, or if you’re using jOOQ but without the code generator, there might be some generated source code in your project. There are many tools that generate source code from other data, such as:

  • The Java compiler (ok, byte code, not strictly source code. But still code generation)
  • XJC, from XSD files
  • Hibernate from .hbm.xml files, or from your schema
  • Xtend translates Xtend code to Java code
  • You could even consider data transformations, like XSLT
  • many more…

In this article, we’re going to look at how to deal with jOOQ-generated code, but the same thoughts apply also to any other type of code generated from other code or data.

Now, the very very interesting strategic question that we need to ask ourselves is: Where to put that code? Under version control, like the original data? Or should we consider generated code to be derived code that must be re-generated all the time?

The answer is nigh…

It depends!

Nope, unfortunately, as with many other flame-wary discussions, this one doesn’t have a completely correct or wrong answer, either. There are essentially two approaches:

Considering generated code as part of your code base

When you consider generated code as part of your code base, you will want to:

  • Check in generated sources in your version control system
  • Use manual source code generation
  • Possibly use even partial source code generation

This approach is particularly useful when your Java developers are not in full control of or do not have full access to your database schema (or your XSD or your Java code, etc.), or if you have many developers that work simultaneously on the same database schema, which changes all the time. It is also useful to be able to track side-effects of database changes, as your checked-in database schema can be considered when you want to analyse the history of your schema.

With this approach, you can also keep track of the change of behaviour in the jOOQ code generator, e.g. when upgrading jOOQ, or when modifying the code generation configuration.

When you use this approach, you will treat your generated code as an external library with its own lifecycle.

The drawback of this approach is that it is more error-prone and possibly a bit more work as the actual schema may go out of sync with the generated schema.

Considering generated code as derived artefacts

When you consider generated code to be derived artefacts, you will want to:

  • Check in only the actual DDL, i.e. the “original source of truth” (e.g. controlled via Flyway)
  • Regenerate jOOQ code every time the schema changes
  • Regenerate jOOQ code on every machine – including continuous integration machines, and possibly, if you’re crazy enough, on production

This approach is particularly useful when you have a smaller database schema that is under full control by your Java developers, who want to profit from the increased quality of being able to regenerate all derived artefacts in every step of your build.

This approach is fully supported by Maven, for instance, which foresees special directories (e.g. target/generated-sources), and phases (e.g. <phase>generate-sources</phase>) specifically for source code generation.

The drawback of this approach is that the build may break in perfectly “acceptable” situations, when parts of your database are temporarily unavailable.

Pragmatic approach

Some of you might not like that answer, but there is also a pragmatic approach, a combination of both. You can consider some code as part of your code base, and some code as derived. For instance, jOOQ-meta’s generated sources (used to query the dictionary views / INFORMATION_SCHEMA when generating jOOQ code) are put under version control as few jOOQ contributors will be able to run the jOOQ-meta code generator against all supported databases. But in many integration tests, we re-generate the sources every time to be sure the code generator works correctly.

Huh!

Conclusion

I’m sorry to disappoint you. There is no final answer to whether one approach or the other is better. Pick the one that offers you more value in your specific situation.

In case you’re choosing your generated code to be part of the code base, read this interesting experience report on the jOOQ User Group by Witold Szczerba about how to best achieve this.

Stop Unit Testing Database Code

Writing tests that use an actual database is hard.

Period.

Now that this has been established, let’s have a look at a blog post by Marco Behler, in which he elaborates on various options when testing database code, with respect to transactionality. Testing database transactions is even harder than just testing database code. Marco lists a couple of options how to tweak these tests to make them “easier” to write.

One of the options is:

3. Set the flush-mode to FlushMode.ALWAYS for your tests

(note, this is about testing Hibernate code).

Marco puts this option in parentheses, because he’s not 100% convinced if it’s a good idea to test different behaviour from the productive one. Here’s our take on that:

Stop Unit Testing Database Code

By the time you start thinking about tweaking your test setup to achieve “simpler” transaction behaviour (or worse, use mocks for your database), you’re pretty much doomed. You start creating an alternative system that heavily deviates from your productive system. This essentially means:

  • that results from tests against your test system have (almost) no meaning
  • that your tests will not cover some of the most complex aspects of your productive system
  • that you will start spending way too much time on tweaking tests rather than implementing useful business logic

Instead, focus on writing integration tests that test your business logic on a very high level, i.e. on a “service layer” level, if your architecture has such a thing. If you’re using EJB, this would probably be on a session bean level. If you’re using REST or SOAP, this would be on a REST or SOAP service level. If you’re using HTTP (the retro name for REST), it would be on an HTTP service level.

Here are the advantages of this approach:

  • You might not get 100% coverage, but you will get the 80% coverage for those parts that really matter (with 20% of the effort). Your UI (or external system) doesn’t call the database in the quirkiest of forms either. It calls your services. Why would you test anything other than your services?
  • Your tests are very easy to maintain. You have a public API to your UI (or external system). It’s formal and easy to understand. It’s a black box with well-defined input and output parameters, which makes it easy to read / write tests

Databases are stateful. Obviously

What you have to do is let go of this idea that your database will ever participate in a “unit” (as in “unit” test). Units are pretty stateless, and thus it is very easy to write mutually independent unit tests for functional algorithms, for instance.

Databases couldn’t be any less stateless. The whole idea of a database is to manage state. And that’s very complicated and completely opposite to what any unit test can ever model. Many of the most meaningful state transitions span several database interactions, or even transactions, or maybe even services. For instance, it may be important that the CREATE_USER service invocation be immediately followed by an invocation of CHANGE_PASSWORD. You can only integration-test that on a service layer. Don’t believe it? What if CREATE_USER depends on an external LDAP system? Or complex security logic in stored procedures? Your integration test’s got that covered.

Takeaway

Writing tests that use an actual database is hard.

Yes. That won’t change. But your perception may. Don’t try to tweak things around this fact. Create a well-known test database. Reset it between tests. And write integration tests on a very high level. The 20/80 cost/benefit ratio will leave you no better choice.

Stay tuned for another blog post on this blog about how we integration-test the jOOQ API against 16 actual RDBMS

MyBatis’ Alternative Transaction Management

On the jOOQ user group, we’re often being asked how to perform transaction management with jOOQ. And we have an easy answer ready: You don’t do that with jOOQ. You choose your favourite transaction management API, be it:

And the above list is far from being exhaustive. Transaction management is something very delicate, and it certainly should not be imposed by a library whose main purpose is not transaction management, because any such library / framework will provide you with at most a very leaky abstraction of its transaction model. In other words, if you just slightly want to deviate from “the standard” model (e.g. as imposed by Hibernate), you will suffer greatly, as soon as you want to run 2-3 queries outside of Hibernate – e.g. batch or reporting statements through jOOQ.

MyBatis’ Alternative Transaction Management

MyBatis is a SQL templating engine that provides a couple of features on top of alternative templating engines, such as Velocity, or StringTemplate. One of these features built on top of templating is precisely transaction management, as can be seen in the docs.

From what we can read in the docs, it looks as though MyBatis’ transaction managers can be overriden by Spring, for instance. However, it is not easy to see how this is done. In fact, given that MyBatis also solves Connection pooling (for which there are also very viable alternatives, such as c3p0 and DBCP), and mapping (which could be solved more easily with custom transformers, such as offered by Spring’s JdbcTemplate, or jOOQ’s RecordMapper).

As many frameworks, MyBatis tries to solve problems outside its core scope, which is SQL templating. While this may be a good thing as you only rely on a single dependency, it is also quite a lock-in, in case you have a more complex model. In the case of transaction management, we believe that this was not a good idea by MyBatis.

Thoughts from MyBatis users?

jOOQ Newsletter December 13, 2013

subscribe to the newsletter here

A jOOQ Runtime Only Distribution

Several of our customers have made us aware of the fact that they don’t really need the jOOQ code generator, only the SQL builder / query DSL API and possibly the SQL execution functionality. In the next weeks, we’ll be working towards a new “jOOQ Runtime Only Distribution” that will ship at a lower price. Concretely, we’ll be giving those customers a 25% discount on the regular distributions for any of jOOQ Express, jOOQ Professional, and jOOQ Enterprise.

Obviously, we recommend to use jOOQ’s code generator to profit from the full feature scope that we offer.

License Amendments

We have updated our commercial license to improve your legal relationship with Data Geekery. Essentially, these things have been amended:

  • Definitions: We have added the missing definition of what is a “Minor Defect”
  • 6.2 Distribution Right: The distribution right section now grants you a perpetual license to distribute the software, instead of a timely limited one. This will allow you to continue to distribute, embed and use jOOQ in your End-user Application even if you terminate your Developer Workstation License Agreement with us. You will, however, still need to license jOOQ in order to maintain your End-user Application.
  • 7.1.1 Remedial Services: The remedial services section now formally defines how Minor Defects are remedied

As this updated license grants new rights to our customers, it shall be in effect immediately also for existing jOOQ 3.2 customers.

Upcoming Events

We have recently presented jOOQ at the Java2Days conference in Sofia, Bulgaria as well as at the Java User Group Berlin-Brandenburg. Both talks have enjoyed a high attendance with lots of SQL developers challenging us with interesting questions. Of course, jOOQ responds to most questions already, as Manuel Bernhardt, another speaker has observed:

If you want us to talk about jOOQ or SQL near you, do not hesitate to contact us.

Here is an overview of other, upcoming events:

Stay informed about 2014 events on www.jooq.org/news. If you’re looking for the slides, they’re available for free under a CC BY-SA 3.0 license on SlideShare.

Using jOOQ with Hibernate

While jOOQ can be seen as a popular alternative to Hibernate, there is no stopping you from using both frameworks in the same application. While Hibernate heavily improves every day CRUD operations, jOOQ heavily improves writing SQL in Java. The two goals can be orthogonal and thus it may make perfect sense to combine the two.

Vlad Mihalcea, an enthusiastic jOOQ user, is writing up a series of blog posts on his blog about how to use jOOQ with Hibernate:

Gavin King himself chimed in on Google+ and confirmed once again that it was never his intention for Hibernate to be used for everything.

If you want to share your own experience of your jOOQ / Spring, jOOQ / Hibernate, jOOQ / Anything integration, let us know!

SQL Zone – “Lightning Fast SQL with Proper Indexing”

Understanding SQL and the history of SQL is of the essence when you want to get the best out of your relational database. This has recently been nicely explained by Markus Winand at the Oredev conference 2013. Luckily, his presentation has been recorded for free review.

More about the history of SQL can be seen in this interesting article about Codd’s Relational Vision – Has NoSQL Come Full Circle? It proves what we have been evangelising for a while, ourselves. NoSQL is not a new technology. It is more of a return to pre-Codd times when people did not have a powerful relational model to abstract their storage implementation away from their application. This article claims that Codd’s greatest achievement was the fact that he surpassed the deficiencies of early databases:

  • Access dependencies
  • Order dependencies
  • Index dependencies

Read the full article for a very interesting historic insight into the relational model, and why NoSQL might not be a good answer for most general problems.

How to Integrate jOOQ with Hibernate

While jOOQ can be an alternative to Hibernate, it doesn’t have to entirely replace Hibernate. Many users have reported positive experiences when combining jOOQ with Hibernate, letting Hibernate do the tedious CRUD work, and jOOQ the complex querying and reporting through its sophisticated, yet intuitive query DSL.

Vlad Mihalcea who has been blogging interesting stuff about SQL and transaction models recently has now published a very nice tutorial on how to use JPA-annotated entities as a grounds for source code generation in jOOQ, through the following steps:

  • Write the annotated entities
  • Generate HSQLDB DDL from those entities
  • Execute the DDL in an HSQLDB instance
  • Run the jOOQ code generator to reverse-engineer the schema

With the above, he’s ready to query entities through JPA/Hibernate or jOOQ in no time.

Read the full blog post here:
http://vladmihalcea.wordpress.com/2013/12/06/jooq-facts-from-jpa-annotations-to-jooq-table-mappings/

Deep Stack Traces Can be a Sign for Good Code Quality

The term “leaky abstractions” has been around for a while. Coining it is most often attributed to Joel Spolsky, who wrote this often-cited article about it. I’ve now stumbled upon another interpretation of a leaky abstraction, measured by the depth of a stack trace:

Leaky Abstractions as understood by Geek and Poke (Licensed CC-BY)
Leaky Abstractions as understood by Geek and Poke (Licensed CC-BY)

So, long stack traces are bad according to Geek & Poke. I’ve seen this argument before on Igor Polevoy’s blog (he’s the creator of ActiveJDBC, a Java implementation of the popular Ruby ActiveRecord query interface). Much like Joel Spolsky’s argumentation was often used to criticise ORMs, Igor’s argument was also used to compare ActiveJDBC with Hibernate. I’m citing:

One might say: so what, why do I care about the size of dependencies, depth of stack trace, etc. I think a good developer should care about these things. The thicker the framework, the more complex it is, the more memory it allocates, the more things can go wrong.

I completely agree that a framework with a certain amount of complexity tends to have longer stack traces. So if we run these axioms through our mental Prolog processors:

  • if Hibernate is a leaky abstraction, and
  • if Hibernate is complex, and
  • if complexity leads to long stack traces, then
  • leaky abstractions and long stack traces correlate

I wouldn’t go as far as claiming there’s a formal, causal connection. But a correlation seems logical.

But these things aren’t necessarily bad. In fact, long stack traces can be a good sign in terms of software quality. It can mean that the internals of a piece of software show a high amount of cohesion, a high degree of DRY-ness, which again means that there is little risk for subtle bugs deep down in your framework. Remember that a high cohesion and high DRY-ness lead to a large portion of the code being extremely relevant within the whole framework, which again means that any low-level bug will pretty much blow up the whole framework as it will lead to everything going wrong. If you do test-driven development, you’ll be rewarded by noticing immediately that your silly mistake fails 90% of your test cases.

A real-world example

Let’s use jOOQ as an example to illustrate this, as we’re already comparing Hibernate and ActiveJDBC. Some of the longest stack traces in a database access abstraction can be achieved by putting a breakpoint at the interface of that abstraction with JDBC. For instance, when fetching data from a JDBC ResultSet.

Utils.getFromResultSet(ExecuteContext, Class<T>, int) line: 1945
Utils.getFromResultSet(ExecuteContext, Field<U>, int) line: 1938
CursorImpl$CursorIterator$CursorRecordInitialiser.setValue(AbstractRecord, Field<T>, int) line: 1464
CursorImpl$CursorIterator$CursorRecordInitialiser.operate(AbstractRecord) line: 1447
CursorImpl$CursorIterator$CursorRecordInitialiser.operate(Record) line: 1
RecordDelegate<R>.operate(RecordOperation<R,E>) line: 119
CursorImpl$CursorIterator.fetchOne() line: 1413
CursorImpl$CursorIterator.next() line: 1389
CursorImpl$CursorIterator.next() line: 1
CursorImpl<R>.fetch(int) line: 202
CursorImpl<R>.fetch() line: 176
SelectQueryImpl<R>(AbstractResultQuery<R>).execute(ExecuteContext, ExecuteListener) line: 274
SelectQueryImpl<R>(AbstractQuery).execute() line: 322
T_2698Record(UpdatableRecordImpl<R>).refresh(Field<?>...) line: 438
T_2698Record(UpdatableRecordImpl<R>).refresh() line: 428
H2Test.testH2T2698InsertRecordWithDefault() line: 931

Compared to ActiveJDBC’s stack traces, that’s quite a bit more, but still less compared to Hibernate (which uses lots of reflection and instrumentation). And it involves rather cryptic inner classes with quite a bit of method overloading. How to interpret that? Let’s go through this, bottom-up (or top-down in the stack trace)

CursorRecordInitialiser

The CursorRecordInitialiser is an inner class that encapsules the initialisation of a Record by a Cursor, and it ensures that relevant parts of the ExecuteListener SPI are covered at a single place. It is the gateway to JDBC’s various ResultSet methods. It is a generic internal RecordOperation implementation that is called by…

RecordDelegate

… a RecordDelegate. While the class name is pretty meaningless, its purpose is to shield and wrap all direct record operations in a way that a central implementation of the RecordListener SPI can be achieved. This SPI can be implemented by client code to listen to active record lifecycle events. The price for keeping the implementation of this SPI DRY is a couple of elements on the stack trace, as such callbacks are the standard way to implement closures in the Java language. But keeping this logic DRY guarantees that no matter how a Record is initialised, the SPI will always be invoked. There are (almost) no forgotten corner-cases.

But we were initialising a Record in…

CursorImpl

… a CursorImpl, an implementation of a Cursor. This might appear odd, as jOOQ Cursors are used for “lazy fetching”, i.e. for fetching Records one-by-one from JDBC.

On the other hand, the SELECT query from this stack trace simply refreshes a single UpdatableRecord, jOOQ’s equivalent of an active record. Yet, still, all the lazy fetching logic is executed just as if we were fetching a large, complex data set. This is again to keep things DRY when fetching data. Of course, around 6 levels of stack trace could have been saved by simply reading the single record as we know there can be only one. But again, any subtle bug in the cursor will likely show up in some test case, even in a remote one like the test case for refreshing records.

Some may claim that all of this is wasting memory and CPU cycles. The opposite is more likely to be true. Modern JVM implementations are so good with managing and garbage-collecting short-lived objects and method calls, the slight additional complexity imposes almost no additional work to your runtime environment.

TL;DR: Long stack traces may indicate high cohesion and DRY-ness

The claim that a long stack trace is a bad thing is not necessarily correct. A long stack trace is what happens, when complex frameworks are well implemented. Complexity will inevitably lead to “leaky abstractions”. But only well-designed complexity will lead to long stack traces.

Conversely, short stack traces can mean two things:

  • Lack of complexity: The framework is simple, with few features. This matches Igor’s claim for ActiveJDBC, as he is advertising ActiveJDBC as a “simple framework”.
  • Lack of cohesion and DRY-ness: The framework is poorly written, and probably has poor test coverage and lots of bugs.

Tree data structures

As a final note, it’s worth mentioning that another case where long stack traces are inevitable is when tree structures / composite pattern structures are traversed using visitors. Anyone who has ever debugged XPath or XSLT will know how deep these traces are.

Why Staying in Control of Your SQL is so Important

Lots of blog posts and research papers are written about the topics of scaling up and scaling out. This interesting blog post, for instance, sheds some light on the two strategies with respect to physical maintenance costs, such as cooling and electricity consumption. Certainly non-negligible aspects for very large systems.

But before solving problems at a very big scale, consider much simpler SQL tuning mechanisms. Very often, when you have a bottleneck in your application, it is at the database layer. This fact is used by many NoSQL evangelists to promote their products, claiming that scaling out is much easier with NoSQL databases. This might be true, but ask yourself: Do you need a system that works under heavy load? Or is your bottleneck a performance bottleneck?

In other words: Do you have a 5’000’000-concurrent-users problem? Or do you have a request-takes-more-than-3-seconds problem? Because if you suffer from the latter, you probably do not need to scale out nor up. Your “traditional” architecture is probably quite fine, but your database / SQL queries aren’t. The popular use-the-index-luke.com website features an interesting article about the two top performance problems caused by ORM tools. They are:

  • The infamous N+1 selects problem
  • The hardly-known Index-Only Scan

Both problems result from the fact that popular ORMs generate SQL code in a way that can hardly be influenced by developers. People often claim that tools like Hibernate generate better SQL than the average developer. This is true to an extent that the average developer might never care to actually learn to write better SQL. Which in turn leads to the above problems.

Hibernate is very good at generating 70% of your application’s boring CRUD SQL. At the same time, Hibernate never claimed to be a replacement for SQL as you will have to resort to native SQL in 30% of the time. Should you then use Hibernate’s native SQL API? Or an alternative like jOOQ?

The important thing is to get back in control of your SQL when performance matters. Know your indexes. Know your database meta data. And use a tool that allows you to write precisely the SQL statement you want. Learning better SQL will help you save lots of money on operations costs as you might not need to scale out nor up.