Applying Queueing Theory to Dynamic Connection Pool Sizing with FlexyPool

I’m very happy to have another interesting blog post by Vlad Mihalcea on the jOOQ blog, this time about his Open Source library flexypool. Read his previous jOOQ Tuesdays post on Hibernate here.

Vlad is a Hibernate developer advocate and he’s the author of the popular book High Performance Java Persistence, and he knows 1-2 things about connection pooling.

vladmihalcea

Introduction

Back in 2014, I was working as a software architect, and our team was building a real-estate platform which was composed of multiple nodes, as depicted in the following diagram:

databaseintegrationpoint

This is a classic enterprise architecture layout. The database is replicated to provide better throughout and availability in case of node failures. There are front-end nodes that deliver the website content. There are also many back-end nodes as well, like email schedulers or data import batch processors.

All these nodes require database connectivity, either to a Master node, for read-write transactions or to the Slave nodes, for read-only transactions.

Because acquiring database connections is an expensive process, each system node uses its own connection pool. By reusing physical database connections, the connection acquisition is very fast, therefore reducing the overall transaction response time.

Not only that a connection pool can reduce transaction response time, but it can level up traffic spikes as well. Without a connection pool, during a traffic spike, a front-end nodes might acquire all database connections, leaving the back-end processors with no database connectivity.

The connection pool, having a maximum number of database connections, allows the connections to queue whenever a traffic spike is happening. Therefore, during a traffic spike, the transaction response time will increase due to the queuing mechanism, but this is way better than taking down the whole system.

For these two reasons, the connection pool is a very good choice in many enterprise systems.

Based on the underlying hardware resources, a relational database can only offer a limited number of connections. For this reason, we must be very careful when choosing the pool size for each particular system node.

Connection pool sizing

I was the lucky person to get the task of figuring out how many connections should we allocate for each system node in our real-estate platform. Since I graduated Electronics and Telecommunications, I remembered that we learned about a similar problem when having to provision telecommunications networks. Agner Krarup Erlang invented Queuing theory for solving this problem, and I was curious if we could also find the right pool size by applying Erlang queuing models.

I was not the only one trying to apply the Queuing theory principles to software systems. Percona has a very interesting study: Forecasting MySQL Scalability with the actual service time in a system that is affected by a myriad of variables.

In the end, I realized that the best way to tackle this problem is to constant measuring and adjustments. For this reason, I needed a tool to capture database connection metrics, as well as a way to adjust a given connection pool while the enterprise system is running.

And, that’s how FlexyPool was born.

Basically, FlexyPool is a DataSource Proxy that stands in front of the actual JDBC DataSource or other proxies (e.g. statement logging).

datasourceproxyarchitecture

FlexyPool supports a great variety of stand-alone connection pools:

And it collects the following metrics:

  • concurrent connections histogram
  • concurrent connection requests histogram
  • data source connection acquiring time histogram
  • connection lease time histogram
  • maximum pool size histogram
  • total connection acquiring time histogram
  • overflow pool size histogram
  • retries attempts histogram

For instance, the concurrent connection count metric gives you an insight into how many connections are required by a certain application under a given traffic load:

concurrentconnectioncount

The connection acquisition metric tells you how much time it takes to obtain a database connection from the pool:

connectionacquire

The connection lease time allows you to spot long-running transactions, which are undesirable in high-performance OLTP applications:

connectionlease

For the stand-alone connection pools, FlexyPool can increment the pool size beyond the maximum capacity, as it offers an overflow buffer. The benefit of this overflow buffer is that it allows you to increase the pool size only when the incoming traffic causes a certain connection acquisition timeout.

Although FlexyPool can also monitor Java EE connection pools, it cannot increase the pool size in Java EE environments since the DataSource is an application server managed resource.

Conclusion

Because enterprise systems evolve, so does the underlying data access patterns. For this reasons, monitoring the underlying database connection usage is a very important metric, which needs to be monitored on a regular basis. FlexyPool builds on top of CodaHale and Dropwizard Metrics, so you can easily integrate it with well-known Application Performance Monitoring tools, such as Graphite or Grafana.

FlexyPool is open-source, and it uses an Apache license 2.0. You can find it the project repository on GitHub, and all the released dependencies are available on Maven Central, so it’s very easy to integrate it in your own project.

FkexyPool is powering many enterprise systems, like Etuovi, Mitch&Mates, and ScentBird. If you decide to use it in your current enterprise system, and you are willing to provide a testimonial, you can win a free copy of my High-Performance Java Persistence book.

jOOQ Tuesdays: Thorben Janssen Shares his Hibernate Performance Secrets

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

thorben-janssen

I’m very excited to feature today Thorben Janssen who has spent most of his professional life with Hibernate.

Thorben, with your blog and training, you are one of the few daring “annotatioficionados” as we like to call them, who risks diving deep into JPA’s more sophisticated annotations – like @SqlResultSetMapping. What is your experience with JPA’s advanced, declarative programming style?

From my point of view, the declarative style of JPA is great and a huge problem at the same time.

If you know what you’re doing, you just add an annotation, set a few properties and your JPA implementation takes care of the rest. That makes it very easy to use complex features and avoids a lot of boilerplate code.

But it can also become a huge issue, when someone is not that familiar with JPA and just copies a few annotations from stack overflow and hopes that it works.

It will work in most of the cases. JPA and Hibernate are highly optimized and handle suboptimal code and annotations quite well. At least as long as it is tested with one user on a local machine. But that changes quickly when the code gets deployed to production and several hundred or thousand users use it in parallel. These issues get then often posted on stack overflow or other forums together with a complaint about the bad performance of Hibernate…

Your training goes far beyond these rather esoteric use-cases and focuses on JPA / Hibernate performance. What are three things every ORM user should know about JPA / SQL performance?

Only three things? I could talk about a lot more things related to JPA and Hibernate performance.

The by far most important one is to remember that your ORM framework is using SQL to store your data in a relational database. That seems to be pretty obvious, but you can avoid the most common performance issues by analyzing and optimizing the executed SQL statements. One example for that is the popular n+1 select issue which you can easily find and fix as I show in my free, 3-part video course.

Another important thing is that no framework or specification provides a good solution for every problem. JPA and Hibernate make it very easy to insert and update data into a relational database. And they provide a set of advanced features for performance optimizations, like caching or the ordering of statements to improve the efficiency of JDBC batches.

But Hibernate and JPA are not a good fit for applications that have to perform a lot of very complex queries for reporting or data mining use cases. The feature set of JPQL is too limited for these use cases. You can, of course, use native queries to execute plain SQL, but you should have a look at other frameworks if you need a lot of these queries.

So, always make sure that your preferred framework is a good fit for your project.

The third thing you should keep in mind is that you should prefer lazy fetching for the relationships between your entities. This prevents Hibernate from executing additional SQL queries to initialize the relationships to other entities when it gets an entity from the database. Most use cases don’t need the related entities, and the additional queries slow down the application. And if one of your use cases uses the relationships, you can use FETCH JOIN statements or entity graphs to initialize them with the initial query.

This approach avoids the overhead of unnecessary SQL queries for most of your use cases and allows you to initialize the relationships if you need them.

These are the 3 most important things you should keep in mind, if you want to avoid performance problems with Hibernate. If you want to dive deeper into this topic, have a look at my Hibernate Performance Tuning Online Training. The next one starts on 23th July.

What made you focus your training mostly on Hibernate, rather than also on EclipseLink / OpenJPA, or just plain SQL / jOOQ? Do you have plans to extend to those topics?

To be honest, that decision was quite easy for me. I’m working with Hibernate for about 15 years now and used it in a lot of different projects with very different requirements. That gives me the experience and knowledge about the framework, which you need if you want to optimize its performance. I also tried EclipseLink but not to the same extent as Hibernate.

And I also asked my readers which JPA implementation they use, and most of them told me that they either use plain JPA or Hibernate. That made it pretty easy to focus on Hibernate.

I might integrate jOOQ into one of my future trainings. Because as I said before, Hibernate and JPA are a good solution if you want to create or update data or if your queries are not too complex. As soon as your queries get complex, you have to use native queries with plain SQL. In these cases, jOOQ can provide some nice benefits.

What’s the advantage of your online training over a more classic training format, where people meet physically – both for you and for your participants?

The good thing about a classroom training is that you can discuss your questions with other students and the instructor. But it also requires you to be in a certain place at a certain time which creates additional costs, requires you to get out of your current projects and keeps you away from home.

With the Hibernate Performance Tuning Online Training, I want to provide a similar experience to a classroom training in which you study with other students and ask your questions but without having to travel somewhere. You can watch my training videos and do the exercises from your office or home and meet with me, and other students in the forum or group coaching calls to discuss your questions.

So you get the best of both worlds without declaring any travel expenses 😉

Your blog also includes a weekly digest of all things happening in the Java ecosystem called Java Weekly. What are the biggest insights into our ecosystem that you’ve gotten out of this work, yourself?

The Java ecosystem is always changing and improving, and you need to learn constantly if you want to stay up to date. One way to do that is to read good blog posts. And there are A LOT of great, small blogs out there written by very experienced Java developers who like to share their knowledge. You just have to find them. That’s probably the biggest insight I got.

I read a lot about Java and Java EE each week (that’s probably the only advantage of a 1.5-hour commute with public transportation) and present the most interesting ones every Monday in a new issue of Java Weekly.

It is all about the JDBC Basics

We’re very happy to announce a guest post by Marco Behler, who has been blogging about jOOQ in the past.

img31Marco started out in programming (reverse-engineering, actually) and now mainly programmes on the JVM in his day-to-day work. He also always had a sweet tooth for strategy and marketing. Marco Behler GmbH is the result of that hybrid role.

It is all about the JDBC Basics

It is one of the days.

You are reading the Spring documentation’s @Transactional section and still don’t understand the difference between logical and physical transaction scopes. Simultaneously your app throws an
LazyInitializationException and you have no idea why. To top it off you see spontaneous database deadlocks in production and you suspect your connection pool is leaking connections..somehow.

Know what most likely would have helped instead of banging your head against the wall? Spending a couple (literally) of hours on learning the JDBC basics. Let’s find out why:

What are the JDBC basics?

The basics are opening up/closing database connections and then working with transactions. Also understanding how deadlocks, pessimistic and optimistic locking work on a plain JDBC level. A bit of isolation levels and savepoints and then directly on to connection pools and jdbc driver logging. That’s it. Seriously.

Why are the basics so important?

Everything you will encounter in frameworks like Spring, Hibernate, jOOQ etc. builds up on these basics. For example, there are a gazillion topics on the internet regarding Hibernate’s LazyInitializationException and I was scared of that particular exception myself many years ago. But what else would you expect trying to query the database without having a connection to the database open (which is basically all that this exeception is) ?

The same with Spring’s “transaction framework”. There is so much content, or shall we say (F)ear/(U)ncertainty/(D)oubt, out there on how to open up transactions with spring, be it programmatically, with annotations or xml. But what if you knew that under the hood, there is only one way (and actually one line of code) to open up transactions in the JDBC world?

Let me not even get started on the various (mis)configurations of connection pools you see in production in the wild. Or the unawareness of JDBC (driver) logging, which usually leads to debugging in the wild. All basics, which you can master in a couple of hours and which will help you for a lifetime!

Why do people not just learn the basics?

In every middle-sized project there is a ton of technologies involved and there usually is no clear-cut path on how to learn all of them or how they all work together. It simply takes a lot of time and effort to dig through everything.

There’s JPA sessions and JDBC connections and then Spring somehow provides those transactional proxies in 5 different ways and then some other colleague just put jOOQ into the mix, but then somehow my session doesn’t flush and my objects don’t get persisted and the HibernateTransactionManager is not working as expected.

With all of this, I would also hope for my database transactions just to commit – god forbid what happens on rollback 🙂

But in the end, everything technology mentioned is just a layer on top of JDBC. If you understand transactions or deadlocks or savepoints on the basic level, then Spring or Hibernate or jOOQ will not throw you off.

So what do you recommend ?

If you want to get miles ahead in your day-to-day database programming, you have to start with the basics. Step-by-Step. And then you will see most of your problems automatically evaporate.

Out of my extensive database programming experience, I created an ebook with a ton of ready-to-run exercises, which will take you from Java database novice to expert. At your own pace. You can literally copy the source code of every chapter into your IDE, run it and (hopefully) learn from it. It covers plain JDBC, Spring, Hibernate, jOOQ (soon) and also distributed transactions.

You can read the whole book for free online here, and I would love to get your feedback! I would really like to let the community feedback flow back into future editions of the book. In addition, If you like what you see and the exercises help you, you can also show your support by getting a paid digital version (pdf, epub, mobi).

In any case…

…learn your JDBC basics – and you will profit from them for a lifetime!

Type Safe Queries for JPA’s Native Query API

When you’re using JPA – sometimes – JPQL won’t do the trick and you’ll have to resort to native SQL. From the very beginning, ORMs like Hibernate kept an open “backdoor” for these cases and offered a similar API to Spring’s JdbcTemplate, to Apache DbUtils, or to jOOQ for plain SQL. This is useful as you can continue using your ORM as your single point of entry for database interaction.

However, writing complex, dynamic SQL using string concatenation is tedious and error-prone, and an open door for SQL injection vulnerabilities. Using a type safe API like jOOQ would be very useful, but you may find it hard to maintain two different connection, transaction, session models within the same application just for 10-15 native queries.

But the truth is:

You an use jOOQ for your JPA native queries!

That’s true! There are several ways to achieve this.

Fetching tuples (i.e. Object[])

The simplest way will not make use of any of JPA’s advanced features and simply fetch tuples in JPA’s native Object[] form for you. Assuming this simple utility method:

public static List<Object[]> nativeQuery(
    EntityManager em, 
    org.jooq.Query query
) {

    // Extract the SQL statement from the jOOQ query:
    Query result = em.createNativeQuery(query.getSQL());

    // Extract the bind values from the jOOQ query:
    List<Object> values = query.getBindValues();
    for (int i = 0; i < values.size(); i++) {
        result.setParameter(i + 1, values.get(i));
    }

    return result.getResultList();
}

Using the API

This is all you need to bridge the two APIs in their simplest form to run “complex” queries via an EntityManager:

List<Object[]> books =
nativeQuery(em, DSL.using(configuration)
    .select(
        AUTHOR.FIRST_NAME, 
        AUTHOR.LAST_NAME, 
        BOOK.TITLE
    )
    .from(AUTHOR)
    .join(BOOK)
        .on(AUTHOR.ID.eq(BOOK.AUTHOR_ID))
    .orderBy(BOOK.ID));

books.forEach((Object[] book) -> 
    System.out.println(book[0] + " " + 
                       book[1] + " wrote " + 
                       book[2]));

Agreed, not a lot of type safety in the results – as we’re only getting an Object[]. We’re looking forward to a future Java that supports tuple (or even record) types like Scala or Ceylon.

So a better solution might be the following:

Fetching entities

Let’s assume you have the following, very simple entities:

@Entity
@Table(name = "book")
public class Book {

    @Id
    public int id;

    @Column(name = "title")
    public String title;

    @ManyToOne
    public Author author;
}

@Entity
@Table(name = "author")
public class Author {

    @Id
    public int id;

    @Column(name = "first_name")
    public String firstName;

    @Column(name = "last_name")
    public String lastName;

    @OneToMany(mappedBy = "author")
    public Set<Book> books;
}

And let’s assume, we’ll add an additional utility method that also passes a Class reference to the EntityManager:

public static <E> List<E> nativeQuery(
    EntityManager em, 
    org.jooq.Query query,
    Class<E> type
) {

    // Extract the SQL statement from the jOOQ query:
    Query result = em.createNativeQuery(
        query.getSQL(), type);

    // Extract the bind values from the jOOQ query:
    List<Object> values = query.getBindValues();
    for (int i = 0; i < values.size(); i++) {
        result.setParameter(i + 1, values.get(i));
    }

    // There's an unsafe cast here, but we can be sure
    // that we'll get the right type from JPA
    return result.getResultList();
}

Using the API

This is now rather slick, just put your jOOQ query into that API and get JPA entities back from it – the best of both worlds, as you can easily add/remove nested collections from the fetched entities as if you had fetched them via JPQL:

List<Author> authors =
nativeQuery(em,
    DSL.using(configuration)
       .select()
       .from(AUTHOR)
       .orderBy(AUTHOR.ID)
, Author.class); // This is our entity class here

authors.forEach(author -> {
    System.out.println(author.firstName + " " + 
                       author.lastName + " wrote");
    
    books.forEach(book -> {
        System.out.println("  " + book.title);

        // Manipulate the entities here. Your
        // changes will be persistent!
    });
});

Fetching EntityResults

If you’re extra-daring and have a strange affection for annotations, or you just want to crack a joke for your coworkers just before you leave on vacation, you can also resort to using JPA’s javax.persistence.SqlResultSetMapping. Imagine the following mapping declaration:

@SqlResultSetMapping(
    name = "bookmapping",
    entities = {
        @EntityResult(
            entityClass = Book.class,
            fields = {
                @FieldResult(name = "id", column = "b_id"),
                @FieldResult(name = "title", column = "b_title"),
                @FieldResult(name = "author", column = "b_author_id")
            }
        ),
        @EntityResult(
            entityClass = Author.class,
            fields = {
                @FieldResult(name = "id", column = "a_id"),
                @FieldResult(name = "firstName", column = "a_first_name"),
                @FieldResult(name = "lastName", column = "a_last_name")
            }
        )
    }
)

Essentially, the above declaration maps database columns (@SqlResultSetMapping -> entities -> @EntityResult -> fields -> @FieldResult -> column) onto entities and their corresponding attributes. With this powerful technique, you can generate entity results from any sort of SQL query result.

Again, we’ll be creating a small little utility method:

public static <E> List<E> nativeQuery(
    EntityManager em, 
    org.jooq.Query query,
    String resultSetMapping
) {

    // Extract the SQL statement from the jOOQ query:
    Query result = em.createNativeQuery(
        query.getSQL(), resultSetMapping);

    // Extract the bind values from the jOOQ query:
    List<Object> values = query.getBindValues();
    for (int i = 0; i < values.size(); i++) {
        result.setParameter(i + 1, values.get(i));
    }

    // This implicit cast is a lie, but let's risk it
    return result.getResultList();
}

Note that the above API makes use of an anti-pattern, which is OK in this case, because JPA is not a type safe API in the first place.

Using the API

Now, again, you can pass your type safe jOOQ query to the EntityManager via the above API, passing the name of the SqlResultSetMapping along like so:

List<Object[]> result =
nativeQuery(em,
    DSL.using(configuration
       .select(
           AUTHOR.ID.as("a_id"),
           AUTHOR.FIRST_NAME.as("a_first_name"),
           AUTHOR.LAST_NAME.as("a_last_name"),
           BOOK.ID.as("b_id"),
           BOOK.AUTHOR_ID.as("b_author_id"),
           BOOK.TITLE.as("b_title")
       )
       .from(AUTHOR)
       .join(BOOK).on(BOOK.AUTHOR_ID.eq(AUTHOR.ID))
       .orderBy(BOOK.ID)), 
    "bookmapping" // The name of the SqlResultSetMapping
);

result.forEach((Object[] entities) -> {
    JPAAuthor author = (JPAAuthor) entities[1];
    JPABook book = (JPABook) entities[0];

    System.out.println(author.firstName + " " + 
                       author.lastName + " wrote " + 
                       book.title);
});

The result in this case is again an Object[], but this time, the Object[] doesn’t represent a tuple with individual columns, but it represents the entities as declared by the SqlResultSetMapping annotation.

This approach is intriguing and probably has its use when you need to map arbitrary results from queries, but still want managed entities. We can only recommend Thorben Janssen‘s interesting blog series about these advanced JPA features, if you want to know more:

Conclusion

Choosing between an ORM and SQL (or between Hibernate and jOOQ, in particular) isn’t always easy.

  • ORMs shine when it comes to applying object graph persistence, i.e. when you have a lot of complex CRUD, involving complex locking and transaction strategies.
  • SQL shines when it comes to running bulk SQL, both for read and write operations, when running analytics, reporting.

When you’re “lucky” (as in – the job is easy), your application is only on one side of the fence, and you can make a choice between ORM and SQL. When you’re “lucky” (as in – ooooh, this is an interesting problem), you will have to use both. (See also Mike Hadlow’s interesting article on the subject)

The message here is: You can! Using JPA’s native query API, you can run complex queries leveraging the full power of your RDBMS, and still map results to JPA entities. You’re not restricted to using JPQL.

Side-note

While we’ve been critical with some aspects of JPA in the past (read How JPA 2.1 has become the new EJB 2.0 for details), our criticism has been mainly focused on JPA’s (ab-)use of annotations. When you’re using a type safe API like jOOQ, you can provide the compiler with all the required type information easily to construct results. We’re convinced that a future version of JPA will engage more heavily in using Java’s type system, allowing a more fluent integration of SQL, JPQL, and entity persistence.

jOOQ Tuesdays: Vlad Mihalcea Gives Deep Insight into SQL and Hibernate

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

vlad_mihalcea

We have the pleasure of talking to Vlad Mihalcea in this third edition who will be telling us about the skills developers need to acquire when working with Java, SQL, and Hibernate.

Hi Vlad – You’re blog explodes with excellent posts about Hibernate. It looks like you love digging deep into the most popular persistence API in the market, right?

I really mean when saying that “teaching is my way of learning” and to master a certain technology, you have to go beyond the reference documentation. Hibernate has been around for 10 years now and there’s a plethora of projects built on top of it. The Hibernate Master Class focuses on some proven ORM design patterns, like concurrency control, caching and batching.

You’ve recently told me about your realisation of the lack of SQL insight in our industry. How did that come to be?

The Object-Relational mismatch is only the tip of the iceberg, when it comes to accessing data. The biggest problem we face in enterprise systems, is the Enterprise-Database developer mismatch.

A developer knows about the programming languages, design patterns and application architecturing, but database skills are always attributed to the Database Administrator role. This is a very dangerous assumption.

It’s as if we developed on Linux without ever wanting to learn how the operating system works, relying solely on the System Administrator knowledge. If you develop enterprise applications, you have no escape but learning how a database works. Reading the excellent “SQL Performance Explained” book, made me realize how little I knew about the inner-workings of relational database systems. This book is meant for developers and it’s a must-read for every enterprise developer professional.

What can we do to improve the situation for our industry? Is there a chance for a tighter integration of JPA and SQL? Or specifically, of Hibernate and jOOQ?

First, it’s the mindset that needs to change. We need to acknowledge that there’s no such thing as a one-size-fits-all framework, and that applies to database access as well. When I write unit tests, I don’t limit myself to JUnit. I also use Mockito and Hamcrest, a testing stack being a better alternative.

JPA excels when writing data, because you can the INSERT/UPDATE statements are automatically updated, whenever the persistence model changes. The implicit and explicit locking allow us to protect against lost updates, especially in long conversation workflows.

But while abstracting the SQL write statements is a doable task, when it comes to reading data, nothing can beat native SQL. The most commonly-used RDBMS have implemented non-standard data access techniques (window functions, Common Table Expressions, PIVOT) and the SQL-92 JPA abstraction layer can only focus on common functionalities. That’s why native querying is unavoidable on almost any enterprise system.

jOOQ has done a very good job promoting SQL knowledge into the Java ecosystem. Java is ruling the enterprise software development and SQL skills have always been the Achilles heel of most enterprise development teams.

While you can fire native queries from JPA, there’s no support for dynamic native query building. jOOQ allows you to build type-safe dynamic native queries, strengthening your application against SQL-injection attacks. jOOQ can be integrated with JPA, as I already proven on my blog, and the JPA-jOOQ combo can provide a solid data access stack.

Tell us a little bit about your Hibernate Master Class, and your personal blogging strategy.

The Hibernate Master Class blog series is actually a book in the making. Because I work a full-time job, it’s difficult to commit to a fixed writing schedule, so I can only write as much as my spare times allows me.

Once all topics are covered, I’ll turn all this info into a book, that I’m going to self-publish, following the “SQL Performance Explained” example.

[ Edit ] The book has been finished and is available here:

https://leanpub.com/high-performance-java-persistence

Where will you be in 5 years?

I enjoy both software architecture, as well as writing about it. I will continue on this journey and see where the wind will carry me.

jOOQ vs. Hibernate: When to Choose Which

Hibernate has become a de-facto standard in the Java ecosystem, and after the fact, also an actual JavaEE standard implementation if standards matter to you, and if you put the JCP on the same level with ISO, ANSI, IEEE, etc.

This article does not intended to discuss standards, but visions. Hibernate shares JPA’s vision of ORM. jOOQ shares SQL’s vision of powerful querying, so for the sake of the argument, let’s use Hibernate / JPA / ORM interchangeably, much like jOOQ / JDBC / SQL.

The question why should anyone not use Hibernate these days always shows up frequently – precisely because Hibernate is a de-facto standard, and the first framework choice in many other frameworks such as Grails (which uses GORM, which again uses Hibernate).

However, even Gavin King, the creator of Hibernate, doesn’t believe that Hibernate should be used for everything:

gavin-king

If that’s the case, are there any objective decision helping points that you could consider, when to use an ORM and when to use SQL?

Discussing on a high level

First off, let’s bring this discussion to a higher level. Instead of deciding between Hibernate and jOOQ as concrete implementations of their own domains, let’s think about ORM vs. SQL, and their different use-cases.

When deciding between an ORM (e.g. Hibernate) and SQL (e.g. jOOQ), the driving question that you should ask yourself is not the question of project complexity. Some of our most demanding customers are using jOOQ on medium-sized schemas with thousands of tables / views. Often, those schemas are extremely normalised and sometimes even deployed on as many as six different RDBMS. jOOQ was specifically designed to work in these scenarios, while keeping the simple use-case in mind as well.

So, instead of thinking about project complexity, ask yourself the following questions:

  • 1. Will your data model drive your application design, or will your application design drive your data model(s)?

    A main aspect here is the question whether you “care” about your database in the sense of whether it might survive your application. Very often, applications come and go. They may be re-written in Python / JavaScript, etc. 5 years down the line. Or you have multiple applications accessing the same database: Your Java application, some Perl scripts, stored procedures, etc. If this is the case, database design is a priority in your project, and jOOQ works extremely well in these setups.

    If you don’t necessarily “care” about your database in the sense that you just want to “persist” your Java domain somewhere, and this happens to be a relational database, then Hibernate might be a better choice – at least in early stages of your project, because you can easily generate your database schema from your Entity model.

  • 2. Will you do mostly complex reading and simple writing, or will you engage in complex writing?

    SQL really shines when reading is complex. When you join many tables, when you aggregate data in your database, when you do reporting, when you do bulk reading and writing. You think of your data in terms of set theory, e.g. your data as a whole. Writing CRUD with SQL is boring, though. This is why jOOQ also provides you with an ActiveRecord-style API that handles the boring parts, when you’re operating on single tables (Jason mentioned this).

    If, however, your writing becomes complex, i.e. you have to load a complex object graph with 20 entities involved into memory, perform optimistic locking on it, modify it in many different ways and then persist it again in one go, then SQL / jOOQ will not help you. This is what Hibernate has originally been created for.

Opinion

I believe that data is forever. You should *always* assume that your database survives your application. It is much easier to rewrite (parts of) an application than to migrate a database. Having a clean and well-designed database schema will always pay off down the line of a project, specifically of a complex project. See also our previous article about the fallacy of “schemaless” databases.

Also, most projects really do 90% reading and 10% writing, writing often not being complex (2-3 tables modified within a transaction). This means that most of the time, the complexity solved by Hibernate / JPA’s first and second level caches is not needed. People often misunderstand these features and simply turn off caching, flushing Hibernate’s cache to the server all the time, and thus using Hibernate in the wrong way.

If, however, you’re undecided about the above two axes of decision, you can go the middle way and use jOOQ only for reporting, batch processing, etc. and use Hibernate for your CRUD – in a CQRS (Command Query Responsibility Segregation: http://martinfowler.com/bliki/CQRS.html) style. There are also quite a few jOOQ users who have chosen this path.

Further reading

10 Java Articles Everyone Must Read

One month ago, we’ve published a list of 10 SQL Articles Everyone Must Read. A list of articles that we believe would add exceptional value to our readers on the jOOQ blog. The jOOQ blog is a blog focusing on both Java and SQL, so it is only natural that today, one month later, we’re publishing an equally exciting list of 10 Java articles everyone must read.

Note that by “must read”, we may not specifically mean the particular linked article only, but also other works from the same authors, who have been regular bloggers over the past years and never failed to produce new interesting content!

Here goes…

1. Brian Goetz: “Stewardship: the Sobering Parts”

The first blog post is actually not a blog post but a recording of a very interesting talk by Brian Goetz on Oracle’s stewardship of Java. On the jOOQ blog, we’ve been slightly critical about 1-2 features of the Java language in the past, e.g. when comparing it to Scala, or Ceylon.

Brian makes good points about why it would not be a good idea for Java to become just as “modern” as quickly as other languages. A must-watch for every Java developer (around 1h)

2. Aleksey Shipilёv: The Black Magic of (Java) Method Dispatch

In recent years, the JVM has seen quite a few improvements, including invokedynamic that arrived in Java 7 as a prerequisite for Java 8 lambdas, as well as a great tool for other, more dynamic languages built on top of the JVM, such as Nashorn.

invokedynamic is only a small, “high level” puzzle piece in the advanced trickery performed by the JVM. What really happens under the hood when you call methods? How are they resolved, optimised by the JIT? Aleksey’s article sub-title reveals what the article is really about:

“Everything you wanted to know about Black Deviously Surreptitious Magic in low-level performance engineering”

Definitely not a simple read, but a great post to learn about the power of the JVM.

Read Aleksey’s “The Black Magic of (Java) Method Dispatch

3. Oliver White: Java Tools and Technologies Landscape for 2014

We’re already in 2015, but this report by Oliver White (at the time head of ZeroTurnaround’s RebelLabs) had been exceptionally well executed and touches pretty much everything related to the Java ecosystem.

Read Oliver’s “Java Tools and Technologies Landscape for 2014

4. Peter Lawrey: Java Lambdas and Low Latency

When Aleksey has introduced us to some performance semantics in the JVM, Peter takes this one step further, talking about low latency in Java 8. We could have picked many other useful little blog posts from Peter’s blog, which is all about low-latency, high performance computing on the JVM, sometimes even doing advanced off-heap trickery.

Read Peter’s “Java Lambdas and Low Latency

5. Nicolai Parlog: Everything You Need To Know About Default Methods

Nicolai is a newcomer in the Java blogosphere, and a very promising one, too. His well-researched articles go in-depth about some interesting facts related to Java 8, digging out old e-mails from the expert group’s mailing list, explaining the decisions they made to conclude with what we call Java 8 today.

Read Nicolai’s “Everything You Need To Know About Default Methods

6. Lukas Eder: 10 Things You Didn’t Know About Java

This list wouldn’t be complete without listing another list that we wrote ourselves on the jOOQ blog. Java is an old beast with 20 years of history this year in 2015. This old beast has a lot of secrets and caveats that many people have forgotten or never thought about. We’ve uncovered them for you:

Read Lukas’s “10 Things You Didn’t Know About Java

7. Edwin Dalorzo: Why There Is Interface Pollution in Java 8

Edwin has been responding to our own blog posts a couple of times in the past with very well researched and thoroughly thought through articles, in particular about Java 8 related features, e.g. comparing Java 8 Streams with LINQ (something that we’ve done ourselves, as well).

This particular article explains why there are so many different and differently named functional interfaces in Java 8.

Read Edwin’s “Why There Is Interface Pollution in Java 8

8. Vlad Mihalcea: How Does PESSIMISTIC_FORCE_INCREMENT Lock Mode Work

When Java talks to databases, many people default to using Hibernate for convenience (see also 3. Oliver White: Java Tools and Technologies Landscape for 2014). Hibernate’s main vision, however, is not to add convenience – you can get that in many other ways as well. Hibernate’s main vision is to provide powerful means of navigating and persisting an object graph representation of your RDBMS’s data model, including various ways of locking.

Vlad is an extremely proficient Hibernate user, who has a whole blog series on how Hibernate works going. We’ve picked a recent, well-researched article about locking, but we strongly suggest you read the other articles as well:

Read Vlad’s “How Does PESSIMISTIC_FORCE_INCREMENT Lock Mode Work

9. Petri Kainulainen: Writing Clean Tests

This isn’t a purely Java-related blog post, although it is written from the perspective of a Java developer. Modern development involves testing – automatic testing – and lots of it. Petri has written an interesting blog series about writing clean tests in Java – you shouldn’t miss his articles!

Read Petri’s “Writing Clean Tests

10. Eugen Paraschiv: Java 8 Resources Collection

If you don’t already have at least 9 open tabs with interesting stuff to read after this list, get ready for a browser tab explosion! Eugen Paraschiv who maintains baeldung.com has been collecting all sorts of very interesting resources related to Java 8 in a single link collection. You should definitely bookmark this collection and check back frequently for interesting changes:

Read Eugen’s “Java 8 Resources Collection

Many other articles

There are, of course, many other very good articles providing deep insight into useful Java tricks. If you find you’ve encountered an article that would nicely complement this list, please leave a link and description in the comments section. Future readers will appreciate the additional insight.