Using Testcontainers to Generate jOOQ Code

Database first is at the core of jOOQ’s design. jOOQ has been made primarily for classic systems the database is always there and always has been and will never leave. This is because we think “data have mass”

This not only translates to moving logic closer to the data (see our previous posts about the cost of JDBC round trips or generating vendor agnostic procedural logic), but also avoiding moving data around between migrations (e.g. of RDBMS products).

Compared to how “heavy” data is, applications and UIs come and go. Speaking of go, maybe you’ll replace all of your Java code tomorrow for some go code. But you will keep the database if it isn’t trivial.

With this in mind, jOOQ assumes you have a pre-existing schema, which you mange with Flyway or Liquibase, and then you use jOOQ to reverse engineer your updated schema using the code generator.

The old days

In the old days, setting up an Oracle instance was very heavy, and also hard. I remember working at a company where we had shared development and test instances. The schema was always in flux. We couldn’t assume a stable dev version.

As such, pointing the jOOQ code generator towards a live database used to be a bit of a challenge, which is why jOOQ offers alternative, connection-free code generation modes, including:

  • The JPADatabase, if you have a pre-existing JPA entity based meta model.
  • The XMLDatabase, if you have some form of XML version of your schema, which you can XSL transform to jOOQ’s format
  • The DDLDatabase, which can interpret your DDL scripts, e.g. the ones you pass to Flyway, or the ones produced by pg_dump.
  • The LiquibaseDatabase, which simulates a Liquibase database migration and uses the simulated database output as a source for meta information of the code generator

But all of the above have the same limitation. You can’t really use many vendor-specific features, such as advanced stored procedures, data types, etc.

A modern approach using testcontainers

Ideally, unless you’re supporting several RDBMS products (most people don’t), you should work only with your production database product, say PostgreSQL.

Thanks to testcontainers.org, it’s very easy to programmatically, or configuratively, start up a PostgreSQL instance of any version in a Docker container. If you have a SQL script that contains your database, you can supply it to the testcontainers JDBC URL, e.g. like this:

jdbc:tc:postgresql:13:///sakila?TC_TMPFS=/testtmpfs:rw&TC_INITSCRIPT=file:${basedir}/src/main/resources/postgres-sakila-schema.sql

For more information, see their docs about JDBC support. Now, add the testcontainers dependency on your project classpath, e.g.

<dependency>
  <groupId>org.testcontainers</groupId>
  <artifactId>postgresql</artifactId>
</dependency>

And use the ContainerDatabaseDriver instead of the actual PostgreSQL driver for your code generation configuration in jOOQ, e.g. when using Maven:

<plugin>
  <groupId>org.jooq</groupId>
  <artifactId>jooq-codegen-maven</artifactId>

  <executions>
    <execution>
      <id>java-generator</id>
      <phase>generate-sources</phase>
      <goals>
        <goal>generate</goal>
      </goals>

      <configuration>
        <jdbc>
          <driver>
            org.testcontainers.jdbc.ContainerDatabaseDriver
          </driver>
          <url>${db.url}</url>
          <user>${db.username}</user>
          <password>${db.password}</password>
        </jdbc>
        <generator>
          <database>
            <inputSchema>public</inputSchema>
          </database>
          <target>
            <packageName>org.jooq.example.tc.db</packageName>
            <directory>src/main/java</directory>
          </target>
        </generator>
      </configuration>
    </execution>
  </executions>

  <dependencies>

    <!-- Junit seems a transitive dependency of testcontainers? -->
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.13.1</version>
    </dependency>
  </dependencies>
</plugin>

As simple as that! Check out the jOOQ-testcontainers-example for a runnable example that uses testcontainers for code generation using the above approach.

Adding database change management

A real world example would be using again Flyway or Liquibase, etc. to apply a complete database migration to your PostgreSQL instance inside of testcontainers, prior to generating code and/or running your integration tests.

This just slightly complicates things, but doesn’t produce any impossible problems. Instead of creating throwaway containers with a single TC_INITSCRIPT, you will now have to make sure the following steps are executed consecutively in your build somehow:

  1. A testcontainers instance of your database is started
  2. A Flyway or Liquibase migration is run inside of that database
  3. The jOOQ code generator reverse engineers that database
  4. Optionally, your integration tests also reuse the database of that container

Of course you should integration test your code! But for the sake of this discussion, that might be an optional step, as you may have different preferences on how to run those tests, e.g. more globally than just for this module. But in our example, let’s include the tests.

You can find the full example using testcontainers/flyway/jOOQ here.

Start the testcontainers instance

Unfortunately, testcontainers doesn’t ship any Maven / Gradle plugins yet to invoke container lifecycle management during a build. I’ve created a feature request for that here, which you should upvote: https://github.com/testcontainers/testcontainers-java/issues/4397.

But we can easily help ourselves by using the ever powerful Maven escape hatch that is the groovy-maven-plugin (the ideas are the same for gradle):

<plugin>
  <groupId>org.codehaus.gmaven</groupId>
  <artifactId>groovy-maven-plugin</artifactId>
  <version>2.1.1</version>
  <executions>
    <execution>
      <!-- Start the container in any phase before the actual code
           generation is required, i.e. at the latest in
           generate-sources -->
      <phase>generate-sources</phase>
      <goals>
        <goal>execute</goal>
      </goals>
      <configuration>
        <source>
          db = new org.testcontainers.containers.PostgreSQLContainer(
                  "postgres:latest")
            .withUsername("${db.username}")
            .withDatabaseName("postgres")
            .withPassword("${db.password}");
            
          db.start();

          // After you've started the container, collect its generated
          // JDBC URL (which contains a random port)
          project.properties.setProperty('db.url', db.getJdbcUrl());
        </source>
      </configuration>
    </execution>
  </executions>
  
  <dependencies>
    <dependency>
      <groupId>org.testcontainers</groupId>
      <artifactId>postgresql</artifactId>
      <version>1.15.2</version>
    </dependency>
  </dependencies>
</plugin>

So, that starts a container and keeps it running until the build terminates. I won’t show a graceful shutdown, because it’s not needed for the example, but you could implement that as well, of course.

Now, migrate your database

The above database is empty. Now to run the migration, in the example using Flyway, but it will be the same thing with Liquibase.

<plugin>
  <groupId>org.flywaydb</groupId>
  <artifactId>flyway-maven-plugin</artifactId>
  <version>7.14.0</version>
  <executions>
    <execution>

      <!-- We run the migration in the same phase, before jOOQ's
           code generation -->
      <phase>generate-sources</phase>
      <goals>
        <goal>migrate</goal>
      </goals>
      <configuration>

        <!-- This URL has been set by groovy, above -->
        <url>${db.url}</url>
        <user>${db.username}</user>
        <password>${db.password}</password>
        <locations>
          <location>
            filesystem:src/main/resources/db/migration
          </location>
        </locations>
      </configuration>
    </execution>
  </executions>
</plugin>

Add all the additional complexity of your migration in this configuration if you like. jOOQ wouldn’t know anything about it.

Now, generate the code

Again, nothing special here:

<plugin>
  <groupId>org.jooq</groupId>
  <artifactId>jooq-codegen-maven</artifactId>

  <executions>
    <execution>
      <id>java-generator</id>

      <!-- Same phase as above, but the previous plugins have already
           executed, so we're generating the db post migration -->
      <phase>generate-sources</phase>
      <goals>
        <goal>generate</goal>
      </goals>

      <configuration>
        <jdbc>

          <!-- Again, this URL has been set by groovy, above -->
          <url>${db.url}</url>
          <user>${db.username}</user>
          <password>${db.password}</password>
        </jdbc>
        <generator>
          <database>
            <inputSchema>public</inputSchema>
          </database>
          <target>
            <packageName>org.jooq.example.db</packageName>
          </target>
        </generator>
      </configuration>
    </execution>
  </executions>
</plugin>

And finally, optionally, integration test

If you want to re-use the above container with migrated database also in your integration tests, you could just pass along the generated JDBC URL to the maven-surefire-plugin as follows:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-surefire-plugin</artifactId>
  <configuration>
    <systemPropertyVariables>

      <!-- Again, this URL has been set by groovy, above -->
      <db.url>${db.url}</db.url>
      <db.username>${db.username}</db.username>
      <db.password>${db.password}</db.password>
    </systemPropertyVariables>
  </configuration>
</plugin>

There are many ways to achieve the same thing, this is one of them that works decently out of the box. You can check out a full example from github here, and play around with it:

https://github.com/jOOQ/jOOQ/tree/main/jOOQ-examples/jOOQ-testcontainers-flyway-example

More about testcontainers

To learn more about testcontainers, see our interview with Richard North here.

jOOQ Tuesdays: Richard North Makes Database Testing More Reproducible with Testcontainers

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

I’m very excited to feature today Richard North, creator of Testcontainers, a promising new database and UI testing tool for reproducible tests.

Hi Richard – you work at Skyscanner, which means you have tons of travel data to work with. What’s the most exciting thing about working with this data?

Skyscanner is a really data-led company, using data at all levels in decision-making. The volume of data we gather and process really helps us understand what travellers want and serve them better. For example, our destination recommender system helps with people discover new and interesting places to go, based upon a vast amount of data, algorithms and experiments. But it’s interesting how this varies from recommendations in internet media companies – there are far fewer possible destinations than books, songs and movies, yet users’ reasons for travelling and tastes can be more nuanced and varied.

There’s yet more data-oriented work under the surface, too – for example, the infrastructure needed to gather and analyse such large amounts of data, and the running of experiments to help us improve. There are a lot of smart people working hard to make this all happen, and it’s an exciting place to be!

You’ve created an increasingly popular testing framework, Testcontainers. What made you do it? What itch does it scratch?

Well, I think it’s something that scratches several itches at the same time – things that previously we only had isolated solutions to. The common element, though, is reproducibility of test environments. Of all my time developing and writing tests for JVM-based systems, it’s always been the non-JVM dependencies that caused the most complexity, unreliability and maintenance overhead.

I remember my first day as a developer, years ago: I was given a desktop machine and 2 days’ worth of step-by-step instructions that I needed to follow – just so that I’d be able to develop and run tests with all dependencies in place. A few months later I had to repeat the same task many times over when building new CI servers.

A lot has changed since then in terms of how we deploy and manage our production infrastructure, and thankfully Docker has done a lot to further bring prod-parity to developers’ machines.

Testcontainers started out as my effort to bring the full power of Docker to integrated testing on the JVM in two areas that I’ve experienced the most pain: testing against a clean, representative database, and making browser-based selenium testing more reproducible, both for developers and on CI.

Mostly being curious about testing databases, your documentation mentions Testcontainers as an alternative to using H2 as a test database. What are the disadvantages of emulating a database e.g. with H2? Did you make any personal experience with that?

Yes, definitely – it was one of the tipping point factors that triggered me to create Testcontainers. I do think H2 is a fantastic piece of work in what it manages to deliver, and it’s something I’ve used on a number of projects to good effect.

However, compatibility with real databases has often been a sticking point. Back in 2015, before I started Testcontainers, we were struggling with a few MySQL features that didn’t have equivalents in H2. We were facing the unpleasant prospect of having to constrain our implementation to what H2 would allow us to test against. It became fairly obvious that there was a gap in the market for an H2-like tool that was actually a facade to a Docker-based database container – and Testcontainers was born.

How do you think of mocking the database at any layer, including the DAO layer, service layer, etc.?

I’m all in favour of keeping tests small, light and layered, and using mocks to accomplish this. This might sound strange coming from somebody who has developed an integrated testing tool, but it’s true!

Still, I feel that we need to be pragmatic about how we approach automated tests and how we make sure we’re testing the right thing – especially when crossing boundaries. Are we testing how this code behaves against reality, or are we testing against our own (potentially false) understanding of how external components work?

My feeling is that it’s quite straightforward to mock layers of your system that you yourself wrote, or where you can easily jump into the source code, a spec or documentation. With an external component, you can still produce a mock that behaves how you expect, or how you witness the real thing behaving. But does that mock continue to represent the real thing, especially after accretion of other features, or the additional perils of state that a database entails – schema changes and actual data?

My ideal is to mock the data access layer for consumption by higher layers, but to be quite careful about what the data access layer itself talks to in my tests. It should probably be a real database. Hopefully Testcontainers is one tool that helps make this particular thing a little less painful – so that when you find yourself needing to do this, there’s a way to do it easily.

What’s the biggest challenge you’ve faced when testing databases, or other things?

It’s not databases, but I’d say that by far the hardest testing challenge I’ve faced as a developer is mobile apps, especially iOS. I’ve always enjoyed mobile development  as a whole, but when switching from a Java server-side/web project to mobile, it really feels like you’re going back in time. Some of the challenges are harder – such as asynchronicity and platform APIs that make it harder to structure software in a testable way. But it also feels like the tooling is much further behind, and until quite recently received far less attention. I feel the net result has been that developers have been discouraged from investing in automated tests, which is sad given that we know how valuable they can be.

Things are getting better, but I do greatly prefer the testing aspects of working on server-side JVM projects. For all its difficulties, we are actually quite lucky!