Look no Further! The Final Answer to “Where to Put Generated Code?”

This recent question on Stack Overflow made me think.

Why does jOOQ suggest to put generated code under “/target” and not under “/src”?

… and I’m about to give you the final answer to “Where to Put Generated Code?”

This isn’t only about jOOQ

Even if you’re not using jOOQ, or if you’re using jOOQ but without the code generator, there might be some generated source code in your project. There are many tools that generate source code from other data, such as:

  • The Java compiler (ok, byte code, not strictly source code. But still code generation)
  • XJC, from XSD files
  • Hibernate from .hbm.xml files, or from your schema
  • Xtend translates Xtend code to Java code
  • You could even consider data transformations, like XSLT
  • many more…

In this article, we’re going to look at how to deal with jOOQ-generated code, but the same thoughts apply also to any other type of code generated from other code or data.

Now, the very very interesting strategic question that we need to ask ourselves is: Where to put that code? Under version control, like the original data? Or should we consider generated code to be derived code that must be re-generated all the time?

The answer is nigh…

It depends!

Nope, unfortunately, as with many other flame-wary discussions, this one doesn’t have a completely correct or wrong answer, either. There are essentially two approaches:

Considering generated code as part of your code base

When you consider generated code as part of your code base, you will want to:

  • Check in generated sources in your version control system
  • Use manual source code generation
  • Possibly use even partial source code generation

This approach is particularly useful when your Java developers are not in full control of or do not have full access to your database schema (or your XSD or your Java code, etc.), or if you have many developers that work simultaneously on the same database schema, which changes all the time. It is also useful to be able to track side-effects of database changes, as your checked-in database schema can be considered when you want to analyse the history of your schema.

With this approach, you can also keep track of the change of behaviour in the jOOQ code generator, e.g. when upgrading jOOQ, or when modifying the code generation configuration.

When you use this approach, you will treat your generated code as an external library with its own lifecycle.

The drawback of this approach is that it is more error-prone and possibly a bit more work as the actual schema may go out of sync with the generated schema.

Considering generated code as derived artefacts

When you consider generated code to be derived artefacts, you will want to:

  • Check in only the actual DDL, i.e. the “original source of truth” (e.g. controlled via Flyway)
  • Regenerate jOOQ code every time the schema changes
  • Regenerate jOOQ code on every machine – including continuous integration machines, and possibly, if you’re crazy enough, on production

This approach is particularly useful when you have a smaller database schema that is under full control by your Java developers, who want to profit from the increased quality of being able to regenerate all derived artefacts in every step of your build.

This approach is fully supported by Maven, for instance, which foresees special directories (e.g. target/generated-sources), and phases (e.g. <phase>generate-sources</phase>) specifically for source code generation.

The drawback of this approach is that the build may break in perfectly “acceptable” situations, when parts of your database are temporarily unavailable.

Pragmatic approach

Some of you might not like that answer, but there is also a pragmatic approach, a combination of both. You can consider some code as part of your code base, and some code as derived. For instance, jOOQ-meta’s generated sources (used to query the dictionary views / INFORMATION_SCHEMA when generating jOOQ code) are put under version control as few jOOQ contributors will be able to run the jOOQ-meta code generator against all supported databases. But in many integration tests, we re-generate the sources every time to be sure the code generator works correctly.

Huh!

Conclusion

I’m sorry to disappoint you. There is no final answer to whether one approach or the other is better. Pick the one that offers you more value in your specific situation.

In case you’re choosing your generated code to be part of the code base, read this interesting experience report on the jOOQ User Group by Witold Szczerba about how to best achieve this.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s