Please, Java. Do Finally Support Multiline String Literals

I understand the idea of Java-the-language being rather hard to maintain in a backwards-compatible way. I understand the idea of JDK API, such as the collections, to be rather tough not to break. Yes.

I don’t understand why Java still doesn’t have multiline string literals.

How often do you write JDBC code (or whatever other external language or markup, say, JSON or XML you want to embed in Java) like this?

try (PreparedStatement s = connection.prepareStatement(
    "SELECT * "
  + "FROM my_table "
  + "WHERE a = b "
)) {
    ...
}

What’s the issue?

  • Syntax correctness, i.e. don’t forget to add a whitespace at the end of each line
  • Style in host language vs style in external language, sure the above code looks “nicely” formatted in Java, but it’s not formatted for the consuming server side
  • SQL injection, didn’t we teach our juniors not to perform this kind of string concatenation in SQL, to prevent SQL injection? Sure, the above is still safe, but what keeps a less experienced maintainer from embedding, accidentally, user input?

Today, I was working with some code written in Xtend, a very interesting language that compiles into Java source code. Xtend is exceptionally useful for templating (e.g. for generating jOOQ’s Record1Record22 API). I noticed another very nice feature of multi line strings:

The lack of need for escaping!

Multi line strings in Xtend are terminated by triple-apostrophes. E.g.

// Xtend
val regex = '''import java\.lang\.AutoCloseable;'''

Yes, the above is a valid Java regular expression. I’m escaping the dots when matching imports of the AutoCloseable type. I don’t have to do this tedious double-escaping that I have to do in ordinary strings to tell the Java compiler that the backslash is really a backslash, not Java escaping of the following character:

// Java
String regex = "import java\\.lang\\.AutoCloseable;";

So… Translated to our original SQL example, I would really like to write this, instead:

try (PreparedStatement s = connection.prepareStatement(
    '''SELECT *
       FROM my_table
       WHERE a = b'''
)) {
    ...
}

With a big nice-to-have plus: String interpolation (even PHP has it)!

String tableName = "my_table";
int b = 1;
try (PreparedStatement s = connection.prepareStatement(
    '''SELECT *
       FROM ${tableName}
       WHERE a = ${b}'''
)) {
    ...
}

Small but very effective improvement

This would be a very small (in terms of language complexity budget: Just one new token) but very effective improvement for all of us out there who are embedding an external language (SQL, XML, XPath, Regex, you name it) in Java. We do that a lot. And we hate it.

It doesn’t have to be as powerful as Xtend’s multiline string literals (which really rock with their whitespace management for formatting, and templating expressions). But it would be a start.

Please, make this a New Year’s resolution! :)

Look no Further! The Final Answer to “Where to Put Generated Code?”

This recent question on Stack Overflow made me think.

Why does jOOQ suggest to put generated code under “/target” and not under “/src”?

… and I’m about to give you the final answer to “Where to Put Generated Code?”

This isn’t only about jOOQ

Even if you’re not using jOOQ, or if you’re using jOOQ but without the code generator, there might be some generated source code in your project. There are many tools that generate source code from other data, such as:

  • The Java compiler (ok, byte code, not strictly source code. But still code generation)
  • XJC, from XSD files
  • Hibernate from .hbm.xml files, or from your schema
  • Xtend translates Xtend code to Java code
  • You could even consider data transformations, like XSLT
  • many more…

In this article, we’re going to look at how to deal with jOOQ-generated code, but the same thoughts apply also to any other type of code generated from other code or data.

Now, the very very interesting strategic question that we need to ask ourselves is: Where to put that code? Under version control, like the original data? Or should we consider generated code to be derived code that must be re-generated all the time?

The answer is nigh…

It depends!

Nope, unfortunately, as with many other flame-wary discussions, this one doesn’t have a completely correct or wrong answer, either. There are essentially two approaches:

Considering generated code as part of your code base

When you consider generated code as part of your code base, you will want to:

  • Check in generated sources in your version control system
  • Use manual source code generation
  • Possibly use even partial source code generation

This approach is particularly useful when your Java developers are not in full control of or do not have full access to your database schema (or your XSD or your Java code, etc.), or if you have many developers that work simultaneously on the same database schema, which changes all the time. It is also useful to be able to track side-effects of database changes, as your checked-in database schema can be considered when you want to analyse the history of your schema.

With this approach, you can also keep track of the change of behaviour in the jOOQ code generator, e.g. when upgrading jOOQ, or when modifying the code generation configuration.

When you use this approach, you will treat your generated code as an external library with its own lifecycle.

The drawback of this approach is that it is more error-prone and possibly a bit more work as the actual schema may go out of sync with the generated schema.

Considering generated code as derived artefacts

When you consider generated code to be derived artefacts, you will want to:

  • Check in only the actual DDL, i.e. the “original source of truth” (e.g. controlled via Flyway)
  • Regenerate jOOQ code every time the schema changes
  • Regenerate jOOQ code on every machine – including continuous integration machines, and possibly, if you’re crazy enough, on production

This approach is particularly useful when you have a smaller database schema that is under full control by your Java developers, who want to profit from the increased quality of being able to regenerate all derived artefacts in every step of your build.

This approach is fully supported by Maven, for instance, which foresees special directories (e.g. target/generated-sources), and phases (e.g. <phase>generate-sources</phase>) specifically for source code generation.

The drawback of this approach is that the build may break in perfectly “acceptable” situations, when parts of your database are temporarily unavailable.

Pragmatic approach

Some of you might not like that answer, but there is also a pragmatic approach, a combination of both. You can consider some code as part of your code base, and some code as derived. For instance, jOOQ-meta’s generated sources (used to query the dictionary views / INFORMATION_SCHEMA when generating jOOQ code) are put under version control as few jOOQ contributors will be able to run the jOOQ-meta code generator against all supported databases. But in many integration tests, we re-generate the sources every time to be sure the code generator works correctly.

Huh!

Conclusion

I’m sorry to disappoint you. There is no final answer to whether one approach or the other is better. Pick the one that offers you more value in your specific situation.

In case you’re choosing your generated code to be part of the code base, read this interesting experience report on the jOOQ User Group by Witold Szczerba about how to best achieve this.

Why Everyone Hates Operator Overloading

… no, don’t tell me you like Perl. Because you don’t. You never did. It does horrible things. It makes your code look like…

Designing Perl 6
Designing Perl 6 as found on the awesome Perl Humour page

Perl made heavy use of operator overloading and used operators for a variety of things. A similar tendency can be seen in C++ and Scala. See also people comparing the two. So what’s wrong with operator overloading?

People never agreed whether Scala got operator overloading right or wrong:

Usually, people then cite the usual suspects, such as complex numbers (getting things right):

class Complex(val real:Int, 
              val imaginary:Int) {
    def +(operand:Complex):Complex = {
        new Complex(real + operand.real, 
                    imaginary + operand.imaginary)
    }
 
    def *(operand:Complex):Complex = {
        new Complex(real * operand.real - 
                    imaginary * operand.imaginary,
            real * operand.imaginary + 
            imaginary * operand.real)
    }
}

The above will now allow for adding and multiplying complex numbers, and there’s absolutely nothing wrong with that:

val c1 = new Complex(1, 2)
val c2 = new Complex(2, -3)
val c3 = c1 + c2
 
val res = c1 + c2 * c3

But then, there are these weirdo punctuation things that make average programmers simply go mad:

 ->
 ||=
 ++=
 <=
 _._
 ::
 :+=

Don’t believe it? Check out this graph library!

To the above, we say:

Operator Overloading? Meh

How operator overloading should be

Operator overloading can be good, but mostly isn’t. In Java, we’re all missing better ways to interact with BigDecimal and similar types:

// How it is:
bigdecimal1.add(bigdecimal2.multiply(bigdecimal3));

// How it should be:
bigdecimal1 + bigdecimal2 * bigdecimal3

Of course, operator precedence would take place as expected. Unlike C++ or Scala, ideal operator overloading would simply map common operators to common method names. Nothing more. No one really wants API developers to come up with fancy ##-%>> operators.

While Ceylon, Groovy, and Xtend implemented this in a somewhat predictable and useful way, Kotlin is probably the language that has implemented the best standard operator overloading mechanism into their language. Their documentation states:

Binary operations

Expression Translated to
a + b a.plus(b)
a – b a.minus(b)
a * b a.times(b)
a / b a.div(b)
a % b a.mod(b)
a..b a.rangeTo(b)

That looks pretty straightforward. Now check this out:

“Array” access

Symbol Translated to
a[i] a.get(i)
a[i, j] a.get(i, j)
a[i_1, …, i_n] a.get(i_1, …, i_n)
a[i] = b a.set(i, b)
a[i, j] = b a.set(i, j, b)
a[i_1, …, i_n] = b a.set(i_1, …, i_n, b)

Now, I really don’t see a single argument against the above. This goes on, and unfortunately, Java 8 has missed this train, as method references cannot be assigned to variables and invoked like JavaScript functions (although, that’s not too late for Java 9+):

Method calls

Symbol Translated to
a(i) a.invoke(i)
a(i, j) a.invoke(i, j)
a(i_1, …, i_n) a.invoke(i_1, …, i_n)

Simply beautiful!

Conclusion

We’ve recently blogged about Ceylon’s awesome language features. But the above Kotlin features are definitely a killer and would remove any other sorts of desires to introduce operator overloading in Java for good.

Let’s hope future Java versions take inspiration from Kotlin, a language that got operator overloading right.

Fast File System Operations with Xtend, Lambdas, and ThreadPools

Recently, I’ve blogged about 10 Subtle Best Practices when Coding Java, and I have mentioned that you should start writing SAMs (Single Abstract Method) now, in order to be prepared for Java 8. But there’s another language gem out there, which comes in handy every once in a while, and that’s Eclipse Xtend. Xtend is a “dialect” of the Java language, compiling into Java source code, which then compiles into byte code.

Here’s a quickie showing how easily recursive file system operations can be done with Xtend, Lambdas, and ThreadPools.

class Transform {

  // This is the thread pool performing
  // all the "hard" work
  static ExecutorService ex;

  def static void main(String[] args) {
    // Initialise the thread pool with
    // something meaningful
    ex = Executors::newFixedThreadPool(4);

    // Pass the root directory to the
    // transform method
    val in = new File(...);

    // Recurse into the file transformation
    transform(in);
  }

  def static transform(File in) {

    // Calculate the target file name
    val out = new File(...);

    // Recurse into directories
    if (in.directory) {

      // Pass a FileFilter in the form of an
      // Xtend lambda expression
      for (file : in.listFiles[path |
             !path.name.endsWith(".class")
          && !path.name.endsWith(".zip")
          && !path.name.endsWith(".jar")
        ]) {
        transform(file);
      }
    }
    else {
      // Pass an Xtend lambda expression to
      // the ExecutorService
      ex.submit[ |
        // Read and write could be implemented
        // in Apache Commons IO
        write(out, transform(read(in)));
      ];
    }
  }

  def static transform(String content) {
    // Do the actual string transformation
  }
}

Granted, with Java 8, we’ll get lambdas as well, and that’s awesome. But Xtend has a couple of other nice features that can be seen above:

  • Passing lambdas to a couple of JDK methods, such as File.listFiles() or ExecutorService.submit()
  • Local variable type inference using valvar, or for
  • Method return type inference using def
  • Ability to omit parentheses when passing a lambda to a method
  • Calling getters and setters by convention, e.g. path.name, instead of path.getName(), or in.directory, instead of in.isDirectory()
  • You could also omit semi-colons, although I don’t personally think that’s a good idea.

Xtend is Java 8 already now, which can be very useful for scripts like the above