Don’t Extract Everything Into a Method

Every now and then, I tweet something like this, just to piss off some clean coders:

Apart from the obvious trolling factor (why can’t I ever resist?), I do think there’s something thought provoking in such a tweet.

First off, given how rare break and continue statements are in Java code, many people probably don’t know that you can break out of an if-else block, or any labelled block, for that matter.

Secondly, we’ve been following cargo cults about clean code so much that we completely forget about some more obscure language features of our favourite languages that would be so handy in edge cases. Like these ones!

Breaking out of loops

Most people know how this works:

for (int i = 0;; i++)

    // Stop at 42
    if (i == 42)
        break;

    // Ignore even numbers
    else if (i % 2 == 0)
        continue;

    // Main action
    else
        System.out.println("i = " + i);

Of course, this code is bogus just to illustrate the syntax. A much simpler, refactored, equivalent version is this:

// Stop criteria in loop, duh
for (int i = 0; i < 42; i++)

    // Inverse if and else for even numbers
    if (i % 2 != 0)
        System.out.println("i = " + i);

This refactoring was obvious and trivial because the loop is so simple. But sometimes it isn’t, or sometimes the break or continue statement makes a loop just simpler to read (“simplicity” like “beauty” is in the eye of the beholder, of course). Parsers are the most obvious example, but also, for example, consider jOOQ’s AbstractRecord.compareTo() method, which contains this loop:

for (int i = 0; i < size(); i++) {
    final Object thisValue = get(i);
    final Object thatValue = that.get(i);

    if (thisValue == null && thatValue == null)
        continue;
    else if (thisValue == null)
        return 1;
    else if (thatValue == null)
        return -1;
    else {
        // ... Actually interesting comparisons
    }
}

// If we got through the above loop, the two records are equal
return 0;

jOOQ Records are compared with each other like SQL Records (if they are comparable, i.e. of equal degree): From the the first attribute to the last.

The interesting bit here is the continue statement, that skips to the next attribute, if both attributes are null.

Of course, if you are a feverish hater of such explicit, goto-style control flow, you would probably refactor this to something “much better”, namely:

for (int i = 0; i < size(); i++) {
    final Object thisValue = get(i);
    final Object thatValue = that.get(i);

    if (!(thisValue == null && thatValue == null)) {
        if (thisValue == null)
            return 1;
        else if (thatValue == null)
            return -1;
        else {
            // ... Actually interesting comparisons
        }
    }
}

// If we got through the above loop, the two records are equal
return 0;

Drawbacks:

  • There’s one more level of indentation
  • The if-else branch series is no longer “regular”, i.e. some conditions lead to quite different control flow than others
  • This worked fine for a single continue statement, but we might complicate this loop and add a second continue statement, or better: A nested loop that also contains break or continue statements…

With loops, the choice of using break or continue is sometimes “obvious”. Refactoring the imperative programming style to something more “elegant” might take quite a while of thinking, introduce regressions, and (ghasp) make the code more complex and less readable. Yes, I said it. Sometimes, imperative style is simpler (not better, just simpler).

Of course, this isn’t necessarily true in this particular example, so don’t haunt me 😉

Breaking out of an if

While breaking out of loops (or continuing them from in the middle) is somewhat common in complicated loops, breaking out of an if is certainly less common. In fact, you can break out of any labeled statement in Java (label-less breaks are possible with loops only):

// No-op
label1:
break label1;

// More interesting, Russian Roulette:
label2: {
    System.out.println("Pulling trigger");

    if (Math.random() < 0.16) {
        System.out.println("Click");
        break label2;
    }

    System.out.println("Whew");
}

So, why not break out of an if-else?

public class FunWithBreak {
    public static void main(String[] args) {
        label:
        if (args.length > 0) {
            System.out.println("We got args!");

            for (String arg : args)
                if ("ignore-args".equals(arg))
                    break label;

            for (String arg : args)
                System.out.println("Arg : " + arg);
        }
        else {
            System.out.println("No args");
        }

        System.out.println("End of program");
    }
}

Now run this program from the commandline

> java FunWithBreak
No args
End of program

> java FunWithBreak a b c
We got args!
Arg : a
Arg : b
Arg : c
End of program

> java FunWithBreak a b c ignore-args
We got args!
End of program

It’s kind of neat, no? The break in the middle of the if statement. It makes sense because there’s an early abort condition for the if-block, and only for the if-block.

Alternatives

Someone responded to my original tweet, that I should refactor the code and factor out a method for the if block, which would allow for replacing the break by return. Here’s how that would work:

public class FunWithBreak {
    public static void main(String[] args) {
        if (args.length > 0)
            nameThisLater(args);
        else
            System.out.println("No args");

        System.out.println("End of program");
    }

    // TODO find a meaningful name for this method later
    private static void nameThisLater(String[] args) {
        System.out.println("We got args!");

        for (String arg : args)
            if ("ignore-args".equals(arg))
                return;

        for (String arg : args)
            System.out.println("Arg : " + arg);
    }
}

Now, that certainly looks more common, which doesn’t mean it’s better. We’ve now factored out a useless method that is called only exactly once, and that is inlined (hopefully) by the JIT to become the original code again.

The advantage is that our original main() method got a bit simpler to read (once we’ve found a meaningful name for the method), but again, simplicity is in the eye of the beholder. But the price we pay here is that we have to pass all local state as method arguments. Simple in this case (single local variable args), but that can become quite complex, especially when we have local mutable collections that need to be passed around.

Besides, others might argue that one shouldn’t return from the middle of a method. They are proponents of “single-return-statement methods”.

Sure, why not. Here’s how that would look:

public class FunWithBreak {
    public static void main(String[] args) {
        if (args.length > 0) {
            ifBlock(args);
        }
        else {
            System.out.println("No args");
        }

        System.out.println("End of program");
    }

    private static void ifBlock(String[] args) {
        System.out.println("We got args!");

        boolean ignoreArgs = false;
        for (String arg : args) {
            if ("ignore-args".equals(arg)) {
                ignoreArgs = true;
                break; // We need break again!
            }
        }

        if (!ignoreArgs)
            for (String arg : args)
                System.out.println("Arg : " + arg);
    }
}

Is it really better?

I don’t know. I’ll just stick with the original break-out-of-if.

Bonus

Surprise your coworkers with an even more unusual placement of your label. Between the if statement and the ensuing block. After all, the curly braces are just an ordinary block, not part of the if statement

public class FunWithBreak {
    public static void main(String[] args) {
        if (args.length > 0) label: {
            System.out.println("We got args!");

            for (String arg : args)
                if ("ignore-args".equals(arg))
                    break label;

            for (String arg : args)
                System.out.println("Arg : " + arg);
        }
        else {
            System.out.println("No args");
        }

        System.out.println("End of program");
    }
}

Second bonus

Now that we’ve gotten all hooked on imperative control flow… What does this print?

for (int i = 0; i < 3; i++) {

    hmmm:
    try {
        System.out.println("Try      : " + i);
        if (i % 2 == 0)
            break hmmm;
        else
            throw new RuntimeException("Hmmm");
    }
    finally {
        System.out.println("Finally  : " + i);
        if (i < 2)
            continue;
    }

    System.out.println("Loop end : " + i);
}

Solution (select text below)

Try      : 0
Finally  : 0
Try      : 1
Finally  : 1
Try      : 2
Finally  : 2
Loop end : 2

Rare Uses of a “ControlFlowException”

Control flows are a “relict” from imperative programming, which has leaked into various other programming paradigms, including Java’s object oriented paradigm. Apart from the useful and ubiquitous branch and loop structures, there are also primitives (e.g. GOTO) and non-locals (e.g. exceptions). Let’s have a closer look at these controversial control flow techniques.

GOTO

goto is a reserved word in the Java language. goto is also a valid instruction in JVM bytecode. Yet, in Java, it isn’t easily possible to peform goto operations. One example taken from this Stack Overflow question can be seen here:

Jumping forward

label: {
    // do stuff
    if (check) break label;
    // do more stuff
}

In bytecode:

    2  iload_1 [check]
    3  ifeq 6          // Jumping forward
    6  ..

Jumping backward

label: do {
    // do stuff
    if (check) continue label;
    // do more stuff
    break label;
} while(true);

In bytecode:

     2  iload_1 [check]
     3  ifeq 9
     6  goto 2          // Jumping backward
     9  ..

Of course, these tricks are useful only in very very rare occasions, and even then, you might want to re-consider. Because we all know what happens when we use goto in our code:

One little goto...

Drawing taken from xkcd: http://xkcd.com/292/

Breaking out of control flows with exceptions

Exceptions are a good tool to break out of a control flow structure in the event of an error or failure. But regular jumping downwards (without error or failure) can also be done using exceptions:

try {
    // Do stuff
    if (check) throw new Exception();
    // Do more stuff
}
catch (Exception notReallyAnException) {}

This feels just as kludgy as the tricks involving labels, mentioned before.

Legitimate uses of exceptions for control flow:

However, there are some other very rare occasions, where exceptions are a good tool to break out of a complex, nested control flow (without error or failure). This may be the case when you’re parsing an XML document using a SAXParser. Maybe, your logic is going to test the occurrence of at least three <check/> elements, in case of which you may want to skip parsing the rest of the document. Here is how to implement the above:

Create a ControlFlowException:

package com.example;

public class ControlFlowException 
extends SAXException {}

Note that usually, you might prefer a RuntimeException for this, but the SAX contracts require handler implementations to throw SAXException instead.

Use that ControlFlowException in a SAX handler:

package com.example;

import java.io.File;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.helpers.DefaultHandler;

public class Parse {
  public static void main(String[] args) 
  throws Exception {
    SAXParser parser = SAXParserFactory
        .newInstance()
        .newSAXParser();

    try {
      parser.parse(new File("test.xml"),
          new Handler());
      System.out.println(
          "Less than 3 <check/> elements found.");
    } catch (ControlFlowException e) {
      System.out.println(
          "3 or more <check/> elements found.");
    }
  }

  private static class Handler 
  extends DefaultHandler {

    int count;

    @Override
    public void startElement(
        String uri, 
        String localName, 
        String qName,
        Attributes attributes) {
      
      if ("check".equals(qName) && ++count >= 3)
        throw new ControlFlowException();
    }
  }
}

When to use exceptions for control flow:

The above practice seems reasonable with SAX, as SAX contracts expect such exceptions to happen, even if in this case, they’re not exceptions but regular control flow. Here are some indications about when to use the above practice in real world examples:

  • You want to break out of a complex algorithm (as opposed to a simple block).
  • You can implement “handlers” to introduce behaviour into complex algorithms.
  • Those “handlers” explicitly allow throwing exceptions in their contracts.
  • Your use case does not pull the weight of actually refactoring the complex algorithm.

A real-world example: Batch querying with jOOQ

In jOOQ, it is possible to “batch store” a collection of records. Instead of running a single SQL statement for every record, jOOQ collects all SQL statements and executes a JDBC batch operation to store them all at once.

As each record encapsulates its generated SQL rendering and execution for a given store() call in an object-oriented way, it would be quite tricky to extract the SQL rendering algorithm in a reusable way, without breaking (or exposing) too many things. Instead, jOOQ’s batch operation implements this simple pseudo-algorithm:

// Pseudo-code attaching a "handler" that will
// prevent query execution and throw exceptions
// instead:
context.attachQueryCollector();

// Collect the SQL for every store operation
for (int i = 0; i < records.length; i++) {
  try {
    records[i].store();
  }

  // The attached handler will result in this
  // exception being thrown rather than actually
  // storing records to the database
  catch (QueryCollectorException e) {

    // The exception is thrown after the rendered
    // SQL statement is available
    queries.add(e.query());                
  }
}

A real-world example: Exceptionally changing behaviour

Another example from jOOQ shows how this technique can be useful to introduce exceptional behaviour that is applicable only in rare cases. As explained in issue #1520, some databases have a limitation regarding the number of possible bind values per statement. These are:

  • SQLite: 999
  • Ingres 10.1.0: 1024
  • Sybase ASE 15.5: 2000
  • SQL Server 2008: 2100

In order to circumvent this limitation, it will be necessary for jOOQ to inline all bind values, once the maximum has been reached. As jOOQ’s query model heavily encapsulates SQL rendering and variable binding behaviour by applying the composite pattern, it is not possible to know the number of bind values before traversing a query model tree. For more details about jOOQ’s query model architecture, consider this previous blog post:

https://blog.jooq.org/2012/04/10/the-visitor-pattern-re-visited

So the solution is to render the SQL statement and count bind values that are effectively going to be rendered. A canonical implementation would be this pseudo code:

String sql;

query.renderWith(countRenderer);
if (countRenderer.bindValueCount() > maxBindValues) {
  sql = query.renderWithInlinedBindValues();
}
else {
  sql = query.render();
}

As can be seen, a canonical implementation will need to render the SQL statement twice. The first rendering is used only to count the number of bind values, whereas the second rendering will generate the true SQL statement. The problem here is that the exceptional behaviour should only be put in place, once the exceptional event (too many bind values) occurs. A much better solution is to introduce a “handler” that counts bind values in a regular “rendering attempt”, throwing a ControlFlowException for those few exceptional “attempts” where the number of bind values exceeds the maximum:

// Pseudo-code attaching a "handler" that will
// abort query rendering once the maximum number
// of bind values was exceeded:
context.attachBindValueCounter();
String sql;
try {

  // In most cases, this will succeed:
  sql = query.render();
}
catch (ReRenderWithInlinedVariables e) {
  sql = query.renderWithInlinedBindValues();
}

The second solution is better, because:

  • We only re-render the query in the exceptional case.
  • We don’t finish rendering the query to calculate the actual count, but abort early for re-rendering. I.e. we don’t care if we have 2000, 5000, or 100000 bind values.

Conclusion

As with all exceptional techniques, remember to use them in the right moment. If in doubt, think again.