Don’t Overdo the “Principle of Least Astonishment” Cargo Cult

As we all agree, GOTO is evil, right? Relevant XKCD

Or even funnier: New Intern Knows Best

Of course, GOTO isn’t evil

Of course, somewhere deep down in our professional selves, we know that GOTO isn’t evil, it’s just a very basic processor instruction that was available since the early days of assembly code. GOTO is everywhere. Take the following Java code, for instance:

public class Goto {
    public static void main(String[] args) {
        if (args.length == 0)
            System.out.println("No args");
        else
            System.out.println("Args");
    }
}

The bytecode that is generated from the above logic is this:

  // Method descriptor #15 ([Ljava/lang/String;)V
  // Stack: 2, Locals: 1
  public static void main(java.lang.String[] args);
     0  aload_0 [args]
     1  arraylength
     2  ifne 16
     5  getstatic java.lang.System.out : java.io.PrintStream [16]
     8  ldc  [22]
    10  invokevirtual java.io.PrintStream.println(java.lang.String) : void [24]
    13  goto 24
    16  getstatic java.lang.System.out : java.io.PrintStream [16]
    19  ldc  [30]
    21  invokevirtual java.io.PrintStream.println(java.lang.String) : void [24]
    24  return

There are two jumps. Byte code (like assembly code) doesn’t know anything about local scope nesting. There’s just a stream of instructions and we’re jumping around them. If goto were possible in Java (the keyword is reserved, so who knows about a future Java…?), we could write this:

public class Goto {
    public static void main(String[] args) {
        if (args.length != 0)
            goto args;

        System.out.println("No args");
        goto end;

        args:
        System.out.println("Args");

        end:
        return;
    }
}

Of course, the if-else construct is more readable and less error prone. Some of the biggest problems we’ve had with true GOTO instructions in the past is that they tend to lead to spaghetti code. In particular, notice how we’re not jumping from the “args statement” to the “end statement”. We’re just “falling through” like in a “bad” switch statement:

switch (i) {
    case 1:
    case 2:
        System.out.println("1 or 2");
        // no break
    case 3:
        System.out.println("3 (or 1 or 2!)");
}

If you will, the switch statement is very similar to a set of GOTO instructions.

The argument today is about readability, not executability

The case against “classic” GOTO is a case in favour of correctness.

These days, however, we don’t have to squeeze our code into 16KB of RAM anymore, these problems are problems of the past. So we’ve started “engineering” our code (whatever that means 😉 ) according to “best practices”, most of which involve more humans than machines. We don’t really need “GOTO” anymore, since we have higher forms of control flow abstractions.

Java has “GOTO” in the form of labelled statements and break and continue statements. For instance:

Jumping forward

label: {
  // do stuff
  if (check) break label;
  // do more stuff
}

In bytecode:

2  iload_1 [check]
3  ifeq 6          // Jumping forward
6  ..

Jumping backward

label: do {
  // do stuff
  if (check) continue label;
  // do more stuff
  break label;
} while(true);

In bytecode:

 2  iload_1 [check]
 3  ifeq 9
 6  goto 2          // Jumping backward
 9  ..

Just recently, I’ve blogged about labelled statements making a case against extracting everything into a method. Why? Because a labelled statement is a (bit unusual) way of creating a local “method”. Squint really hard for a moment and let your imagination run wild with this:

Labelled statement Method
Label name Method name
Local variables prior to label Method arguments
Break statement Return statement
Continue statement Recursion (remember: squint 🙂 )

So, if we admit the above, then labelled statements in Java are almost like one-shot local methods (oh how I wish Java had actual local methods). There’s no risk of doing things wrong like with GOTO. There’s no risk of any accidents in jumping to the wrong location, because we still have locally nested scope and everything is well defined. So, there’s no correctness argument against using labelled statements.

Principle of least astonishment

Why aren’t they used more often? Probably, because we rarely write such complicated imperative code where labelled statements become really useful. I do from time to time, especially inside jOOQ’s parser logic. Labelled continue is especially useful to break out of an inner loop, continuing the outer loop.

Labelled breaks (as mentioned in that previous article) are less commonly useful. Usually, it is easy to refactor an if-else branch into a more readable version (e.g. by inverting if and else) than by breaking out of it. But sometimes, breaking is the more readable version, e.g. because it leads to a more consistent indentation where each branch is at the same level of indentation.

Remember the analogy with methods? Break is the same as return. Every time you want to return early from a method, you could break early out of a statement. Why not? Does it have to be a new method every time? Certainly not. Sometimes, jumping to a method is more distractive than jumping locally (again, as long as Java doesn’t have local methods).

The important thing here is that simplicity and readability, just like beauty, are in the eye of the beholder. We’re in an area where we cannot clearly say that something is correct or wrong. Better or worse. Because again, unlike GOTO, breaking out of labelled statements yields no correctness risk.

So, people start arguing in favour of the “principle of least astonishment” (especially in the comments of the DZone version of my post). Sure. Labelled breaks are a bit harder to read, because we hardly ever see them in our code. That doesn’t mean they’re bad. We hardly ever use synchronized these days (with JDK library support having become much better), but that doesn’t mean synchronized is bad and shouldn’t be used. It’s rare, and thus we might get astonished. We don’t usually use exceptions as control flow signals, but why shouldn’t we? It is important to remind ourselves that the principle of least astonishment is a general guideline that allows for exceptions.

After all, astonishment is what makes us stop and think a bit more thoroughly about code. We might even learn something new, like the fact that we can break out of labelled statements.

I like Ryan James Spencer’s way of putting it:

Learning vim… It has become a joke that programmers don’t know how to exit vim:

(of course, it’s simply “:q!”, how hard can it be 😉 )

The point is, yes this is surprising, astonishing, unexpected. If you’re not using vim frequently. If you’re using it frequently, it’s obvious, just like any other funky IDE keyboard shortcut. Or like using the middle mouse button to close tabs in almost any program (that was a revelation).

In order to be productive, we shouldn’t be astonished constantly. We should mostly be in our technical / infrastructure comfort zones, because most of us write business logic, and we want to spend our precious time and brain cells on that, not on the infrastructure logic. Everyone likes SQL. No one likes working around funky JDBC edge cases.

Another interesting reply by Chris Martin:

Absolutely. That was the point of my previous article. Breaking out of labelled statements is cool every now and then. It might be astonishing if you don’t write complex algorithms (like parsers) every day. It is not at all astonishing if you do.

So, please. Be open. Stop cargo culting. Use labelled statements and break out of them. Every once in a while, when you think that makes your code clearer.

Don’t Extract Everything Into a Method

Every now and then, I tweet something like this, just to piss off some clean coders:

Apart from the obvious trolling factor (why can’t I ever resist?), I do think there’s something thought provoking in such a tweet.

First off, given how rare break and continue statements are in Java code, many people probably don’t know that you can break out of an if-else block, or any labelled block, for that matter.

Secondly, we’ve been following cargo cults about clean code so much that we completely forget about some more obscure language features of our favourite languages that would be so handy in edge cases. Like these ones!

Breaking out of loops

Most people know how this works:

for (int i = 0;; i++)

    // Stop at 42
    if (i == 42)
        break;

    // Ignore even numbers
    else if (i % 2 == 0)
        continue;

    // Main action
    else
        System.out.println("i = " + i);

Of course, this code is bogus just to illustrate the syntax. A much simpler, refactored, equivalent version is this:

// Stop criteria in loop, duh
for (int i = 0; i < 42; i++)

    // Inverse if and else for even numbers
    if (i % 2 != 0)
        System.out.println("i = " + i);

This refactoring was obvious and trivial because the loop is so simple. But sometimes it isn’t, or sometimes the break or continue statement makes a loop just simpler to read (“simplicity” like “beauty” is in the eye of the beholder, of course). Parsers are the most obvious example, but also, for example, consider jOOQ’s AbstractRecord.compareTo() method, which contains this loop:

for (int i = 0; i < size(); i++) {
    final Object thisValue = get(i);
    final Object thatValue = that.get(i);

    if (thisValue == null && thatValue == null)
        continue;
    else if (thisValue == null)
        return 1;
    else if (thatValue == null)
        return -1;
    else {
        // ... Actually interesting comparisons
    }
}

// If we got through the above loop, the two records are equal
return 0;

jOOQ Records are compared with each other like SQL Records (if they are comparable, i.e. of equal degree): From the the first attribute to the last.

The interesting bit here is the continue statement, that skips to the next attribute, if both attributes are null.

Of course, if you are a feverish hater of such explicit, goto-style control flow, you would probably refactor this to something “much better”, namely:

for (int i = 0; i < size(); i++) {
    final Object thisValue = get(i);
    final Object thatValue = that.get(i);

    if (!(thisValue == null && thatValue == null)) {
        if (thisValue == null)
            return 1;
        else if (thatValue == null)
            return -1;
        else {
            // ... Actually interesting comparisons
        }
    }
}

// If we got through the above loop, the two records are equal
return 0;

Drawbacks:

  • There’s one more level of indentation
  • The if-else branch series is no longer “regular”, i.e. some conditions lead to quite different control flow than others
  • This worked fine for a single continue statement, but we might complicate this loop and add a second continue statement, or better: A nested loop that also contains break or continue statements…

With loops, the choice of using break or continue is sometimes “obvious”. Refactoring the imperative programming style to something more “elegant” might take quite a while of thinking, introduce regressions, and (ghasp) make the code more complex and less readable. Yes, I said it. Sometimes, imperative style is simpler (not better, just simpler).

Of course, this isn’t necessarily true in this particular example, so don’t haunt me 😉

Breaking out of an if

While breaking out of loops (or continuing them from in the middle) is somewhat common in complicated loops, breaking out of an if is certainly less common. In fact, you can break out of any labeled statement in Java (label-less breaks are possible with loops only):

// No-op
label1:
break label1;

// More interesting, Russian Roulette:
label2: {
    System.out.println("Pulling trigger");

    if (Math.random() < 0.16) {
        System.out.println("Click");
        break label2;
    }

    System.out.println("Whew");
}

So, why not break out of an if-else?

public class FunWithBreak {
    public static void main(String[] args) {
        label:
        if (args.length > 0) {
            System.out.println("We got args!");

            for (String arg : args)
                if ("ignore-args".equals(arg))
                    break label;

            for (String arg : args)
                System.out.println("Arg : " + arg);
        }
        else {
            System.out.println("No args");
        }

        System.out.println("End of program");
    }
}

Now run this program from the commandline

> java FunWithBreak
No args
End of program

> java FunWithBreak a b c
We got args!
Arg : a
Arg : b
Arg : c
End of program

> java FunWithBreak a b c ignore-args
We got args!
End of program

It’s kind of neat, no? The break in the middle of the if statement. It makes sense because there’s an early abort condition for the if-block, and only for the if-block.

Alternatives

Someone responded to my original tweet, that I should refactor the code and factor out a method for the if block, which would allow for replacing the break by return. Here’s how that would work:

public class FunWithBreak {
    public static void main(String[] args) {
        if (args.length > 0)
            nameThisLater(args);
        else
            System.out.println("No args");

        System.out.println("End of program");
    }

    // TODO find a meaningful name for this method later
    private static void nameThisLater(String[] args) {
        System.out.println("We got args!");

        for (String arg : args)
            if ("ignore-args".equals(arg))
                return;

        for (String arg : args)
            System.out.println("Arg : " + arg);
    }
}

Now, that certainly looks more common, which doesn’t mean it’s better. We’ve now factored out a useless method that is called only exactly once, and that is inlined (hopefully) by the JIT to become the original code again.

The advantage is that our original main() method got a bit simpler to read (once we’ve found a meaningful name for the method), but again, simplicity is in the eye of the beholder. But the price we pay here is that we have to pass all local state as method arguments. Simple in this case (single local variable args), but that can become quite complex, especially when we have local mutable collections that need to be passed around.

Besides, others might argue that one shouldn’t return from the middle of a method. They are proponents of “single-return-statement methods”.

Sure, why not. Here’s how that would look:

public class FunWithBreak {
    public static void main(String[] args) {
        if (args.length > 0) {
            ifBlock(args);
        }
        else {
            System.out.println("No args");
        }

        System.out.println("End of program");
    }

    private static void ifBlock(String[] args) {
        System.out.println("We got args!");

        boolean ignoreArgs = false;
        for (String arg : args) {
            if ("ignore-args".equals(arg)) {
                ignoreArgs = true;
                break; // We need break again!
            }
        }

        if (!ignoreArgs)
            for (String arg : args)
                System.out.println("Arg : " + arg);
    }
}

Is it really better?

I don’t know. I’ll just stick with the original break-out-of-if.

Bonus

Surprise your coworkers with an even more unusual placement of your label. Between the if statement and the ensuing block. After all, the curly braces are just an ordinary block, not part of the if statement

public class FunWithBreak {
    public static void main(String[] args) {
        if (args.length > 0) label: {
            System.out.println("We got args!");

            for (String arg : args)
                if ("ignore-args".equals(arg))
                    break label;

            for (String arg : args)
                System.out.println("Arg : " + arg);
        }
        else {
            System.out.println("No args");
        }

        System.out.println("End of program");
    }
}

Second bonus

Now that we’ve gotten all hooked on imperative control flow… What does this print?

for (int i = 0; i < 3; i++) {

    hmmm:
    try {
        System.out.println("Try      : " + i);
        if (i % 2 == 0)
            break hmmm;
        else
            throw new RuntimeException("Hmmm");
    }
    finally {
        System.out.println("Finally  : " + i);
        if (i < 2)
            continue;
    }

    System.out.println("Loop end : " + i);
}

Solution (select text below)

Try      : 0
Finally  : 0
Try      : 1
Finally  : 1
Try      : 2
Finally  : 2
Loop end : 2