Don’t Overdo the “Principle of Least Astonishment” Cargo Cult

As we all agree, GOTO is evil, right? Relevant XKCD Or even funnier: New Intern Knows Best

Of course, GOTO isn’t evil

Of course, somewhere deep down in our professional selves, we know that GOTO isn’t evil, it’s just a very basic processor instruction that was available since the early days of assembly code. GOTO is everywhere. Take the following Java code, for instance:

public class Goto {
    public static void main(String[] args) {
        if (args.length == 0)
            System.out.println("No args");
        else
            System.out.println("Args");
    }
}

The bytecode that is generated from the above logic is this:
  // Method descriptor #15 ([Ljava/lang/String;)V
  // Stack: 2, Locals: 1
  public static void main(java.lang.String[] args);
     0  aload_0 [args]
     1  arraylength
     2  ifne 16
     5  getstatic java.lang.System.out : java.io.PrintStream [16]
     8  ldc  [22]
    10  invokevirtual java.io.PrintStream.println(java.lang.String) : void [24]
    13  goto 24
    16  getstatic java.lang.System.out : java.io.PrintStream [16]
    19  ldc  [30]
    21  invokevirtual java.io.PrintStream.println(java.lang.String) : void [24]
    24  return
There are two jumps. Byte code (like assembly code) doesn’t know anything about local scope nesting. There’s just a stream of instructions and we’re jumping around them. If goto were possible in Java (the keyword is reserved, so who knows about a future Java…?), we could write this:

public class Goto {
    public static void main(String[] args) {
        if (args.length != 0)
            goto args;

        System.out.println("No args");
        goto end;

        args:
        System.out.println("Args");

        end:
        return;
    }
}

Of course, the if-else construct is more readable and less error prone. Some of the biggest problems we’ve had with true GOTO instructions in the past is that they tend to lead to spaghetti code. In particular, notice how we’re not jumping from the “args statement” to the “end statement”. We’re just “falling through” like in a “bad” switch statement:

switch (i) {
    case 1:
    case 2:
        System.out.println("1 or 2");
        // no break
    case 3:
        System.out.println("3 (or 1 or 2!)");
}

If you will, the switch statement is very similar to a set of GOTO instructions.

The argument today is about readability, not executability

The case against “classic” GOTO is a case in favour of correctness. These days, however, we don’t have to squeeze our code into 16KB of RAM anymore, these problems are problems of the past. So we’ve started “engineering” our code (whatever that means ;) ) according to “best practices”, most of which involve more humans than machines. We don’t really need “GOTO” anymore, since we have higher forms of control flow abstractions. Java has “GOTO” in the form of labelled statements and break and continue statements. For instance: Jumping forward

label: {
  // do stuff
  if (check) break label;
  // do more stuff
}

In bytecode:
2  iload_1 [check]
3  ifeq 6          // Jumping forward
6  ..
Jumping backward

label: do {
  // do stuff
  if (check) continue label;
  // do more stuff
  break label;
} while(true);

In bytecode:
 2  iload_1 [check]
 3  ifeq 9
 6  goto 2          // Jumping backward
 9  ..
Just recently, I’ve blogged about labelled statements making a case against extracting everything into a method. Why? Because a labelled statement is a (bit unusual) way of creating a local “method”. Squint really hard for a moment and let your imagination run wild with this:
Labelled statementMethod
Label nameMethod name
Local variables prior to labelMethod arguments
Break statementReturn statement
Continue statementRecursion (remember: squint :) )
So, if we admit the above, then labelled statements in Java are almost like one-shot local methods (oh how I wish Java had actual local methods). There’s no risk of doing things wrong like with GOTO. There’s no risk of any accidents in jumping to the wrong location, because we still have locally nested scope and everything is well defined. So, there’s no correctness argument against using labelled statements.

Principle of least astonishment

Why aren’t they used more often? Probably, because we rarely write such complicated imperative code where labelled statements become really useful. I do from time to time, especially inside jOOQ’s parser logic. Labelled continue is especially useful to break out of an inner loop, continuing the outer loop. Labelled breaks (as mentioned in that previous article) are less commonly useful. Usually, it is easy to refactor an if-else branch into a more readable version (e.g. by inverting if and else) than by breaking out of it. But sometimes, breaking is the more readable version, e.g. because it leads to a more consistent indentation where each branch is at the same level of indentation. Remember the analogy with methods? Break is the same as return. Every time you want to return early from a method, you could break early out of a statement. Why not? Does it have to be a new method every time? Certainly not. Sometimes, jumping to a method is more distractive than jumping locally (again, as long as Java doesn’t have local methods). The important thing here is that simplicity and readability, just like beauty, are in the eye of the beholder. We’re in an area where we cannot clearly say that something is correct or wrong. Better or worse. Because again, unlike GOTO, breaking out of labelled statements yields no correctness risk. So, people start arguing in favour of the “principle of least astonishment” (especially in the comments of the DZone version of my post). Sure. Labelled breaks are a bit harder to read, because we hardly ever see them in our code. That doesn’t mean they’re bad. We hardly ever use synchronized these days (with JDK library support having become much better), but that doesn’t mean synchronized is bad and shouldn’t be used. It’s rare, and thus we might get astonished. We don’t usually use exceptions as control flow signals, but why shouldn’t we? It is important to remind ourselves that the principle of least astonishment is a general guideline that allows for exceptions. After all, astonishment is what makes us stop and think a bit more thoroughly about code. We might even learn something new, like the fact that we can break out of labelled statements. I like Ryan James Spencer’s way of putting it:
Learning vim… It has become a joke that programmers don’t know how to exit vim:
(of course, it’s simply “:q!”, how hard can it be ;) ) The point is, yes this is surprising, astonishing, unexpected. If you’re not using vim frequently. If you’re using it frequently, it’s obvious, just like any other funky IDE keyboard shortcut. Or like using the middle mouse button to close tabs in almost any program (that was a revelation). In order to be productive, we shouldn’t be astonished constantly. We should mostly be in our technical / infrastructure comfort zones, because most of us write business logic, and we want to spend our precious time and brain cells on that, not on the infrastructure logic. Everyone likes SQL. No one likes working around funky JDBC edge cases. Another interesting reply by Chris Martin:
Absolutely. That was the point of my previous article. Breaking out of labelled statements is cool every now and then. It might be astonishing if you don’t write complex algorithms (like parsers) every day. It is not at all astonishing if you do. So, please. Be open. Stop cargo culting. Use labelled statements and break out of them. Every once in a while, when you think that makes your code clearer.

6 thoughts on “Don’t Overdo the “Principle of Least Astonishment” Cargo Cult

  1. Isn’t this post shooting a bit past its mark? I always thought the “Principle of Least Astonishment” was something meant for the design level. For example, if your machine has three buttons, and the first is “make toast”, the second “make coffee”, then the third is NOT supposed to be “irrevocably call wreckers to destroy my house”. That would be against The Principle in question. This applies to function/API design too: “removeDirectory(File f)” would violate The Principle by behaving like this for example: f indicates a file: Does nothing. f indicates a directory: recursively deletes the directory. f is null: empties the whole disk. It’s about continuity and “staying within a general neighborhood of behavior”.

    There are other principles and aphorisms to tell the developer to avoid spaghetti code, avoid ambiguous and unreadable code, avoid code that Certainly Cannot be Used in High-Assurance Scenarios or otherwise avoid falling back to pre-Wirth-ian times.

    1. Sure, that’s another way of putting it. Yet, people try to apply this principle when they encounter a language feature that they don’t know, deducing from this particular ignorance that everyone would be astonished by said feature.

      1. Every time, I use a label, I feel a bit dirty. At the same time, I always prefer `continue` and `break` to an additional nesting level. Should I see a psychologist?

        And sure, not knowing the language is a good way to be astonished again and again. Thank you for clearing my ignorance concerning `break`ing out of a non-loop block. I just wonder what a `continue` would do. ;)

        I hope to see not only GOTO but also COMEFROM in Java soon. ;)

        1. I feel dirty when I use synchronized. But then I remember, feeling dirty (in Software) often correlates with doing something rarely, and thus not profiting from confirmation bias that it is the right thing™.

          Continue can only be used in loops.

          I wasn’t aware of COMEFROM. Sounds like the real-world FORTRAN ENTRY statement. Both sound quite useful ;)

Leave a Reply