10 Tips on How to be a Great Programmer

I was recently asked in an interview about my opinion on how to be a great programmer. That’s an interesting question, and I think we can all be great programmers, regardless of our talent, if we follow a couple of rules that – I believe – should be common sense. In fact, these rules don’t all apply to programmers only, but to any professional.

Not everything in this list is meant to be taken entirely serious, some things are just my opinion and yours may vary, and not all descriptions of programmer personalities match real-world situations I may have encountered, so if in doubt, please do not take offense. I didn’t mean you 🙂

Here they are:

1. Learn how to ask questions

There are essentially these types of programmers asking questions:

  • The perfectionist: Especially when asking a question about some open source tool, they may have already debugged through the code and found the real cause of their problem. But even if not, the perfectionist will write an introduction to the problem, steps to reproduce, potentially a suggested workaround, and as I said, potentially a suggested fix. In fact, the perfectionist doesn’t have questions. Only answers.
  • The chatterbox: This person will not really ask questions. Rather, they arrange their thoughts publicly and occasionally put a rhetorical question mark here and there. What may appear to be a question is really just a stream of thoughts and if you wait with the answer, they may either find the answer themselves, or ask the real question in email number 37. Or not. Or maybe, if we try it this way. You know what? It turned out that the requirement is completely wrong, I solved it with some other technique. Oh, actually, I changed libraries entirely. Hehe. No more questions.
  • The slacker: Here’s the code. What’s wrong? Halp plz.
  • The manager: For this type, time is money. The questions must be short and the answers ASAP. There’s something ironic about this approach, because by keeping the questions short (read: incomplete, not concise), most often, a lot of important details are missing, which only leads to requests for details. Then, the manager (naturally being disappointed because the reply is not an answer but a new question) will send yet again a short message and demand an answer with even more urgency. This can go back and forth for quite a while. In the end, it may take 1-2 weeks until what could’ve been the answer will actually be answered.
  • The complainer: This person doesn’t ask questions. They complain. Until the question goes away. Perhaps. If it doesn’t all the better. More reasons to complain.

By now, it should be clear that a well prepared question (concise, simple, short, yet with enough details) will yield much better answers. If you know what exactly you want to learn with your question, chances are, you’ll get exactly what you wanted.

2. Learn how to avoid asking questions

In fact, mostly, it’s better to try to avoid asking questions. Perhaps you can figure it out yourself? Not always, of course. Many things, you simply cannot know and by asking a domain expert, you’ll find the quickest and most efficient path to success. But often, trying something yourself has these benefits:

  • You learn it the “hard way” which is the way that sticks to our memory much better – we’ll remember what we’ve learned
  • It’s more rewarding to figure out stuff yourself
  • You don’t create “noise”. Remember the chatterbox? Unless the person you’re asking the question(s) is routinely answering questions (and thus postponing their answer), they may not see through your stream of thoughts and try to answer each individual incomplete “question”. That doesn’t help anyone.
  • By postponing a question (for a while, at least), you can gather more relevant information that you can provide to someone who might be able to answer your question. Think of the “perfectionist”. Spend more time looking for details first, then answer the question.
  • You’ll get better at asking questions by training to ask good questions. And that takes a bit more time.

3. Don’t leave broken windows

There was a very interesting article on reddit, recently about not leaving broken windows. The essence of the article is to never compromise on quality. To never become a slacker. To never leave … broken windows. Here’s a quote from the article

When we take certain shortcuts to deliver something in the shortest period of time, and our code reflects how careless we’ve been, developers coming after us (from the same team, a future team, and even ourselves!), will reach an important conclusion: it’s not important to pay enough attention to the code we produce. Little by little our application will start deteriorating in what it becomes an unstoppable process.

This isn’t about being a perfectionist, in fact. Sometimes, fixing broken windows can be postponed (and must be, of course). But often times, by allowing windows to break and stay broken, we’re introducing a situation where no one is happy. Not us the programmers, not our customers, not our users, not our project managers. This is an attitude thing and thus at the core of being professional. How did Benjamin Franklin say?

The bitterness of poor quality remains long after the sweetness of low price is forgotten

That’s true for everything. In this case, “low price” is the quick win we might get by implementing something in a sloppy way.

4. Software ought to be deterministic. Aim for it!

In an ideal world, everything in software is “deterministic”. We’d all be functional programmers, writing side-effect free, pure functions. Like String.contains(). No matter how many times you execute the following:

assertTrue("abcde".contains("bc"));

… the result is always the same, expected result. The universe and all it’s statefulness will have absolutely no impact on this computation. It’s deterministic.

We can do that, too in our own programs, not just in standard libraries. We can try to write side-effect free, deterministic modules for as often as possible. This isn’t really a matter of what technology we choose. Deterministic programming can be done in any language – even if functional languages have more tools to prevent accidental side effects through more sophisticated type systems. But the example I’ve shown is a Java example. Object orientation permits determinism. Heck, procedural languages like PL/SQL allow for determinism. It’s a requirement for a function to be deterministic, if it is to be used in an index:

CREATE INDEX upper_first_name ON customer (upper (first_name));
-- Deterministic function here: -----------^^^^^^^^^^^^^^^^^^

So again, this is a matter of discipline. You could see a side-effectful procedure / method / “function” as a “broken window”. Perhaps it was easier to maintain a side-effect, hoping that eventually, it can be removed. But that’s usually a lie. The price will be paid later, when non-determinism suddenly strikes. And it will.

5. Expect the unexpected

That previous link is key. Murphy’s Law is something we programmers should observe all the time. Everything can break. And it will. Come on, as software engineers, we know it will break. Because our world is not deterministic, neither are the business requirements that we’re implementing. We can implement tip #4 (determinism) only until it is no longer possible. From then on, we’ll inevitably enter the world of non-determinism (the “real world”), where stuff will go wrong. So, aim for it. Expect the unexpected. Train your inner pessimist to foresee all kinds of things.

How to write that pessimistic code in a concise manner is another story, of course. And how to distinguish from things that will fail (and need to be dealt with) from things that might fail (and don’t need to be dealt with), that takes some practice 🙂

6. Never cargo cult. Never follow dogma. Always embrace: “It depends”

Everything you’ve been taught is potentially wrong. Including (or probably: especially) when someone really popular says it. Here’s a nice quote:

Our profession is full of hypocrisy. We like to think of ourselves as mathematicians, where only the purest of ideas persist, and they’re necessarily correct.

That’s wrong. Our profession is built on top of mathematics, but unless you go into the funky world of category theory or relational algebra (and even then, I doubt everything is “correct”), you’re in the pragmatic world of real-world business requirements. And that’s, quite frankly, very far from perfect. Let’s look at some of the most popular programming languages out there:

  • C
  • Java
  • SQL

Do you really think that these languages resemble mathematics in the least bit? If so, let’s discuss segmentation faults, Java generics, or SQL three-valued logic. These languages are platforms built by pragmatists. There is some really cool theoretical background in all of them, but ultimately, they had to do the job.

The same is true for everything built on top of languages. Libraries. Frameworks. Heck, even architectures. Design patterns. Nothing is right or wrong. Everything is a tool that was designed for some context. Think of the tool in its context. Never think of the tool as a standalone raison d’être. We’re not doing art for art’s sake.

So, say no to unquestioned:

  • XML
  • JSON
  • Functional programming
  • Object oriented programming
  • Design patterns
  • Microservices
  • Three tier architectures
  • DDD
  • TDD
  • In fact: *DD

All of these are nice tools given a certain context. They don’t always hold true. By staying curious and thinking out of the box, you’ll be a better programmer and know when to use which one of these tools.

7. Do it

True. There are luminaries out there who outperform everyone.

But most programmers are simply “good”. Or they have the potential to be “good”. How to be a good programmer? By doing it. Great software wasn’t written in a day, and popular people aren’t the only heroes of our time. I’ve met many great programmers no one knows about, because they lead private lives, solve private problems of small companies.

But great programmers all have one thing in common: They just do it. They practice. They work every day to get better and better.

Want to get better at SQL? Do it! Try to write a SQL statement with some new features in them, every day. Use window functions. Grouping sets. Recursion. Partitioned outer join. The MODEL and/or MATCH_RECOGNIZE clauses. It doesn’t have to ship to production every time, but the practice will be worth it.

8. Focus on one subject (in the long run)

There might be only very few “good” full stack developers out there. Most full stack developers will rather be mediocre at everything. Sure, a small team might need only a few of them and they can cover a lot of business logic to quickly bootstrap a piece of new software. But everything will be quite sloppy and “kinda works”. Perhaps that’s good enough for the minimum viable product phase, but in the long run, there will be more complex problems that a full stack developer will not have the time to properly analyse (or foresee!).

Focus on one subject mostly, and get really good at that. A specialist will always be in demand as long as that specialist’s niche exists, and many niches will outlive us all (hello COBOL or SQL). So, do your career a favour and do something really good, rather than many things “just ok”.

9. Play

While you should mostly focus on one subject, you shouldn’t forget the other subjects completely. You may never be really really really good at all of SQL, scaling up, scaling out, low level performance, CSS (who’s good at that anyway!?), object orientation, requirements engineering, architecture, etc. all at once (see tip #8). It’s just not possible.

But you should at least understand the essence of each of these. You should understand when SQL is the right choice (and when it isn’t). When low level performance tuning is important (and when it isn’t). How CSS works in principle. The advantage of object orientation, FP, etc.

You should spend some time to play with these (and many more) concepts, technologies to better understand why they’re important. To know when to apply them, and then perhaps to find an expert to actually execute the work.

By playing around with new paradigms and technologies, you’ll open up your mind to entirely different ways of thinking, and chances are, you’ll be able to use that in your every day work in one way or another.

10. Keep it simple, stupid

Albert Einstein said:

Everything should be made as simple as possible, but no simpler

No one is able to handle huge complexity. Not in software, not in any other aspect of life. Complexity is the killer of good software and thus simplicity is the enabler. Easy to understand. Hard to implement. Simplicity is something that takes a lot of time and practice to recognise and produce. There are many rules you can follow, of course.

One of the simplest rules is to have methods/functions with only a few parameters. Let’s look at that. Surely, the aforementioned String.contains() method qualifies. We can write "abcde".contains("bcd") and without reading any documentation, everyone will immediately understand what this does and why. The method does one thing and one thing only. There are no complicated context/settings/other arguments that can be passed to the method. There are no “special cases”, nor any caveats.

Again, producing simplicity in a library is much easier than doing that in business logic. Can we achieve it? Perhaps. By practicing. By refactoring. But like great software, simplicity is not built in a day.

(Protip: Conway’s Law applies. It is utterly impossible to write good, simple software in an environment where the business is super complex. Either, you like complexity and ugly legacy, or you better get out of that business).

The Open-Closed Principle is Often Not What You Think it Is

jOOQ is a library that loves making everything internal final and package private. We have tons of classes like these:

final class Concat extends AbstractFunction<String> {
    // ...
}

The class implements the semantics of SQL string concatenation. Clearly, you shouldn’t need to tamper with it (or even know about it), because it is “protected” behind the corresponding public API in the DSL class:

// You can see this:
public class DSL {

    // You can see this but not override it:
    public static Field<String> concat(Field<?>... fields) {

        // But you cannot do this, yourself:
        return new Concat(nullSafe(fields));
    }
}

Now, in the past decades, there have been a lot of software design movements that were contrary to the concept of encapsulation in some ways. The driving powers of that were:

A fun to read example of “slightly” (i.e. completely) exaggerated advocacy of extreme application of object orientation is Yegor Bugayenko’s blog:

http://www.yegor256.com

Through exaggeration, he makes some really interesting points that make you think. Of course, you have to be able to accept the hyperboles as non-facts. Not everyone can do that, so don’t get angry reading 😉

Let’s look at the open-closed principle

The open-closed principle claims, according to Wikipedia:

In object-oriented programming, the open/closed principle states “software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification”; that is, such an entity can allow its behaviour to be extended without modifying its source code.

This is a very desireable aspect of some software entities. For instance, it is always true for an SPI (Service Provider Interface), by design, of course. Let’s read the Wikipedia definition of an SPI:

Service Provider Interface (SPI) is an API intended to be implemented or extended by a third party. It can be used to enable framework extension and replaceable components

Perfect. For instance, a jOOQ Converter is a SPI. We’ve just published a recent post about how to use the Converter API in a strategy pattern style with lambdas – the strategy pattern works really well with SPIs.

In fact, the strategy pattern isn’t even strictly an object oriented feature, you can get it for free in functional programming without giving it a fancy name. It’s just any ordinary higher order function.

Another fine example of what could be considered an SPI is an Iterable. While Iterable subtypes like List are more often used as APIs (user is the consumer) rather than SPIs (user is the implementor), the Iterable API itself is more of a way of providing the functionality required to run code inside of a foreach loop. For instance, jOOQ’s ResultQuery implements Iterable, which allows it to be used in a foreach loop:

for (MyTableRecord rec : DSL
    .using(configuration)
    .selectFrom(MY_TABLE)
    .orderBy(MY_TABLE.COLUMN)) { // Automatic execution, fetching
 
    doThingsWithRecord(rec);
}

So, clearly, it can be said that:

  • Iterable follows the open-closed principle as it models an entity that is open for extension (I can produce my own iterable semantics), but closed for modification (I won’t ever modify the Java compiler and/or the foreach loop semantics
  • The Liskov substitution principle is also followed trivially, as the foreach loop doesn’t care at all about how I implement my Iterable, as long as it behaves like one (providing an Iterator)

That was easy

But when does it not apply?

In a lot of situations. For instance, jOOQ is in many ways not designed for object oriented extension. You simply should not:

  • Mock the concat() function.
    You might be tempted to do so, as you might think that you need to unit test everything, including third party libraries, and then you have to mock out the string concatenation feature inside of your database. But it doesn’t work. The DSL.concat() method is static, and the implementation hidden. No way you could replace it with ordinary means (there are some dirty tricks).

    But hold on for a second. Why are you even doing this? Aren’t integration tests the better way here? Do you really have time (and want to spend it) on replacing entire complex implementations with your mocks? I don’t think so. That hardly ever works

  • Modify the concatenation behaviour for some use-case.
    While you may think that sometimes, you’d just like to tweak an implementation a little bit to get a quick win, that is certainly not the intent of the authors of the open-closed principle or the Lishkov substitution principle. We as API designers don’t want you to extend all of our functionality. As simple as that. Why? Because we want you to get in touch with us to help us improve our software for everyone, rather than you tweaking something for a quick win.

Let this sink in – especially the latter.

The premise that everything should be object oriented and everything should be extensible is wrong. Object orientation (and all the philosophies connected to it) are a tool. They’re a very powerful tool, for instance, when we as API/SPI designers want to allow users to extend our software. (mostly through SPIs). And we spend a lot of time thinking about really good, generic, useful, powerful SPIs that solve 99% of all extensibility problems in a way that we can control and keep backwards compatible. For some examples, check out these blog posts:

And sometimes, yes, we did not foresee a justified request for extensibility. Nothing is perfect. You have a feature request, and you cannot implement it right away. Then you start exploring. You look into ways how you can inject some behaviour into jOOQ. And as we Java developers like object orientation, we’re looking into writing subclasses to override existing behaviour. That’s what we were taught. That’s what we’re doing all the time. That’s what the combination of the open-closed principle and the Liskov substitution principle suggest.

Let me shock you for a moment.

Haskell (and many other languages) doesn’t support subtype polymorphism

Yes. There are entire ecosystems out there, that don’t have the luxury of bikeshedding the fact that if a class cannot be (easily) extended through subtype polymorphism and overriding of methods, it must be ill-designed. An entire ecosystem that never worries about something being final, and thus “closed for extension” (through subtype polymorphism).

Alternative definitions

Given the historic context, both principles are very interesting things. But their object-oriented context is something we should free our minds of. Here’s a better definition:

  • open-closed principle:
    Systems should strive for openness for extension, but not at any price. Some parts of a system / module / perhaps class should be open for extension. Those parts should be very well designed and kept very backwards compatible. And the vendor of those parts should listen to its consumers to better identify the required extension points. Consumers on the other hand shouldn’t blindly assume that everything can be extended. If they’re extending (through unexpected subtype polymorphism) random parts, then they’re hacking in the same way as if they would be actually modifying the system / parts. There’s no more benefit to extending.
  • Liskov substitution principle:
    Subtype polymorphism is just a tool, and in 2017, we have long started understanding that it’s a very wrong tool for many things. The composition over inheritance concept has shown that we’ve regretted the subtype polymorphism hype from the 90s. So, forget about your mocks through subtype overriding. Start looking for alternative interpretations of this principle. I like Jessica Kerr’s finding:

    Therefore, the Liskov Substition Principle says, “Don’t surprise people.”

    That’s a much better credo to follow, than the one that is strictly related to an aspect of object orientation and in particular to subtype polymorphism.

Conclusion

Yes. Package private, final classes mean, you cannot extend them. The open-closed principle is “violated”. Because that part of the system was not designed for you to know about (it’s encapsulated).

Sometimes, you think that if just you could override such an entity, you might get a quick win and inject your desired behaviour into a third party library / entity / class / module / system. My claim here is that: Mostly, you’ll deeply regret your desire for a quick win later on. You shouldn’t argue about open-closed or Liskov substitution. These principles simply don’t apply here. They do not at all, in particular, apply to badly designed legacy software. Once software is “badly designed”, no principles will help you.

Instead, do get in touch with the vendor if you run into a bump. There’s always an interesting idea for a great new feature hidden in such a limitation. And for the time being, accept that your overriding of what was not meant to be overridden is just the same thing as actually modifying that entity. You’re patching the library. Let’s do that and move on.

A Nice API Design Gem: Strategy Pattern With Lambdas

With Java 8 lambdas being available to us as a programming tool, there is a “new” and elegant way of constructing objects. I put “new” in quotes, because it’s not new. It used to be called the strategy pattern, but as I’ve written on this blog before, many GoF patterns will no longer be implemented in their classic OO way, now that we have lambdas.

A simple example from jOOQ

jOOQ knows a simple type called Converter. It’s a simple SPI, which allows users to implement custom data types and inject data type conversion into jOOQ’s type system. The interface looks like this:

public interface Converter<T, U> {
    U from(T databaseObject);
    T to(U userObject);
    Class<T> fromType();
    Class<U> toType();
}

Users will have to implement 4 methods:

  • Conversion from a database (JDBC) type T to the user type U
  • Conversion from the user type U to the database (JDBC) type T
  • Two methods providing a Class reference, to work around generic type erasure

Now, an implementation that converts hex strings (database) to integers (user type):

public class HexConverter implements Converter<String, Integer> {

    @Override
    public Integer from(String hexString) {
        return hexString == null 
            ? null 
            : Integer.parseInt(hexString, 16);
    }

    @Override
    public String to(Integer number) {
        return number == null 
            ? null 
            : Integer.toHexString(number);
    }

    @Override
    public Class<String> fromType() {
        return String.class;
    }

    @Override
    public Class<Integer> toType() {
        return Integer.class;
    }
}

That wasn’t difficult to write, but it’s quite boring to write this much boilerplate:

  • Why do we need to give this class a name?
  • Why do we need to override methods?
  • Why do we need to handle nulls ourselves?

Now, we could write some object oriented libraries, e.g. abstract base classes that take care at least of the fromType() and toType() methods, but much better: The API designer can provide a “constructor API”, which allows users to provide “strategies”, which is just a fancy name for “function”. One function (i.e. lambda) for each of the four methods. For example:

public interface Converter<T, U> {
    ...

    static <T, U> Converter<T, U> of(
        Class<T> fromType,
        Class<U> toType,
        Function<? super T, ? extends U> from,
        Function<? super U, ? extends T> to
    ) {
        return new Converter<T, U>() { ... boring code here ... }
    }

    static <T, U> Converter<T, U> ofNullable(
        Class<T> fromType,
        Class<U> toType,
        Function<? super T, ? extends U> from,
        Function<? super U, ? extends T> to
    ) {
        return of(
            fromType,
            toType,

            // Boring null handling code here
            t -> t == null ? null : from.apply(t),
            u -> u == null ? null : to.apply(u)
        );
    }
}

From now on, we can easily write converters in a functional way. For example, our HexConverter would become:

Converter<String, Integer> converter =
Converter.ofNullable(
    String.class,
    Integer.class,
    s -> Integer.parseInt(s, 16),
    Integer::toHexString
);

Wow! This is really nice, isn’t it? This is the pure essence of what it means to write a Converter. No more overriding, null handling, type juggling, just the bidirectional conversion logic.

Other examples

A more famous example is the JDK 8 Collector.of() constructor, without which it would be much more tedious to implement a collector. For example, if we want to find the second largest element in a stream… easy!

for (int i : Stream.of(1, 8, 3, 5, 6, 2, 4, 7)
                   .collect(Collector.of(
    () -> new int[] { Integer.MIN_VALUE, Integer.MIN_VALUE },
    (a, t) -> {
        if (a[0] < t) {
            a[1] = a[0];
            a[0] = t;
        }
        else if (a[1] < t)
            a[1] = t;
    },
    (a1, a2) -> {
        throw new UnsupportedOperationException(
            "Say no to parallel streams");
    }
)))
    System.out.println(i);

Run this, and you get:

8
7

Bonus exercise: Make the collector parallel capable by implementing the combiner correctly. In a sequential-only scenario, we don’t need it (until we do, of course…).

Conclusion

The concrete examples are nice examples of API usage, but the key message is this:

If you have an interface of the form:

interface MyInterface {
    void myMethod1();
    String myMethod2();
    void myMethod3(String value);
    String myMethod4(String value);
}

Then, just add a convenience constructor to the interface, accepting Java 8 functional interfaces like this:

// You write this boring stuff
interface MyInterface {
    static MyInterface of(
        Runnable function1,
        Supplier<String> function2,
        Consumer<String> function3,
        Function<String, String> function4
    ) {
        return new MyInterface() {
            @Override
            public void myMethod1() {
                function1.run();
            }

            @Override
            public String myMethod2() {
                return function2.get();
            }

            @Override
            public void myMethod3(String value) {
                function3.accept(value);
            }

            @Override
            public String myMethod4(String value) {
                return function4.apply(value);
            }
        }
    }
}

As an API designer, you write this boilerplate only once. And your users can then easily write things like these:

// Your users write this awesome stuff
MyInterface.of(
    () -> { ... },
    () -> "hello",
    v -> { ... },
    v -> "world"
);

Easy! And your users will love you forever for this.

Should I Implement the Arcane Iterator.remove() Method? Yes You (Probably) Should

An interesting question was asked on reddit’s /r/java recently:

Should Iterators be used to modify a custom Collection?

Paraphrasing the question: The author wondered whether a custom java.util.Iterator that is returned from a mutable Collection.iterator() method should implement the weird Iterator.remove() method.

A totally understandable question.

What does Iterator.remove() do?

Few people ever use this method at all. For instance, if you want to implement a generic way to remove null values from an arbitrary Collection, this would be the most generic approach:

Collection<Integer> collection =
Stream.of(1, 2, null, 3, 4, null, 5, 6)
      .collect(Collectors.toCollection(ArrayList::new));

System.out.println(collection);

Iterator<Integer> it = collection.iterator();
while (it.hasNext())
    if (it.next() == null)
        it.remove();

System.out.println(collection);

The above program will print:

[1, 2, null, 3, 4, null, 5, 6]
[1, 2, 3, 4, 5, 6]

Somehow, this API usage does feel dirty. An Iterator seems to be useful to … well … iterate its backing collection. It’s really weird that it also allows for modifying it. It’s even weirder that it only offers removal. E.g. we cannot add a new element before or after the current element of iteration, or replace it.

Luckily, Java 8 provides us with a much better method on the Collection API directly, namely Collection.removeIf(Predicate).

The above iteration code can be re-written as such:

collection.removeIf(Objects::isNull);

OK, now should I implement remove() on my own iterators?

Yes, you should – if your custom collection is mutable. For a very simple reason. Check out the default implementation of Collection.removeIf():

default boolean removeIf(Predicate<? super E> filter) {
    Objects.requireNonNull(filter);
    boolean removed = false;
    final Iterator<E> each = iterator();
    while (each.hasNext()) {
        if (filter.test(each.next())) {
            each.remove();
            removed = true;
        }
    }
    return removed;
}

As I said. The most generic way to remove specific elements from a Collection is precisely to go by its Iterator.remove() method and that’s precisely what the JDK does. Subtypes like ArrayList may of course override this implementation because there’s a more performant alternative, but in general, if you write your own custom, modifiable collection, you should implement this method.

And enjoy the ride into Java’s peculiar, historic caveats for which we all love the language.

How to Write a Quick and Dirty Converter in jOOQ

One of jOOQ‘s most powerful features is the capability of introducing custom data types, pretending the database actually understands them. For instance, when working with SQL TIMESTAMP types, users mostly want to use the new JSR-310 LocalDateTime, rather than the JDBC java.sql.Timestamp type.

In jOOQ 3.9+, this is a no brainer, as we’ve finally introduced the <javaTimeTypes> flag to automatically generate JSR 310 types instead of JDBC types. But sometimes, you want some custom conversion behaviour, so you write a Converter.

To the rescue our new jOOQ 3.9+ converter constructors, which essentially take two lambdas to construct a converter for you. For instance:

Converter<Timestamp, LocalDateTime> converter =
Converter.of(
    Timestamp.class,
    LocalDateTime.class,
    t -> t == null ? null : t.toLocalDateTime(),
    u -> u == null ? null : Timestamp.valueOf(u)
);

And you’re set! Even easier, if you don’t need any special null encoding (as above), just write this equivalent converter, instead:

Converter<Timestamp, LocalDateTime> converter =
Converter.ofNullable(
    Timestamp.class,
    LocalDateTime.class,
    Timestamp::toLocalDateTime
    Timestamp::valueOf
);

Where’s that useful? The code generator needs a concrete converter class, so you cannot use that with the code generator, but there are many other places in the jOOQ API where converters are useful, including when you write plain SQL like this:

DSL.field(
    "my_table.my_timestamp", 
    SQLDataType.TIMESTAMP.asConvertedDataType(
        Converter.ofNullable(...)
));

How to Prevent JDBC Resource Leaks with JDBC and with jOOQ

In a recent consulting gig, I was analysing a client’s connection pool issue in a productive system, where during some peak loads, all the Java processes involving database interactions just started queueing up until nothing really worked anymore. No exceptions, though, and when the peak load was gone in the evening, everything returned back to normal. The database load looked pretty healthy at the time, so no actual database problem was involved – the problem had to be a client side problem.

Weblogic operations teams quickly identified the connection pool to be the bottleneck. All the connections were constantly allocated to some client process. The immediate thought was: A resource leak is happeneing, and it didn’t show before because this was an exceptional situation: Around the beginning of the new year when everyone wanted to download their electronic documents from the bank (and some new features introduced many more document related database calls).

The obvious problem

That particular system still runs a lot of legacy code in Java 6 style, which means, there are tons of code elements of the following kind:

Connection connection = null;
try {

  // Get the connection from the pool through JNDI
  connection = JDBCHelper.getConnection();
}
finally {

  // Release the connection
  JDBCHelper.close(connection);  
}

While the above code is perfectly fine, and 99% of all database interactions were of the above type, there was an occasional instance of someone badly copy-pasting some code and doing something like this:

Connection connection = JDBCHelper.getConnection();
PreparedStatement stmt = null;

try {
  stmt = connection.prepareStatement("SELECT ...");
}
finally {

  // Release the statement
  JDBCHelper.close(stmt);
}

// But the connection is never released

Sometimes, things were even more subtle, as a utility method expected a connection like this:

// Utility method doesn't have to close the connection:
public void databaseCalls(Connection connection) {
  try {
    stmt = connection.prepareStatement("SELECT ...");
  }
  finally {

    // Release the statement
    JDBCHelper.close(stmt);
  }
}

public void businessLogic() {
  // Oops, subtle connection leak
  databaseCalls(JDBCHelper.getConnection());
}

Thoroughly fixing these things

There’s a quick fix to all these problems. The easiest fix is to just continue rigorously using the JDBCHelper.close() method (or just call connection.close() with appropriate error handling) every time. But apparently, that’s not easy enough as there will always be a non-vigilant developer (or a junior developer who doesn’t know these things), who will get it wrong, who will simply forget things.

I mean, even the official JDBC tutorial gets it “wrong” on their first page:
https://docs.oracle.com/javase/tutorial/jdbc/overview/index.html

The bad example being:

public void connectToAndQueryDatabase(
    String username, String password) {

    Connection con = DriverManager.getConnection(
                         "jdbc:myDriver:myDatabase",
                         username,
                         password);

    Statement stmt = con.createStatement();
    ResultSet rs = stmt.executeQuery(
        "SELECT a, b, c FROM Table1");

    while (rs.next()) {
        int x = rs.getInt("a");
        String s = rs.getString("b");
        float f = rs.getFloat("c");
    }
}

All resources leak in this example!

Of course, it’s just an example, and of course, it’s not a terrible situation, because resources can usually clean up themselves when they go out of scope, i.e. when the GC kicks in. But as software engineers we shouldn’t rely on that, and as the productive issues have shown, there are always edge cases, where precisely this lack of vigilance will cause great harm. After all,

It works on my machine

… is simply not an excuse. We should design our software for productive use.

Fix #1: Use try-with-resources. Always

If you want to stay on the safe side, always follow this rule:

The scope that acquires the resource, closes the resource

As long as you’re working with JDBC, save yourself the trouble of writing those JDBCUtilities classes that close non-null resources and safely catch exceptions that may arise. Just use try-with-resources, all the time. For instance, take the example from the Oracle JDBC tutorial, which should read:

public void connectToAndQueryDatabase(
     String username, String password) {

    // All of these resources are allocated in this method. Thus,
    // this method's responsibility is to also close / free all
    // these resources.
    try (Connection con = DriverManager.getConnection(
            "jdbc:myDriver:myDatabase", username, password);
         Statement stmt = con.createStatement();
         ResultSet rs = stmt.executeQuery(
            "SELECT a, b, c FROM Table1")) {

        while (rs.next()) {
            int x = rs.getInt("a");
            String s = rs.getString("b");
            float f = rs.getFloat("c");
        }
    }
}

This already feels that much better and cleaner, doesn’t it? All the resources are acquired in the above method, and the try-with-resources block will close all of them when they go out of scope. It’s just syntax sugar for something we’ve been doing manually all the time. But now, we will (hopefully) never again forget!

Of course, you could introduce automatic leak detection in your integration tests, because it’s rather easy to proxy the JDBC DataSource and count all connection acquisitions and closings. An example can be seen in this post:
The best way to detect database connection leaks

Fix #2: Use jOOQ, which manages resources for you

Historically, JDBC works on lazy resources that are kept around for a while. The assumption in 1997 (when JDBC was introduced) was that database interactions were relatively slow and it made sense to fetch and process one record at a time, even for moderately sized result sets.

In fact, it was even common to abort fetching records from a cursor when we’ve had enough results and close it eagerly before consuming all the rows.

Today, these assumptions are (mostly) no longer true, and jOOQ (like other, more modern database APIs) invert the lazy/eager API default behaviour. In jOOQ, the JDBC types have the following corresponding counterparts:

  • JDBC DataSource / Connection => jOOQ ConnectionProvider:
    jOOQ doesn’t know the concept of an “open connection” like JDBC. jOOQ only has this ConnectionProvider which works in a similar way to JDBC’s / JavaEE’s DataSource. The semantics here is that the connection / session is “managed” and jOOQ will acquire / release it once per statement. This happens automatically, so users don’t have to worry about any connection resource.
  • JDBC Statement (and subtypes) => jOOQ Query:
    While the JDBC statement (especially the PreparedStatement) is a resource that binds some server-side objects, such as an execution plan, for instance, jOOQ again doesn’t have such a resourceful thing. The Query just wraps the SQL string (or AST) and bind variables. All resources are created lazily only when the query is actually executed – and released immediately after execution. Again, users don’t have to worry about any statement resource.
  • JDBC ResultSet => jOOQ Result:
    The JDBC ResultSet corresponds to a server-side cursor, another object that possibly binds quite a few resources, depending on your fetch mode. Again, in jOOQ no resources are bound / exposed, because jOOQ by default eagerly fetches your entire result set – the assumption being that a low-level optimisation here doesn’t add much value for moderately sized result sets

With the above inverted defaults (from lazy to eager resource allocation / freeing), the jOOQ-ified Oracle JDBC tutorial code would look like this:

Working with a standalone Connection

public void connectToAndQueryDatabase(
    String username, String password) {

    // If you're using a standalone connection, you can pass that
    // one to jOOQ, but you're still responsible of closing it
    // again:
    try (Connection con = DriverManager.getConnection(
            "jdbc:myDriver:myDatabase", username, password)) {

        // There is no statment resource anymore, and the result
        // is fetched eagerly from the database, so you don't have
        // to worry about it
        for (Record record : DSL.using(con).fetch(
                "SELECT a, b, c FROM Table1")) {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        }
    }
}

Working with a connection pool / DataSource

// You probably have some means of injecting / discovering
// a JDBC DataSource, e.g. from Spring, or from your JavaEE
// container, etc.
@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // With a DataSource, jOOQ will automatically acquire and
    // close the JDBC Connection for you, so the last remaining
    // resource has also disappeared from your client code.
    for (Record record : DSL
           .using(ds, SQLDialect.ORACLE)
           .fetch("SELECT a, b, c FROM Table1")) {
        int x = record.get("a", int.class);
        String s = record.get("b", String.class);
        float f = record.get("c", float.class);
    }
}

With jOOQ, all resource management is automatic, by default, because by default, you don’t want to worry about this low level stuff. It’s not 1997 anymore. The JDBC API really is too low level for most use-cases.

If you do want to optimise resource management and not fetch everything eagerly, you can, of course. jOOQ will allow you to fetch your results lazily, in two ways:

Using a Cursor

@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // jOOQ's Cursor type is a resource, just like JDBC's
    // ResultSet. It actually keeps a reference to an open
    // ResultSet, internally. This is an opt-in
    // feature, though, only to be used if desired.
    try (Cursor<Record> cursor : DSL
            .using(ds, SQLDialect.ORACLE)
            .fetchLazy("SELECT a, b, c FROM Table1")) {

        for (Record record : cursor) {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        }
    }
}

Using a Java 8 Stream (lazy, resourceful version)

@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // This can also work with a stream
    try (Stream<Record> stream : DSL
        .using(ds, SQLDialect.ORACLE)
        .fetchStream("SELECT a, b, c FROM Table1")) {

        stream.forEach(record -> {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        });
    }
}

Unfortunately, there are no auto-closing streams in Java, which is why we have to resort to using the try-with-resources statement, breaking the fluency of jOOQ’s API.

Do note though, that you can use the Stream API in an eager fashion:

Using a Java 8 Stream (eager version)

@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // Fetch the jOOQ Result eagerly into memory, then stream it
    // Again, no resource management
    DSL.using(ds, SQLDialect.ORACLE)
       .fetch()
       .stream("SELECT a, b, c FROM Table1")
       .forEach(record -> {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        });
}

Conclusion

Developers, unfortunately, often suffer from

Works on my machine

This leads to problems that can be discovered only in production, under load. When it comes to resources, it is important to constantly remind ourselves that …

The scope that acquires the resource, closes the resource

JDBC (and the JDK’s IO APIs), “unfortunately”, deal with resources on a very low level. This way, their default behaviour is very resource-efficient. For instance, when you only need to read a file header, you don’t load the entire file into memory through the InputStream. You can explicitly, manually, only load the first few lines.

But in many applications, this default and its low level nature gets in the way of correctness (accidental resource leaks are easy to create), and convenience (a lot of boiler plate code needs to be written).

With database interactions, it’s usually best to migrate your JDBC code towards a more modern API like jOOQ, which abstracts resource handling away in its API and inverts the lazy/eager semantics: Eager by default, lazy on demand.

More information about the differences between jOOQ and JDBC can be seen here, in the manual.

Do You Really Have to Name Everything in Software?

This is one of software engineering’s oldest battles. No, I’m not talking about where to put curly braces, or whether to use tabs or spaces. I mean the eternal battle between nominal typing and structural typing.

This article is inspired by a very vocal blogger who eloquently reminds us to …

[…] Please Avoid Functional Vomit

Read the full article here:
https://dzone.com/articles/using-java-8-please-avoid-functional-vomit

What’s the post really about?

It is about naming things. As we all know:

There are only two hard things in Computer Science: cache invalidation and naming things.

— Phil Karlton

Now, for some reason, there is a group of people who wants constant pain and suffering by explicitly naming everything, including rather abstract concepts and algorithmic components, such as compound predicates. Those people like nominal typing and all the features that are derived from it. What is nominal typing (as opposed to structural typing)?

Structural typing

SQL is a good example to study the two worlds. When you write SQL statements, you’re creating structural row types all the time. For instance, when you write:

SELECT first_name, last_name
FROM customer

… what you’re really doing is you’re creating a new rowtype of the structure (in pseudo-SQL):

TYPE (
  first_name VARCHAR,
  last_name VARCHAR
)

The type has the following properties:

  • It is a tuple or record (as always in SQL)
  • It contains two attributes or columns
  • Those two attributes / columns are called first_name and last_name
  • Their types is VARCHAR

This is a structural type, because the SQL statement that produces the type only declares the type’s structure implicitly, by producing a set of column expressions.

In Java, we know lambda expressions, which are (incomplete) structural types, such as:

// A type that can check for i to be even
i -> i % 2 == 0

Nominal typing

Nominal typing takes things one step further. In SQL, nominal typing is perfectly possible as well, for instance, in the above statement, we selected from a well-known table by name customer. Nominal typing assigns a name to a structural type (and possibly stores the type somewhere, for reuse).

If we want to name our (first_name, last_name) type, we could do things like:

-- By using a derived table:
SELECT *
FROM (
  SELECT first_name, last_name
  FROM customer
) AS people

-- By using a common table expression:
WITH people AS (
  SELECT first_name, last_name
  FROM customer
)
SELECT *
FROM people

-- By using a view
CREATE VIEW people AS
SELECT first_name, last_name
FROM customer

In all cases, we’ve assigned the name people to the structural type (first_name, last_name). The only difference being the scope for which the name (and the corresponding content) is defined.

In Java, we can only use lambda expressions, once we assign them to a typed name, either by using an assignment, or by passing the expression to a method that takes a named type argument:

// Naming the lambda expression itself
Predicate<Integer> p = i -> i % 2 == 0

// Passing the lambda expression to a method
Stream.of(1, 2, 3)
      .filter(i -> i % 2 == 0);

Back to the article

The article claims that giving a name to things is always better. For instance, the author proposes giving a name to what we would commonly refer to as a “predicate”:

//original, less clear code
if(barrier.value() > LIMIT && barrier.value() > 0){
//extracted out to helper function. More code, more clear
if(barrierHasPositiveLimitBreach()){

So, the author thinks that extracting a rather trivial predicate into an external function is better because a future reader of such code will better understand what’s going on. At least in the article’s opinion. Let’s refute this claim for the sake of the argument:

  • The proposed name is verbose and requires quite some thinking.
  • What does breach mean?
  • Is breach the same as >= or the same as >?
  • Is LIMIT a constant? From where?
  • Where is barrier? Who owns it?
  • What does the verb “has” mean, here? Does it depend on something outside of barrier? E.g. some shared state?
  • What happens if there’s a negative limit?

By naming the predicate (remember, naming things is hard), the OP has added several layers of cognitive complexity to the reader, while quite possibly introducing subtle bugs, because probably both LIMIT and barrier should be function arguments, rather than global (im)mutable state that is assumed to be there, by the function.

The name introduced several concepts (“to have a breach”, “positive limit”, “breach”) that are not well defined and need some deciphering. How do we decipher it? Probably by looking inside the function and reading the actual code. So what do we gain? Better reuse, perhaps? But is this really reusable?

Finally, there is a (very slight) risk of introducing a performance penalty by the additional indirection. If we translate this to SQL, we could have written a stored function and then queried:

SELECT *
FROM orders -- Just an assumption here
WHERE barrier_has_positive_limit_breach(orders.barrier)

If this was some really complicated business logic depending on a huge number of things, perhaps extracting the function might’ve been worthwile. But in this particular case, is it really better than:

SELECT *
FROM orders
WHERE barrier > :limit AND barrier > 0

or even

SELECT *
FROM orders
WHERE barrier > GREATEST(:limit, 0)

Conclusion

There are some people in our industry who constantly want to see the world in black and white. As soon as they’ve had one small success story (e.g. reusing a very common predicate 4-5 times by extracting it into a function), they conclude with a general rule of this approach being always superior.

They struggle with the notion of “it depends”. Nominal typing and structural typing are both very interesting concepts. Structural typing is extremely powerful, whereas nominal typing helps us humans keep track of complexity. In SQL, we’ve always liked to structure our huge SQL statements, e.g. in nameable views. Likewise, Java programmers structure their code in nameable classes and methods.

But it should be immediately clear to anyone reading the linked article that the author seems to like hyperboles and probably wasn’t really serious, given the silly example he came up with. The message he’s conveying is wrong, because it claims that naming things is always better. It’s not true.

Be pragmatic. Name things where it really helps. Don’t name things where it doesn’t. Or as Leon Bambrick amended Phil Karlton’s quote:

There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors

Here’s my advice to you, dear nominal typing loving blogger. There’s are only two ways of typing: nominal typing and structural typing. And it depends typing.