jOOQ 3.10 Supports JPA AttributeConverter

One of the cooler hidden features in jOOQ is the JPADatabase, which allows for reverse engineering a pre-existing set of JPA-annotated entities to generate jOOQ code.

For instance, you could write these entities here:

@Entity
public class Actor {

    @Id
    @GeneratedValue(strategy = IDENTITY)
    public Integer actorId;

    @Column
    public String firstName;

    @Column
    public String lastName;

    @ManyToMany(fetch = LAZY, mappedBy = "actors", 
        cascade = CascadeType.ALL)
    public Set<Film> films = new HashSet<>();

    public Actor(String firstName, String lastName) {
        this.firstName = firstName;
        this.lastName = lastName;
    }
}

@Entity
public class Film {

    @Id
    @GeneratedValue(strategy = IDENTITY)
    public Integer filmId;

    @Column
    public String title;

    @Column(name = "RELEASE_YEAR")
    @Convert(converter = YearConverter.class)
    public Year releaseYear;

    @ManyToMany(fetch = LAZY, cascade = CascadeType.ALL)
    public Set<Actor> actors = new HashSet<>();

    public Film(String title, Year releaseYear) {
        this.title = title;
        this.releaseYear = releaseYear;
    }
}

// Imagine also a Language entity here...

(Just a simple example. Let’s not discuss the caveats of @ManyToMany mapping).

For more info, the full example can be found on Github:

Now observe the fact that we’ve gone through all the trouble of mapping the database type INT for the RELEASE_YEAR column to the cool JSR-310 java.time.Year type for convenience. This has been done using a JPA 2.1 AttributeConverter, which simply looks like this:

public class YearConverter 
implements AttributeConverter<Year, Integer> {

    @Override
    public Integer convertToDatabaseColumn(Year attribute) {
        return attribute == null ? null : attribute.getValue();
    }

    @Override
    public Year convertToEntityAttribute(Integer dbData) {
        return dbData == null ? null : Year.of(dbData);
    }
}

Using jOOQ’s JPADatabase

Now, the JPADatabase in jOOQ allows you to simply configure the input entities (e.g. their package names) and generate jOOQ code from it. This works behind the scenes with this algorithm:

  • Spring is used to discover all the annotated entities on the classpath
  • Hibernate is used to generate an in-memory H2 database from those entities
  • jOOQ is used to reverse-engineer this H2 database again to generate jOOQ code

This works pretty well for most use-cases as the JPA annotated entities are already very vendor-agnostic and do not provide access to many vendor-specific features. We can thus perfectly easily write the following kind of query with jOOQ:

ctx.select(
        ACTOR.FIRSTNAME,
        ACTOR.LASTNAME,
        count().as("Total"),
        count().filterWhere(LANGUAGE.NAME.eq("English"))
          .as("English"),
        count().filterWhere(LANGUAGE.NAME.eq("German"))
          .as("German"),
        min(FILM.RELEASE_YEAR),
        max(FILM.RELEASE_YEAR))
   .from(ACTOR)
   .join(FILM_ACTOR)
     .on(ACTOR.ACTORID.eq(FILM_ACTOR.ACTORS_ACTORID))
   .join(FILM)
     .on(FILM.FILMID.eq(FILM_ACTOR.FILMS_FILMID))
   .join(LANGUAGE)
     .on(FILM.LANGUAGE_LANGUAGEID.eq(LANGUAGE.LANGUAGEID))
   .groupBy(
        ACTOR.ACTORID,
        ACTOR.FIRSTNAME,
        ACTOR.LASTNAME)
   .orderBy(ACTOR.FIRSTNAME, ACTOR.LASTNAME, ACTOR.ACTORID)
   .fetch()

(more info about the awesome FILTER clause here)

In this example, we’re also using the LANGUAGE table, which we omitted in the article. The output of the above query is something along the lines of:

+---------+---------+-----+-------+------+----+----+
|FIRSTNAME|LASTNAME |Total|English|German|min |max |
+---------+---------+-----+-------+------+----+----+
|Daryl    |Hannah   |    1|      1|     0|2015|2015|
|David    |Carradine|    1|      1|     0|2015|2015|
|Michael  |Angarano |    1|      0|     1|2017|2017|
|Reece    |Thompson |    1|      0|     1|2017|2017|
|Uma      |Thurman  |    2|      1|     1|2015|2017|
+---------+---------+-----+-------+------+----+----+

As we can see, this is a very suitable combination of jOOQ and JPA. JPA was used to insert the data through JPA’s useful object graph persistence capabilities, whereas jOOQ is used for reporting on the same tables.

Now, since we already wrote this nice AttributeConverter, we certainly want to apply it also to the jOOQ query and get the java.time.Year data type also in jOOQ, without any additional effort.

jOOQ 3.10 auto conversion

In jOOQ 3.10, we don’t have to do anything anymore. The existing JPA converter will automatically mapped to a jOOQ converter as the generated jOOQ code reads:

// Don't worry about this generated code
public final TableField<FilmRecord, Year> RELEASE_YEAR = 
    createField("RELEASE_YEAR", org.jooq.impl.SQLDataType.INTEGER, 
        this, "", new JPAConverter(YearConverter.class));

… which leads to the previous jOOQ query now returning a type:

Record7<String, String, Integer, Integer, Integer, Year, Year>

Luckily, this was rather easy to implement as the Hibernate meta model allows for navigating the binding between entities and tables very conveniently as described in this article here:

How to get the entity mapping to database table binding metadata from Hibernate

More similar features are coming up in jOOQ 3.11, e.g. when we look into reverse engineering JPA @Embedded types as well. See https://github.com/jOOQ/jOOQ/issues/6518

If you want to run this example, do check out our jOOQ/JPA example on GitHub:

A Curious Java Language Feature and How it Produced a Subtle Bug

Java’s visibility rules are tricky at times. Do you know what this will print?

package p;

import static p.A.x;

class A {
    static String x = "A.x";
}

class B {
    String x = "B.x";
}

class C {
    String x = "C.x";

    class D extends B {
        void m() {
            System.out.println(x);
        }
    }
}

public class X {
    public static void main(String[] args) {
        new C().new D().m();
    }
}

It will print (highlight to see the solution):

B.x

Because:

The super type B's members hide the enclosing type C's members, 
which again hide the static import from A.

How can this lead to bugs?

The problem isn’t that the above code is tricky per se. When you write this logic, everything will work as expected. But what happens if you change things? For instance, if you mark the super type’s attribute as private:

package p;

import static p.A.x;

class A {
    static String x = "A.x";
}

class B {
    private String x = "B.x"; // Change here
}

class C {
    String x = "C.x";

    class D extends B {
        void m() {
            System.out.println(x);
        }
    }
}

public class X {
    public static void main(String[] args) {
        new C().new D().m();
    }
}

Now, suddenly, B.x is no longer visible from within method m(), so the rules are now:

Enclosing member hides static import

And we get the result of

C.x

Of course, we can change this again to the following code:

package p;

import static p.A.x;

class A {
    static String x = "A.x";
}

class B {
    private String x = "B.x";
}

class C {
    String xOld = "C.x"; // Change here

    class D extends B {
        void m() {
            System.out.println(x);
        }
    }
}

public class X {
    public static void main(String[] args) {
        new C().new D().m();
    }
}

As we all know, 50% of variables that are no longer needed are renamed to “old”.

Now, in this final version, there’s only one possible meaning of x inside of m(), and that’s the statically imported A.x, thus the output is:

A.x

Subtleties across a larger code base

These refactorings have shown that making something less visible is dangerous because a sub type might have depended on it, but for some freak coincidence, there was another member in a less “important” scope by the same name that now jumps in and keeps you from getting a compiler error.

The same can happen if you make something that was private more visible. Suddenly, it might be in scope on all its subtypes, even if you didn’t want it to be in scope.

Likewise, with the static import, we could run into a similar issue. When your code depends on a static import, it might suddenly be hidden by some member of the same name, e.g. in a super type. The author of the change might not even notice, because they’re not looking at your subtype.

Conclusion

The conclusion is, once more, not to rely on subtyping too much. If you can make your classes final, no one will ever override them and accidentally “profit” from your newly added members.

Besides that, every time you do a visibility change of your members, be very careful about potential members that are named the same accidentally.

How I Incorrectly Fetched JDBC ResultSets. Again.

You know JDBC, right? It’s that really easy, concise API that we love to use to work with virtually any database, relational or not. It has essentially three types that you need to care about:

All the other types some sort of utilities.

Now, with the above three, we can do really nice and lean Java/SQL coding as follows:

try (Connection c = datasource.getConnection();
     Statement s = c.createStatement();
     ResultSet r = s.executeQuery("SELECT 'hello'")
) {
    while (r.next())
        System.out.println(r.getString(1));
}

Output:

hello

OK? Super easy.

Unless…

Unless you want to write generic JDBC code, because you don’t know what the SQL string is. It could be a SELECT statement. It could be and UPDATE. It could be DDL. It could be a statement batch (several statements). It could call triggers and stored procedures, which again produce nice things like warnings, exceptions, update counts, and additional result sets.

You know, the sort of thing that might come flying in to a generic utility method like jOOQ’s ResultQuery.fetchMany().

(Don’t think this couldn’t happen to you as well. SQL Server triggers are really mean things!)

For this, let’s consider the correct way to execute the following simple statement batch that works wonderfully in SQL Server:

raiserror('e0', 0, 2, 3);
create table t(i int);
raiserror('e1', 5, 2, 3);
insert into t values (1);
raiserror('e2', 10, 2, 3);
insert into t values (2);
raiserror('e3', 15, 2, 3);
select * from t;
drop table t;
raiserror('e4', 16, 2, 3);

The result is:

And obviously

For your convenience, I have pre-formatted the above String into a Java String variable, which is already the first problem, because Java STILL doesn’t have multi-line strings (gaah):

String sql =
    "\n raiserror('e0', 0, 2, 3);"
  + "\n create table t(i int);"
  + "\n raiserror('e1', 5, 2, 3);"
  + "\n insert into t values (1);"
  + "\n raiserror('e2', 10, 2, 3);"
  + "\n insert into t values (2);"
  + "\n raiserror('e3', 15, 2, 3);"
  + "\n select * from t;"
  + "\n drop table t;"
  + "\n raiserror('e4', 16, 2, 3);";

Now see, we might be inclined to just copy paste some JDBC snippet off some website (e.g. this blog, and take its first snippet) and execute it as such:

try (
    Statement s = c.createStatement();
    ResultSet r = s.executeQuery(sql)
) {
    while (r.next())
        System.out.println(r.getString(1));
}

Yeah. What’ll happen if we do this?

Rats:

com.microsoft.sqlserver.jdbc.SQLServerException: e3
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1547)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:857)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:757)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7151)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2689)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:224)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:204)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQuery(SQLServerStatement.java:659)
	at SQLServer.main(SQLServer.java:80)

e3? What on earth? So what happened with my statement batch? Did it execute? Only until the middle? Or did I get to the end as well?

OK, quite obviously, we have to do this more carefully. We cannot use Statement.executeQuery() here, because we don’t know whether we’ll get a result set. In fact, we got an exception, but not the first one.

Let’s try something else. Let’s try this:

try (Statement s = c.createStatement()) {
    System.out.println(s.execute(sql));
}

That just yields:

false

Okaaay, did anything execute in the database at all? No more exceptions… Let me have a look at the SQL Server Profiler…

Nope, the entire batch got executed. (Could’ve just removed the DROP TABLE statement and checked the contents of table T in SQL Server Management Studio, of course).

Huh, quite a different result, depending on what method we’re calling. Does that scare you? Does your ORM get this right? (jOOQ didn’t but this is now fixed).

OK, let’s read the Javadoc on Statement.execute()

It says:

Executes the given SQL statement, which may return multiple results. In some (uncommon) situations, a single SQL statement may return multiple result sets and/or update counts. Normally you can ignore this unless you are (1) executing a stored procedure that you know may return multiple results or (2) you are dynamically executing an unknown SQL string.
The execute method executes an SQL statement and indicates the form of the first result. You must then use the methods getResultSet or getUpdateCount to retrieve the result, and getMoreResults to move to any subsequent result(s).

Huh, OK. Statement.getResultSet() and getUpdateCount() must be used, and then getMoreResults()

The getMoreResults() method also has this interesting bit of information:

There are no more results when the following is true:

// stmt is a Statement object
((stmt.getMoreResults() == false) && (stmt.getUpdateCount() == -1))

Interesting. -1. I guess we can be very happy that at least it’s not returning null or a punch in your face.

So, let’s try this again:

  • We first have to call execute()
  • If it’s true, we fetch getResultSet()
  • If it’s false, we check getUpdateCount()
  • If that was -1, we can stop

Or, in code:

fetchLoop:
for (int i = 0, updateCount = 0; i < 256; i++) {
    boolean result = (i == 0)
        ? s.execute(sql)
        : s.getMoreResults();

    if (result)
        try (ResultSet rs = s.getResultSet()) {
            System.out.println("Result      :");

            while (rs.next())
                System.out.println("  " + rs.getString(1));
        }
    else if ((updateCount = s.getUpdateCount()) != -1)
        System.out.println("Update Count: " + updateCount);
    else
        break fetchLoop;
}

Beautiful! Some remarks:

  • Note how the loop stops after 256 iterations. Never trust these infinite streaming APIs, there’s always a bug somewhere, trust me
  • The boolean value return from Statement.execute() and Statement.getMoreResults() is the same. We can assign it to a variable inside the loop and call execute only on the first iteration
  • If true, fetch the result set
  • If false, check the update count
  • If that was -1, stop

Run time!

Update Count: 0
Update Count: 1
Update Count: 1
com.microsoft.sqlserver.jdbc.SQLServerException: e3
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1547)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getMoreResults(SQLServerStatement.java:1270)
	at SQLServer.main(SQLServer.java:83)

Crap. But did it execute completely? Yes it did, but we didn’t get that sweet result set after e3, because of that exception. But at least, we now got 3 update counts. But wait a second, why didn’t we get e0, e1, and e2?

AHA, they’re warnings, not exceptions. Funky SQL Server decided that everything below some severity level is a warning. Whatever.

Anyway, let’s fetch those warnings as well!

fetchLoop:
for (int i = 0, updateCount = 0; i < 256; i++) {
    boolean result = (i == 0)
        ? s.execute(sql)
        : s.getMoreResults();

    // Warnings here
    SQLWarning w = s.getWarnings();
    for (int j = 0; j < 255 && w != null; j++) {
        System.out.println("Warning     : " + w.getMessage());
        w = w.getNextWarning();
    }

    // Don't forget this
    s.clearWarnings();

    if (result)
        try (ResultSet rs = s.getResultSet()) {
            System.out.println("Result      :");

            while (rs.next())
                System.out.println("  " + rs.getString(1));
        }
    else if ((updateCount = s.getUpdateCount()) != -1)
        System.out.println("Update Count: " + updateCount);
    else
        break fetchLoop;
}

Great, so now we get all the warnings e0, e1, e2, and the exception e3, along with the update counts:

Warning     : e0
Update Count: 0
Warning     : e1
Update Count: 1
Warning     : e2
Update Count: 1
com.microsoft.sqlserver.jdbc.SQLServerException: e3
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1547)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getMoreResults(SQLServerStatement.java:1270)
	at SQLServer.main(SQLServer.java:82)

That’s more like our batch. But we’re still aborting after e3. How can we get the result set? Easy! Just ignore the exception, right? 🙂

And while we’re at it, let’s use ResultSetMetaData to read the unknown result set type.

fetchLoop:
for (int i = 0, updateCount = 0; i < 256; i++) {
    try {
        boolean result = (i == 0)
            ? s.execute(sql)
            : s.getMoreResults();

        SQLWarning w = s.getWarnings();
        for (int j = 0; j < 255 && w != null; j++) {
            System.out.println("Warning     : " + w.getMessage());
            w = w.getNextWarning();
        }

        s.clearWarnings();

        if (result)
            try (ResultSet rs = s.getResultSet()) {
                System.out.println("Result      :");
                ResultSetMetaData m = rs.getMetaData();

                while (rs.next())
                    for (int c = 1; c <= m.getColumnCount(); c++)
                        System.out.println(
                            "  " + m.getColumnName(c) +
                            ": " + rs.getInt(c));
            }
        else if ((updateCount = s.getUpdateCount()) != -1)
            System.out.println("Update Count: " + updateCount);
        else
            break fetchLoop;
        }
    catch (SQLException e) {
        System.out.println("Exception   : " + e.getMessage());
    }
}

There, that’s more like it:

Warning     : e0
Update Count: 0
Warning     : e1
Update Count: 1
Warning     : e2
Update Count: 1
Exception   : e3
Result      :
  i: 1
  i: 2
Update Count: 0
Exception   : e4

Now we’ve executed the entire batch in a super generic way with JDBC

Gah, I want this to be easier

Of course you do, which is why there is jOOQ. jOOQ has the really nice fetchMany() methods, which can execute random SQL strings to get a mixture of:

  • Update counts
  • Result sets
  • Exceptions / Warnings (jOOQ 3.10+ only)

For example, we can write:

// Use this new setting to indicate that we don't want to throw
//  exceptions, but collect them, as we've seen above
DSLContext ctx = DSL.using(c, 
  new Settings().withThrowExceptions(THROW_NONE));

// And then, simply:
System.out.println(ctx.fetchMany(sql));

The result is of the form:

Warning: SQL [...]; e0
Update count: 0
Warning: SQL [...]; e1
Update count: 1
Warning: SQL [...]; e2
Update count: 1
Exception: SQL [...]; e3
Result set:
+----+
|   i|
+----+
|   1|
|   2|
+----+
Update count: 0
Exception: SQL [...]; e4

Excellent!

What we didn’t cover

Oh, tons of things, but I need material for future blog posts, too, right?

  • We only discussed SQL Server so far
  • We didn’t discuss the fact that SQLException.getNextException() doesn’t work here
  • We didn’t discuss how we can combine this with OUT parameters (eegh, at what moment do we fetch those)
  • We didn’t discuss the fact that some JDBC drivers don’t implement this correctly (looking at you, Oracle)
  • We didn’t go into the depths of how JDBC drivers don’t implement ResultSetMetaData correctly
  • We didn’t cover the performance overhead of fetching warnings, e.g. in MySQL
  • … and much more

So, are you still writing JDBC code yourself? 🙂

A Basic Programming Pattern: Filter First, Map Later

In recent days, I’ve seen a bit too much of this:

someCollection
    .stream()
    .map(e -> someFunction(e))
    .collect(Collectors.toList())
    .subList(0, 2);

Something is very wrong with the above example. Can you see it? No? Let me rename those variables for you.

hugeCollection
    .stream()
    .map(e -> superExpensiveMapping(e))
    .collect(Collectors.toList())
    .subList(0, 2);

Better now? Exactly. The above algorithm is O(N) when it could be O(1):

hugeCollection
    .stream()
    .limit(2)
    .map(e -> superExpensiveMapping(e))
    .collect(Collectors.toList());

(Let’s assume the lack of explicit ordering is irrelevant)

I’m working mostly with SQL and helping companies tune their SQL (check out our training, btw) and as such, I’m always very eager to reduce the algorithmic complexity of queries. To some people, this seems like wizardry, but honestly, most of the time, it just boils down to adding a well placed index.

Why?

Because an index reduces the algorithmic complexity of a query algorithm from O(N) (scanning the entire table for matches) to O(log N) (scanning a B-tree index for the same matches).

Sidenote: Reducing algorithmic complexity is almost never premature optimisation. As your data set grows, bad complexity will always bite you!

The same is true for the above examples. Why would you ever traverse and transform an entire list of a huge amount of elements (N) with an expensive operation (superExpensiveMapping), when you really only need to do this only for the first two values?

Conclusion

SQL is a declarative language where the query optimiser gets this right automatically for you: It will (almost) always filter the data first (WHERE clause), and only then transform it (JOIN, GROUP BY,
SELECT
, etc).

Just as in SQL, when we hand-write our queries using Streams (and also in imperative programming), do always:

Filter First, Map Later

Your production system will thank you.

Note, another interesting case with flatMap() is documented here.

ORMs Should Update “Changed” Values, Not Just “Modified” Ones

In this article, I will establish how the SQL language and its implementations distinguish between changed values and modified values, where a changed value is a value that has been “touched”, but not necessarily modified, i.e. the value might be the same before and after the change.

Many ORMs, unfortunately, either update all of a record’s values, or only the modified ones. The first can be inefficient, and the latter can be wrong. Updating the changed values would be correct.

Note that you may have a different definition of changed and modified. For this article, let’s just assume that the above definition is as valid as it is useful.

Introduction

A very interesting discussion was triggered recently by Vlad Mihalcea who was looking for an answer to this interesting question:

What’s the overhead of updating all columns, even the ones that haven’t changed?

Apart from the question being very interesting from a performance perspective, the tweet also inspired functional aspects of a distinction between updating all columns vs. updating some columns, which I’ll summarise in this article.

What’s the Problem?

The problem is one that all ORM vendors need to solve: ORMs have a client side representation of the relational model, and that representation is cached (or “out of sync”) for a user to change and then persist again. The problem is now how to re-synchronise the client side representation with the server side representation in a consistent and correct way.

Sidenote: By ORM I understand any tool that maps from a client side representation of your database schema to the database schema itself, regardless if the product supports full-fledged JPA-style object graph persistence, or “merely” implements an “active record” pattern, such as jOOQ 3.x (I find that distinction a bit academic).

All such ORMs have a client side representation of a database record, for instance given the following table (I’m going to be using PostgreSQL syntax):

CREATE TABLE customer (
  customer_id SERIAL8     NOT NULL PRIMARY KEY,
  first_name  VARCHAR(50) NOT NULL,
  last_name   VARCHAR(50) NOT NULL
)

You’re going to have a client side representation as the following (using Java, e.g. jOOQ or JPA):

// jOOQ generated UpdatableRecord
public class CustomerRecord 
extends UpdatableRecordImpl<CustomerRecord> {

  public CustomerRecord setCustomerId(Long customerId) { ... }
  public Long getCustomerId() { ... }
  public CustomerRecord setFirstName(String firstName) { ... }
  public String getFirstName() { ... }

  ...
}

// JPA annotated entity
@Entity
public class Customer {

  @Id
  @GeneratedValue(strategy = IDENITITY)
  public long customerId;

  @Column
  public String firstName;

  ...
}

In principle, these two approaches are the same thing with the distinction that jOOQ explicitly governs all UpdatableRecord interactions through type inheritance, whereas JPA makes this dependency more implicit through annotations:

  • jOOQ – explicit behavioural dependency between entity and jOOQ logic
  • JPA – implicit behavioural dependency between entity and JPA entity manager

In principle, the distinction is just a matter of taste, a programming style: Explicit vs. declarative.

But from a practical perspective, the JPA implementation lacks an important feature when it comes to synching the state back to the database. It cannot reflect change, only modification.

How to synch the state back to the database?

Let’s assume we have a customer called John Doe:

INSERT INTO customer (first_name, last_name)
VALUES ('John', 'Doe');

And that customer now changes their names to John Smith. We have several options of sending that update to the database, through “PATCH” or “PUT” semantics – terminology used by Morgan Tocker in another tweet in that discussion:

-- PATCH
UPDATE customer SET last_name = 'Smith' WHERE id = ? 

-- PUT
UPDATE customer 
SET first_name = 'John',
    last_name = 'Smith'
WHERE customer_id = ? 

A “PATCH” operation sends only the changed values back to the server, whereas a “PUT” operation sends the entire entity back to the server.

Discussion – Semantics.

In favour of PUT

The two operations are semantically very different. If another session attempts to rename this customer to Jane Doe concurrently (and without optimistic locking being in place), then the PATCH operation might result in an inconsistent outcome (Jane Smith), whereas the PUT operation would still produce one of the expected results, depending on what write is executed first:

-- PATCH result: Jane Smith
-- PATCH 1
UPDATE customer SET last_name = 'Smith' WHERE customer_id = ? 

-- PATCH 2
UPDATE customer SET first_name = 'Jane' WHERE customer_id = ? 

-- PUT result: Jane Doe
-- PUT 1
UPDATE customer 
SET first_name = 'John',
    last_name = 'Smith'
WHERE customer_id = ? 

-- PUT 2
UPDATE customer 
SET first_name = 'Jane',
    last_name = 'Doe'
WHERE customer_id = ? 

This is one of the reasons why Hibernate, as a JPA implementation, always implements PUT semantics by default, sending all the columns at once. You can opt out of this by using the @DynamicUpdate, which will only update modified values (not “changed” values, I’ll explain this distinction later).

This makes perfect sense in such a trivial setup, but it is a short-sighted solution, when the table has many more columns. We’ll see right away why:

In favour of PATCH

One size doesn’t fit all. Sometimes, you do want concurrent updates to happen, and you do want to implement PATCH semantics, because sometimes, two concurrent updates do not work against each other. Take the following example using an enhancement of the customer table.

Business is asking us to collect some aggregate metrics for each customer. The number of clicks they made on our website, as well as the number of purchases they made:

CREATE TABLE customer (
  customer_id SERIAL8     NOT NULL PRIMARY KEY,
  first_name  VARCHAR(50) NOT NULL,
  last_name   VARCHAR(50) NOT NULL,

  clicks      BIGINT      NOT NULL DEFAULT 0,
  purchases   BIGINT      NOT NULL DEFAULT 0
)

And, of course, once you agree that the above design is a suitable one, you’ll immediately agree that here, PATCH semantics is more desirable than PUT semantics:

-- Updating clicks
UPDATE customer SET clicks = clicks+1 WHERE customer_id = ? 

-- Updating purchases
UPDATE customer SET purchases = purchases+1 WHERE customer_id = ? 

Not only do we update only an individual column, we’re doing it entirely in SQL, including the calculation. With this approach, we do not even need optimistic locking to guarantee update correctness, as we’re not using any client side cached version of the customer record, which could be out of date and would need optimistic (or worse: pessimistic) locking.

If we implemented this differently, using client side calculation of the updated clicks / purchases counters…

-- Updating clicks
UPDATE customer 
SET clicks = ? 
WHERE customer_id = ? 

-- Updating purchases
UPDATE customer 
SET purchases = ? 
WHERE customer_id = ? 

… then we’d need one of these techniques:

  • Pessimistic locking: Nope, won’t work. We could still get incorrect updates
  • Optimistic locking: Indeed, any update would need to be done on a versioned customer record, so if there are two concurrent updates, one of them will fail and could try again. This guarantees data integrity, but will probably make this functionality very painful, because a lot of click updates are probably done in a short amount of time, and they would need to be repeated until they work!
  • Client side synchronisation: Of course, we could prevent concurrency for these updates on the client side, making sure that only one concurrent process ever updates click counts (for a given customer). We could implement a click count update queue for this.

All of the above options have significant drawbacks, the easiest solution is really to just increment the counter directly in the database.

And don’t forget, if you choose a bind-variable based solution, and opt for updating ALL the columns, rather than just the changed one, your first_name / last_name updates might conflict with these counter updates as well, making things even more complicated.

Partial PUT (or compound PATCH)

In fact, from a semantics perspective, if you do want to use an ORM to update an entity, you should think about a “partial PUT” semantics, which separates the different entity elements in “sub entities”. From a relational perspective, of course, no such thing as a subentity exists. The above example should be normalised into this, and we would have much less concurrency issues:

CREATE TABLE customer (
  customer_id SERIAL8     NOT NULL PRIMARY KEY,
  first_name  VARCHAR(50) NOT NULL,
  last_name   VARCHAR(50) NOT NULL
);

CREATE TABLE customer_clicks
  customer_id BIGINT NOT NULL PRIMARY KEY REFERENCES customer,
  clicks      BIGINT NOT NULL DEFAULT 0
);

CREATE TABLE customer_purchases
  customer_id BIGINT NOT NULL PRIMARY KEY REFERENCES customer,
  purchases   BIGINT NOT NULL DEFAULT 0
);

This way, the previously mentioned PUT semantics would not create situations where individual, semantically unrelated updates (updates to names, updates to clicks) would interfere with each other. We would only need to make sure that e.g. two competing updates to clicks are correctly serialised.

Practically, we often don’t design our databases this way, either for convenience reasons, for optimised storage, for optimised querying (see also our article when normalisation and surrogate keys hurt performance).

jOOQ’s “changed” value semantics

So that “sub entity” is really just a logical thing, which can be represented either as a logically separate entity in JPA, or we can use jOOQ, which works a bit differently here. In jOOQ, we can change an UpdatableRecord only partially, and that partial change is sent to the server:

CustomerRecord customer = ctx
    .selectFrom(CUSTOMER)
    .where(CUSTOMER.CUSTOMER_ID.eq(customerId))
    .fetchOne();

customer.setFirstName("John");
customer.setLastName("Smith");

assertTrue(customer.changed(CUSTOMER.FIRST_NAME));
assertTrue(customer.changed(CUSTOMER.LAST_NAME));
assertFalse(customer.changed(CUSTOMER.CLICKS));
assertFalse(customer.changed(CUSTOMER.PURCHASES));

customer.store();

assertFalse(customer.changed(CUSTOMER.FIRST_NAME));
assertFalse(customer.changed(CUSTOMER.LAST_NAME));
assertFalse(customer.changed(CUSTOMER.CLICKS));
assertFalse(customer.changed(CUSTOMER.PURCHASES));

This will send the following statement to the server:

UPDATE customer
SET first_name = ?,
    last_name = ?
WHERE customer_id = ?

Optionally, just as with JPA, you can turn on optimistic locking on this statement. The important thing here is that the clicks and purchases columns are left untouched, because they were not changed by the client code. This is different from JPA, which either sends all the values by default, or if you specify @DynamicUpdate in Hibernate, it would send only the last_name column, because while first_name was changed it was not modified.

My definition:

  • changed: The value is “touched”, its state is “dirty” and the state needs to be synched to the database, regardless of modification.
  • modified: The value is different from its previously known value. By necessity, a modified value is always changed.

As you can see, these are different things, and it is quite hard for a JPA-based API like Hibernate to implement changed semantics because of the annotation-based declarative nature of how entities are defined. We’d need some sophisticated instrumentation to intercept all data changes even when the values have not been modified (I didn’t make those attributes public by accident).

Without this distinction, however, it is unreasonable to use @DynamicUpdate in Hibernate, as we might run into that situation we didn’t want to run into, where we get a customer called “Jane Smith” – or we use optimistic locking, in case of which there’s not much point in using @DynamicUpdate.

The database perspective

From a database perspective, it is also important to distinguish between change and modification semantics. In the answer I gave on Stack Exchange, I’ve illustrated two situations:

INSERTs and DEFAULT values

Thus far, we’ve discussed only UPDATE statements, but similar reasoning may be made for INSERT as well. These two statements are the same:

INSERT INTO t (a, b)    VALUES (?, ?);
INSERT INTO t (a, b, c) VALUES (?, ?, DEFAULT);

This one, however, is different:

INSERT INTO t (a, b, c) VALUES (?, ?, ?);

In the first case, a DEFAULT clause (e.g. timestamp generation, identity generation, trigger value generation, etc.) may apply to the column c. In the second case, the value c is provided explicitly by the client.

Languages like Java do not have any way to represent this distinction between

  • NULL (which is usually, but not always, the DEFAULT) in SQL
  • an actual DEFAULT

This can only be achieved when an ORM implements changed semantics, like jOOQ does. When you create a customer with jOOQ, then clicks and purchases will have their DEFAULT applied:

CustomerRecord c1 = ctx.newRecord(CUSTOMER);
c1.setFirstName("John");
c1.setLastName("Doe");
c1.store();

CustomerRecord c2 = ctx.newRecord(CUSTOMER);
c2.setFirstName("Jane");
c2.setLastName("Smith");
c2.setClicks(1);
c2.setPurchases(1);
c2.store();

Resulting SQL:

-- c1.store();
INSERT INTO customer (first_name, last_name)
VALUES (?, ?);

-- c2.store();
INSERT INTO customer (first_name, last_name, clicks, purchases)
VALUES (?, ?, ?, ?);

In both cases, that’s what the user tells jOOQ to do, so jOOQ will generate a query accordingly.

Back to UPDATE statements

Consider the following example using Oracle triggers:

CREATE TABLE x (a INT PRIMARY KEY, b INT, c INT, d INT);

INSERT INTO x VALUES (1, 1, 1, 1);

CREATE OR REPLACE TRIGGER t
  BEFORE UPDATE OF c, d -- Doesn't fire on UPDATE OF b!
  ON x
BEGIN
  IF updating('c') THEN
    dbms_output.put_line('Updating c');
  END IF;
  IF updating('d') THEN
    dbms_output.put_line('Updating d');
  END IF;
END;
/

SET SERVEROUTPUT ON
UPDATE x SET b = 1 WHERE a = 1;
UPDATE x SET c = 1 WHERE a = 1;
UPDATE x SET d = 1 WHERE a = 1;
UPDATE x SET b = 1, c = 1, d = 1 WHERE a = 1;

It results in the following output:

table X created.
1 rows inserted.
TRIGGER T compiled
1 rows updated.
1 rows updated.
Updating c

1 rows updated.
Updating d

1 rows updated.
Updating c
Updating d

As you can see, the trigger doesn’t fire when we update only column b, which it is not interested in. Again, this goes in the direction of distinguishing between changed and modified values, where a trigger fires only when a value is changed (but not necessarily modified).

Now, if an ORM will always update all the columns, this trigger will not work correctly. Sure, we can compare :OLD.b and :NEW.b, but that would check for modification, not change, and it might be costly to do so for large strings!

Speaking of costs…

Performance

Statement caching: Weakly in favour of PUT

While one of the reasons the Hibernate team mentioned in favour of updating all the columns is improved cursor cache performance (fewer distinct SQL statements need to be parsed by the database as there are fewer distinct update configurations), I suggest that this “premature optimisation” is negligible. If a client application runs dynamic updates (in the jOOQ sense, where changed values are updated, not just modified values), then chances that the possible SQL statements that need to be parsed will explode are slim to non-existent.

I would definitely like to see real-world benchmarks on this topic!

Batching: Weakly in favour of PUT

When you want to batch tons of update statements from JDBC, then indeed, you will need to ensure that they all have the exact same SQL string. However, this is not a good argument in favour of using PUT semantics and updating all columns.

I’m saying “not good”, because such a batched update should still only consider a subset of the columns for update, not all the columns. And that subset should be determined on aggregated changed flags, not data modification.

Index updates: In favour of PATCH (depending on the database)

Most databases optimise index updates to ignore indexes whose columns have not been changed. Oracle also doesn’t update indexes whose columns have not been modified, in case of which PUT and PATCH semantics both work the same way from an indexing perspective. Other databases may not work this way, where PATCH semantics is favourable.

But even if the optimisation is in place, the old and the new values have to be compared for equality (i.e. to see if a modification took place). You don’t want to compare millions of strings per second if there’s no need to do so! Check out Morgan Tocker’s interesting answer on Stack Exchange, from a MySQL perspective

So, why not just prevent expensive modification checks by telling the database what has changed, instead?

UNDO overhead: In favour of PATCH

Every statement has a footprint on the UNDO / REDO logs. As I’ve shown above, the statements are semantically different in many ways, so if your statement is bigger (more columns are updated), then the impact on the UNDO / REDO log is bigger as well. This can have drastic effects depending on the size of your table / columns:

Don’t forget that this can also affect backup performance!

More performance related information in this blog post:

https://jonathanlewis.wordpress.com/2007/01/02/superfluous-updates

Note: While these bits of information were mostly Oracle-specific, common sense dictates that other RDBMS will behave in similar ways.

Conclusion

With all these negative aspects to including unnecessary columns for update through an ORM compared to the almost negligible benefits, I’d say that users should move forward and completely avoid this mess. Here’s how:

  • jOOQ optimises this out of the box, if users set the changed values explicitly. Beware that when you “load” a POJO into a Record, it will set all the columns to changed, which may or may not be the desired effect!
  • Hibernate allows for @DynamicUpdate, which may work incorrectly as we have minimal “PATCH” semantics based on modified values, not on changed values. However, JPA allows for declaring more than one entity per table, which might certainly be a valid option for this kind of problem
  • Normalisation is always an option, with its own trade offs. The clicks and purchases columns could be externalised in separate tables, if this benefits the overall design.
  • More often than not, writing an UPDATE with SQL directly is the best choice. As we’ve seen in this article, the counters should be updated with expressions of the form clicks = clicks + 1, which circumvents most problems exposed in this article.

In short, as Michael Simons said:

And we all do feel very dirty when we write SELECT *, right? So we should at least be wary of updating all the columns as well.

Using Kotlin’s Apply Function for Dynamic SQL with jOOQ

It was hard to limit ourselves to 10 Nice Examples of Writing SQL in Kotlin With jOOQ, recently, because the Kotlin language has many nice little features that really help a lot when working with Java libraries. We’ve talked about the nice with() stdlib function, which allows to “import” a namespace for a local scope or closure:

with (AUTHOR) {
    ctx.select(FIRST_NAME, LAST_NAME)
       .from(AUTHOR)
       .where(ID.lt(5))
       .orderBy(ID)
       .fetch {
           println("${it[FIRST_NAME]} ${it[LAST_NAME]}")
       }
}

In the above example, the AUTHOR table is made available as the this reference in the closure following the with function, which works exactly like JavaScript’s with(). Everything in AUTHOR is available, without dereferencing it from AUTHOR.

Apply is very similar

A very similar feature is made available through apply(), although with different syntactic implications. Check out this Stack Overflow question for some details about with() vs. apply() in Kotlin.

When using jOOQ, apply() is most useful for dynamic SQL. Imagine you have local variables indicating whether some parts of a query should be added to the query:

val filtering = true;
val joining = true;

These boolean variables would be evaluated dynamically, of course. filtering specifies whether a dynamic filter / where clause is needed, whereas joining specifies whether an additional JOIN is required.

So, the following query will select authors, and:

  • if “filtering”, we’re selecting only author ID = 1
  • if “joining”, we’ll join the books table and count the number of books per author

Both of these predicates are independent. Enter the game: apply():

ctx.select(
      a.FIRST_NAME, 
      a.LAST_NAME, 
      if (joining) count() else value(""))
   .from(a)
   .apply { if (filtering) where(a.ID.eq(1)) }
   .apply { if (joining) join(b).on(a.ID.eq(b.AUTHOR_ID)) }
   .apply { if (joining) groupBy(a.FIRST_NAME, a.LAST_NAME) }
   .orderBy(a.ID)
   .fetch {
       println(it[a.FIRST_NAME] + " " + 
               it[a.LAST_NAME] +
               (if (joining) " " + it[count()] else ""))
   }

That’s neat! See, the jOOQ API doesn’t specify any apply() method / function, yet you can chain the apply() function to the jOOQ API as if it were natively supported.

Like with(), apply() makes a reference available to a closure as this, so it doesn’t have to be referenced explicitly anymore. Which means, we can write neat things like

   .apply { if (filtering) where(a.ID.eq(1)) }

Where a where() clause is added only if we’re filtering!

Of course, jOOQ (or any other query builder) lends itself to this kind of dynamic SQL, and it can be done in Java too:
https://www.jooq.org/doc/latest/manual/sql-building/dynamic-sql

But the Kotlin-specific fluent integration using apply() is exceptionally neat. Well done, Kotlin!

Side-note

This only works because the jOOQ DSL API of jOOQ 3.x is mutable and every operation returns the same this reference as was kindly pointed out by Ilya Ryzhenkov

In the future (e.g. version 4.0), we’re planning on making the jOOQ API more immutable – mutability is a historic legacy (although, often, it’s the desired behaviour for a query builder).

More nice Kotlin/jOOQ tricks in this article here.

10 Tips on How to be a Great Programmer

I was recently asked in an interview about my opinion on how to be a great programmer. That’s an interesting question, and I think we can all be great programmers, regardless of our talent, if we follow a couple of rules that – I believe – should be common sense. In fact, these rules don’t all apply to programmers only, but to any professional.

Not everything in this list is meant to be taken entirely seriously, some things are just my opinions and yours may vary, and not all descriptions of programmer personalities match real-world situations I may have encountered, so if in doubt, please do not take offense. I didn’t mean you 🙂

Here they are:

1. Learn how to ask questions

There are essentially these types of programmers asking questions:

  • The perfectionist: Especially when asking a question about some open source tool, they may have already debugged through the code and found the real cause of their problem. But even if not, the perfectionist will write an introduction to the problem, steps to reproduce, potentially a suggested workaround, and as I said, potentially a suggested fix. In fact, the perfectionist doesn’t have questions. Only answers.
  • The chatterbox: This person will not really ask questions. Rather, they arrange their thoughts publicly and occasionally put a rhetorical question mark here and there. What may appear to be a question is really just a stream of thoughts and if you wait with the answer, they may either find the answer themselves, or ask the real question in email number 37. Or not. Or maybe, if we try it this way. You know what? It turned out that the requirement is completely wrong, I solved it with some other technique. Oh, actually, I changed libraries entirely. Hehe. No more questions.
  • The slacker: Here’s the code. What’s wrong? Halp plz.
  • The manager: For this type, time is money. The questions must be short and the answers ASAP. There’s something ironic about this approach, because by keeping the questions short (read: incomplete, not concise), most often, a lot of important details are missing, which only leads to requests for details. Then, the manager (naturally being disappointed because the reply is not an answer but a new question) will send yet again a short message and demand an answer with even more urgency. This can go back and forth for quite a while. In the end, it may take 1-2 weeks until what could’ve been the answer will actually be answered.
  • The complainer: This person doesn’t ask questions. They complain. Until the question goes away. Perhaps. If it doesn’t all the better. More reasons to complain.

By now, it should be clear that a well prepared question (concise, simple, short, yet with enough details) will yield much better answers. If you know what exactly you want to learn with your question, chances are, you’ll get exactly what you wanted.

2. Learn how to avoid asking questions

In fact, mostly, it’s better to try to avoid asking questions. Perhaps you can figure it out yourself? Not always, of course. Many things, you simply cannot know and by asking a domain expert, you’ll find the quickest and most efficient path to success. But often, trying something yourself has these benefits:

  • You learn it the “hard way” which is the way that sticks to our memory much better – we’ll remember what we’ve learned
  • It’s more rewarding to figure out stuff yourself
  • You don’t create “noise”. Remember the chatterbox? Unless the person you’re asking the question(s) is routinely answering questions (and thus postponing their answer), they may not see through your stream of thoughts and try to answer each individual incomplete “question”. That doesn’t help anyone.
  • By postponing a question (for a while, at least), you can gather more relevant information that you can provide to someone who might be able to answer your question. Think of the “perfectionist”. Spend more time looking for details first, then answer the question.
  • You’ll get better at asking questions by training to ask good questions. And that takes a bit more time.

3. Don’t leave broken windows

There was a very interesting article on reddit, recently about not leaving broken windows. The essence of the article is to never compromise on quality. To never become a slacker. To never leave … broken windows. Here’s a quote from the article

When we take certain shortcuts to deliver something in the shortest period of time, and our code reflects how careless we’ve been, developers coming after us (from the same team, a future team, and even ourselves!), will reach an important conclusion: it’s not important to pay enough attention to the code we produce. Little by little our application will start deteriorating in what it becomes an unstoppable process.

This isn’t about being a perfectionist, in fact. Sometimes, fixing broken windows can be postponed (and must be, of course). Oftentimes, by allowing windows to break and stay broken, we’re introducing a situation where no one is happy. Not us the programmers, not our customers, not our users, not our project managers. This is an attitude thing and thus at the core of being professional. How did Benjamin Franklin say?

The bitterness of poor quality remains long after the sweetness of low price is forgotten

That’s true for everything. In this case, “low price” is the quick win we might get by implementing something in a sloppy way.

4. Software ought to be deterministic. Aim for it!

In an ideal world, everything in software is “deterministic”. We’d all be functional programmers, writing side-effect free, pure functions. Like String.contains(). No matter how many times you execute the following:

assertTrue("abcde".contains("bc"));

… the result is always the same, expected result. The universe and all its statefulness will have absolutely no impact on this computation. It’s deterministic.

We can do that, too in our own programs, not just in standard libraries. We can try to write side-effect free, deterministic modules for as often as possible. This isn’t really a matter of what technology we choose. Deterministic programming can be done in any language – even if functional languages have more tools to prevent accidental side-effects through more sophisticated type systems. But the example I’ve shown is a Java example. Object orientation permits determinism. Heck, procedural languages like PL/SQL allow for determinism. It’s a requirement for a function to be deterministic, if it is to be used in an index:

CREATE INDEX upper_first_name ON customer (upper (first_name));
-- Deterministic function here: -----------^^^^^^^^^^^^^^^^^^

So again, this is a matter of discipline. You could see a side-effectful procedure / method / “function” as a “broken window”. Perhaps it was easier to maintain a side-effect, hoping that eventually, it can be removed. But that’s usually a lie. The price will be paid later, when non-determinism suddenly strikes. And it will.

5. Expect the unexpected

That previous link is key. Murphy’s Law is something we programmers should observe all the time. Everything can break. And it will. Come on, as software engineers, we know it will break. Because our world is not deterministic, neither are the business requirements that we’re implementing. We can implement tip #4 (determinism) only until it is no longer possible. From then on, we’ll inevitably enter the world of non-determinism (the “real world”), where stuff will go wrong. So, aim for it. Expect the unexpected. Train your inner pessimist to foresee all kinds of things.

How to write that pessimistic code in a concise manner is another story, of course. And how to distinguish things that will fail (and need to be dealt with) from things that might fail (and don’t need to be dealt with), that takes some practice 🙂

6. Never cargo cult. Never follow dogma. Always embrace: “It depends”

Everything you’ve been taught is potentially wrong. Including (or probably: especially) when someone really popular says it. Here’s a nice quote:

Our profession is full of hypocrisy. We like to think of ourselves as mathematicians, where only the purest of ideas persist, and they’re necessarily correct.

That’s wrong. Our profession is built on top of mathematics, but unless you go into the funky world of category theory or relational algebra (and even then, I doubt everything is “correct”), you’re in the pragmatic world of real-world business requirements. And that’s, quite frankly, very far from perfect. Let’s look at some of the most popular programming languages out there:

  • C
  • Java
  • SQL

Do you really think that these languages resemble mathematics in the least bit? If so, let’s discuss segmentation faults, Java generics, or SQL three-valued logic. These languages are platforms built by pragmatists. There is some really cool theoretical background in all of them, but ultimately, they had to do the job.

The same is true for everything built on top of languages. Libraries. Frameworks. Heck, even architectures. Design patterns. Nothing is right or wrong. Everything is a tool that was designed for some context. Think of the tool in its context. Never think of the tool as a standalone raison d’être. We’re not doing art for art’s sake.

So, say no to unquestioned:

  • XML
  • JSON
  • Functional programming
  • Object oriented programming
  • Design patterns
  • Microservices
  • Three tier architectures
  • DDD
  • TDD
  • In fact: *DD

All of these are nice tools given a certain context. They don’t always hold true. By staying curious and thinking out of the box, you’ll be a better programmer and know when to use which one of these tools.

7. Do it

True. There are luminaries out there who outperform everyone.

But most programmers are simply “good”. Or they have the potential to be “good”. How to be a good programmer? By doing it. Great software wasn’t written in a day, and popular people aren’t the only heroes of our time. I’ve met many great programmers no one knows about, because they lead private lives, solve private problems of small companies.

But great programmers all have one thing in common: They just do it. They practice. They work every day to get better and better.

Want to get better at SQL? Do it! Try to write a SQL statement with some new features in them, every day. Use window functions. Grouping sets. Recursion. Partitioned outer join. The MODEL and/or MATCH_RECOGNIZE clauses. It doesn’t have to ship to production every time, but the practice will be worth it.

8. Focus on one subject (in the long run)

There might be only very few “good” full stack developers out there. Most full stack developers will rather be mediocre at everything. Sure, a small team might need only a few of them and they can cover a lot of business logic to quickly bootstrap a piece of new software. But everything will be quite sloppy and “kinda works”. Perhaps that’s good enough for the minimum viable product phase, but in the long run, there will be more complex problems that a full stack developer will not have the time to properly analyse (or foresee!).

Focus on one subject mostly, and get really good at that. A specialist will always be in demand as long as that specialist’s niche exists, and many niches will outlive us all (hello COBOL or SQL). So, do your career a favour and do something really weel, rather than many things “just ok”.

9. Play

While you should mostly focus on one subject, you shouldn’t forget the other subjects completely. You may never be really really really good at all of SQL, scaling up, scaling out, low level performance, CSS (who’s good at that anyway!?), object orientation, requirements engineering, architecture, etc. all at once (see tip #8). It’s just not possible.

But you should at least understand the essence of each of these. You should understand when SQL is the right choice (and when it isn’t). When low level performance tuning is important (and when it isn’t). How CSS works in principle. The advantage of object orientation, FP, etc.

You should spend some time playing with these (and many more) concepts, technologies to better understand why they’re important. To know when to apply them, and then perhaps to find an expert to actually execute the work.

By playing around with new paradigms and technologies, you’ll open up your mind to entirely different ways of thinking, and chances are, you’ll be able to use that in your everyday work in one way or another.

10. Keep it simple, stupid

Albert Einstein said:

Everything should be made as simple as possible, but no simpler

No one is able to handle huge complexity. Not in software, not in any other aspect of life. Complexity is the killer of good software and thus simplicity is the enabler. Easy to understand. Hard to implement. Simplicity is something that takes a lot of time and practice to recognise and produce. There are many rules you can follow, of course.

One of the simplest rules is to have methods/functions with only a few parameters. Let’s look at that. Surely, the aforementioned String.contains() method qualifies. We can write "abcde".contains("bcd") and without reading any documentation, everyone will immediately understand what this does and why. The method does one thing and one thing only. There are no complicated context/settings/other arguments that can be passed to the method. There are no “special cases”, nor any caveats.

Again, producing simplicity in a library is much easier than doing that in business logic. Can we achieve it? Perhaps. By practicing. By refactoring. But like great software, simplicity is not built in a day.

(Protip: Conway’s Law applies. It is utterly impossible to write good, simple software in an environment where the business is super complex. Either, you like complexity and ugly legacy, or you better get out of that business).