The 5 Golden Rules of Giving Awesome Customer Support


This is usually a tech blog, but every now and then, we make an exception when there’s something important to say.

Today, I’m going to criticise a lot of our industry’s understanding of support.

Who is this article for?

It’s for every software engineer giving support to users and/or customers, and every manager who works with software engineers. In fact, it isn’t restricted to software, but since this is a Java blog, let’s pretend it is only about software.

#1: All users are customers

While we’re at it, let’s use the terms “users” and “customers” as synonyms, because every user should be treated like a paying customer. In fact, every user should be treated like the only paying customer you have, on which your whole business depends. This is why I will put “customer” in quotes in this article.

This is something you should constantly remind yourself of. It doesn’t matter if your “customer” paid you millions for your software, or if they just use your quick and dirty Open Source library that you implemented in your free time.

Why is that? Because it’s your fault and problem as a vendor if you chose some business model (or if you don’t even have a business model) that generates (excessive?) support work. Change the model. Make more non-paying “customers” pay for support. Help them understand the value they’re getting from being your paying customers. Or keep doing stuff for free, it doesn’t matter as long as you will remember that it’s your problem, not theirs.

Always take Apple as a shining example. Apple customers love to pay excessive amounts for average hardware and software with nice-looking design. This has nothing to do with support, it just shows that Apple has gotten making customers pay right. Because Apple users want to be paying customers.

#2: Your customer took the time

Remember this:

Your “customer” took the time to report an issue.

Period. That’s all there is to say. When a “customer” gets in touch with you, they took the time to get in touch. So you should be thankful they took the time. Not all “customers” take the time. Most of them just rant and leave you silently if something goes wrong. Or worse, they rant in public and then leave you.

If a “customer” does take the time, this means they stand for thousands of others who don’t take the time but like your product and want to continue liking it. Don’t spoil that. Keep those customers!

#3: Your customer is abrasive and demanding

The food chain goes this way round. Accept that. Moreover, remember: You’re not some divine entity that grants an audience to an unworthy soul who should kneel before thy wisdom. You’re the supplier. They’re the customer. They’re the king. You’re the servant.

Of course they are abrasive and demanding. They are the “customer”. They had a bad day. They have an idiot boss. Their job sucks. They work on legacy code. They blame you (and everyone else). It’s not their fault. It never was.

So what? That happens from time to time. More often than not, however, “customers” are nice people, and again: They took the time to get in touch.

So, your job is to thank them for that. To make sure you do everything needed to help them. To solve their trouble and issues. Which often isn’t what they’re saying. They just want you to listen. A workaround might just be enough. They’ll understand you can’t fix things immediately. But do help them.

#4: Your customer doesn’t understand your product

You’ve spent all that time with your product. You know exactly how it works. You have accepted all the caveats. The bugs. The workarounds.

Your customer doesn’t. They have more important things to do than reverse engineer your funky product. If they don’t understand something, always assume it’s your fault. You could improve the product. Or the documentation. Or your SEO to make sure they will find the documentation. Or your support channels (see below). Whatever. Always assume that your “customer” doesn’t have the time to really understand your product. Why should they.

Learn from your “customers”. Their misunderstanding of your ways is your chance to do it better. Every single time.

#5: Your customer doesn’t care about your workflows

And here’s the one thing you must never, NEVER do:

NEVER tell your “customer” that they’ve used the wrong support channel

NEVER do the above. They’ll learn by themselves after a while. Or they don’t. Does it matter? It doesn’t. They’ve taken the time to reach out and that ought to be enough.

Perhaps, you’ve chosen a confusing setup of support channels. Or complicated products (looking at you, Bugzilla). Improve it, then. Make it easier for your “customers” to choose the right support channel if they have gotten it “wrong”.

What does wrong support channel even mean? Wrong support channels create administrative work to you. You have to copy a bug report from Google Groups to JIRA. From your E-Mail program to GitHub. Etc.

So, a “customer” has reported a bug on the discussion group. You prefer they create an issue in an issue tracker. How do you react?

  • Bad: This is a bug not a discussion, we have a bug tracker for this. Can you please create an issue??!?
  • Good: Thank you very much for bringing this to our attention. I have registered an issue #717863 for this problem. We’re looking into this as quickly as possible and let you know about our progress. If there’s anything … blah blah.

As simple as that.

Conclusion

There are many more things that help you get support right. And it isn’t always easy. Sometimes, it’s just too hard to get it right. We all have tons of things to do. And priorities.

But remember one thing. This is (almost) never the “customer”‘s fault. And even if it is, you should never make them feel as if it were. There are enough enterprise-y companies out there whose processes are too complex and heavy to ever give awesome support.

Don’t be like those companies. Be awesome for your “customers”.

Using Oracle AQ via Java 8 Streams


One of the most awesome features of the Oracle database is Oracle AQ: Oracle Database Advanced Queuing. The AQ API implements a full fledged, transactional messaging system directly in the database.

In a classic architecture where the database is at the center of your system, with multiple applications (some of which written in Java, others written in Perl or PL/SQL, etc.) accessing the same database, using AQ for inter-process communication is just great. If you’re more on the Java EE side, you might purchase a Java-based MQ solution, and put that message bus / middleware at the center of your system architecture. But why not use the database instead?

How to use the PL/SQL AQ API with jOOQ

The PL/SQL API for AQ message enqueuing and dequeuing is rather simple, and it can be accessed very easily from Java using jOOQ’s OracleDSL.DBMS_AQ API.

The queue configuration used here would look something like:

CREATE OR REPLACE TYPE message_t AS OBJECT (
  ID         NUMBER(7),
  title      VARCHAR2(100 CHAR)
)
/

BEGIN
  DBMS_AQADM.CREATE_QUEUE_TABLE(
    queue_table => 'message_aq_t',
    queue_payload_type => 'message_t'
  );

  DBMS_AQADM.CREATE_QUEUE(
    queue_name => 'message_q',
    queue_table => 'message_aq_t'
  );

  DBMS_AQADM.START_QUEUE(
    queue_name => 'message_q'
  );
  COMMIT;
END;
/

And the jOOQ code generator would generate the useful classes with all the type information directly associated with them (simplified example):

class Queues {
    static final Queue<MessageTRecord> MESSAGE_Q = 
        new QueueImpl<>("NEW_AUTHOR_AQ", MESSAGE_T);
}

class MessageTRecord {
    void setId(Integer id) { ... }
    Integer getId() { ... }
    void setTitle(String title) { ... }
    String getTitle() { ... }
    MessageTRecord(
        Integer id, String title
    ) { ... }
}

These classes can then be used to enqueue and dequeue messages type safely directly on the generated queue references:

// The jOOQ configuration
Configuration c = ...

// Enqueue a message
DBMS_AQ.enqueue(c, MESSAGE_Q, 
    new MessageTRecord(1, "test"));

// Dequeue it again
MessageTRecord message = DBMS_AQ.dequeue(c, MESSAGE_Q);

Easy, isn’t it?

Now, let’s leverage Java 8 features

A message queue is nothing other than an infinite (blocking) stream of messages. Since Java 8, we have a formidable API for such message streams, the Stream API.

This is why we have added (for the upcoming jOOQ 3.8) a new API that combines the existing jOOQ AQ API with Java 8 Streams:

// The jOOQ configuration
Configuration c = ...

DBMS_AQ.dequeueStream(c, MESSAGE_Q)
       .filter(m -> "test".equals(m.getTitle()))
       .forEach(System.out::println);

The above stream pipeline will listen on the MESSAGE_Q queue, consume all messages, filter out messages that do not contain the "test", and print the remaining messages.

Blocking streams

The interesting thing is the fact that this is a blocking, infinite stream. As long as there is no new message in the queue, the stream pipeline processing will simply block on the queue, waiting for new messages. This is not an issue for sequential streams, but when calling Stream.parallel(), what happens then?

jOOQ will consume each message in a transaction. A jOOQ 3.8 transaction runs in a ForkJoinPool.ManagedBlocker:

static <T> Supplier<T> blocking(Supplier<T> supplier) {
    return new Supplier<T>() {
        volatile T result;

        @Override
        public T get() {
            try {
                ForkJoinPool.managedBlock(new ManagedBlocker() {
                    @Override
                    public boolean block() {
                        result = supplier.get();
                        return true;
                    }

                    @Override
                    public boolean isReleasable() {
                        return result != null;
                    }
                });
            }
            catch (InterruptedException e) {
                throw new RuntimeException(e);
            }

            return asyncResult;
        }
    };
}

This isn’t a lot of magic. A ManagedBlocker runs some special code when it is run by a ForkJoinWorkerThread, making sure the thread’s ForkJoinPool won’t suffer from thread exhaustion and thus from deadlocks. For more info, read this interesting article here:
http://zeroturnaround.com/rebellabs/java-parallel-streams-are-bad-for-your-health

Or this Stack Overflow answer:
http://stackoverflow.com/a/35272153/521799

So, if you want a super-fast parallel AQ dequeuing process, just run:

// The jOOQ configuration. Make sure its referenced
// ConnectionPool has enough connections
Configuration c = ...

DBMS_AQ.dequeueStream(c, MESSAGE_Q)
       .parallel()
       .filter(m -> "test".equals(m.getTitle()))
       .forEach(System.out::println);

And you’ll have several threads that will dequeue messages in parallel.

Don’t want to wait for jOOQ 3.8?

No problem. Use the current version and wrap the dequeue operation in your own Stream:

Stream<MessageTRecord> stream = Stream.generate(() ->
    DSL.using(config).transactionResult(c ->
        dequeue(c, MESSAGE_Q)
    )
);

Done.

Bonus: Asynchronous dequeuing

While we were at it, another very nice feature of queuing systems is their asynchronicity. With Java 8, a very useful type to model (and compose) asynchronous algorithms is the CompletionStage, and it’s default implementation the CompletableFuture, which executes tasks in the ForkJoinPool again.

Using jOOQ 3.8, you can again simply call

// The jOOQ configuration. Make sure its referenced
// ConnectionPool has enough connections
Configuration c = ...

CompletionStage<MessageTRecord> stage =
DBMS_AQ.dequeueAsync(c, MESSAGE_Q)
       .thenCompose(m -> ...)
       ...;

Stay tuned for another article on the jOOQ blog soon, where we look into more sophisticated use-cases for asynchronous, blocking SQL statements with jOOQ 3.8 and Java 8

How to Detect Slow Queries with jOOQ


Just now, we implemented a nice little feature in jOOQ’s code generator:
https://github.com/jOOQ/jOOQ/issues/4974

It detects whenever the jOOQ code generator runs a slow query to reverse engineer schema meta information. Why?

In our development and integration test environment, we don’t have huge schemas with all the different performance edge cases put in place. For instance, we don’t have 5000 Oracle synonyms. Or 10000 procedures with 500 parameters each. We do cover some common edge cases, but not in all databases.

Users on the other hand tend to accept the status quo after a while. The code generator is slow-ish? Sure, because we have a huge schema. This lazy acceptance is an impediment to our product quality. We had rather our users report all sorts of issues they encounter, so we want to encourage them.

And we did

In the upcoming jOOQ version 3.8 (and patch releases for 3.5.5, 3.6.5, and 3.7.3), we’ve added a nice little ExecuteListener to jOOQ-meta, which looks roughly like this:

class PerformanceListener 
    extends DefaultExecuteListener {

    StopWatch watch;
    class SQLPerformanceWarning 
        extends Exception {}

    @Override
    public void executeStart(ExecuteContext ctx) {
        super.executeStart(ctx);
        watch = new StopWatch();
    }

    @Override
    public void executeEnd(ExecuteContext ctx) {
        super.executeEnd(ctx);
        if (watch.split() > 5_000_000_000L)
            log.warn(
                "Slow SQL",
                "jOOQ Meta executed a slow query"
              + "\n\n"
              + "Please report this bug here: "
              + "https://github.com/jOOQ/jOOQ/issues/new\n\n"
              + formatted(ctx.query()),
                new SQLPerformanceWarning());
    }
}

It’s very simple. Every time we start executing a query, a “stop watch” is started. Every time we end execution, we check if the watch elapsed more than 5 seconds. If so, we log a warning, a link to our issue tracker, a formatted version of the SQL query, and a stack trace to help find the exact spot where the slow statement was executed.

Let’s run this

The reason why we did this is because we’ve seen ourselves that the PostgreSQL code generator runs a slow query to fetch all the stored procedures (and to generate overload indexes). The produced error message is:

[WARNING] Slow SQL                 : jOOQ Meta executed a slow query (slower than 5 seconds)

Please report this bug here: https://github.com/jOOQ/jOOQ/issues/new

select
  "r1"."routine_schema",
  "r1"."routine_name",
  "r1"."specific_name",
  case when exists (
        select 1 as "one"
        from "information_schema"."parameters"
        where (
          "information_schema"."parameters"."specific_schema" = "r1"."specific_schema"
          and "information_schema"."parameters"."specific_name" = "r1"."specific_name"
          and upper("information_schema"."parameters"."parameter_mode")  'IN'
        )
      ) then 'void'
       else "r1"."data_type"
  end as "data_type",
  "r1"."character_maximum_length",
  "r1"."numeric_precision",
  "r1"."numeric_scale",
  "r1"."type_udt_schema",
  "r1"."type_udt_name",
  case when exists (
        select 1 as "one"
        from "information_schema"."routines" as "r2"
        where (
          "r2"."routine_schema" in (
            'public', 'multi_schema', 'pg_catalog'
          )
          and "r2"."routine_schema" = "r1"."routine_schema"
          and "r2"."routine_name" = "r1"."routine_name"
          and "r2"."specific_name"  "r1"."specific_name"
        )
      ) then (
        select count(*)
        from "information_schema"."routines" as "r2"
        where (
          "r2"."routine_schema" in (
            'public', 'multi_schema', 'pg_catalog'
          )
          and "r2"."routine_schema" = "r1"."routine_schema"
          and "r2"."routine_name" = "r1"."routine_name"
          and "r2"."specific_name" <= "r1"."specific_name"
        )
      ) end as "overload",
  "pg_catalog"."pg_proc"."proisagg"
from "information_schema"."routines" as "r1"
  join "pg_catalog"."pg_namespace"
  on "pg_catalog"."pg_namespace"."nspname" = "r1"."specific_schema"
  join "pg_catalog"."pg_proc"
  on (
    "pg_catalog"."pg_proc"."pronamespace" = "pg_catalog"."pg_namespace".oid
    and (("pg_catalog"."pg_proc"."proname" || '_') || cast("pg_catalog"."pg_proc".oid as varchar)) = "r1"."specific_name"
  )
where (
  "r1"."routine_schema" in (
    'public', 'multi_schema', 'pg_catalog'
  )
  and not("pg_catalog"."pg_proc"."proretset")
)
order by
  "r1"."routine_schema" asc,
  "r1"."routine_name" asc,
  "overload" asc
org.jooq.util.AbstractDatabase$1$SQLPerformanceWarning
        at org.jooq.util.AbstractDatabase$1.executeEnd(AbstractDatabase.java:230)
        at org.jooq.impl.ExecuteListeners.executeEnd(ExecuteListeners.java:163)
        at org.jooq.impl.AbstractResultQuery.execute(AbstractResultQuery.java:269)
        at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:346)
        at org.jooq.impl.AbstractResultQuery.fetch(AbstractResultQuery.java:308)
        at org.jooq.impl.SelectImpl.fetch(SelectImpl.java:2703)
        at org.jooq.util.postgres.PostgresDatabase.getRoutines0(PostgresDatabase.java:707)
        at org.jooq.util.AbstractDatabase.getRoutines(AbstractDatabase.java:1131)
        at org.jooq.util.JavaGenerator.generate(JavaGenerator.java:417)
        at org.jooq.util.JavaGenerator.generate(JavaGenerator.java:314)
        at org.jooq.util.JavaGenerator.generate(JavaGenerator.java:279)
        at org.jooq.util.GenerationTool.run(GenerationTool.java:490)
        at org.jooq.util.GenerationTool.generate(GenerationTool.java:193)
        at org.jooq.util.maven.Plugin.execute(Plugin.java:131)
        ...

We can now proceed with fixing the query, easily.

You can do the same!

The implementation of the ExecuteListener was straight forward. You can do the same, very easily. Just hook a simple execute listener into your jOOQ Configuration, measuring execution speeds and logging warnings after a threshold, done.

Happy debugging!

Further reading

Coincidentally, a very similar approach has been documented by the engineering team at Square – The Query Sniper:

https://corner.squareup.com/2016/01/query-sniper.html

How to Pattern-Match Files and Display Adjacent Lines in Java


Recently, we’ve published our article about the awesome window function support in jOOλ 0.9.9, which I believe is some of the best additions to the library that we’ve ever done.

Today, we’ll look into an awesome application of window functions in a use-case that is inspired by this Stack Overflow question Sean Nguyen:

How to get lines before and after matching from java 8 stream like grep?

I have a text files that have a lot of string lines in there. If I want to find lines before and after a matching in grep, I will do like this:

grep -A 10 -B 10 "ABC" myfile.txt

How can I implement the equivalent in java 8 using streams?

So the question is:

How can I implement the equivalent in Java 8 using streams?

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere.Well, the unix shell and its various “pipable” commands are about the only thing that are even more awesome (and mysterious) than window functions. Being able to grep for a certain string in a file, and then display a “window” of a couple of lines is quite useful.

With jOOλ 0.9.9, however, we can do that very easily in Java 8 as well. Consider this little snippet:

Seq.seq(Files.readAllLines(Paths.get(
        new File("/path/to/Example.java").toURI())))
   .window()
   .filter(w -> w.value().contains("ABC"))
   .forEach(w -> {
       System.out.println();
       System.out.println("-1:" + w.lag().orElse(""));
       System.out.println(" 0:" + w.value());
       System.out.println("+1:" + w.lead().orElse(""));
       // ABC: Just checking
   });

This program will output:

-1: .window()
 0: .filter(w -> w.value().contains("ABC"))
+1: .forEach(w -> {

-1:     System.out.println("+1:" + w.lead().orElse(""));
 0:     // ABC: Just checking
+1: });

So, I’ve run the program on itself and I’ve found all the lines that match “ABC”, plus the previous lines (“lagging” / lag()) and the following lines (leading / lead()). These lead() and lag() functions work just like their SQL equivalents.

But unlike SQL, composing functions in Java (or other general purpose languages) is a bit simpler as there is less syntax clutter involved. We can easily do aggregations over a window frame to collect a generic amount of lines “lagging” and “leading” a match. Consider the following alternative:

int lower = -5;
int upper =  5;
        
Seq.seq(Files.readAllLines(Paths.get(
        new File("/path/to/Example.java").toURI())))
   .window(lower, upper)
   .filter(w -> w.value().contains("ABC"))
   .map(w -> w.window()
              .zipWithIndex()
              .map(t -> tuple(t.v1, t.v2 + lower))
              .map(t -> (t.v2 > 0 
                       ? "+" 
                       : t.v2 == 0 
                       ? " " : "") 
                       + t.v2 + ":" + t.v1)
              .toString("\n"))

And the output that we’re getting is this:

-5:int upper =  5;
-4:        
-3:Seq.seq(Files.readAllLines(Paths.get(
-2:        new File("/path/to/Example.java").toURI())))
-1:   .window(lower, upper)
 0:   .filter(w -> w.value().contains("ABC"))
+1:   .map(w -> w.window()
+2:              .zipWithIndex()
+3:              .map(t -> tuple(t.v1, t.v2 + lower))
+4:              .map(t -> (t.v2 > 0 
+5:                       ? "+" 

Could it get any more concise? I don’t think so. Most of the logic above was just generating the index next to the line.

Conclusion

Window functions are extremely powerful. The recent discussion on reddit about our previous article on jOOλ’s window function support has shown that other languages also support primitives to build similar functionality. But usually, these building blocks aren’t as concise as the ones exposed in jOOλ, which are inspired by SQL.

With jOOλ mimicking SQL’s window functions, there is only little cognitive friction when composing powerful operations on in memory data streams.

Learn more about window functions in these articles here:

Use JUnit’s expected exceptions sparingly


Sometimes, when we get pull requests for jOOQ or our other libraries, people change the code in our unit tests to be more “idiomatic JUnit”. In particular, this means that they tend to change this (admittedly not so pretty code):

@Test
public void testValueOfIntInvalid() {
    try {
        ubyte((UByte.MIN_VALUE) - 1);
        fail();
    }
    catch (NumberFormatException e) {}
    try {
        ubyte((UByte.MAX_VALUE) + 1);
        fail();
    }
    catch (NumberFormatException e) {}
}

… into this, “better” and “cleaner” version:

@Test(expected = NumberFormatException.class)
public void testValueOfShortInvalidCase1() {
    ubyte((short) ((UByte.MIN_VALUE) - 1));
}

@Test(expected = NumberFormatException.class)
public void testValueOfShortInvalidCase2() {
    ubyte((short) ((UByte.MAX_VALUE) + 1));
}

What have we gained?

Nothing!

Sure, we already have to use the @Test annotation, so we might as well use its attribute expected right? I claim that this is completely wrong. For two reasons. And when I say “two”, I mean “four”:

1. We’re not really gaining anything in terms of number of lines of code

Compare the semantically interesting bits:

// This:
try {
    ubyte((UByte.MIN_VALUE) - 1);
    fail("Reason for failing");
}
catch (NumberFormatException e) {}

// Vs this:
@Test(expected = NumberFormatException.class)
public void reasonForFailing() {
    ubyte((short) ((UByte.MAX_VALUE) + 1));
}

Give or take whitespace formatting, there are exactly the same amount of essential semantic pieces of information:

  1. The method call on ubyte(), which is under test. This doesn’t change
  2. The message we want to pass to the failure report (in a string vs. in a method name)
  3. The exception type and the fact that it is expected

So, even from a stylistic point of view, this isn’t really a meaningful change.

2. We’ll have to refactor it back anyway

In the annotation-driven approach, all I can do is test for the exception type. I cannot make any assumptions about the exception message for instance, in case I do want to add further tests, later on. Consider this:

// This:
try {
    ubyte((UByte.MIN_VALUE) - 1);
    fail("Reason for failing");
}
catch (NumberFormatException e) {
    assertEquals("some message", e.getMessage());
    assertNull(e.getCause());
    ...
}

3. The single method call is not the unit

The unit test was called testValueOfIntInvalid(). So, the semantic “unit” being tested is that of the UByte type’s valueOf() behaviour in the event of invalid input in general. Not for a single value, such as UByte.MIN_VALUE - 1.

It shouldn’t be split into further smaller units, just because that’s the only way we can shoehorn the @Test annotation into its limited scope of what it can do.

Hear this, TDD folks. I NEVER want to shoehorn my API design or my logic into some weird restrictions imposed by your “backwards” test framework (nothing personal, JUnit). NEVER! “My” API is 100x more important than “your” tests. This includes me not wanting to:

  • Make everything public
  • Make everything non-final
  • Make everything injectable
  • Make everything non-static
  • Use annotations. I hate annotations.

Nope. You’re wrong. Java is already a not-so-sophisticated language, but let me at least use the few features it offers in any way I want.

Don’t impose your design or semantic disfigurement on my code because of testing.

OK. I’m overreacting. I always am, in the presence of annotations. Because…

4. Annotations are always a bad choice for control flow structuring

Time and again, I’m surprised by the amount of abuse of annotations in the Java ecosystem. Annotations are good for three things:

  1. Processable documentation (e.g. @Deprecated)
  2. Custom “modifiers” on methods, members, types, etc. (e.g. @Override)
  3. Aspect oriented programming (e.g. @Transactional)

And beware that @Transactional is the one of the very few really generally useful aspect that ever made it to mainstream (logging hooks being another one, or dependency injection if you absolutely must). In most cases, AOP is a niche technique to solve problems, and you generally don’t want that in ordinary programs.

It is decidedly NOT a good idea to model control flow structures, let alone test behaviour, with annotations

Yes. Java has come a long (slow) way to embrace more sophisticated programming idioms. But if you really get upset with the verbosity of the occasional try { .. } catch { .. } statement in your unit tests, then there’s a solution for you. It’s Java 8.

How to do it better with Java 8

JUnit lambda is in the works:
http://junit.org/junit-lambda.html

And they have added new functional API to the new Assertions class:
https://github.com/junit-team/junit-lambda/blob/master/junit5-api/src/main/java/org/junit/gen5/api/Assertions.java

Everything is based around the Executable functional interface:

@FunctionalInterface
public interface Executable {
    void execute() throws Exception;
}

This executable can now be used to implement code that is asserted to throw (or not to throw) an exception. See the following methods in Assertions

public static void assertThrows(Class<? extends Throwable> expected, Executable executable) {
    expectThrows(expected, executable);
}

public static <T extends Throwable> T expectThrows(Class<T> expectedType, Executable executable) {
    try {
        executable.execute();
    }
    catch (Throwable actualException) {
        if (expectedType.isInstance(actualException)) {
            return (T) actualException;
        }
        else {
            String message = Assertions.format(expectedType.getName(), actualException.getClass().getName(),
                "unexpected exception type thrown;");
            throw new AssertionFailedError(message, actualException);
        }
    }
    throw new AssertionFailedError(
        String.format("Expected %s to be thrown, but nothing was thrown.", expectedType.getName()));
}

That’s it! Now, those of you who object to the verbosity of try { .. } catch { .. } blocks can rewrite this:

try {
    ubyte((UByte.MIN_VALUE) - 1);
    fail("Reason for failing");
}
catch (NumberFormatException e) {}

… into this:

expectThrows(NumberFormatException.class, () -> 
    ubyte((UByte.MIN_VALUE) - 1));

And if I want to do further checks on my exception, I can do so:

Exception e = expectThrows(NumberFormatException.class, () -> 
    ubyte((UByte.MIN_VALUE) - 1));
assertEquals("abc", e.getMessage());
...

Great work, JUnit lambda team!

Functional programming beats annotations every time

Annotations were abused for a lot of logic, mostly in the JavaEE and Spring environments, which were all too eager to move XML configuration back into Java code. This has gone the wrong way, and the example provided here clearly shows that there is almost always a better way to write out control flow logic explicitly both using object orientation or functional programming, than by using annotations.

In the case of @Test(expected = ...), I conclude:

Rest in peace, expected

(it is no longer part of the JUnit 5 @Test annotation, anyway)

Impress Your Coworkers by Using SQL UNPIVOT!


I’ve recently encountered a very interesting question on Stack Overflow by an unnamed user. The question was about generating a table of the following form in Oracle, using a table valued function:

Description   COUNT
-------------------
TEST1         10 
TEST2         15
TEST3         25
TEST4         50

The logic that should be implemented for the COUNT column is the following:

  • TEST1: count of employees whose sal < 10000
  • TEST2: count of employees whose dept > 10
  • TEST3: count of employees whose hiredate > (SYSDATE-60)
  • TEST4: count of employees whose grade = 1

Challenge accepted!

For this exercise, let’s assume the following table:

CREATE TABLE employees (
  id NUMBER(18)     NOT NULL PRIMARY KEY,
  sal NUMBER(18, 2) NOT NULL,
  dept NUMBER(18)   NOT NULL,
  hiredate DATE     NOT NULL,
  grade NUMBER(18)  NOT NULL
);

INSERT INTO employees 
      VALUES (1, 10000,  1, SYSDATE     , 1);
INSERT INTO employees 
      VALUES (2,  9000,  5, SYSDATE - 10, 1);
INSERT INTO employees 
      VALUES (3, 11000, 13, SYSDATE - 30, 2);
INSERT INTO employees 
      VALUES (4, 10000, 12, SYSDATE - 80, 2);
INSERT INTO employees 
      VALUES (5,  8000,  7, SYSDATE - 90, 1);

How to calculate the COUNT values

In a first step, we’re going to look into how to best calculate the COUNT values. The simplest way is to calculate the values in individual columns, not rows. SQL newbies will probably resort to a canonical solution using nested SELECTs, which is very bad for performance reasons:

SELECT
  (SELECT COUNT(*) FROM employees 
      WHERE sal < 10000) AS test1,
  (SELECT COUNT(*) FROM employees 
      WHERE dept > 10) AS test2,
  (SELECT COUNT(*) FROM employees 
      WHERE hiredate > (SYSDATE - 60)) AS test3,
  (SELECT COUNT(*) FROM employees 
      WHERE grade = 1) AS test4
FROM dual;

Why is the query not optimal? There are four table accesses to find all the data:

unpivot-nested-select

If you add an index to each individual column being filtered, chances are at least to optimise individual subqueries, but for these kinds of reports, the occasional full table scan is perfectly fine, especially if you aggregate a lot of data.

Even if not optimal in speed, the above yields the correct result:

TEST1   TEST2   TEST3   TEST4
-----------------------------
2	2	3	3

How to improve the query, then?

Few people are aware of the fact that aggregate functions only aggregate non-NULL values. This has no effect, when you write COUNT(*), but when you pass an expression to the COUNT(expr) function, this becomes much more interesting!

The idea here is that you use a CASE expression that transforms each predicate’s TRUE evaluation into a non-NULL value, an the FALSE (or NULL) evaluation into NULL. The following query illustrates this approach

SELECT
  COUNT(CASE WHEN sal < 10000 THEN 1 END)
      AS test1,
  COUNT(CASE WHEN dept > 10 THEN 1 END)
      AS test2,
  COUNT(CASE WHEN hiredate > (SYSDATE-60) THEN 1 END)
      AS test3,
  COUNT(CASE WHEN grade = 1 THEN 1 END)
      AS test4
FROM employees;

… and yields again the correct result:

TEST1   TEST2   TEST3   TEST4
-----------------------------
2	2	3	3

Using FILTER() instead of CASE

The SQL standard and the awesome PostgreSQL database offer an even more convenient syntax for the above functionality. The little known FILTER() clause on aggregate functions.

In PostgreSQL, you’d write instead:

SELECT
  COUNT(*) FILTER (WHERE sal < 10000)
      AS test1,
  COUNT(*) FILTER (WHERE dept > 10)
      AS test2,
  COUNT(*) FILTER (WHERE hiredate > (SYSDATE - 60))
      AS test3,
  COUNT(*) FILTER (WHERE grade = 1)
      AS test4
FROM employees;

This is useful when you want to cleanly separate the FILTER() criteria from any other expression that you want to use for aggregating. E.g. when calculating a SUM().

In any case, the query now has to hit the table only once. The aggregation can then be performed entirely in memory.

unpivot-case-expression

This is always better than the previous approach, unless you have an index for every aggregation!

OK. Now how to get the results in rows?

The question on Stack Overflow wanted a result with TESTn values being put in individual rows, not columns.

Description   COUNT
-------------------
TEST1         2
TEST2         2
TEST3         3
TEST4         3

Again, there’s a canonical, not so performant approach to do this with UNION ALL:

SELECT 
  'TEST1' AS Description, 
  COUNT(*) AS COUNT 
FROM employees WHERE sal < 10000
UNION ALL
SELECT 
  'TEST2', 
  COUNT(*)
FROM employees WHERE dept > 10
UNION ALL
SELECT 
  'TEST3', 
  COUNT(*) 
FROM employees WHERE hiredate > (SYSDATE - 60)
UNION ALL
SELECT 
  'TEST4', 
  COUNT(*) 
FROM employees WHERE grade = 1

This approach is more or less equivalent to the nested selects approach, except for the column / row transposition (“unpivoting”). And the plan is also very similar:

unpivot-union-all

Transposition = (un)pivoting

Notice how I used the term “transpose”. That’s what we did, and it has a name: (un)pivoting. Not only does it have a name, but this feature is also supported out of the box in Oracle and SQL Server via the PIVOT and UNPIVOT keywords that can be placed after table references.

  • PIVOT transposes rows into columns
  • UNPIVOT transposes columns back into rows

So, we’ll take the original, optimal solution, and transpose that with UNPIVOT

SELECT *
FROM (
  SELECT
    COUNT(CASE WHEN sal < 10000 THEN 1 END)
      AS test1,
    COUNT(CASE WHEN dept > 10 THEN 1 END)
      AS test2,
    COUNT(CASE WHEN hiredate > (SYSDATE-60) THEN 1 END)
      AS test3,
    COUNT(CASE WHEN grade = 1 THEN 1 END)
      AS test4
  FROM employees
) t
UNPIVOT (
  count FOR description IN (
    "TEST1", "TEST2", "TEST3", "TEST4"
  )
)

All we need to do is wrap the original query in a derived table t (i.e. an inline SELECT in the FROM clause), and then “UNPIVOT” that table t, generating the count and description columns. The result is, again:

Description   COUNT
-------------------
TEST1         2
TEST2         2
TEST3         3
TEST4         3

The execution plan is still optimal. All the action is happening in memory.

unpivot-unpivot

Conclusion

PIVOT and UNPIVOT are very useful tools for reporting and reorganising data. There are many use-cases like the above, where you want to re-organise some aggregations. Other use-cases include settings or properties tables that implement an entity attribute value model, and you want to transform attributes from rows to columns (PIVOT), or from columns to rows (UNPIVOT)

Intrigued? Read on about PIVOT here:

Reactive Database Access – Part 3 – Using jOOQ with Scala, Futures and Actors


We’re very happy to continue our a guest post series on the jOOQ blog by Manuel Bernhardt. In this blog series, Manuel will explain the motivation behind so-called reactive technologies and after introducing the concepts of Futures and Actors use them in order to access a relational database in combination with jOOQ.

manuel-bernhardtManuel Bernhardt is an independent software consultant with a passion for building web-based systems, both back-end and front-end. He is the author of “Reactive Web Applications” (Manning) and he started working with Scala, Akka and the Play Framework in 2010 after spending a long time with Java. He lives in Vienna, where he is co-organiser of the local Scala User Group. He is enthusiastic about the Scala-based technologies and the vibrant community and is looking for ways to spread its usage in the industry. He’s also scuba-diving since age 6, and can’t quite get used to the lack of sea in Austria.

This series is split in three parts, which we have published over the past months:

Introduction

In the previous two posts of this series we have introduced the benefits of reactive programming as well as two tools available for manipulating them, Futures and Actors. Now it is time to get your hands dirty, dear reader, and to create a simple application featuring reactive database access. Fear not, I will be there along the whole way to guide you through the process.

Also, the source code of this example is available on Github

Getting the tools

In order to build the application, you will need a few tools. If you haven’t worked with Scala yet, the simplest for you may be to go and grab the Typesafe Activator which is a standalone project that brings in the necessary tools to build a Scala project from scratch.

Since this is about reactive database access, you will also need a database. For the purpose of this simple example, we’re going to use Oracle Database 12c Enterprise Edition. Nah, just kidding – it might be a bit cumbersome to get this one to run on your machine. Instead we will use the excellent PostgreSQL. Make sure to install it so that you can run the psql utility from your console.

Ready? Great! Let’s have a look at what we’re going to build.

The application

The goal of our application is to fetch mentions from Twitter and store them locally so as to be able to visualize them and perform analytics on them.

687474703a2f2f6d616e75656c2e6265726e68617264742e696f2f77702d636f6e74656e742f4d656e74696f6e732e706e67

The core of this mechanism will be a MentionsFetcher actor which will periodically fetch mentions from Twitter and save them in the database. Once there we can display useful information on a view.

Creating the database

The first step we’re going to take is to create the database. Create a mentions.sql file somewhere with the following content:

CREATE USER "play" NOSUPERUSER INHERIT CREATEROLE;

CREATE DATABASE mentions WITH OWNER = "play" ENCODING 'UTF8';

GRANT ALL PRIVILEGES ON DATABASE mentions to "play";

\connect mentions play

CREATE TABLE twitter_user (
  id bigserial primary key,
  created_on timestamp with time zone NOT NULL,
  twitter_user_name varchar NOT NULL
);

CREATE TABLE mentions (
  id bigserial primary key,
  tweet_id varchar NOT NULL,
  user_id bigint NOT NULL,
  created_on timestamp with time zone NOT NULL,
  text varchar NOT NULL
);

This script will create a play user, a mentions database as well as two tables, twitter_user and mentions.

In order to execute it, execute the following command in a terminal:

psql -f mentions.sql

(note: you might need to explictly declare which user runs this command, depending on how you have configured PostgreSQL to run)

Bootstrapping the project

Let’s create the reactive-mentions project, shall we? Assuming that you have installed the activator, run the following command in a terminal:

~/workspace » activator new reactive-mentions

This will prompt a list of templates, we are going to use the play-scala project template:

Fetching the latest list of templates...

Browse the list of templates: http://typesafe.com/activator/templates
Choose from these featured templates or enter a template name:
  1) minimal-akka-java-seed
  2) minimal-akka-scala-seed
  3) minimal-java
  4) minimal-scala
  5) play-java
  6) play-scala
(hit tab to see a list of all templates)
> 6
OK, application "reactive-mentions" is being created using the "play-scala" template.
[...]

At this point, a simple Play Framework project has been created in the reactive-mentions directory. If you want to, you can run this project by navigating to it and running the command activator run.

In order to work on the project, you can use one of the many IDEs that have Scala support. My personal favourite is to this day IntelliJ IDEA which does a pretty good job at this and also has built-in support for the Play Framework itself.

Setting up jOOQ

I wrote about database access in Scala about 2 years ago. There are to this day still quite a few alternatives to relational database access in Scala but at least personally I have now reached the conclusion that for the type of projects I work on, jOOQ beats them all when it comes to writing type-safe SQL. So without further ado let’s integrate it with our project.

There is an SBT plugin available for this if you would like, however for this application we will settle for a minimal, hand-crafter solution.

Bring up the build.sbt file in an editor and add adjust the libraryDependencies to look like so:

libraryDependencies ++= Seq(
  jdbc,
  cache,
  ws,
  "org.postgresql" % "postgresql" % "9.4-1201-jdbc41",
  "org.jooq" % "jooq" % "3.7.0",
  "org.jooq" % "jooq-codegen-maven" % "3.7.0",
  "org.jooq" % "jooq-meta" % "3.7.0",
  specs2 % Test
)

If you are running the project’s console (which you can do by executing the activator command in the project’s directory) you will need to call the reload command in order for the new dependencies to be pulled in. This is true of any change you are doing to the build.sbt file. Don’t forget about it in the remainder of this article!

(note: make sure to use the version of the PostgreSQL driver that fits your version of PostgreSQL!)

Next, we need to set up jOOQ itself. For this purpose, create the file conf/mentions.xml, where conf is the directory used in the Play Framework for storing configuration-related files:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<configuration xmlns="http://www.jooq.org/xsd/jooq-codegen-3.7.0.xsd">
  <jdbc>
    <driver>org.postgresql.Driver</driver>
    <url>jdbc:postgresql://localhost/mentions</url>
    <user>play</user>
    <password></password>
  </jdbc>
  <generator>
    <name>org.jooq.util.ScalaGenerator</name>
    <database>
      <name>org.jooq.util.postgres.PostgresDatabase</name>
      <inputSchema>public</inputSchema>
      <includes>.*</includes>
      <excludes></excludes>
    </database>
    <target>
      <packageName>generated</packageName>
      <directory>target/scala-2.11/src_managed/main</directory>
    </target>
  </generator>
</configuration>

This configuration will allow to run jOOQ’s ScalaGenerator which will read the database schema and generate Scala specific classes for it, storing them in a directory accessible in the classpath and meant for generated sources.

All that is left to do is to create have a way to run jOOQ’s code generation. A simple solution that we are going to use is to create a custom SBT task in our project build. Go back to build.sbt and add the following at the end:

val generateJOOQ = taskKey[Seq[File]]("Generate JooQ classes")

val generateJOOQTask = (sourceManaged, fullClasspath in Compile, runner in Compile, streams) map { (src, cp, r, s) =>
  toError(r.run("org.jooq.util.GenerationTool", cp.files, Array("conf/mentions.xml"), s.log))
  ((src / "main/generated") ** "*.scala").get
}

generateJOOQ <<= generateJOOQTask


unmanagedSourceDirectories in Compile += sourceManaged.value / "main/generated"

The generateJOOQ task will run the GenerationTool using the mentions.xml file we have set-up earlier on. Let’s run it!

Start the SBT console by running the activator command in your terminal window, in the reactive-streams directory, and then run the generateJOOQ command:

[reactive-mentions] $ generateJOOQ
[info] Running org.jooq.util.GenerationTool conf/mentions.xml
[success] Total time: 1 s, completed Dec 11, 2015 2:55:08 PM

That’s it! If you want a bit more verbosity, add the following logger configuration to conf/logback.xml:

 <logger name="org.jooq" level="INFO" />

Alright, we are now ready to get to the core of our endaveour: create the actor that will pull mentions from Twitter!

Creating the MentionsFetcher actor

For the purpose of fetching mentions at regular intervals from Twitter, we will be using a simple Akka actor. Actors are meant to do a lot more powerful things than this but for the sake of introducing the concept this example will do (or so I hope).

Go ahead and add Akka as well as its logging facility as library dependencies in build.sbt:

libraryDependencies ++= Seq(
  ...
  "com.typesafe.akka" %% "akka-actor" % "2.4.1",
  "com.typesafe.akka" %% "akka-slf4j" % "2.4.1"
)

Next, create the file app/actors/MentionsFetcher.scala with the following content:

package actors

import actors.MentionsFetcher._
import akka.actor.{ActorLogging, Actor}
import org.joda.time.DateTime
import scala.concurrent.duration._

class MentionsFetcher extends Actor with ActorLogging {

  val scheduler = context.system.scheduler.schedule(
    initialDelay = 5.seconds,
    interval = 10.minutes,
    receiver = self,
    message = CheckMentions
  )

  override def postStop(): Unit = {
    scheduler.cancel()
  }

  def receive = {
    case CheckMentions => checkMentions
    case MentionsReceived(mentions) => storeMentions(mentions)
  }

  def checkMentions = ???

  def storeMentions(mentions: Seq[Mention]) = ???

}

object MentionsFetcher {

  case object CheckMentions
  case class Mention(id: String, created_at: DateTime, text: String, from: String, users: Seq[User])
  case class User(handle: String, id: String)
  case class MentionsReceived(mentions: Seq[Mention])

}

The first thing you may notice from this code is the unimplemented methods fetchMentions and storeMentions with the triple question mark ???. That’s actually valid Scala syntax: it is a method available by default which throws ascala.NotImplementedError.

The second thing I want you to notice is the companion object to the MentionsFetcher class which holds the protocol of our actor. Actors communicate using messages and even though our actor will only communicate with itself in this example it is a good idea to place it in a companion object and to import its members (via the wildcard import import actors.MentionsFetcher._) so as to keep things organized as the project grows.

Other than this, what we are doing for the moment is quite simple: we are setting up a scheduler that wakes up every 10 minutes in order to send the actor it-self the FetchMentions message. Upon receiving this message in the main receivemethod we are going to proceed to fetching the mentions from Twitter. Finally when a MentionsReceived message is received, we simply invoke the storeMentions method.

Simple enough, isn’t it? Don’t worry, things are about to get a little bit more complicated.

Fetching the mentions from Twitter

Twitter does not have an API that lets us directly fetch recent mentions. However it has an API that lets us search for Tweets and that will have to do.

Before you can go any further, if you intend to run this project, you will need to get yourself a set of keys and access tokens at apps.twitter.com. If you don’t you will have to trust me that the following works.

Once you have them, add them in the file conf/application.conf like so:

# Twitter
twitter.apiKey="..."
twitter.apiSecret="..."
twitter.accessToken="..."
twitter.accessTokenSecret="..."

Then, create the credentials method in MentionsFetcher:

// ...
import play.api.Play
import play.api.Play.current
import play.api.libs.oauth.{RequestToken, ConsumerKey}

class MentionsFetcher extends Actor with ActorLogging {

  // ...

  def credentials = for {
    apiKey <- Play.configuration.getString("twitter.apiKey")
    apiSecret <- Play.configuration.getString("twitter.apiSecret")
    token <- Play.configuration.getString("twitter.accessToken")
    tokenSecret <- Play.configuration.getString("twitter.accessTokenSecret")
  } yield (ConsumerKey(apiKey, apiSecret), RequestToken(token, tokenSecret))

}

This will allow us to place a call to Twitter’s API using the correct OAuth credentials.

Next, let’s get ready to fetch those mentions:

// ...
import akka.pattern.pipe
import org.joda.time.DateTime
import scala.util.control.NonFatal

class MentionsFetcher extends Actor with ActorLogging {

  // ...

  var lastSeenMentionTime: Option[DateTime] = Some(DateTime.now)

  def checkMentions = {
      val maybeMentions = for {
        (consumerKey, requestToken) <- credentials
        time <- lastSeenMentionTime
      } yield fetchMentions(consumerKey, requestToken, "<yourTwitterHandleHere>", time)

      maybeMentions.foreach { mentions =>
        mentions.map { m =>
          MentionsReceived(m)
        } recover { case NonFatal(t) =>
          log.error(t, "Could not fetch mentions")
          MentionsReceived(Seq.empty)
        } pipeTo self
      }
  }

  def fetchMentions(consumerKey: ConsumerKey, requestToken: RequestToken, user: String, time: DateTime): Future[Seq[Mention]] = ???

Do you remember the pipe pattern we talked about in the previous post about Actors? Well, here it is again!

The call we are going to make against Twitter’s API is going to be asynchronous. In other words we will not simply get aSeq[Mention] but a Future[Seq[Mention]] to work with, and the best way to deal with that one is to send ourselves a message once the Future has completed with the contents of the result.

Since things can go wrong though we also need to think about error recovery which we do here by heroically logging out the fact that we could not fetch the mentions.

You may also notice that we have introduced a lastSeenMentionTime variable. This is the means by which we are going to keep in memory the timestamp of the last mention we have seen.

In order to go ahead, one thing we need to do is to use a more recent version of the async-http-library client since there is a bug in Play 2.4.x. Add the following dependency to build.sbt:

libraryDependencies ++= Seq(
  ...
  "com.ning" % "async-http-client" % "1.9.29"
)

Alright, now that we are all set, let’s finally fetch those mentions!

// ...
import scala.util.control.NonFatal
import org.joda.time.format.DateTimeFormat
import play.api.libs.json.JsArray
import play.api.libs.ws.WS
import scala.concurrent.Future


class MentionsFetcher extends Actor with ActorLogging {

  // ...

  def fetchMentions(consumerKey: ConsumerKey, requestToken: RequestToken, user: String, time: DateTime): Future[Seq[Mention]] = {
    val df = DateTimeFormat.forPattern("EEE MMM dd HH:mm:ss Z yyyy").withLocale(Locale.ENGLISH)

    WS.url("https://api.twitter.com/1.1/search/tweets.json")
      .sign(OAuthCalculator(consumerKey, requestToken))
      .withQueryString("q" -> s"@$user")
      .get()
      .map { response =>
        val mentions = (response.json \ "statuses").as[JsArray].value.map { status =>
          val id = (status \ "id_str").as[String]
          val text = (status \ "text").as[String]
          val from = (status \ "user" \ "screen_name").as[String]
          val created_at = df.parseDateTime((status \ "created_at").as[String])
          val userMentions = (status \ "entities" \ "user_mentions").as[JsArray].value.map { user =>
            User((user \ "screen_name").as[String], ((user \ "id_str").as[String]))
          }

          Mention(id, created_at, text, from, userMentions)

        }
        mentions.filter(_.created_at.isAfter(time))
    }
  }

}

Fetching the mentions is rather straightforward thanks to Play’s WebService library. We create a signed OAuth request using our credentials and run a HTTP GET request against the search API passing as query string the @userName which will (hopefully) give us a list of Tweets mentioning a user. Lastly we do only keep those mentions that are after our last check time. Since we check every 10 minutes and since the API only returns recent tweets, this should be doing fine (unless you are very popular on Twitter and get an insane amount of replies – but this really is your own fault, then).

Setting the ExecutionContext

If you try to compile the project now (using the compile command) you will be greeted with a few compilation errors complaining about a missing ExecutionContext. Futures are a way to abstract tasks and they need something to run them. The ExecutionContext is the missing bit which will schedule the tasks to be executed.

Since we are inside of an actor we can borrow the actor’s own dispatcher:

class MentionsFetcher extends Actor with ActorLogging {

  implicit val executionContext = context.dispatcher

  // ...
}

We’ll talk more about Execution Contexts later on when it comes to fine-tuning the connection with the database. For the moment let us focus on actually getting to talk with the database at all.

Setting up a reactive database connection

Configuring the database connection

In order to connect to the database, we will first need to configure the connection information in conf/application.conf like so:

// ...

db.default.driver="org.postgresql.Driver"
db.default.url="jdbc:postgresql://localhost/mentions?user=play"

Creating a helper class to access the database

Play’s Database API is letting us access the configured database. We now need to do two things:

  • use jOOQ (rather than plain JDBC) to talk with the database
  • make sure we are not going to jeopardize our application by blocking while waiting for the database interaction to happen (JDBC is blocking)

For this purpose we will wrap the database operations in a Future that will run on its own ExecutionContext rather than sharing the one used by the actor or by the Play application itself.

Go ahead and create the file app/database/DB.scala:

package database

import javax.inject.Inject

import akka.actor.ActorSystem
import org.jooq.{SQLDialect, DSLContext}
import org.jooq.impl.DSL
import play.api.db.Database

import scala.concurrent.{ExecutionContext, Future}

class DB @Inject() (db: Database, system: ActorSystem) {

  val databaseContext: ExecutionContext = system.dispatchers.lookup("contexts.database")

  def query[A](block: DSLContext => A): Future[A] = Future {
    db.withConnection { connection =>
      val sql = DSL.using(connection, SQLDialect.POSTGRES_9_4)
      block(sql)
    }
  }(databaseContext)

  def withTransaction[A](block: DSLContext => A): Future[A] = Future {
    db.withTransaction { connection =>
      val sql = DSL.using(connection, SQLDialect.POSTGRES_9_4)
      block(sql)
    }
  }(databaseContext)

}

We define two methods, query and withTransaction that:

  • use a Future block in order to wrap the underlying code as a Future, thus running it asynchronously
  • use a custom databaseContext ExecutionContext in order to execute this Future
  • initialze jOOQ’s DSLContext and give access to it in the body of the expected functions

The databaseContext ExectionContext is created using Akka’s configuration capabilities. We need to add the configuration of the database dispatcher in conf/application.conf:

contexts {
    database {
        fork-join-executor {
          parallelism-max = 9
        }
    }
}

The magic number 9 doesn’t come out of nowhere. Check the excellent explanation provided by the HikariCP connection pool about connection pool sizing for more details. Those considerations are also discussed in length in Chapters 5 and 7 of Reactive Web Applications.

Wiring everything using dependency injection

Next, let’s use Play’s built-in dependency injection mechanism in order to provide our MentionsFetcher actor with a DB class. Adjust the constructor of our MentionsFetcher actor in app/actors/MentionsFetcher.scala to look like so:

// ...
import javax.inject.Inject
import play.api.db.Database


class MentionsFetcher @Inject() (database: Database) extends Actor with ActorLogging { ... }

We just need one more thing in order to bootstrap our MentionsFetcher actor: let Play know that we want to use it.

For this purpose we will declare a module and leverage the plumbing that Play provides when it comes to interacting with Akka actors. At the end of MentionsFetcher.scala (or in a new file, if you like), declare the following MentionsFetcherModule:

import com.google.inject.AbstractModule
import play.api.libs.concurrent.AkkaGuiceSupport

class MentionsFetcherModule extends AbstractModule with AkkaGuiceSupport {
  def configure(): Unit =
    bindActor[MentionsFetcher]("fetcher")
}

Last but not least we need to tell Play that we would like to use this module. In conf/appliction.conf add the following line to do so:

play.modules.enabled += "actors.MentionsFetcherModule"

That’s it! When Play starts up it will initialize the enabled modules which in turn will lead to the actor being initialized.

We now can go ahead and use the database in order to store the fetched mentions.

Storing the mentions in the database

Thanks to jOOQ writing the statements for storing the mentions is rather easy. Since we do not want to risk storing users or mentions twice we will upsert them using the WHERE NOT EXISTS SQL clause. For the sake of recording as much data as possible we will also store all mentioned users of a Tweet.

// ...
import generated.Tables._
import org.jooq.impl.DSL._

class MentionsFetcher @Inject() (db: DB) extends Actor with ActorLogging {

  // ...

  def storeMentions(mentions: Seq[Mention]) = db.withTransaction { sql =>
    log.info("Inserting potentially {} mentions into the database", mentions.size)
    val now = new Timestamp(DateTime.now.getMillis)

    def upsertUser(handle: String) = {
      sql.insertInto(TWITTER_USER, TWITTER_USER.CREATED_ON, TWITTER_USER.TWITTER_USER_NAME)
        .select(
          select(value(now), value(handle))
            .whereNotExists(
              selectOne()
                .from(TWITTER_USER)
                .where(TWITTER_USER.TWITTER_USER_NAME.equal(handle))
            )
        )
        .execute()
    }

    mentions.foreach { mention =>
      // upsert the mentioning users
      upsertUser(mention.from)

      // upsert the mentioned users
      mention.users.foreach { user =>
        upsertUser(user.handle)
      }

      // upsert the mention
      sql.insertInto(MENTIONS, MENTIONS.CREATED_ON, MENTIONS.TEXT, MENTIONS.TWEET_ID, MENTIONS.USER_ID)
        .select(
          select(
            value(now),
            value(mention.text),
            value(mention.id),
            TWITTER_USER.ID
          )
            .from(TWITTER_USER)
            .where(TWITTER_USER.TWITTER_USER_NAME.equal(mention.from))
            .andNotExists(
              selectOne()
                .from(MENTIONS)
                .where(MENTIONS.TWEET_ID.equal(mention.id))
            )
        )
        .execute()
    }
  }

}

Et voilà! If you execute this code (and generate some mentions, or use an earlier timestamp for filtering) you will get some data into your database!

Let’s now query and display a few statistics in the browser.

Displaying the mentions

In order to show our mentions we will adjust the default view shown when launching the application as well as theApplication controller. Start by adjusting the template app/views/index.scala.html to look as follows:

@(mentionsCount: Int)

@main("Reactive mentions") {

    <p>You have been mentioned @mentionsCount times in the past days</p>

}

Next, edit the Application controller located in app/controllers/Application.scala:

package controllers

import java.sql.Timestamp
import javax.inject.Inject

import database.DB
import org.joda.time.DateTime
import play.api._
import play.api.mvc._

class Application @Inject() (db: DB) extends Controller {

  def index = Action.async { implicit request =>

    import generated.Tables._
    import org.jooq.impl.DSL._

    db.query { sql =>
      val mentionsCount = sql.select(
        count()
      ).from(MENTIONS)
       .where(
         MENTIONS.CREATED_ON.gt(value(new Timestamp(DateTime.now.minusDays(1).getMillis)))
       ).execute()

      Ok(views.html.index(mentionsCount))
    }

  }

}

This time, we are using the query method that we have built in our DB helper. Since the result of this operation is a Future, we need to use the Action.async method of the Action which has a signature of the kind Request => Future[Response]. The execution of this query is performed by the custom ExecutionContext that we have set up for database operations and does not impede on the default ExecutionContext of the Play framework itself.

In case anything were to go wrong and the database operations were to hang on forever on the threads offered by that context, the rest of the application would not be affected (this principle is called “bulkheading” and is described a bit more in detail in Chapter 5 of Reactive Web Applications).

Conclusion

In this series we have explored the “Why?” of reactive applications and of asynchronous programming. In particular, we have talked about Futures and Actors, two powerful abstractions that make asynchronous programming easier to think about.

Most relational databases do not have asynchronous drivers available yet and even if there are some projects aiming at it I think it will still take some time before we’ll have a standard that will hopefully be implemented by many vendors. In the meanwhile we have seen that we can use a custom ExecutionContext in order to isolate otherwise blocking database operations.

687474703a2f2f6d616e75656c2e6265726e68617264742e696f2f77702d636f6e74656e742f4265726e68617264742d526561637469766557412d4d4541502d48492e6a706567

If you liked this series and are interested in learning more on the topic, consider checking out my book which provides an introductio to building reactive web applications on the JVM. Futures are covered in Chapter 5, Actors in Chapter 6 and Database Access in Chapter 7.

Read on

Read the previous chapters of this series:

If Java Were Designed Today: The Synchronizable Interface


Java has come a long way. A very long way. And it carries with it all the “junk” from early day design decisions.

One thing that has been regretted time and again is the fact that every object (potentially) contains a monitor. This is hardly ever necessary and this flaw was corrected, finally, in Java 5, when new concurrency APIs were introduced, such as the java.util.concurrent.locks.Lock and its subtypes. Since then, writing synchronized, concurrent code has become a lot easier than before when we only had the synchronized keyword and the hard-to-understand wait() and notify() mechanism:

The synchronized modifier is hardly used anymore

The original language design specified for these “convenience” modifiers on methods:

// These are the same, semantically:
public synchronized void method() {
    ...
}

public void method() {
    synchronized (this) {
        ...
    }
}

// So are these:
public static synchronized void method() {
    ...
}

public static void method() {
    synchronized (ClassOfMethod.class) {
        ...
    }
}

(note, while the byte code produced above is not the same, the high level semantics certainly is)

You hardly want to synchronize on the complete method scope, in order to keep synchronization time at a minimum, and factoring out a method every time you need synchronization is tedious.

Furthermore, the monitor breaks encapsulation. Everyone can synchronize on your monitor if you synchronize on this or on the entire class. You probably don’t want that, which is why most people who still do work with the synchronized keyword will simply create an explicit, private lock object, such as:

class SomeClass {
    private Object LOCK = new Object();

    public void method() {
        ...

        synchronized (LOCK) {
            ...
        }

        ...
    }
}

If that’s the standard use-case for classic synchronized blocks, do we then still need a monitor on every object?

Synchronized in a more modern Java version

If Java were designed with today’s knowledge about the Java language, we wouldn’t allow for using synchronized on any random object (including strings or arrays):

// Wouldn't work
synchronized ("abc") {
    ...
}

We would introduce a special Synchronizable marker interface, which guarantees that implementors will have a monitor. And the synchronized block would only accept Synchronizable arguments:

Synchronizable lock = ...

synchronized (lock) {
    ...
}

This would work exactly the same way as foreach or try-with-resources:

Iterable<Object> iterable = ...

// The type to the right of ":" must be Iterable
for (Object o : iterable) {
    ...
}

// The assignment type must be AutoCloseable
try (AutoCloseable closeable = ...) {
    ...
}

// The assignment type must be a functional interface
Runnable runnable = () -> {};

So, in order for a given language feature to work, the Java language imposes constraints on the types that are used in that context. In the case of foreach or try-with-resources, a concrete JDK type is required. In the case of lambda expressions, a matching structural type is required (which is rather esoteric but clever, for Java).

Unfortunately, for backwards-compatibility reasons, there will not be any new restriction added for synchronized blocks. Or will there? It would be great, and an optional warning could be issued if the type is not Synchronizable. This might allow, in the course of a couple of future major releases, to remove monitors from objects that are not really required to be synchronizable.

Which is essentially what the C language has been doing with mutexes all along. They’re a special thing. Not the common thing.

2016 Will be the Year Remembered as When Java Finally Had Window Functions!


You heard right. Up until now, the awesome window functions were a feature uniquely reserved to SQL. Even sophisticated functional programming languages still seem to lack this beautiful functionality (correct me if I’m wrong, Haskell folks).

We’ve written tons of blog posts about window functions, evangelising them to our audience, in articles like:

One of my favourite example use-cases for window functions is the running total. I.e. to get from the following bank account transaction table:

| ID   | VALUE_DATE | AMOUNT |
|------|------------|--------|
| 9997 | 2014-03-18 |  99.17 |
| 9981 | 2014-03-16 |  71.44 |
| 9979 | 2014-03-16 | -94.60 |
| 9977 | 2014-03-16 |  -6.96 |
| 9971 | 2014-03-15 | -65.95 |

… to this one, with a calculated balance:

| ID   | VALUE_DATE | AMOUNT |  BALANCE |
|------|------------|--------|----------|
| 9997 | 2014-03-18 |  99.17 | 19985.81 |
| 9981 | 2014-03-16 |  71.44 | 19886.64 |
| 9979 | 2014-03-16 | -94.60 | 19815.20 |
| 9977 | 2014-03-16 |  -6.96 | 19909.80 |
| 9971 | 2014-03-15 | -65.95 | 19916.76 |

With SQL, this is a piece of cake. Observe the usage of SUM(t.amount) OVER(...):

SELECT
  t.*,
  t.current_balance - NVL(
    SUM(t.amount) OVER (
      PARTITION BY t.account_id
      ORDER BY     t.value_date DESC,
                   t.id         DESC
      ROWS BETWEEN UNBOUNDED PRECEDING
           AND     1         PRECEDING
    ),
  0) AS balance
FROM     v_transactions t
WHERE    t.account_id = 1
ORDER BY t.value_date DESC,
         t.id         DESC

How do window functions work?

(don’t forget to book our SQL Masterclass to learn about window functions, and much more!)

Despite the sometimes a bit scary syntax, window functions are really very easy to understand. Windows are “views” of the data produced in your FROM / WHERE / GROUP BY / HAVING clauses. They allow you to access all the other rows relative to the current row, while you calculate something in your SELECT clause (or rarely, in your ORDER BY clause). What the above statement really does is this:

| ID   | VALUE_DATE |  AMOUNT |  BALANCE |
|------|------------|---------|----------|
| 9997 | 2014-03-18 | -(99.17)|+19985.81 |
| 9981 | 2014-03-16 | -(71.44)| 19886.64 |
| 9979 | 2014-03-16 |-(-94.60)| 19815.20 |
| 9977 | 2014-03-16 |   -6.96 |=19909.80 |
| 9971 | 2014-03-15 |  -65.95 | 19916.76 |

I.e. for any given balance, subtract from the current balance the SUM()OVER()” the window of all the rows that are in the same partition as the current row (same bank account), and that are strictly “above” the current row.

Or, in detail:

  • PARTITION BY specifies “OVER()” which rows the window spans
  • ORDER BY specifies how the window is ordered
  • ROWS specifies what ordered row indexes should be considered

Can we do this with Java collections?

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere.Yes we can! If you’re using jOOλ: A completely free Open Source, Apache 2.0 licensed library that we designed because we thought that the JDK 8 Stream and Collector APIs just don’t do it.

When Java 8 was designed, a lot of focus went into supporting parallel streams. That’s nice but certainly not the only useful area where functional programming can be applied. We’ve created jOOλ to fill this gap – without implementing an all new, alternative collections API, such as Javaslang or functional java have.

jOOλ already provides:

  1. Tuple types
  2. More useful stuff for ordered, sequential-only streams

With the recently released jOOλ 0.9.9, we’ve added two main new features:

  1. Tons of new Collectors
  2. Window functions

The many missing collectors in the JDK

The JDK ships with a couple of collectors, but they do seem awkward and verbose, and no one really appreciates writing collectors like the ones exposed in this Stack Overflow question (and many others).

But the use case exposed in the linked question is a very valid one. You want to aggregate several things from a list of person:

public class Person {
    private String firstName;
    private String lastName;
    private int age;
    private double height;
    private double weight;
    // getters / setters

Assuming you have this list:

List<Person> personsList = new ArrayList<Person>();

personsList.add(new Person("John", "Doe", 25, 1.80, 80));
personsList.add(new Person("Jane", "Doe", 30, 1.69, 60));
personsList.add(new Person("John", "Smith", 35, 174, 70));

You now want to get the following aggregations:

  • Number of people
  • Max age
  • Min height
  • Avg weight

This is a ridiculous problem for anyone used to writing SQL:

SELECT count(*), max(age), min(height), avg(weight)
FROM person

Done. How hard can it be in Java? It turns out that a lot of glue code needs to be written with vanilla JDK 8 API. Consider the sophisticated answers given

With jOOλ 0.9.9, solving this problem becomes ridiculously trivial again, and it reads almost like SQL:

Tuple result =
Seq.seq(personsList)
   .collect(
       count(),
       max(Person::getAge),
       min(Person::getHeight),
       avg(Person::getWeight)
   );

System.out.println(result);

And the result yields:

(3, Optional[35], Optional[1.69], Optional[70.0])

Note that this isn’t running a query against a SQL database (that’s what jOOQ is for). We’re running this “query” against an in-memory Java collection.

OK ok, that’s already awesome. Now what about window functions?

Right, the title of this article didn’t promise trivial aggregation stuff. It promised the awesome window functions.

Yet, window functions are nothing else than aggregations (or rankings) on a subset of your data stream. Instead of aggregating all of the stream (or table) into a single record, you want to maintain the original records, and provide the aggregation on each individual record directly.

A nice introductory example for window functions is the one provided in this article that explains the difference between ROW_NUMBER(), RANK(), and DENSE_RANK(). Consider the following PostgreSQL query:

SELECT
  v, 
  ROW_NUMBER() OVER(w),
  RANK()       OVER(w),
  DENSE_RANK() OVER(w)
FROM (
  VALUES('a'),('a'),('a'),('b'),
        ('c'),('c'),('d'),('e')
) t(v)
WINDOW w AS (ORDER BY v);

It yields:

| V | ROW_NUMBER | RANK | DENSE_RANK |
|---|------------|------|------------|
| a |          1 |    1 |          1 |
| a |          2 |    1 |          1 |
| a |          3 |    1 |          1 |
| b |          4 |    4 |          2 |
| c |          5 |    5 |          3 |
| c |          6 |    5 |          3 |
| d |          7 |    7 |          4 |
| e |          8 |    8 |          5 |

The same can be done in Java 8 using jOOλ 0.9.9

System.out.println(
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")
       .window(naturalOrder())
       .map(w -> tuple(
            w.value(),
            w.rowNumber(),
            w.rank(),
            w.denseRank()
       ))
       .format()
);

Yielding…

+----+----+----+----+
| v0 | v1 | v2 | v3 |
+----+----+----+----+
| a  |  0 |  0 |  0 |
| a  |  1 |  0 |  0 |
| a  |  2 |  0 |  0 |
| b  |  3 |  3 |  1 |
| c  |  4 |  4 |  2 |
| c  |  5 |  4 |  2 |
| d  |  6 |  6 |  3 |
| e  |  7 |  7 |  4 |
+----+----+----+----+

Again, do note that we’re not running any queries against a database. Everything is done in memory.

Notice two things:

  • jOOλ’s window functions return 0-based ranks, as is expected for Java APIs, as opposed to SQL, which is all 1-based.
  • In Java, it is not possible to construct ad-hoc records with named columns. That’s unfortunate, and I do hope a future Java will provide support for such language features.

Let’s review what happens exactly in the code:

System.out.println(

    // This is just enumerating our values
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")

    // Here, we specify a single window to be
    // ordered by the value T in the stream, in
    // natural order
       .window(naturalOrder())

    // The above window clause produces a Window<T>
    // object (the w here), which exposes...
       .map(w -> tuple(

    // ... the current value itself, of type String...
            w.value(),

    // ... or various rankings or aggregations on
    // the above window.
            w.rowNumber(),
            w.rank(),
            w.denseRank()
       ))

    // Just some nice formatting to produce the table
       .format()
);

That’s it! Easy, isn’t it?

We can do more! Check this out:

System.out.println(
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")
       .window(naturalOrder())
       .map(w -> tuple(
            w.value(),   // v0 
            w.count(),   // v1
            w.median(),  // v2
            w.lead(),    // v3
            w.lag(),     // v4
            w.toString() // v5
       ))
       .format()
);

What does the above yield?

+----+----+----+---------+---------+----------+
| v0 | v1 | v2 | v3      | v4      | v5       |
+----+----+----+---------+---------+----------+
| a  |  1 | a  | a       | {empty} | a        |
| a  |  2 | a  | a       | a       | aa       |
| a  |  3 | a  | b       | a       | aaa      |
| b  |  4 | a  | c       | a       | aaab     |
| c  |  5 | a  | c       | b       | aaabc    |
| c  |  6 | a  | d       | c       | aaabcc   |
| d  |  7 | b  | e       | c       | aaabccd  |
| e  |  8 | b  | {empty} | d       | aaabccde |
+----+----+----+---------+---------+----------+

Your analytics heart should be jumping, now.

4376565[1]

Wait a second. Can we do frames, too, as in SQL? Yes, we can. Just as in SQL, when we omit the frame clause on a window definition (but we do specify an ORDER BY clause), then the following is applied by default:

RANGE BETWEEN UNBOUNDED PRECEDING
  AND CURRENT ROW

We’ve done this in the previous examples. It can be seen in column v5, where we aggregate the string from the very first value up until the current value. So, let’s specify the frame then:

System.out.println(
    Seq.of("a", "a", "a", "b", "c", "c", "d", "e")
       .window(naturalOrder(), -1, 1) // frame here
       .map(w -> tuple(
            w.value(),   // v0
            w.count(),   // v1
            w.median(),  // v2
            w.lead(),    // v3
            w.lag(),     // v4
            w.toString() // v5
       ))
       .format()
);

And the result is, trivially:

+----+----+----+---------+---------+-----+
| v0 | v1 | v2 | v3      | v4      | v5  |
+----+----+----+---------+---------+-----+
| a  |  2 | a  | a       | {empty} | aa  |
| a  |  3 | a  | a       | a       | aaa |
| a  |  3 | a  | b       | a       | aab |
| b  |  3 | b  | c       | a       | abc |
| c  |  3 | c  | c       | b       | bcc |
| c  |  3 | c  | d       | c       | ccd |
| d  |  3 | d  | e       | c       | cde |
| e  |  2 | d  | {empty} | d       | de  |
+----+----+----+---------+---------+-----+

As expected, lead() and lag() are unaffected, as opposed to count(), median(), and toString()

Awesome! Now, let’s review the running total.

Often, you don’t calculate window functions on the scalar value of the stream itself, as that value is usually not a scalar value but a tuple (or a POJO in Java-speak). Instead, you extract values from the tuple (or POJO) and perform the aggregation on that. So, again, when calculating the BALANCE, we need to extract the AMOUNT first.

| ID   | VALUE_DATE |  AMOUNT |  BALANCE |
|------|------------|---------|----------|
| 9997 | 2014-03-18 | -(99.17)|+19985.81 |
| 9981 | 2014-03-16 | -(71.44)| 19886.64 |
| 9979 | 2014-03-16 |-(-94.60)| 19815.20 |
| 9977 | 2014-03-16 |   -6.96 |=19909.80 |
| 9971 | 2014-03-15 |  -65.95 | 19916.76 |

Here’s how you would write the running total with Java 8 and jOOλ 0.9.9

BigDecimal currentBalance = new BigDecimal("19985.81");

Seq.of(
    tuple(9997, "2014-03-18", new BigDecimal("99.17")),
    tuple(9981, "2014-03-16", new BigDecimal("71.44")),
    tuple(9979, "2014-03-16", new BigDecimal("-94.60")),
    tuple(9977, "2014-03-16", new BigDecimal("-6.96")),
    tuple(9971, "2014-03-15", new BigDecimal("-65.95")))
.window(Comparator
    .comparing((Tuple3<Integer, String, BigDecimal> t) 
        -> t.v1, reverseOrder())
    .thenComparing(t -> t.v2), Long.MIN_VALUE, -1)
.map(w -> w.value().concat(
     currentBalance.subtract(w.sum(t -> t.v3)
                              .orElse(BigDecimal.ZERO))
));

Yielding

+------+------------+--------+----------+
|   v0 | v1         |     v2 |       v3 |
+------+------------+--------+----------+
| 9997 | 2014-03-18 |  99.17 | 19985.81 |
| 9981 | 2014-03-16 |  71.44 | 19886.64 |
| 9979 | 2014-03-16 | -94.60 | 19815.20 |
| 9977 | 2014-03-16 |  -6.96 | 19909.80 |
| 9971 | 2014-03-15 | -65.95 | 19916.76 |
+------+------------+--------+----------+

A couple of things have changed here:

  • The comparator now takes two comparisons into account. Unforunately JEP-101 wasn’t entirely implemented, which is why we need to help the compiler with type inference here.
  • The Window.value() is now a tuple, not a single value. So we need to extract the interesting column from it, the AMOUNT (via t -> t.v3). On the other hand, we can simply concat() that additional value to the tuple

But that’s already it. Apart from the verbosity of the comparator (which we’ll certainly address in a future jOOλ version), writing a window function is a piece of cake.

What else can we do?

This article is not a complete description of all we can do with the new API. We’ll soon write a follow-up blog post with additional examples. For instance:

  • The partition by clause wasn’t described, but is available too
  • You can specify many more windows than the single window exposed here, each with individual PARTITION BY, ORDER BY and frame specifications

Also, the current implementation is rather canonical, i.e. it doesn’t (yet) cache aggregations:

  • For unordered / unframed windows (same value for all the partition)
  • Strictly ascendingly framed windows (aggregation can be based on previous value, for associative collectors like SUM(), or toString())

That’s it from our part. Download jOOλ, play around with it and enjoy the fact that the most awesome SQL feature is now available for all of you Java 8 developers!
https://github.com/jOOQ/jOOL

A Curious Incidence of a jOOQ API Design Flaw


jOOQ is an internal domain-specific language (DSL), modelling the SQL language (external DSL) in Java (the host language). The main mechanism of the jOOQ API is described in this popular article:

The Java Fluent API Designer Crash Course.

Anyone can implement an internal DSL in Java (or in most other host languages) according to the rules from that article.

An example SQL language feature: BOOLEANs

One of the nice things about the SQL language, however, is the BOOLEAN type, which has been introduced late into the language as of SQL:1999. Sure, without booleans, you can just model TRUE and FALSE values via 1 and 0, and transform the predicates into the value using CASE

CASE WHEN A = B THEN 1 ELSE 0 END

But with true BOOLEAN support, you can do awesome queries like the following PostgreSQL query that is run against the Sakila database:

SELECT
  f.title, 
  string_agg(a.first_name, ', ') AS actors
FROM film AS f
JOIN film_actor AS fa USING (film_id)
JOIN actor AS a USING (actor_id)
GROUP BY film_id
HAVING every(a.first_name LIKE '%A%')

The above yields:

TITLE                    ACTORS
-----------------------------------------------------
AMISTAD MIDSUMMER        CARY, DARYL, SCARLETT, SALMA
ANNIE IDENTITY           CATE, ADAM, GRETA
ANTHEM LUKE              MILLA, OPRAH
ARSENIC INDEPENDENCE     RITA, CUBA, OPRAH
BIRD INDEPENDENCE        FAY, JAYNE
...

In other words, we’re looking for all the films where all the actors who played in the film contain the letter “A” in their first names. This is done via an aggregation on the boolean expression / predicate first_name LIKE '%A%':

HAVING every(a.first_name LIKE '%A%')

Now, in the terms of the jOOQ API, this means we’ll have to provide overloads of the having() method that take different argument types, such as:

// These accept "classic" predicates
having(Condition... conditions);
having(Collection<? extends Condition> conditions);

// These accept a BOOLEAN type
having(Field<Boolean> condition);

Of course, these overloads are available for any API method that accepts predicates / boolean values, not just for the HAVING clause.

As mentioned before, since SQL:1999, jOOQ’s Condition and Field<Boolean> are really the same thing. jOOQ allows for converting between the two via explicit API:

Condition condition1 = FIRST_NAME.like("%A%");
Field<Boolean> field = field(condition1);
Condition condition2 = condition(field);

… and the overloads make conversion more conveniently implicit.

So, what’s the problem?

The problem is that we thought it might be a good idea to add yet another convenient overload, the having(Boolean) method, where constant, nullable BOOLEAN values could be introduced into the query, for convenience, which can be useful when building dynamic SQL, or commenting out some predicates:

DSL.using(configuration)
   .select()
   .from(TABLE)
   .where(true)
// .and(predicate1)
   .and(predicate2)
// .and(predicate3)
   .fetch();

The idea is that the WHERE keyword will never be commented out, regardless what predicate you want to temporarily remove.

Unfortunately, adding this overload introduced a nuisance to developers using IDE auto-completion. Consider the following two method calls:

// Using jOOQ API
Condition condition1 = FIRST_NAME.eq   ("ADAM");
Condition condition2 = FIRST_NAME.equal("ADAM");

// Using Object.equals (accident)
boolean = FIRST_NAME.equals("ADAM");

By (accidentally) adding a letter “s” to the equal() method – mostly because of IDE autocompletion – the whole predicate expression changes semantics drastically, from a jOOQ expression tree element that can be used to generate SQL to an “ordinary” boolean value (which always yields false, obviously).

Prior to having added the last overload, this wasn’t a problem. The equals() method usage wouldn’t compile, as there was no applicable overload taking a Java boolean type.

// These accept "classic" predicates
having(Condition condition);
having(Condition... conditions);
having(Collection<? extends Condition> conditions);

// These accept a BOOLEAN type
having(Field<Boolean> condition);

// This method didn't exist prior to jOOQ 3.7
// having(Boolean condition);

After jOOQ 3.7, this accident started to go unnoticed in user code as the compiler no longer complained, leading to wrong SQL.

Conclusion: Be careful when designing an internal DSL. You inherit the host language’s “flaws”

Java is “flawed” in that every type is guaranteed to inherit from java.lang.Object and with it, its methods: getClass(), clone(), finalize() equals(), hashCode(), toString(), notify(), notifyAll(), and wait().

In most APIs, this isn’t really that much of a problem. You don’t really need to re-use any of the above method names (please, don’t).

But when designing an internal DSL, these Object method names (just like the language keywords) limit you in your design space. This is particularly obvious in the case of equal(s).

We’ve learned, and we’ve deprecated and will remove the having(Boolean) overload, and all the similar overloads again.

Follow

Get every new post delivered to your Inbox.

Join 3,907 other followers

%d bloggers like this: