Simulating Latency with SQL / JDBC

I’ve run across a fun little trick to simulate latency in your development environments when testing some SQL queries. Possible use-cases including to validate that backend latency won’t bring down your frontend, or that your UX is still bearable, etc.

The solution is PostgreSQL and Hibernate specific, though to doesn’t have to be. Besides, it uses a stored function to work around the limitations of a VOID function in PostgreSQL, but that can be worked around differently as well, without storing anything auxiliary to the catalog.

To remove the Hibernate dependency, you can just use the pg_sleep function directly using a NULL predicate, but don’t try it like this!

select 1
from t_book
-- Don't do this!
where pg_sleep(1) is not null;

This will sleep 1 second per row (!). As can be seen in the explain plan. Let’s limit to 3 rows to see:

explain analyze
select 1
from t_book
where pg_sleep(1) is not null
limit 3;

And the result is:

Limit  (cost=0.00..1.54 rows=3 width=4) (actual time=1002.142..3005.374 rows=3 loops=1)
   ->  Seq Scan on t_book  (cost=0.00..2.05 rows=4 width=4) (actual time=1002.140..3005.366 rows=3 loops=1)
         Filter: (pg_sleep('1'::double precision) IS NOT NULL)
 Planning Time: 2.036 ms
 Execution Time: 3005.401 ms

As you can see, the whole query took about 3 seconds for 3 rows. In fact, this is also what happens in Gunnar’s example from the tweet, except that he was filtering by ID, which “helps” hide this effect.

We can use what Oracle calls scalar subquery caching, the fact that a scalar subquery can be reasonably expected to be side-effect free (despite the obvious side-effect of pg_sleep), meaning that some RDBMS cache its result per query execution.

explain analyze
select 1
from t_book
where (select pg_sleep(1)) is not null
limit 3;

The result is now:

Limit  (cost=0.01..1.54 rows=3 width=4) (actual time=1001.177..1001.178 rows=3 loops=1)
   InitPlan 1 (returns $0)
     ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=1001.148..1001.148 rows=1 loops=1)
   ->  Result  (cost=0.00..2.04 rows=4 width=4) (actual time=1001.175..1001.176 rows=3 loops=1)
         One-Time Filter: ($0 IS NOT NULL)
         ->  Seq Scan on t_book  (cost=0.00..2.04 rows=4 width=0) (actual time=0.020..0.021 rows=3 loops=1)
 Planning Time: 0.094 ms
 Execution Time: 1001.223 ms

We’re now getting the desired one-time filter. However, I don’t really like this hack, because it depends on an optimisation, which is optional, not a formal guarantee. This may be good enough for a quick simulation of latency, but don’t depend on this kind of optimisation in production lightheartedly.

Another approach that seems to guarantee this behaviour would be to use a MATERIALIZED CTE:

with s (x) as materialized (select pg_sleep(1))
select *
from t_book
where (select x from s) is not null;

I’m now again using a scalar subquery, because I somehow need to access the CTE, and I don’t want to place it in the FROM clause, where it would impact my projection.

The plan being:

Result  (cost=0.03..2.07 rows=4 width=943) (actual time=1001.289..1001.292 rows=4 loops=1)
   One-Time Filter: ($1 IS NOT NULL)
   CTE s
     ->  Result  (...) (actual time=1001.262..1001.263 rows=1 loops=1)
   InitPlan 2 (returns $1)
     ->  CTE Scan on s  (cost=0.00..0.02 rows=1 width=4) (actual time=1001.267..1001.268 rows=1 loops=1)
   ->  Seq Scan on t_book  (cost=0.03..2.07 rows=4 width=943) (actual time=0.015..0.016 rows=4 loops=1)
 Planning Time: 0.049 ms
 Execution Time: 1001.308 ms

Again, containing a one-time filter, which is what we want here.

Using a JDBC based approach

If your application is JDBC based, you don’t have to simulate the latency by tweaking the query. You can simply proxy JDBC in one way or another. Let’s look at this little program:

try (Connection c1 = db.getConnection()) {

    // A Connection proxy that intercepts preparedStatement() calls
    Connection c2 = new DefaultConnection(c1) {
        public PreparedStatement prepareStatement(String sql) 
        throws SQLException {
            return super.prepareStatement(sql);

    long time = System.nanoTime();
    String sql = "SELECT id FROM book";

    // This call now has a 1 second "latency"
    try (PreparedStatement s = c2.prepareStatement(sql);
        ResultSet rs = s.executeQuery()) {
        while (

    System.out.println("Time taken: " + 
       (System.nanoTime() - time) / 1_000_000L + "ms");


public static void sleep(long time) {
    try {
    catch (InterruptedException e) {

For simplicity reasons, this uses jOOQ’s DefaultConnection which acts as a proxy, conveniently delegating all the methods to some delegate connection, allowing for overriding only specific methods. The output of the program is:

Time taken: 1021ms

This simulates the latency on the prepareStatement() event. Obviously, you’d be extracting the proxying into some utility in order not to clutter your code. You could even proxy all your queries in development and enable the sleep call only based on a system property.

Alternatively, we could also simulate it on the executeQuery() event:

try (Connection c = db.getConnection()) {
    long time = System.nanoTime();

    // A PreparedStatement proxy intercepting executeQuery() calls
    try (PreparedStatement s = new DefaultPreparedStatement(
        c.prepareStatement("SELECT id FROM t_book")
    ) {
        public ResultSet executeQuery() throws SQLException {
            return super.executeQuery();

        // This call now has a 1 second "latency"
        ResultSet rs = s.executeQuery()) {
        while (

    System.out.println("Time taken: " +
        (System.nanoTime() - time) / 1_000_000L + "ms");

This is now using the jOOQ convenience class DefaultPreparedStatement. If you need these, just add the jOOQ Open Source Edition dependency (there’s nothing RDBMS specific in these classes), with any JDBC based application, including Hibernate:


Alternatively, just copy the sources of the classes DefaultConnection or DefaultPreparedStatement if you don’t need the entire dependency, or you just proxy the JDBC API yourself.

A jOOQ based solution

If you’re already using jOOQ (and you should be!), you can do this even more easily, by implementing an ExecuteListener. Our program would now look like this:

try (Connection c = db.getConnection()) {
    DSLContext ctx = DSL.using(new DefaultConfiguration()
        .set(new CallbackExecuteListener()
            .onExecuteStart(x -> sleep(1000L))

    long time = System.nanoTime();
    System.out.println(ctx.fetch("SELECT id FROM t_book"));
    System.out.println("Time taken: " +
        (System.nanoTime() - time) / 1_000_000L + "ms");

Still the same result:

|id  |
|1   |
|2   |
|3   |
|4   |
Time taken: 1025ms

The difference is that with a single intercepting callback, we can now add this sleep to all types of statements, including prepared statements, static statements, statements returning result sets, or update counts, or both.

Reactive Database Access – Part 1 – Why “Async”

Notice that the examples in this article may be outdated, as Typesafe’s Activator works differently now. The blog post will not be maintained to provide up-to-date Activator examples.
We’re very happy to announce a guest post series on the jOOQ blog by Manuel Bernhardt. In this blog series, Manuel will explain the motivation behind so-called reactive technologies and after introducing the concepts of Futures and Actors use them in order to access a relational database in combination with jOOQ. manuel-bernhardtManuel Bernhardt is an independent software consultant with a passion for building web-based systems, both back-end and front-end. He is the author of “Reactive Web Applications” (Manning) and he started working with Scala, Akka and the Play Framework in 2010 after spending a long time with Java. He lives in Vienna, where he is co-organiser of the local Scala User Group. He is enthusiastic about the Scala-based technologies and the vibrant community and is looking for ways to spread its usage in the industry. He’s also scuba-diving since age 6, and can’t quite get used to the lack of sea in Austria. This series is split in three parts, which we’ll publish over the next month:


The concept of Reactive Applications is getting increasingly popular these days and chances are that you have already heard of it someplace on the Internet. If not, you could read the Reactive Manifesto or we could perhaps agree to the following simple summary thereof: in a nutshell, Reactive Applications are applications that:
  • make optimal use of computational resources (in terms of CPU and memory usage) by leveraging asynchronous programming techniques
  • know how to deal with failure, degrading gracefully instead of, well, just crashing and becoming unavailable to their users
  • can adapt to intensive workloads, scaling out on several machines / nodes as load increases (and scaling back in)
Reactive Applications do not exist merely in a wild, pristine green field. At some point they will need to store and access data in order to do something meaningful and chances are that the data happens to live in a relational database.

Latency and database access

When an application talks to a database more often than not the database server is not going to be running on the same server as the application. If you’re unlucky it might even be that the server (or set of servers) hosting the application live in a different data centre than the database server. Here is what this means in terms of latency: Latency numbers every programmer should know Say that you have an application that runs a simple SELECT query on its front page (let’s not debate whether or not this is a good idea here). If your application and database servers live in the same data centre, you are looking at a latency of the order of 500 µs (depending on how much data comes back). Now compare this to all that your CPU could do during that time (all those green and black squares on the figure above) and keep this in mind – we’ll come back to it in just a minute.

The cost of threads

Let’s suppose that you run your welcome page query in a synchronous manner (which is what JDBC does) and wait for the result to come back from the database. During all of this time you will be monopolizing a thread that waits for the result to come back. A Java thread that just exists (without doing anything at all) can take up to 1 MB of stack memory, so if you use a threaded server that will allocate one thread per user (I’m looking at you, Tomcat) then it is in your best interest to have quite a bit of memory available for your application in order for it to still work when featured on Hacker News (1 MB / concurrent user). Reactive applications such as the ones built with the Play Framework make use of a server that follows the evented server model: instead of following the “one user, one thread” mantra it will treat requests as a set of events (accessing the database would be one of these events) and run it through an event loop: Evented server and its event loop Such a server will not use many threads. For example, default configuration of the Play Framework is to create one thread per CPU core with a maximum of 24 threads in the pool. And yet, this type of server model can deal with many more concurrent requests than the threaded model given the same hardware. The trick, as it turns out, is to hand over the thread to other events when a task needs to do some waiting – or in other words: to program in an asynchronous fashion.

Painful asynchronous programming

Asynchronous programming is not really new and programming paradigms for dealing with it have been around since the 70s and have quietly evolved since then. And yet, asynchronous programming is not necessarily something that brings back happy memories to most developers. Let’s look at a few of the typical tools and their drawbacks.


Some languages (I’m looking at you, Javascript) have been stuck in the 70s with callbacks as their only tool for asynchronous programming up until recently (ECMAScript 6 introduced Promises). This is also knows as christmas tree programming: Christmas tree of hell Ho ho ho.


As a Java developer, the word asynchronous may not necessarily have a very positive connotation and is often associated to the infamous synchronized keyword:
Working with threads in Java is hard, especially when using mutable state – it is so much more convenient to let an underlying application server abstract all of the asynchronous stuff away and not worry about it, right? Unfortunately as we have just seen this comes at quite a hefty cost in terms of performance. And I mean, just look at this stack trace: Abstraction is the mother of all trees In one way, threaded servers are to asynchronous programming what Hibernate is to SQL – a leaky abstraction that will cost you dearly on the long run. And once you realize it, it is often too late and you are trapped in your abstraction, fighting by all means against it in order to increase performance. Whilst for database access it is relatively easy to let go of the abstraction (just use plain SQL, or even better, jOOQ), for asynchronous programming the better tooling is only starting to gain in popularity. Let’s turn to a programming model that finds its roots in functional programming: Futures.

Futures: the SQL of asynchronous programming

Futures as they can be found in Scala leverage functional programming techniques that have been around for decades in order to make asynchronous programming enjoyable again.

Future fundamentals

A scala.concurrent.Future[T] can be seen as a box that will eventually contain a value of type T if it succeeds. If it fails, theThrowable at the origin of the failure will be kept. A Future is said to have succeeded once the computation it is waiting for has yielded a result, or failed if there was an error during the computation. In either case, once the Future is done computing, it is said to be completed. Welcome to the Futures As soon as a Future is declared, it will start running, which means that the computation it tries to achieve will be executed asynchronously. For example, we can use the WS library of the Play Framework in order to execute a GET request against the Play Framework website:

val response: Future[WSResponse] = 

This call will return immediately and lets us continue to do other things. At some point in the future, the call may have been executed, in which case we could access the result to do something with it. Unlike Java’s java.util.concurrent.Future<V>which lets one check whether a Future is done or block while retrieving it with the get() method, Scala’s Future makes it possible to specify what to do with the result of an execution.

Transforming Futures

Manipulating what’s inside of the box is easy as well and we do not need to wait for the result to be available in order to do so:

val response: Future[WSResponse] = 

val siteOnline: Future[Boolean] = { r =>
    r.status == 200

siteOnline.foreach { isOnline =>
  if(isOnline) {
    println("The Play site is up")
  } else {
    println("The Play site is down")

In this example, we turn our Future[WSResponse] into a Future[Boolean] by checking for the status of the response. It is important to understand that this code will not block at any point: only when the response will be available will a thread be made available for the processing of the response and execute the code inside of the map function.

Recovering failed Futures

Failure recovery is quite convenient as well:

val response: Future[WSResponse] =

val siteAvailable: Future[Option[Boolean]] = { r =>
    Some(r.status == 200)
  } recover {
    case ce: => None

At the very end of the Future we call the recover method which will deal with a certain type of exception and limit the dammage. In this example we are only handling the unfortunate case of a by returning a Nonevalue.

Composing Futures

The killer feature of Futures is their composeability. A very typical use-case when building asynchronous programming workflows is to combine the results of several concurrent operations. Futures (and Scala) make this rather easy:

def siteAvailable(url: String): Future[Boolean] =
  WS.url(url).get().map { r =>
    r.status == 200

val playSiteAvailable =

val playGithubAvailable =

val allSitesAvailable: Future[Boolean] = for {
  siteAvailable <- playSiteAvailable
  githubAvailable <- playGithubAvailable
} yield (siteAvailable && githubAvailable)

The allSitesAvailable Future is built using a for comprehension which will wait until both Futures have completed. The two Futures playSiteAvailable and playGithubAvailable will start running as soon as they are being declared and the for comprehension will compose them together. And if one of those Futures were to fail, the resulting Future[Boolean] would fail directly as a result (without waiting for the other Future to complete). This is it for the first part of this series. In the next post we will look at another tool for reactive programming and then finally at how to use those tools in combination in order to access a relational database in a reactive fashion.

Read on

Stay tuned as we’ll publish Parts 2 and 3 shortly as a part of this series: