Querying Your Database from Millions of Fibers (Rather than Thousands of Threads)


jOOQ is a great way to do SQL in Java and Quasar fibers bring a much improved concurrency

We’re excited to announce another very interesting guest post on the jOOQ Blog by Fabio Tudone from Parallel Universe.

Parallel Universe develops an open-source stack that allows developers to easily code extremly concurrent application on the JVM. With the Parallel Universe stack you build software that works in harmony with modern hardware rather than fight it at every turn, while keeping your programming language and your simple, familiar programming styles.

fabiotudoneFabio Tudone develops and maintains Quasar integration modules as part of the Comsat project. He’s been part of and then led the development of a cloud-based enterprise content governance platform for several years before joining the Parallel Universe team and he’s been writing mostly JVM software along his whole professional journey. His interests include Dev and DevOps practices, scalability, concurrent and functional programming as well as runtime platforms. Naturally curious and leaning towards exploration, he enjoys gathering knowledge and understanding from people, places and cultures. He’s also interested in awareness practices and likes writing all sorts of stuff.

Quasar features an integration for JDBC and jOOQ as part of the Comsat project, so let’s have a look inside the box.

JDBC, jOOQ and Quasar

comsat-jdbc provides a fiber-blocking wrapper of the JDBC API, so that you can use your connection inside fibers rather than regular Java threads.

Why would you do that? Because fibers are lightweight threads and you can have many more fibers than threads in your running JVM. “Many more” means we’re talking millions versus a handful of thousands.

This means that you have a lot more concurrency capacity in your system to do other things in parallel while you wait for JDBC execution, be it concurrent / parallel calculations (like exchanging actor messages in your highly reliable Quasar Erlang-like actor system) or fiber-blocking I/O (e.g. serving webrequests, invoking micro-services, reading files through fiber NIO or accessing other fiber-enabled data sources like MongoDB).

If your DB can stand it and few more regular threads won’t blow up your system (yet), you can even increase your fiber-JDBC pool (see Extra points: where’s the waiting line later) and send more concurrent jOOQ commands.

Since jOOQ uses JDBC connections to access the database, having jOOQ run on fibers is as easy as bringing in the comsat-jooq dependency and handing your fiber-enabled JDBC connection to the jOOQ context:

import java.sql.Connection;
import static org.jooq.impl.DSL.*;

// ...

Connecton conn = FiberDataSource.wrap(dataSource)
                                .getConnection();
DSLContext create = DSL.using(connection);

// ...

Of course you can also configure a ConnectionProvider to fetch connections from your FiberDataSource.

From this moment on you can use regular jOOQ and everything will happen in fiber-blocking mode rather than thread-blocking. That’s it.

No, really, there’s absolutely nothing more to it: you keep using the excellent jOOQ, only with much more efficient fibers rather than threads. Quasar is a good citizen and won’t force you into a new API (which is nice especially when the original one is great already).

Since the JVM at present doesn’t support native green threads nor continuations, which can be used to implement lightweight threads, Quasar implements continuations (and fibers on top of them) viabytecode instrumentation. This can be done at compile time but often it’s just more convenient to use Quasar’s agent (especially when instrumenting third-party libraries), so here’s an example Gradle project based on Dropwizard that also includes the Quasar agent setup (Don’t forget about Capsule, a really great Java deployment tool for every need, which, needless to say, makes using Quasar and agents in general a breeze). The example doesn’t use all of the jOOQ features, rather it falls in SQL-building use case (both for querying and for CRUD) but you’re encouraged to change it to suit your needs. The without-comsat branch contains a thread-blocking version so you can compare and see the (minimal) differences with the Comsat version.

Where’s the waiting line?

You might be wondering now: ok, but JDBC is a thread-blocking API, how can Quasar turn it into a fiber-blocking one? Because JDBC doesn’t have an aynchronous mode, Quasar uses a thread pool behind the scenes to which fibers dispatch JDBC operations and by which they’re unfrozen and scheduled for resumption when the JDBC operation completes (have a look at Quasar’s integration patterns for more info).

Yes, here’s the nasty waiting line: JDBC commands awaiting to be executed by the thread pool. Although you’re not improving DB parallelism beyond your JDBC thread-pool size, you’re not hurting your fibers, either, even though you’re still using a simple and familiar blocking API. You can still have millions of fibers.

Is it possible to improve the overall situation? Without a standard asynchronous Java RDBMS API there isn’t much we can do. However, this may not matter at all if the database is your bottleneck. There are several nice posts and discussions about this topic and the argument amounts to deciding where you want to move the waiting line.

Bonus: how does this neat jOOQ integration work under the cover?

At present Quasar needs the developer (or integrator) to tell it what to instrument, although fully automatic instrumentation is in the works (This feature depends on some minor JRE changes that won’t be released before Java 9). If you can conveniently alter the source code (or the compiled classes) then it’s enough to annotate methods with @Suspendable or letting them throws SuspendExecution, but this is usually not the case with libraries. But methods with fixed, well known names to be instrumented can be listed in META-INF/suspendables and META-INF/suspendable-supers, respectively for concrete methods and abstract / interface methods that can have suspendable implementations.

If there are a lot (or there’s code generation involved), you can write a SuspendableClassifier to ship with your integration and register it with Quasar’s SPI to provide additional instrumentation logic (see jOOQs). A SuspendableClassifier‘s job is to examine signature information about each and every method in your runtime classpath during the instrumentation phase and tell if it’s suspendable, if it can have suspendable implementations, if for sure neither is the case or if it doesn’t know (Some other classifier could say perhaps “suspendable” or “suspendable-super” later on).

Summing it all up

Well… Just enjoy the excellent jOOQ on efficient fibers!

5 thoughts on “Querying Your Database from Millions of Fibers (Rather than Thousands of Threads)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s