(Ab)using Java 8 FunctionalInterfaces as Local Methods

If you’re programming in more advanced languages like Scala or Ceylon, or even JavaScript, “nested functions” or “local functions” are a very common idiom to you. For instance, you’ll write things like fibonacci functions as such:

def f() = {
  def g() = "a string!"
  g() + "– says g"
}

(Question from Stack Overflow by Aaron Yodaiken)

The f() function contains a nested g() function, which is local to the scope of the outer f() function.

In Java, there is no way to create a local function like this, but you can assign a lambda expression to a local variable and use that instead.

The above example can be translated to the following Java code:

String f() {
    Supplier<String> g = () -> "a string!";
    return g.get() + "- says g";
}

While this example is rather trivial, a much more useful use-case is testing. For instance, consider the following jOOλ unit test, which checks whether the Stream.close() semantics is properly implemented across all sorts of jOOλ Seq methods, that combine two streams into one:

@Test
public void testCloseCombineTwoSeqs() {
    Consumer<BiFunction<Stream<Integer>, Stream<Integer>, Seq<?>>> test = f -> {
        AtomicBoolean closed1 = new AtomicBoolean();
        AtomicBoolean closed2 = new AtomicBoolean();
        
        Stream s1 = Stream.of(1, 2).onClose(() -> closed1.set(true));
        Stream s2 = Stream.of(3).onClose(() -> closed2.set(true));
        
        try (Seq s3 = f.apply(s1, s2)) {
            s3.collect(Collectors.toList());
        }

        assertTrue(closed1.get());
        assertTrue(closed2.get());
    };
    
    test.accept((s1, s2) -> seq(s1).concat(s2));
    test.accept((s1, s2) -> seq(s1).crossJoin(s2));
    test.accept((s1, s2) -> seq(s1).innerJoin(s2, (a, b) -> true));
    test.accept((s1, s2) -> seq(s1).leftOuterJoin(s2, (a, b) -> true));
    test.accept((s1, s2) -> seq(s1).rightOuterJoin(s2, (a, b) -> true));
}

The local function is test, and it takes two Stream<Integer> arguments, producing a Seq<?> result.

Why not just write a private method?

Of course, this could have been solved with a private method as well, classic Java style. But sometimes, using a local scope is much more convenient, as the test Consumer (local function) does not escape the scope of this single unit test. It should be used only within this single method.

An alternative, more classic Java way would have been to define a local class, instead, and put the function inside of that. But this solution is much more lean.

One disadvantage, however, is that recursion is much harder to implement this way, in Java. See also:
http://stackoverflow.com/q/19429667/521799

Reactive Database Access – Part 3 – Using jOOQ with Scala, Futures and Actors

We’re very happy to continue our a guest post series on the jOOQ blog by Manuel Bernhardt. In this blog series, Manuel will explain the motivation behind so-called reactive technologies and after introducing the concepts of Futures and Actors use them in order to access a relational database in combination with jOOQ.

manuel-bernhardtManuel Bernhardt is an independent software consultant with a passion for building web-based systems, both back-end and front-end. He is the author of “Reactive Web Applications” (Manning) and he started working with Scala, Akka and the Play Framework in 2010 after spending a long time with Java. He lives in Vienna, where he is co-organiser of the local Scala User Group. He is enthusiastic about the Scala-based technologies and the vibrant community and is looking for ways to spread its usage in the industry. He’s also scuba-diving since age 6, and can’t quite get used to the lack of sea in Austria.

This series is split in three parts, which we have published over the past months:

Introduction

In the previous two posts of this series we have introduced the benefits of reactive programming as well as two tools available for manipulating them, Futures and Actors. Now it is time to get your hands dirty, dear reader, and to create a simple application featuring reactive database access. Fear not, I will be there along the whole way to guide you through the process.

Also, the source code of this example is available on Github

Getting the tools

In order to build the application, you will need a few tools. If you haven’t worked with Scala yet, the simplest for you may be to go and grab the Typesafe Activator which is a standalone project that brings in the necessary tools to build a Scala project from scratch.

Since this is about reactive database access, you will also need a database. For the purpose of this simple example, we’re going to use Oracle Database 12c Enterprise Edition. Nah, just kidding – it might be a bit cumbersome to get this one to run on your machine. Instead we will use the excellent PostgreSQL. Make sure to install it so that you can run the psql utility from your console.

Ready? Great! Let’s have a look at what we’re going to build.

The application

The goal of our application is to fetch mentions from Twitter and store them locally so as to be able to visualize them and perform analytics on them.

687474703a2f2f6d616e75656c2e6265726e68617264742e696f2f77702d636f6e74656e742f4d656e74696f6e732e706e67

The core of this mechanism will be a MentionsFetcher actor which will periodically fetch mentions from Twitter and save them in the database. Once there we can display useful information on a view.

Creating the database

The first step we’re going to take is to create the database. Create a mentions.sql file somewhere with the following content:

CREATE USER "play" NOSUPERUSER INHERIT CREATEROLE;

CREATE DATABASE mentions WITH OWNER = "play" ENCODING 'UTF8';

GRANT ALL PRIVILEGES ON DATABASE mentions to "play";

\connect mentions play

CREATE TABLE twitter_user (
  id bigserial primary key,
  created_on timestamp with time zone NOT NULL,
  twitter_user_name varchar NOT NULL
);

CREATE TABLE mentions (
  id bigserial primary key,
  tweet_id varchar NOT NULL,
  user_id bigint NOT NULL,
  created_on timestamp with time zone NOT NULL,
  text varchar NOT NULL
);

This script will create a play user, a mentions database as well as two tables, twitter_user and mentions.

In order to execute it, execute the following command in a terminal:

psql -f mentions.sql

(note: you might need to explictly declare which user runs this command, depending on how you have configured PostgreSQL to run)

Bootstrapping the project

Let’s create the reactive-mentions project, shall we? Assuming that you have installed the activator, run the following command in a terminal:

~/workspace » activator new reactive-mentions

This will prompt a list of templates, we are going to use the play-scala project template:

Fetching the latest list of templates...

Browse the list of templates: http://typesafe.com/activator/templates
Choose from these featured templates or enter a template name:
  1) minimal-akka-java-seed
  2) minimal-akka-scala-seed
  3) minimal-java
  4) minimal-scala
  5) play-java
  6) play-scala
(hit tab to see a list of all templates)
> 6
OK, application "reactive-mentions" is being created using the "play-scala" template.
[...]

At this point, a simple Play Framework project has been created in the reactive-mentions directory. If you want to, you can run this project by navigating to it and running the command activator run.

In order to work on the project, you can use one of the many IDEs that have Scala support. My personal favourite is to this day IntelliJ IDEA which does a pretty good job at this and also has built-in support for the Play Framework itself.

Setting up jOOQ

I wrote about database access in Scala about 2 years ago. There are to this day still quite a few alternatives to relational database access in Scala but at least personally I have now reached the conclusion that for the type of projects I work on, jOOQ beats them all when it comes to writing type-safe SQL. So without further ado let’s integrate it with our project.

There is an SBT plugin available for this if you would like, however for this application we will settle for a minimal, hand-crafter solution.

Bring up the build.sbt file in an editor and add adjust the libraryDependencies to look like so:

libraryDependencies ++= Seq(
  jdbc,
  cache,
  ws,
  "org.postgresql" % "postgresql" % "9.4-1201-jdbc41",
  "org.jooq" % "jooq" % "3.7.0",
  "org.jooq" % "jooq-codegen-maven" % "3.7.0",
  "org.jooq" % "jooq-meta" % "3.7.0",
  specs2 % Test
)

If you are running the project’s console (which you can do by executing the activator command in the project’s directory) you will need to call the reload command in order for the new dependencies to be pulled in. This is true of any change you are doing to the build.sbt file. Don’t forget about it in the remainder of this article!

(note: make sure to use the version of the PostgreSQL driver that fits your version of PostgreSQL!)

Next, we need to set up jOOQ itself. For this purpose, create the file conf/mentions.xml, where conf is the directory used in the Play Framework for storing configuration-related files:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<configuration xmlns="http://www.jooq.org/xsd/jooq-codegen-3.7.0.xsd">
  <jdbc>
    <driver>org.postgresql.Driver</driver>
    <url>jdbc:postgresql://localhost/mentions</url>
    <user>play</user>
    <password></password>
  </jdbc>
  <generator>
    <name>org.jooq.util.ScalaGenerator</name>
    <database>
      <name>org.jooq.util.postgres.PostgresDatabase</name>
      <inputSchema>public</inputSchema>
      <includes>.*</includes>
      <excludes></excludes>
    </database>
    <target>
      <packageName>generated</packageName>
      <directory>target/scala-2.11/src_managed/main</directory>
    </target>
  </generator>
</configuration>

This configuration will allow to run jOOQ’s ScalaGenerator which will read the database schema and generate Scala specific classes for it, storing them in a directory accessible in the classpath and meant for generated sources.

All that is left to do is to create have a way to run jOOQ’s code generation. A simple solution that we are going to use is to create a custom SBT task in our project build. Go back to build.sbt and add the following at the end:

val generateJOOQ = taskKey[Seq[File]]("Generate JooQ classes")

val generateJOOQTask = (sourceManaged, fullClasspath in Compile, runner in Compile, streams) map { (src, cp, r, s) =>
  toError(r.run("org.jooq.util.GenerationTool", cp.files, Array("conf/mentions.xml"), s.log))
  ((src / "main/generated") ** "*.scala").get
}

generateJOOQ <<= generateJOOQTask


unmanagedSourceDirectories in Compile += sourceManaged.value / "main/generated"

The generateJOOQ task will run the GenerationTool using the mentions.xml file we have set-up earlier on. Let’s run it!

Start the SBT console by running the activator command in your terminal window, in the reactive-streams directory, and then run the generateJOOQ command:

[reactive-mentions] $ generateJOOQ
[info] Running org.jooq.util.GenerationTool conf/mentions.xml
[success] Total time: 1 s, completed Dec 11, 2015 2:55:08 PM

That’s it! If you want a bit more verbosity, add the following logger configuration to conf/logback.xml:

 <logger name="org.jooq" level="INFO" />

Alright, we are now ready to get to the core of our endaveour: create the actor that will pull mentions from Twitter!

Creating the MentionsFetcher actor

For the purpose of fetching mentions at regular intervals from Twitter, we will be using a simple Akka actor. Actors are meant to do a lot more powerful things than this but for the sake of introducing the concept this example will do (or so I hope).

Go ahead and add Akka as well as its logging facility as library dependencies in build.sbt:

libraryDependencies ++= Seq(
  ...
  "com.typesafe.akka" %% "akka-actor" % "2.4.1",
  "com.typesafe.akka" %% "akka-slf4j" % "2.4.1"
)

Next, create the file app/actors/MentionsFetcher.scala with the following content:

package actors

import actors.MentionsFetcher._
import akka.actor.{ActorLogging, Actor}
import org.joda.time.DateTime
import scala.concurrent.duration._

class MentionsFetcher extends Actor with ActorLogging {

  val scheduler = context.system.scheduler.schedule(
    initialDelay = 5.seconds,
    interval = 10.minutes,
    receiver = self,
    message = CheckMentions
  )

  override def postStop(): Unit = {
    scheduler.cancel()
  }

  def receive = {
    case CheckMentions => checkMentions
    case MentionsReceived(mentions) => storeMentions(mentions)
  }

  def checkMentions = ???

  def storeMentions(mentions: Seq[Mention]) = ???

}

object MentionsFetcher {

  case object CheckMentions
  case class Mention(id: String, created_at: DateTime, text: String, from: String, users: Seq[User])
  case class User(handle: String, id: String)
  case class MentionsReceived(mentions: Seq[Mention])

}

The first thing you may notice from this code is the unimplemented methods fetchMentions and storeMentions with the triple question mark ???. That’s actually valid Scala syntax: it is a method available by default which throws ascala.NotImplementedError.

The second thing I want you to notice is the companion object to the MentionsFetcher class which holds the protocol of our actor. Actors communicate using messages and even though our actor will only communicate with itself in this example it is a good idea to place it in a companion object and to import its members (via the wildcard import import actors.MentionsFetcher._) so as to keep things organized as the project grows.

Other than this, what we are doing for the moment is quite simple: we are setting up a scheduler that wakes up every 10 minutes in order to send the actor it-self the FetchMentions message. Upon receiving this message in the main receivemethod we are going to proceed to fetching the mentions from Twitter. Finally when a MentionsReceived message is received, we simply invoke the storeMentions method.

Simple enough, isn’t it? Don’t worry, things are about to get a little bit more complicated.

Fetching the mentions from Twitter

Twitter does not have an API that lets us directly fetch recent mentions. However it has an API that lets us search for Tweets and that will have to do.

Before you can go any further, if you intend to run this project, you will need to get yourself a set of keys and access tokens at apps.twitter.com. If you don’t you will have to trust me that the following works.

Once you have them, add them in the file conf/application.conf like so:

# Twitter
twitter.apiKey="..."
twitter.apiSecret="..."
twitter.accessToken="..."
twitter.accessTokenSecret="..."

Then, create the credentials method in MentionsFetcher:

// ...
import play.api.Play
import play.api.Play.current
import play.api.libs.oauth.{RequestToken, ConsumerKey}

class MentionsFetcher extends Actor with ActorLogging {

  // ...

  def credentials = for {
    apiKey <- Play.configuration.getString("twitter.apiKey")
    apiSecret <- Play.configuration.getString("twitter.apiSecret")
    token <- Play.configuration.getString("twitter.accessToken")
    tokenSecret <- Play.configuration.getString("twitter.accessTokenSecret")
  } yield (ConsumerKey(apiKey, apiSecret), RequestToken(token, tokenSecret))

}

This will allow us to place a call to Twitter’s API using the correct OAuth credentials.

Next, let’s get ready to fetch those mentions:

// ...
import akka.pattern.pipe
import org.joda.time.DateTime
import scala.util.control.NonFatal

class MentionsFetcher extends Actor with ActorLogging {

  // ...

  var lastSeenMentionTime: Option[DateTime] = Some(DateTime.now)

  def checkMentions = {
      val maybeMentions = for {
        (consumerKey, requestToken) <- credentials
        time <- lastSeenMentionTime
      } yield fetchMentions(consumerKey, requestToken, "<yourTwitterHandleHere>", time)

      maybeMentions.foreach { mentions =>
        mentions.map { m =>
          MentionsReceived(m)
        } recover { case NonFatal(t) =>
          log.error(t, "Could not fetch mentions")
          MentionsReceived(Seq.empty)
        } pipeTo self
      }
  }

  def fetchMentions(consumerKey: ConsumerKey, requestToken: RequestToken, user: String, time: DateTime): Future[Seq[Mention]] = ???

Do you remember the pipe pattern we talked about in the previous post about Actors? Well, here it is again!

The call we are going to make against Twitter’s API is going to be asynchronous. In other words we will not simply get aSeq[Mention] but a Future[Seq[Mention]] to work with, and the best way to deal with that one is to send ourselves a message once the Future has completed with the contents of the result.

Since things can go wrong though we also need to think about error recovery which we do here by heroically logging out the fact that we could not fetch the mentions.

You may also notice that we have introduced a lastSeenMentionTime variable. This is the means by which we are going to keep in memory the timestamp of the last mention we have seen.

In order to go ahead, one thing we need to do is to use a more recent version of the async-http-library client since there is a bug in Play 2.4.x. Add the following dependency to build.sbt:

libraryDependencies ++= Seq(
  ...
  "com.ning" % "async-http-client" % "1.9.29"
)

Alright, now that we are all set, let’s finally fetch those mentions!

// ...
import scala.util.control.NonFatal
import org.joda.time.format.DateTimeFormat
import play.api.libs.json.JsArray
import play.api.libs.ws.WS
import scala.concurrent.Future


class MentionsFetcher extends Actor with ActorLogging {

  // ...

  def fetchMentions(consumerKey: ConsumerKey, requestToken: RequestToken, user: String, time: DateTime): Future[Seq[Mention]] = {
    val df = DateTimeFormat.forPattern("EEE MMM dd HH:mm:ss Z yyyy").withLocale(Locale.ENGLISH)

    WS.url("https://api.twitter.com/1.1/search/tweets.json")
      .sign(OAuthCalculator(consumerKey, requestToken))
      .withQueryString("q" -> s"@$user")
      .get()
      .map { response =>
        val mentions = (response.json \ "statuses").as[JsArray].value.map { status =>
          val id = (status \ "id_str").as[String]
          val text = (status \ "text").as[String]
          val from = (status \ "user" \ "screen_name").as[String]
          val created_at = df.parseDateTime((status \ "created_at").as[String])
          val userMentions = (status \ "entities" \ "user_mentions").as[JsArray].value.map { user =>
            User((user \ "screen_name").as[String], ((user \ "id_str").as[String]))
          }

          Mention(id, created_at, text, from, userMentions)

        }
        mentions.filter(_.created_at.isAfter(time))
    }
  }

}

Fetching the mentions is rather straightforward thanks to Play’s WebService library. We create a signed OAuth request using our credentials and run a HTTP GET request against the search API passing as query string the @userName which will (hopefully) give us a list of Tweets mentioning a user. Lastly we do only keep those mentions that are after our last check time. Since we check every 10 minutes and since the API only returns recent tweets, this should be doing fine (unless you are very popular on Twitter and get an insane amount of replies – but this really is your own fault, then).

Setting the ExecutionContext

If you try to compile the project now (using the compile command) you will be greeted with a few compilation errors complaining about a missing ExecutionContext. Futures are a way to abstract tasks and they need something to run them. The ExecutionContext is the missing bit which will schedule the tasks to be executed.

Since we are inside of an actor we can borrow the actor’s own dispatcher:

class MentionsFetcher extends Actor with ActorLogging {

  implicit val executionContext = context.dispatcher

  // ...
}

We’ll talk more about Execution Contexts later on when it comes to fine-tuning the connection with the database. For the moment let us focus on actually getting to talk with the database at all.

Setting up a reactive database connection

Configuring the database connection

In order to connect to the database, we will first need to configure the connection information in conf/application.conf like so:

// ...

db.default.driver="org.postgresql.Driver"
db.default.url="jdbc:postgresql://localhost/mentions?user=play"

Creating a helper class to access the database

Play’s Database API is letting us access the configured database. We now need to do two things:

  • use jOOQ (rather than plain JDBC) to talk with the database
  • make sure we are not going to jeopardize our application by blocking while waiting for the database interaction to happen (JDBC is blocking)

For this purpose we will wrap the database operations in a Future that will run on its own ExecutionContext rather than sharing the one used by the actor or by the Play application itself.

Go ahead and create the file app/database/DB.scala:

package database

import javax.inject.Inject

import akka.actor.ActorSystem
import org.jooq.{SQLDialect, DSLContext}
import org.jooq.impl.DSL
import play.api.db.Database

import scala.concurrent.{ExecutionContext, Future}

class DB @Inject() (db: Database, system: ActorSystem) {

  val databaseContext: ExecutionContext = system.dispatchers.lookup("contexts.database")

  def query[A](block: DSLContext => A): Future[A] = Future {
    db.withConnection { connection =>
      val sql = DSL.using(connection, SQLDialect.POSTGRES_9_4)
      block(sql)
    }
  }(databaseContext)

  def withTransaction[A](block: DSLContext => A): Future[A] = Future {
    db.withTransaction { connection =>
      val sql = DSL.using(connection, SQLDialect.POSTGRES_9_4)
      block(sql)
    }
  }(databaseContext)

}

We define two methods, query and withTransaction that:

  • use a Future block in order to wrap the underlying code as a Future, thus running it asynchronously
  • use a custom databaseContext ExecutionContext in order to execute this Future
  • initialze jOOQ’s DSLContext and give access to it in the body of the expected functions

The databaseContext ExectionContext is created using Akka’s configuration capabilities. We need to add the configuration of the database dispatcher in conf/application.conf:

contexts {
    database {
        fork-join-executor {
          parallelism-max = 9
        }
    }
}

The magic number 9 doesn’t come out of nowhere. Check the excellent explanation provided by the HikariCP connection pool about connection pool sizing for more details. Those considerations are also discussed in length in Chapters 5 and 7 of Reactive Web Applications.

Wiring everything using dependency injection

Next, let’s use Play’s built-in dependency injection mechanism in order to provide our MentionsFetcher actor with a DB class. Adjust the constructor of our MentionsFetcher actor in app/actors/MentionsFetcher.scala to look like so:

// ...
import javax.inject.Inject
import play.api.db.Database


class MentionsFetcher @Inject() (database: Database) extends Actor with ActorLogging { ... }

We just need one more thing in order to bootstrap our MentionsFetcher actor: let Play know that we want to use it.

For this purpose we will declare a module and leverage the plumbing that Play provides when it comes to interacting with Akka actors. At the end of MentionsFetcher.scala (or in a new file, if you like), declare the following MentionsFetcherModule:

import com.google.inject.AbstractModule
import play.api.libs.concurrent.AkkaGuiceSupport

class MentionsFetcherModule extends AbstractModule with AkkaGuiceSupport {
  def configure(): Unit =
    bindActor[MentionsFetcher]("fetcher")
}

Last but not least we need to tell Play that we would like to use this module. In conf/appliction.conf add the following line to do so:

play.modules.enabled += "actors.MentionsFetcherModule"

That’s it! When Play starts up it will initialize the enabled modules which in turn will lead to the actor being initialized.

We now can go ahead and use the database in order to store the fetched mentions.

Storing the mentions in the database

Thanks to jOOQ writing the statements for storing the mentions is rather easy. Since we do not want to risk storing users or mentions twice we will upsert them using the WHERE NOT EXISTS SQL clause. For the sake of recording as much data as possible we will also store all mentioned users of a Tweet.

// ...
import generated.Tables._
import org.jooq.impl.DSL._

class MentionsFetcher @Inject() (db: DB) extends Actor with ActorLogging {

  // ...

  def storeMentions(mentions: Seq[Mention]) = db.withTransaction { sql =>
    log.info("Inserting potentially {} mentions into the database", mentions.size)
    val now = new Timestamp(DateTime.now.getMillis)

    def upsertUser(handle: String) = {
      sql.insertInto(TWITTER_USER, TWITTER_USER.CREATED_ON, TWITTER_USER.TWITTER_USER_NAME)
        .select(
          select(value(now), value(handle))
            .whereNotExists(
              selectOne()
                .from(TWITTER_USER)
                .where(TWITTER_USER.TWITTER_USER_NAME.equal(handle))
            )
        )
        .execute()
    }

    mentions.foreach { mention =>
      // upsert the mentioning users
      upsertUser(mention.from)

      // upsert the mentioned users
      mention.users.foreach { user =>
        upsertUser(user.handle)
      }

      // upsert the mention
      sql.insertInto(MENTIONS, MENTIONS.CREATED_ON, MENTIONS.TEXT, MENTIONS.TWEET_ID, MENTIONS.USER_ID)
        .select(
          select(
            value(now),
            value(mention.text),
            value(mention.id),
            TWITTER_USER.ID
          )
            .from(TWITTER_USER)
            .where(TWITTER_USER.TWITTER_USER_NAME.equal(mention.from))
            .andNotExists(
              selectOne()
                .from(MENTIONS)
                .where(MENTIONS.TWEET_ID.equal(mention.id))
            )
        )
        .execute()
    }
  }

}

Et voilà! If you execute this code (and generate some mentions, or use an earlier timestamp for filtering) you will get some data into your database!

Let’s now query and display a few statistics in the browser.

Displaying the mentions

In order to show our mentions we will adjust the default view shown when launching the application as well as theApplication controller. Start by adjusting the template app/views/index.scala.html to look as follows:

@(mentionsCount: Int)

@main("Reactive mentions") {

    <p>You have been mentioned @mentionsCount times in the past days</p>

}

Next, edit the Application controller located in app/controllers/Application.scala:

package controllers

import java.sql.Timestamp
import javax.inject.Inject

import database.DB
import org.joda.time.DateTime
import play.api._
import play.api.mvc._

class Application @Inject() (db: DB) extends Controller {

  def index = Action.async { implicit request =>

    import generated.Tables._
    import org.jooq.impl.DSL._

    db.query { sql =>
      val mentionsCount = sql.select(
        count()
      ).from(MENTIONS)
       .where(
         MENTIONS.CREATED_ON.gt(value(new Timestamp(DateTime.now.minusDays(1).getMillis)))
       ).fetchOne(int.class)

      Ok(views.html.index(mentionsCount))
    }

  }

}

This time, we are using the query method that we have built in our DB helper. Since the result of this operation is a Future, we need to use the Action.async method of the Action which has a signature of the kind Request => Future[Response]. The execution of this query is performed by the custom ExecutionContext that we have set up for database operations and does not impede on the default ExecutionContext of the Play framework itself.

In case anything were to go wrong and the database operations were to hang on forever on the threads offered by that context, the rest of the application would not be affected (this principle is called “bulkheading” and is described a bit more in detail in Chapter 5 of Reactive Web Applications).

Conclusion

In this series we have explored the “Why?” of reactive applications and of asynchronous programming. In particular, we have talked about Futures and Actors, two powerful abstractions that make asynchronous programming easier to think about.

Most relational databases do not have asynchronous drivers available yet and even if there are some projects aiming at it I think it will still take some time before we’ll have a standard that will hopefully be implemented by many vendors. In the meanwhile we have seen that we can use a custom ExecutionContext in order to isolate otherwise blocking database operations.

687474703a2f2f6d616e75656c2e6265726e68617264742e696f2f77702d636f6e74656e742f4265726e68617264742d526561637469766557412d4d4541502d48492e6a706567

If you liked this series and are interested in learning more on the topic, consider checking out my book which provides an introductio to building reactive web applications on the JVM. Futures are covered in Chapter 5, Actors in Chapter 6 and Database Access in Chapter 7.

Read on

Read the previous chapters of this series:

Reactive Database Access – Part 2 – Actors

We’re very happy to continue our a guest post series on the jOOQ blog by Manuel Bernhardt. In this blog series, Manuel will explain the motivation behind so-called reactive technologies and after introducing the concepts of Futures and Actors use them in order to access a relational database in combination with jOOQ.

manuel-bernhardtManuel Bernhardt is an independent software consultant with a passion for building web-based systems, both back-end and front-end. He is the author of “Reactive Web Applications” (Manning) and he started working with Scala, Akka and the Play Framework in 2010 after spending a long time with Java. He lives in Vienna, where he is co-organiser of the local Scala User Group. He is enthusiastic about the Scala-based technologies and the vibrant community and is looking for ways to spread its usage in the industry. He’s also scuba-diving since age 6, and can’t quite get used to the lack of sea in Austria.

This series is split in three parts, which we’ll publish over the next month:

Introduction

In our last post we introduced the concept of reactive applications, explained the merits of asynchronous programming and introduced Futures, a tool for expressing and manipulating asynchronous values.

In this post we will look into another tool for building asynchronous programs based on the concept of message-driven communication: actors.

The Actor-based concurrency model was popularized by the Erlang programming language and its most popular implementation on the JVM is the Akka concurrency toolkit.

In one way, the Actor model is object-orientation done “right”: the state of an actor can be mutable, but it is never exposed directly to the outside world. Instead, actors communicate with each other on the basis of asynchronous message-passing in which the messages themselves are immutable. An actor can only do one of three things:

  • send and receive any number of messages
  • change its behaviour or state in response to a message arriving
  • start new child actors

It is always in the hands of an actor to decide what state it is ready to share, and when to mutate it. This model therefore makes it much easier for us humans to write concurrent programs that are not riddled with race-conditions or deadlocks that we may have introduced by accidentally reading or writing outdated state or using locks as a means to avoid the latter.

In what follows we are going to see how Actors work and how to combine them with Futures.

Actor fundamentals

Actors are lightweight objects that communicate with eachother by sending and receiving messages. Each actor has amailbox in which incoming messages are queued before they get processed.

Two actors talking to each other

Actors have different states: they can be started, resumed, stopped and restarted. Resuming or restarting an actor is useful when an actor crashes as we will see later on.

Actors also have an actor reference which is a means for one actor to reach another. Like a phone number, the actor reference is a pointer to an actor, and if the actor were to be restarted and replaced by a new incarnation in case of crash it would make no difference to other actors attempting to send messages to it since the only thing they know about the actor is its reference, not the identity of one particular incarnation.

Sending and receiving messages

Let’s start by creating a simple actor:

import akka.actor._

class Luke extends Actor {
  def receive = {
    case _ => // do nothing
  }
}

This is really all it takes to create an actor. But that’s not very interesting. Let’s spice things up a little and define a reaction to a given message:

import akka.actor._

case object RevelationOfFathership

class Luke extends Actor {
  def receive = {
    case RevelationOfFathership =>
      System.err.println("Noooooooooo")
  }
}   

Here we go! RevelationOfFathership is a case object, i.e. an immutable message. This last detail is rather important: your messages should always be self-contained and not referencing the internal state of any actor since this would effectively leak this state to the outside, hence breaking the guarantee that only an actor can change its internal state. This last bit is paramount for actors to offer a better, more human-friendly concurrency model and for not getting any surprises.

Now that Luke knows how to appropriately respond to the inconvenient truth that Dark Vader is his father, all we need is the dark lord himself.

import akka.actor._

class Vader extends Actor {

  override def preStart(): Unit =
    context.actorSelection("akka://application/user/luke") ! RevelationOfFathership

  def receive = {
    case _ => // ...
  }
}

The Vader actor uses the preStart lifecycle method in order to trigger sending the message to his son when he gets started up. We’re using the actor’s context in order to send a message to Luke.

The entire sequence for running this example would look as follows:

import akka.actor._

val system = ActorSystem("application")
val luke = system.actorOf(Props[Luke], name = "luke")
val vader = system.actorOf(Props[Vader], name = "vader")

The Props are a means to describe how to obtain an instance of an actor. Since they are immutable they can be freely shared, for example accross different JVMs running on different machines (this is useful for example when operating an Akka cluster).

Actor supervision

Actors do not merely exist in the wild, but instead are part of an actor hierarchy and each actor has a parent. Actors that we create are supervised by the User Guardian of the application’s ActorSystem which is a special actor provided by Akka and responsible for supervising all actors in user space. The role of a supervising actor is to decide how to deal with the failure of a child actor and to act accordingly.

The User Guardian itself is supervised by the Root Guardian (which also supervises another special actor internal to Akka), and is itself supervised by a special actor reference. Legend says that this reference was there before all other actor references came into existence and is called “the one who walks the bubbles of space-time” (if you don’t believe me, check the official Akka documentation).

Organizing actors in hierarchies offers the advantage of encoding error handling right into the hierarchy. Each parent is responsible for the actions of their children. Should something go wrong and a child crash, the parent would have the opportunity to restart it.

Vader, for example, has a few storm troopers:

import akka.actor._
import akka.routing._

class Vader extends Actor {

  val troopers: ActorRef = context.actorOf(
    RoundRobinPool(8).props(Props[StromTrooper])
  )
}

The RoundRobinPool is a means of expressing the fact that messages sent to troopers will be sent to each trooper child one after the other. Routers encode strategies for sending messages to several actors at once, Akka provides many predefined routers.

Crashed Stormtroopers

Ultimately, actors can crash, and it is then the job of the supervisor to decide what to do. The decision-making mechanism is represented by a so-called supervision strategy. For example, Vader could decide to retry restarting a storm trooper 3 times before giving up and stopping it:

import akka.actor._

class Vader extends Actor {

  val troopers: ActorRef = context.actorOf(
    RoundRobinPool(8).props(Props[StromTrooper])
  )

  override def supervisorStrategy =
    OneForOneStrategy(maxNrOfRetries = 3) {
      case t: Throwable =>
        log.error("StormTrooper down!", t)
        SupervisorStrategy.Restart
    }
}

This supervision strategy is rather crude since it deals with all types of Throwable in the same fashion. We will see in our next post that supervision strategies are an effective means of reacting to different types of failures in different ways.

Combining Futures and Actors

There is one golden rule of working with actors: you should not perform any blocking operation such as for example a blocking network call. The reason is simple: if the actor blocks, it can not process incoming messages which may lead to a full mailbox (or rather, since the default mailbox used by actors is unbounded, to an OutOfMemoryException.

This is why it may be useful to be able to use Futures within actors. The pipe pattern is designed to do just that: it send the result of a Future to an actor:

import akka.actor._
import akka.pattern.pipe

class Luke extends Actor {
  def receive = {
    case RevelationOfFathership =>
      sendTweet("Nooooooo") pipeTo self
    case tsr: TweetSendingResult =>
      // ...
  }

  def sendTweet(msg: String): Future[TweetSendingResult] = ...
}  

In this example we call the sendTweet Future upon reception of the RevelationOfFathership and use the pipeTo method to indicate that we would like the result of the Future to be sent to ourselves.

There is just one problem with the code above: if the Future were to fail, we would receive the failed throwable in a rather inconvenient format, wrapped in a message of type akka.actor.Status.Failure, without any useful context. This is why it may be more appropriate to recover failures before piping the result:

import akka.actor._
import akka.pattern.pipe
import scala.control.NonFatal

class Luke extends Actor {
  def receive = {
    case RevelationOfFathership =>
      val message = "Nooooooo"
      sendTweet(message) recover {
        case NonFatal(t) => TweetSendFailure(message, t)
      } pipeTo self
    case tsr: TweetSendingResult =>
      // ...
    case tsf: TweetSendingFailure =>
      // ...
  }

  def sendTweet(msg: String): Future[TweetSendingResult] = ...
}   

With this failure handling we now know which message failed to be sent on Twitter and can take an appropriate action (e.g. re-try sending it).

That’s it for this short introduction to Actors. In the next and last post of this series we will see how to use Futures and Actors in combination for reactive database access.

Read on

Stay tuned as we’ll publish Part 3 shortly as a part of this series:

Reactive Database Access – Part 1 – Why “Async”

We’re very happy to announce a guest post series on the jOOQ blog by Manuel Bernhardt. In this blog series, Manuel will explain the motivation behind so-called reactive technologies and after introducing the concepts of Futures and Actors use them in order to access a relational database in combination with jOOQ.

manuel-bernhardtManuel Bernhardt is an independent software consultant with a passion for building web-based systems, both back-end and front-end. He is the author of “Reactive Web Applications” (Manning) and he started working with Scala, Akka and the Play Framework in 2010 after spending a long time with Java. He lives in Vienna, where he is co-organiser of the local Scala User Group. He is enthusiastic about the Scala-based technologies and the vibrant community and is looking for ways to spread its usage in the industry. He’s also scuba-diving since age 6, and can’t quite get used to the lack of sea in Austria.

This series is split in three parts, which we’ll publish over the next month:

Reactive?

The concept of Reactive Applications is getting increasingly popular these days and chances are that you have already heard of it someplace on the Internet. If not, you could read the Reactive Manifesto or we could perhaps agree to the following simple summary thereof: in a nutshell, Reactive Applications are applications that:

  • make optimal use of computational resources (in terms of CPU and memory usage) by leveraging asynchronous programming techniques
  • know how to deal with failure, degrading gracefully instead of, well, just crashing and becoming unavailable to their users
  • can adapt to intensive workloads, scaling out on several machines / nodes as load increases (and scaling back in)

Reactive Applications do not exist merely in a wild, pristine green field. At some point they will need to store and access data in order to do something meaningful and chances are that the data happens to live in a relational database.

Latency and database access

When an application talks to a database more often than not the database server is not going to be running on the same server as the application. If you’re unlucky it might even be that the server (or set of servers) hosting the application live in a different data centre than the database server. Here is what this means in terms of latency:

Latency numbers every programmer should know

Say that you have an application that runs a simple SELECT query on its front page (let’s not debate whether or not this is a good idea here). If your application and database servers live in the same data centre, you are looking at a latency of the order of 500 µs (depending on how much data comes back). Now compare this to all that your CPU could do during that time (all those green and black squares on the figure above) and keep this in mind – we’ll come back to it in just a minute.

The cost of threads

Let’s suppose that you run your welcome page query in a synchronous manner (which is what JDBC does) and wait for the result to come back from the database. During all of this time you will be monopolizing a thread that waits for the result to come back. A Java thread that just exists (without doing anything at all) can take up to 1 MB of stack memory, so if you use a threaded server that will allocate one thread per user (I’m looking at you, Tomcat) then it is in your best interest to have quite a bit of memory available for your application in order for it to still work when featured on Hacker News (1 MB / concurrent user).

Reactive applications such as the ones built with the Play Framework make use of a server that follows the evented server model: instead of following the “one user, one thread” mantra it will treat requests as a set of events (accessing the database would be one of these events) and run it through an event loop:

Evented server and its event loop

Such a server will not use many threads. For example, default configuration of the Play Framework is to create one thread per CPU core with a maximum of 24 threads in the pool. And yet, this type of server model can deal with many more concurrent requests than the threaded model given the same hardware. The trick, as it turns out, is to hand over the thread to other events when a task needs to do some waiting – or in other words: to program in an asynchronous fashion.

Painful asynchronous programming

Asynchronous programming is not really new and programming paradigms for dealing with it have been around since the 70s and have quietly evolved since then. And yet, asynchronous programming is not necessarily something that brings back happy memories to most developers. Let’s look at a few of the typical tools and their drawbacks.

Callbacks

Some languages (I’m looking at you, Javascript) have been stuck in the 70s with callbacks as their only tool for asynchronous programming up until recently (ECMAScript 6 introduced Promises). This is also knows as christmas tree programming:

Christmas tree of hell

Ho ho ho.

Threads

As a Java developer, the word asynchronous may not necessarily have a very positive connotation and is often associated to the infamous synchronized keyword:

Working with threads in Java is hard, especially when using mutable state – it is so much more convenient to let an underlying application server abstract all of the asynchronous stuff away and not worry about it, right? Unfortunately as we have just seen this comes at quite a hefty cost in terms of performance.

And I mean, just look at this stack trace:

Abstraction is the mother of all trees

In one way, threaded servers are to asynchronous programming what Hibernate is to SQL – a leaky abstraction that will cost you dearly on the long run. And once you realize it, it is often too late and you are trapped in your abstraction, fighting by all means against it in order to increase performance. Whilst for database access it is relatively easy to let go of the abstraction (just use plain SQL, or even better, jOOQ), for asynchronous programming the better tooling is only starting to gain in popularity.

Let’s turn to a programming model that finds its roots in functional programming: Futures.

Futures: the SQL of asynchronous programming

Futures as they can be found in Scala leverage functional programming techniques that have been around for decades in order to make asynchronous programming enjoyable again.

Future fundamentals

A scala.concurrent.Future[T] can be seen as a box that will eventually contain a value of type T if it succeeds. If it fails, theThrowable at the origin of the failure will be kept. A Future is said to have succeeded once the computation it is waiting for has yielded a result, or failed if there was an error during the computation. In either case, once the Future is done computing, it is said to be completed.

Welcome to the Futures

As soon as a Future is declared, it will start running, which means that the computation it tries to achieve will be executed asynchronously. For example, we can use the WS library of the Play Framework in order to execute a GET request against the Play Framework website:

val response: Future[WSResponse] = 
  WS.url("http://www.playframework.com").get()

This call will return immediately and lets us continue to do other things. At some point in the future, the call may have been executed, in which case we could access the result to do something with it. Unlike Java’s java.util.concurrent.Future<V>which lets one check whether a Future is done or block while retrieving it with the get() method, Scala’s Future makes it possible to specify what to do with the result of an execution.

Transforming Futures

Manipulating what’s inside of the box is easy as well and we do not need to wait for the result to be available in order to do so:

val response: Future[WSResponse] = 
  WS.url("http://www.playframework.com").get()

val siteOnline: Future[Boolean] = 
  response.map { r =>
    r.status == 200
  }

siteOnline.foreach { isOnline =>
  if(isOnline) {
    println("The Play site is up")
  } else {
    println("The Play site is down")
  }
}

In this example, we turn our Future[WSResponse] into a Future[Boolean] by checking for the status of the response. It is important to understand that this code will not block at any point: only when the response will be available will a thread be made available for the processing of the response and execute the code inside of the map function.

Recovering failed Futures

Failure recovery is quite convenient as well:

val response: Future[WSResponse] =
  WS.url("http://www.playframework.com").get()

val siteAvailable: Future[Option[Boolean]] = 
  response.map { r =>
    Some(r.status == 200)
  } recover {
    case ce: java.net.ConnectException => None
  }

At the very end of the Future we call the recover method which will deal with a certain type of exception and limit the dammage. In this example we are only handling the unfortunate case of a java.net.ConnectException by returning a Nonevalue.

Composing Futures

The killer feature of Futures is their composeability. A very typical use-case when building asynchronous programming workflows is to combine the results of several concurrent operations. Futures (and Scala) make this rather easy:

def siteAvailable(url: String): Future[Boolean] =
  WS.url(url).get().map { r =>
    r.status == 200
}

val playSiteAvailable =
  siteAvailable("http://www.playframework.com")

val playGithubAvailable =
  siteAvailable("https://github.com/playframework")

val allSitesAvailable: Future[Boolean] = for {
  siteAvailable <- playSiteAvailable
  githubAvailable <- playGithubAvailable
} yield (siteAvailable && githubAvailable)

The allSitesAvailable Future is built using a for comprehension which will wait until both Futures have completed. The two Futures playSiteAvailable and playGithubAvailable will start running as soon as they are being declared and the for comprehension will compose them together. And if one of those Futures were to fail, the resulting Future[Boolean] would fail directly as a result (without waiting for the other Future to complete).

This is it for the first part of this series. In the next post we will look at another tool for reactive programming and then finally at how to use those tools in combination in order to access a relational database in a reactive fashion.

Read on

Stay tuned as we’ll publish Parts 2 and 3 shortly as a part of this series:

jOOQ vs. Slick – Pros and Cons of Each Approach

Every framework introduces a new compromise. A compromise that is introduced because the framework makes some assumptions about how you’d like to interact with your software infrastructure.

An example of where this compromise has struck users recently is the discussion “Are Slick queries generally isomorphic to the SQL queries?“. And, of course, the answer is: No. What appears to be a simple Slick query:

val salesJoin = sales 
      join purchasers 
      join products 
      join suppliers on {
  case (((sale, purchaser), product), supplier) =>
    sale.productId === product.id &&
    sale.purchaserId === purchaser.id &&
    product.supplierId === supplier.id
}

… turns into a rather large monster with tons of derived tables that are totally unnecessary, given the original query (formatting is mine):

select x2.x3, x4.x5, x2.x6, x2.x7 
from (
    select x8.x9 as x10, 
           x8.x11 as x12, 
           x8.x13 as x14, 
           x8.x15 as x7, 
           x8.x16 as x17, 
           x8.x18 as x3, 
           x8.x19 as x20, 
           x21.x22 as x23, 
           x21.x24 as x25, 
           x21.x26 as x6 
    from (
        select x27.x28 as x9,
               x27.x29 as x11, 
               x27.x30 as x13, 
               x27.x31 as x15, 
               x32.x33 as x16, 
               x32.x34 as x18, 
               x32.x35 as x19 
        from (
            select x36."id" as x28, 
                   x36."purchaser_id" as x29, 
                   x36."product_id" as x30, 
                   x36."total" as x31 
            from "sale" x36
        ) x27 
        inner join (
            select x37."id" as x33, 
                   x37."name" as x34, 
                   x37."address" as x35 
	    from "purchaser" x37
        ) x32 
        on 1=1
    ) x8 
    inner join (
        select x38."id" as x22, 
               x38."supplier_id" as x24, 
               x38."name" as x26 
        from "product" x38
    ) x21
    on 1=1
) x2 
inner join (
    select x39."id" as x40, 
           x39."name" as x5, 
           x39."address" as x41 
    from "supplier" x39
) x4 
on ((x2.x14 = x2.x23) 
and (x2.x12 = x2.x17)) 
and (x2.x25 = x4.x40) 
where x2.x7 >= ?

Christopher Vogt, a former Slick maintainer and still actively involved member of the Slick community, explains the above in the following words:

This means that Slick relies on your database’s query optimizer to be able to execute the sql query that Slick produced efficiently. Currently that is not always the case in MySQL

One of the main ideas behind Slick, according to Christopher, is:

Slick is not a DSL that allows you to build exactly specified SQL strings. Slick’s Scala query translation allows for re-use and composition and using Scala as the language to write your queries. It does not allow you to predict the exact sql query, only the semantics and the rough structure.

Slick vs. jOOQ

Since Christopher later on also compared Slick with jOOQ, I allowed myself to chime in and to add my two cents:

From a high level (without actual Slick experience) I’d say that Slick and jOOQ embrace compositionality equally well. I’ve seen crazy queries of several 100s of lines of [jOOQ] SQL in customer code, composed over several methods. You can do that with both APIs.

On the other hand, as Chris said: Slick has a focus on Scala collections, jOOQ on SQL tables.

  • From a conceptual perspective (= in theory), this focus shouldn’t matter.
  • From a type safety perspective, Scala collections are easier to type-check than SQL tables and queries because SQL as a language itself is rather hard to type-check given that the semantics of various of the advanced SQL clauses alter type configurations rather implicitly (e.g. outer joins, grouping sets, pivot clauses, unions, group by, etc.).
  • From a practical perspective, SQL itself is only an approximation of the original relational theories and has attained a life of its own. This may or may not matter to you.

I guess in the end it really boils down to whether you want to reason about Scala collections (queries are better integrated / more idiomatic with your client code) or about SQL tables (queries are better integrated / more idiomatic with your database).

At this point, I’d like to add another two cents to the discussion. Customers don’t buy the product that you’re selling. They never do. In the case of Hibernate, customers and users were hoping to be able to forget SQL forever. The opposite is true. As Gavin King himself (the creator of Hibernate) had told me:

gavin-king

Because customers and users had never listened to Gavin (and to other ORM creators), we now have what many call the object-relational impedance mismatch. A lot of unjustified criticism has been expressed against Hibernate and JPA, APIs which are simply too popular for the limited scope they really cover.

With Slick (or C#’s LINQ, for that matter), a similar mismatch is impeding integrations, if users abuse these tools for what they believe to be a replacement for SQL. Slick does a great job at modelling the relational model directly in the Scala language. This is wonderful if you want to reason about relations just like you reason about collections. But it is not a SQL API. To illustrate how difficult it is to overcome these limitations, you can browse the issue tracker or user group to learn about:

We’ll simply call this:

The Functional-Relational Impedance Mismatch

SQL is much more

Markus Winand (the author of the popular SQL Performance Explained) has recently published a very good presentation about “modern SQL”, an idea that we fully embrace at jOOQ:

We believe that APIs that have been trying to hide the SQL language from general purpose languages like Java, Scala, C# are missing out on a lot of the very nice features that can add tremendous value to your application. jOOQ is an API that fully embraces the SQL language, with all its awesome features (and with all its quirks). You obviously may or may not agree with that.

We’ll leave this article open ended, hoping you’ll chime in to discuss the benefits and caveats of each approach. Of staying close to Scala vs. staying close to SQL.

As a small teaser, however, I’d like to announce a follow-up article showing that there is no such thing as an object-relational impedance mismatch. You (and your ORM) are just not using SQL correctly. Stay tuned!

The 10 Most Annoying Things Coming Back to Java After Some Days of Scala

So, I’m experimenting with Scala because I want to write a parser, and the Scala Parsers API seems like a really good fit. After all, I can implement the parser in Scala and wrap it behind a Java interface, so apart from an additional runtime dependency, there shouldn’t be any interoperability issues.

After a few days of getting really really used to the awesomeness of Scala syntax, here are the top 10 things I’m missing the most when going back to writing Java:

1. Multiline strings

That is my personal favourite, and a really awesome feature that should be in any language. Even PHP has it: Multiline strings. As easy as writing:

println ("""Dear reader,

If we had this feature in Java,
wouldn't that be great?

Yours Sincerely,
Lukas""")

Where is this useful? With SQL, of course! Here’s how you can run a plain SQL statement with jOOQ and Scala:

println(
  DSL.using(configuration)
     .fetch("""
            SELECT a.first_name, a.last_name, b.title
            FROM author a
            JOIN book b ON a.id = b.author_id
            ORDER BY a.id, b.id
            """)
)

And this isn’t only good for static strings. With string interpolation, you can easily inject variables into such strings:

val predicate =
  if (someCondition)
    "AND a.id = 1"
  else
    ""

println(
  DSL.using(configuration)
      // Observe this little "s"
     .fetch(s"""
            SELECT a.first_name, a.last_name, b.title
            FROM author a
            JOIN book b ON a.id = b.author_id
            -- This predicate is the referencing the
            -- above Scala local variable. Neat!
            WHERE 1 = 1 $predicate
            ORDER BY a.id, b.id
            """)
)

That’s pretty awesome, isn’t it? For SQL, there is a lot of potential in Scala.

jOOQ: The Best Way to Write SQL in Scala

2. Semicolons

I sincerely haven’t missed them one bit. The way I structure code (and probably the way most people structure code), Scala seems not to need semicolons at all. In JavaScript, I wouldn’t say the same thing. The interpreted and non-typesafe nature of JavaScript seems to indicate that leaving away optional syntax elements is a guarantee to shoot yourself in the foot. But not with Scala.

val a = thisIs.soMuchBetter()
val b = no.semiColons()
val c = at.theEndOfALine()

This is probably due to Scala’s type safety, which would make the compiler complain in one of those rare ambiguous situations, but that’s just an educated guess.

3. Parentheses

This is a minefield and leaving away parentheses seems dangerous in many cases. In fact, you can also leave away the dots when calling a method:

myObject method myArgument

Because of the amount of ambiguities this can generate, especially when chaining more method calls, I think that this technique should be best avoided. But in some situations, it’s just convenient to “forget” about the parens. E.g.

val s = myObject.toString

4. Type inference

This one is really annoying in Java, and it seems that many other languages have gotten it right, in the meantime. Java only has limited type inference capabilities, and things aren’t as bright as they could be.

In Scala, I could simply write

val s = myObject.toString

… and not care about the fact that s is of type String. Sometimes, but only sometimes I like to explicitly specify the type of my reference. In that case, I can still do it:

val s : String = myObject.toString

5. Case classes

I think I’d fancy writing another POJO with 40 attributes, constructors, getters, setters, equals, hashCode, and toString

— Said no one. Ever

Scala has case classes. Simple immutable pojos written in one-liners. Take the Person case class for instance:

case class Person(firstName: String, lastName: String)

I do have to write down the attributes once, agreed. But everything else should be automatic.

And how do you create an instance of such a case class? Easily, you don’t even need the new operator (in fact, it completely escapes my imagination why new is really needed in the first place):

Person("George", "Orwell")

That’s it. What else do you want to write to be Enterprise-compliant?

Side-note

OK, some people will now argue to use project lombok. Annotation-based code generation is nonsense and should be best avoided. In fact, many annotations in the Java ecosystem are simple proof of the fact that the Java language is – and will forever be – very limited in its evolution capabilities. Take @Override for instance. This should be a keyword, not an annotation. You may think it’s a cosmetic difference, but I say that Scala has proven that annotations are pretty much always the wrong tool. Or have you seen heavily annotated Scala code, recently?

6. Methods (functions!) everywhere

This one is really one of the most useful features in any language, in my opinion. Why do we always have to link a method to a specific class? Why can’t we simply have methods in any scope level? Because we can, with Scala:

// "Top-level", i.e. associated with the package
def m1(i : Int) = i + 1

object Test {

    // "Static" method in the Test instance
    def m2(i : Int) = i + 2
    
    def main(args: Array[String]): Unit = {

        // Local method in the main method
        def m3(i : Int) = i + 3
        
        println(m1(1))
        println(m2(1))
        println(m3(1))
    }
}

Right? Why shouldn’t I be able to define a local method in another method? I can do that with classes in Java:

public void method() {
    class LocalClass {}

    System.out.println(new LocalClass());
}

A local class is an inner class that is local to a method. This is hardly ever useful, but what would be really useful is are local methods.

These are also supported in JavaScript or Ceylon, by the way.

7. The REPL

Because of various language features (such as 6. Methods everywhere), Scala is a language that can easily run in a REPL. This is awesome for testing out a small algorithm or concept outside of the scope of your application.

In Java, we usually tend to do this:

public class SomeRandomClass {

    // [...]
  
    public static void main(String[] args) {
        System.out.println(SomeOtherClass.testMethod());
    }

    // [...]
}

In Scala, I would’ve just written this in the REPL:

println(SomeOtherClass.testMethod)

Notice also the always available println method. Pure gold in terms of efficient debugging.

8. Arrays are NOT (that much of) a special case

In Java, apart from primitive types, there are also those weird things we call arrays. Arrays originate from an entirely separate universe, where we have to remember quirky rules originating from the ages of Capt Kirk (or so)

array

Yes, rules like:

// Compiles but fails at runtime
Object[] arrrrr = new String[1];
arrrrr[0] = new Object();

// This works
Object[] arrrr2 = new Integer[1];
arrrr2[0] = 1; // Autoboxing

// This doesn't work
Object[] arrrr3 = new int[];

// This works
Object[] arr4[] = new Object[1][];

// So does this (initialisation):
Object[][] arr5 = { { } };

// Or this (puzzle: Why does it work?):
Object[][] arr6 = { { new int[1] } };

// But this doesn't work (assignment)
arr5 = { { } };

Yes, the list could go on. With Scala, arrays are less of a special case, syntactically speaking:

val a = new Array[String](3);
a(0) = "A"
a(1) = "B"
a(2) = "C"
a.map(v => v + ":")

// output Array(A:, B:, C:)

As you can see, arrays behave much like other collections including all the useful methods that can be used on them.

9. Symbolic method names

Now, this topic is one that is more controversial, as it reminds us of the perils of operator overloading. But every once in a while, we’d wish to have something similar. Something that allows us to write

val x = BigDecimal(3);
val y = BigDecimal(4);
val z = x * y

Very intuitively, the value of z should be BigDecimal(12). That cannot be too hard, can it? I don’t care if the implementation of * is really a method called multiply() or whatever. When writing down the method, I’d like to use what looks like a very common operator for multiplication.

By the way, I’d also like to do that with SQL. Here’s an example:

select ( 
  AUTHOR.FIRST_NAME || " " || AUTHOR.LAST_NAME,
  AUTHOR.AGE - 10
)
from AUTHOR
where AUTHOR.ID > 10
fetch

Doesn’t that make sense? We know that || means concat (in some databases). We know what the meaning of - (minus) and > (greater than) is. Why not just write it?

The above is a compiling example of jOOQ in Scala, btw.

jOOQ: The Best Way to Write SQL in Scala

Attention: Caveat

There’s always a flip side to allowing something like operator overloading or symbolic method names. It can (and will be) abused. By libraries as much as by the Scala language itself.

10. Tuples

Being a SQL person, this is again one of the features I miss most in other languages. In SQL, everything is either a TABLE or a ROW. few people actually know that, and few databases actually support this way of thinking.

Scala doesn’t have ROW types (which are really records), but at least, there are anonymous tuple types. Think of rows as tuples with named attributes, whereas case classes would be named rows:

  • Tuple: Anonymous type with typed and indexed elements
  • Row: Anonymous type with typed, named, and indexed elements
  • case class: Named type with typed and named elements

In Scala, I can just write:

// A tuple with two values
val t1 = (1, "A")

// A nested tuple
val t2 = (1, "A", (2, "B"))

In Java, a similar thing can be done, but you’ll have to write the library yourself, and you have no language support:

class Tuple2<T1, T2> {
    // Lots of bloat, see missing case classes
}

class Tuple3<T1, T2, T3> {
    // Bloat bloat bloat
}

And then:

// Yikes, no type inference...
Tuple2<Integer, String> t1 = new Tuple2<>(1, "A");

// OK, this will certainly not look nice
Tuple3<Integer, String, Tuple2<Integer, String>> t2 =
    new Tuple3<>(1, "A", new Tuple2<>(2, "B"));

jOOQ makes extensive use of the above technique to bring you SQL’s row value expressions to Java, and surprisingly, in most cases, you can do without the missing type inference as jOOQ is a fluent API where you never really assign values to local variables… An example:

DSL.using(configuration)
   .select(T1.SOME_VALUE)
   .from(T1)
   .where(
      // This ROW constructor is completely type safe
      row(T1.COL1, T1.COL2)
      .in(select(T2.A, T2.B).from(T2))
   )
   .fetch();

Conclusion

This was certainly a pro-Scala and slightly contra-Java article. Don’t get me wrong. By no means, I’d like to migrate entirely to Scala. I think that the Scala language goes way beyond what is reasonable in any useful software. There are lots of little features and gimmicks that seem nice to have, but will inevitably blow up in your face, such as:

  • implicit conversion. This is not only very hard to manage, it also slows down compilation horribly. Besides, it’s probably utterly impossible to implement semantic versioning reasonably using implicit, as it is probably not possible to foresee all possible client code breakage through accidental backwards-incompatibility.
  • local imports seem great at first, but their power quickly makes code unintelligible when people start partially importing or renaming types for a local scope.
  • symbolic method names are most often abused. Take the parser API for instance, which features method names like ^^, ^^^, ^?, or ~!

Nonetheless, I think that the advantages of Scala over Java listed in this article could all be implemented in Java as well:

  • with little risk of breaking backwards-compatibility
  • with (probably) not too big of an effort, JLS-wise
  • with a huge impact on developer productivity
  • with a huge impact on Java’s competitiveness

In any case, Java 9 will be another promising release, with hot topics like value types, declaration-site variance, specialisation (very interesting!) or ClassDynamic

With these huge changes, let’s hope there’s also some room for any of the above little improvements, that would add more immediate value to every day work.

Scala Code Trollin’

So, we’ve all heard of “flatmap that sh**”, right? The new, functional, hipster way of saying

You’re doing it wrong

Since we don’t really care about dogma and functional religion that much, let’s just start trolling a little (it’s so much easier anyway), with Scala code. Check out the following piece of useful Scala code, which does compile:

def C = Seq(1, 2, 3)
def Windows = 0
def WAT = (C:\Windows)                                                           /* OK, we also need this */ (_)

Awesome, right? I leave it up to you to figure out what the above piece of code does.

Don’t believe it? Or don’t care about the puzzle? Here’s the solution on ScalaFiddle.

This discussion was inspired by CTMMC