The Great Benevolent Dictator Getting Some Beating

I know how it feels to be a benevolent dictator. I am one myself on my jOOQ User Group. Supporting Free Open Source Software isn’t always easy. Sometimes you get demanding users, sometimes you get grumpy bugfixers. And the occasional troll. You try to be nice and sometimes, you make mistakes, too. But hey, even if it’s free and Open Source, that’s not an excuse for being rude towards your users (= customers!). Except, of course, for The One Great Benevolent Dictator. Everyone should know by now who I’m talking about. But not everyone accepts his rudeness. As seen on the Linux Kernel List:

https://lkml.org/lkml/2013/7/15/427

Enjoy!

Internal DSLs on the Fast Lane

I’ve read this interesting article about internal DSLs in Java, a short summary of Martin Fowler’s book on DSLs in general. I’ve been blogging about external and internal DSLs quite a lot myself, naturally, as jOOQ is the largest and most advanced free and Open Source implementation of an internal DSL in the Java ecosystem. Unlike some other DSLs that are currently being developed, jOOQ uses a BNF as a basis for its API. This guarantees that not only simple method chaining, but also grammar-like contexts can be formalised in an API.

How to construct such an API for your own DSL and grammar manually was explained in this popular blog post here:
https://blog.jooq.org/2012/01/05/the-java-fluent-api-designer-crash-course

SQL Templating with jOOQ or MyBatis

Many people compare jOOQ with MyBatis. Both are seen as popular alternatives to Java’s persistence standard JPA, as both are much more SQL-focused than JPA itself. When comparing the two tools, the first obvious difference is this:

  • jOOQ is an internal domain-specific language modelling SQL through a Java fluent API.
  • MyBatis is an XML-based SQL templating and mapping engine where dynamic SQL can be created through an XML-DSL.

MyBatis’ current success is mostly based on it having provided a viable alternative to JPA in a time when JPA was still a controversial standard, and when JPA had to prove that it is better than JDO, which solves very similar problems. This provided alternative is something that many SQL-oriented users like to have:

  • The separation of Java and SQL code, extracting SQL code into external files. This allows DBA to patch SQL strings in productive environments, adding hints, and other tuning.
  • The automatic mapping of tabular result set data to objects. This too is achieved in the same XML DSL as the dynamic SQL specification.

Implementing SQL templating with jOOQ

These things can be achieved with jOOQ as well. But unlike MyBatis, jOOQ’s SQL templating (as will be introduced in jOOQ 3.2) will not make use of a proprietary templating language. You should be able to make your own choice of language, providing jOOQ with a very simple adapter to it. This will allow for using:

Let’s have a look at a Velocity template example. This example adds a dynamic list of ID parameters to the WHERE clause:

SELECT
  a.first_name,
  a.last_name,
  count(*)
FROM
  t_author a
LEFT OUTER JOIN
  t_book b ON a.id = b.author_id
WHERE
  1 = 0
#foreach ($param in $p)
  OR a.id = ?
#end
GROUP BY
  a.first_name,
  a.last_name
ORDER BY
  a.id ASC

The above template can be passed to the following jOOQ Template implementation, which uses arbitrary input objects to produce a concrete jOOQ QueryPart. A QueryPart is an object that can render SQL and bind variables:

class VelocityTemplate
implements org.jooq.Template {

  private final String file;

  public VelocityTemplate(String file) {
    this.file = file;
  }

  @Override
  public QueryPart transform(Object... input) {

    // Velocity code
    // -----------------------------------------
    URL url = this.getClass().getResource(
      "/org/jooq/test/_/templates/");
    File file = url.getFile();

    VelocityEngine ve = new VelocityEngine();
    ve.setProperty(RESOURCE_LOADER, "file");
    ve.setProperty(FILE_RESOURCE_LOADER_PATH,
      new File(file ).getAbsolutePath());
    ve.setProperty(FILE_RESOURCE_LOADER_CACHE,
      "true");
    ve.init();

    VelocityContext context = new VelocityContext();
    context.put("p", input);

    StringWriter writer = new StringWriter();
    ve.getTemplate(file, "UTF-8")
      .merge(context, writer);

    // jOOQ Code
    // -----------------------------------------
    return DSL.queryPart(writer.toString(), input);
  }
}

Very simple glue code. As you’re in full control of your template engine implementation adapter, you can also add caching, object pools, etc to your adapter.

The above template can then be used easily throughout the jOOQ API, wherever jOOQ allows for plain SQL. For example as a top-level query:

Template tpl = new VelocityTemplate(
    "authors-and-books.vm");

DSL.using(configuration)
   .resultQuery(tpl, 1, 2, 3)
   .fetch();

Or as a nested select, embedded in jOOQ’s typesafe DSL:

DSL.using(configuration)
   .select()
   .from(new TableSourceTemplate("my-table.vm"))
   .fetch();

Of course, you can also take advantage of jOOQ’s record mapping features, which allows you to implement your own custom table to object mapping algorithms. This may often be a better choice than relying on any hard-wired XML configuration, such as the one from MyBatis:

List<MyType> result =
DSL.using(configuration)
   .select()
   .from(new TableSourceTemplate("my-table.vm"))
   .fetch(new RecordMapper<Record, MyType>() {
      public MyType map(Record record) {
        // Custom mapping logic here
      }
   });

Or with Java 8:

List<MyType> result =
DSL.using(configuration)
   .select()
   .from(new TableSourceTemplate("my-table.vm"))
   .fetch((Record) -> new MyType().init(record));

The possibilities are great

SQL templating is a powerful tool when you prefer simple, string-based SQL that can be tweaked every now and then with a little loop or if statement, to inject some dynamic SQL clause. There are a couple of SQL engines that try to solve this problem in one way or another:

Among the above, all tools ship with a simple, proprietary template language, but jOOQ is the only one encouraging you to use your template engine of choice and thus offering arbitrary templating extensibility in the future.

Usability vs. Reusability

This blog post I’ve found from 2009 has a nice way of looking at the problem of comparing the ease of use with the ability to reuse. It claims that usability and reusability is always a tradeoff between building

  • a heavyweight, coarse-grained software component with few dependencies (very usable)
  • a lightweight, fine-grained software component with complex dependencies (very reusable)

The following picture nicely depicts this relationship:

Image taken from: http://techdistrict.kirkk.com/2009/07/08/reuse-is-the-dream-dead

Indeed, coarse-grained components are hard to reuse because they try to solve too many problems in the context of the coarse overall point of view. But I’m not sure if they’re necessarily easier to “use”. Being coarse, and thus complex, they may have solved the problem in the wrong way. And since they’re so complex, they cannot be changed easily to fit a slightly different problem domain. Since time can change the original problem setup (and it will in any project), heavyweight, coarse-grained components often cannot even be used nor reused for their “original” purpose. I.e. a coarse solution that is developed in a project over several years cannot be finished, because it cannot be changed after those four years.

And fine-grained components aren’t necessarily hard to use. It is possible to create components with very little dependencies, such that they do not introduce a lot of complexity. jOOQ is one example of such a component, which has no dependencies itself, apart from the JDK. But jOOQ is a library and not a business module, i.e. it implements horizontal reusability, not vertical reusability.

So, let’s hope the original post was not entirely correct and there is a good, middle way! See for yourself:
http://techdistrict.kirkk.com/2009/07/08/reuse-is-the-dream-dead

Static, Non-Static. Mockable, Non-Mockable… Instead, Let’s Focus on Real Added Value…

The never ending topic of testability… Dogmatic discussions about stuff being static, non-static. Mockable, non-mockable. Testable, non-testable. Here’s an article that was recently syndicated on DZone, about the evilness in making things static:

http://java.dzone.com/articles/why-static-bad-and-how-avoid

While the article itself is still somewhat focused on simple means of making something mockable through dependency injection, the big lot of comments and rants is just amazing. If you closely look at the comments, you’ll even find yourself reading gibberish about whether genderless “she” or singular “they” should be preferred. Off-topic troll alarm!

No one doubts the general usefulness of code being testable. If it’s feasible to add automated tests with a reasonable amount of effort, no one sane will question those tests. But where does this anti-static dogma come from? Every project manager will love engineers that follow 80/20 rules. In the end of the day, good software is defined by the added value to all stakeholders. There’s no right or wrong. Instead, there are “50 shades of mockable”. And with a bit of humour we’ll get something in between project day 1 and day 238:

Image found on reddit: http://www.reddit.com/r/webdev/comments/1i0vwh/my_reaction_when_someone_offers_to_contribute/

Just face it. Static is a tool like any other tool. It has its merits. And its drawbacks. Choose the tool where it fits and review your overly strict rule set where needed. Being dogmatic will eventually lead to greater chaos than being pragmatic, Try to bebe efficient rather than to fight “evil”. Mocks have their place, as do integration tests.

For those looking for more rants and trolling comments, they can be seen in this article here, where more mocking is advertised in a database context:
http://architects.dzone.com/articles/easy-mocking-your-database-0

And after that. Let’s get back to work and produce something that focuses on adding value!

High Complexity and Low Throughput. Reasons for Using an ORM.

I’ve recently stumbled upon an interesting blog post about when to use an ORM. I found it to be well-written and quite objective, specifically with respect to its model complexity and throughput diagram:

Original image taken from this blog post: http://mikehadlow.blogspot.ca/2012/06/when-should-i-use-orm.html
Original image taken from this blog post: http://mikehadlow.blogspot.ca/2012/06/when-should-i-use-orm.html

The ORM or not ORM topic will probably never stop showing up on blogs. Some of them are more black and white, such as Jeff Atwood’s Object-Relational Mapping is the Vietnam of Computer Science others are more “50 shades of data access”, such as Martin Fowler’s ORM Hate.

I’m personally impressed by the work ORMs have done for us in times when repetitive SQL started to get boring and CRUD was not yet established. But ORMs do have their caveats as they are indeed leaky abstractions.

The aforementioned article shows in what situations ORMs can pull their weight, and in what situations you better keep operating on a SQL level, using tools like jOOQ, MyBatis, Apache DbUtils, or just simply JDBC.

Read the original blog post here:
http://mikehadlow.blogspot.ca/2012/06/when-should-i-use-orm.html

Other related articles:

LINQ and Java

LINQ has been quite a successful, but also controversial addition to the .NET ecosystem. Many people are looking for a comparable solution in the Java world. To better understand what a comparable solution could be, let’s have a look at the main problem that LINQ solves:

Query languages are often declarative programming languages with many keywords. They offer few control-flow elements, yet they are highly descriptive. The most popular query language is SQL, the ISO/IEC standardised Structured Query Language, mostly used for relational databases.

Declarative programming means that programmers do not explicitly phrase out their algorithms. Instead, they describe the result they would like to obtain, leaving algorithmic calculus to their implementing systems. Some databases have become very good at interpreting large SQL statements, applying SQL language transformation rules based on language syntax and metadata. An interesting read is Tom Kyte’s metadata matters, hinting at the incredible effort that has been put into Oracle’s Cost-Based Optimiser. Similar papers can be found for SQL Server, DB2 and other leading RDBMS.

LINQ-to-SQL is not SQL

LINQ is an entirely different query language that allows to embed declarative programming aspects into .NET languages, such as C#, or ASP. The nice part of LINQ is the fact that a C# compiler can compile something that looks like SQL in the middle of C# statements. In a way, LINQ is to .NET what SQL is to PL/SQL, pgplsql or what jOOQ is to Java (see my previous article about PL/Java). But unlike PL/SQL, which embeds the actual SQL language, LINQ-to-SQL does not aim for modelling SQL itself within .NET. It is a higher-level abstraction that keeps an open door for attempting to unify querying against various heterogeneous data stores in a single language. This unification will create a similar impedance mismatch as ORM did before, maybe an even bigger one. While similar languages can be transformed into each other to a certain extent, it can become quite difficult for an advanced SQL developer to predict what actual SQL code will be generated from even very simple LINQ statements.

LINQ Examples

This gets more clear when looking at some examples given by the LINQ-to-SQL documentation. For example the Count() aggregate function:

System.Int32 notDiscontinuedCount =
    (from prod in db.Products
    where !prod.Discontinued
    select prod)
    .Count();

Console.WriteLine(notDiscontinuedCount);

In the above example, it is not immediately clear if the .Count() function is transformed into a SQL count(*) aggregate function within the parenthesised query (then why not put it into the projection?), or if it will be applied only after executing the query, in the application memory. The latter would be prohibitive, if a large number or records would need to be transferred from the database to memory. Depending on the transaction model, they would even need to be read-locked!

Another example is given here where grouping is explained:

var prodCountQuery =
    from prod in db.Products
    group prod by prod.CategoryID into grouping
    where grouping.Count() >= 10
    select new
    {
        grouping.Key,
        ProductCount = grouping.Count()
    };

In this case, LINQ models its language aspects entirely different from SQL. The above LINQ where clause is obviously a SQL HAVING clause. into grouping is an alias for what will be a grouped tuple, which is quite a nice idea. This does not directly map to SQL, though, and must be used by LINQ internally, to produce typed output. What’s awesome, of course, are the statically typed projections that can be reused afterwards, directly in C#!

Let’s look at another grouping example:

var priceQuery =
    from prod in db.Products
    group prod by prod.CategoryID into grouping
    select new
    {
        grouping.Key,
        TotalPrice = grouping.Sum(p => p.UnitPrice)
    };

In this example, C#’s functional aspects are embedded into LINQ’s Sum(p => p.UnitPrice) aggregate expression. TotalPrice = ... is just simple column aliasing. The above leaves me with lots of open questions. How can I control, which parts are really going to be translated to SQL, and which parts will execute in my application, after a SQL query returns a partial result set? How can I predict whether a lambda expression is suitable for a LINQ aggregate function, and when it will cause a huge amount of data to be loaded into memory for in-memory aggregation? And also: Will the compiler warn me that it couldn’t figure out how to generate a C#/SQL algorithm mix? Or will this simply fail at runtime?

To LINQ or not to LINQ

Don’t get me wrong. Whenever I look inside the LINQ manuals for some inspiration, I have a deep urge to try it in a project. It looks awesome, and well-designed. There are also lots of interesting LINQ questions on Stack Overflow. I wouldn’t mind having LINQ in Java, but I want to remind readers that LINQ is NOT SQL. If you want to stay in control of your SQL, LINQ or LINQesque APIs may be a bad choice for two reasons:

  1. Some SQL mechanisms cannot be expressed in LINQ. Just as with JPA, you may need to resort to plain SQL.
  2. Some LINQ mechanisms cannot be expressed in SQL. Just as with JPA, you may suffer from severe performance issues, and will thus resort again to plain SQL.

Beware of the above when choosing LINQ, or a “Java implementation” thereof! You may be better off, using SQL (i.e. JDBC, jOOQ, or MyBatis) for data fetching and Java APIs (e.g. Java 8’s Stream API) for in-memory post-processing

LINQ-like libraries modelling SQL in Java, Scala

LINQ-like libraries abstracting SQL syntax and data stores in Java, Scala