jOOQ Newsletter September 17, 2013

Subscribe to this newsletter here.

SQL for calculations

SQL can be used for heavy calculations. This doesn’t mean that it has to, of course. Many Java-oriented software architects are reluctant to allow for business logic entering their database. DBA tend to disagree and promote complex logic in database views or stored procedures.

The pros and cons of both sides can be seen in this popular discussion on reddit. Our take on this discussion is a rather pragmatic one, saying: Neither approach is “better”. But not knowing about awesome, often vendor-specific SQL features is a pity. For example, here are two approaches to calculating a running total with Oracle SQL:

What is your experience with SQL for calculations? Reply to this e-mail and tell us your story.

SQL Performance Explained

SQL Performance ExplainedWhen running calculations in the database using SQL, it is always good to know your way around indexing and performance tuning. As jOOQ is a very SQL-centric environment, we would like to promote a very good book about SQL Performance.

SQL databases have become incredibly fast in parsing and executing even very complex SQL statements. But to many SQL developers, writing performing SQL is still a mystery. SQL Performance Explained by Markus Winandcovers 90% of what any SQL developer should know in very simple terms. A must-read for all SQL beginners and pros! Available in English, German, and French.

PostgreSQL 9.3

PostgreSQL 9.3 has been released! This is great news as a wonderful database has just gotten better. Apart from support for materialised views and updatable views, PostgreSQL now also supports the SQL standard LATERAL JOIN, which will soon be supported in jOOQ as well.

More about what’s new in PostgreSQL 9.3 can be seen here.

Google and MariaDB

After RedHat’s alleged switching from MySQL to MariaDB in RHEL, Google now also announces that they will start to migrate away from Oracle’s popular Open Source database. These migrations will further strengthen MariaDB’s position and communities, as large players in the data market will stop contributing to MySQL and contribute to MariaDB instead.

Read more about this migration here.

Why PostgreSQL is so Awesome

Just recently, I’ve blogged about PostgreSQL 9.3 having been released, which is awesome enough as PostgreSQL finally supports materialised views and updatable views. I have then blogged about PostgreSQL’s syntax being a mystery only exceeded by its power, as it allows for treating INSERT and UPDATE statements as table references, when used with the RETURNING clause. This is quite fancy, even if not very useful in everyday SQL.

But what I’m writing about today is colossal:

PostgreSQL predicates are just ordinary expressions

Let this sink in. To PostgreSQL, predicates are just ordinary expressions evaluating to the boolean type. This is documented in the SELECT syntax reference, here:
http://www.postgresql.org/docs/9.3/static/sql-select.html

Citing:

The optional WHERE clause has the general form

WHERE condition

where condition is any expression that evaluates to a result of type boolean […]

It struck me like lightning when I read this Stack Overflow question. You can put a predicate everywhere! Although I knew this before, I had never thought about how awesome this is! You can use predicates in the SELECT clause:

SELECT a, b, c = d, e IN (SELECT x FROM y)
FROM t

You can use predicates in the GROUP BY clause:

SELECT count(*)
FROM t
GROUP BY c = d, e IN (SELECT x FROM y)

You can use predicates in the ORDER BY clause:

SELECT *
FROM t
ORDER BY c = d, e IN (SELECT x FROM y)

You can aggregate predicates using the EVERY aggregate function.

Don’t believe it? See for yourself in this SQLFiddle!

“Ordinary” SQL

In “ordinary” SQL (i.e. standards-compliant), predicates have to be transformed into ordinary value expressions using the CASE clause. Repeating the above examples:

SELECT clause:

SELECT a, b,
       CASE WHEN c = d THEN true ELSE false END,
       CASE WHEN e IN (SELECT x FROM y)
            THEN true ELSE false END
FROM t

GROUP BY clause:

SELECT count(*)
FROM t
GROUP BY CASE WHEN c = d THEN true ELSE false END,
         CASE WHEN e IN (SELECT x FROM y)
              THEN true ELSE false END

ORDER BY clause:

SELECT *
FROM t
ORDER BY CASE WHEN c = d THEN true ELSE false END,
         CASE WHEN e IN (SELECT x FROM y)
              THEN true ELSE false END

Impact for jOOQ

The answer given in the above Stack Overflow question shows how jOOQ can standardise this behaviour by rendering predicates / conditions directly as column expressions where this is supported, while emulating this behaviour using an equivalent CASE expression, otherwise.

Take predicates in the SELECT clause, for instance:

DSL.using(configuration)
   .select(
       T.A, T.B,
       // Transform a jOOQ Condition into a Field:
       field(T.C.eq(T.D)),
       field(T.E.in(select(Y.X).from(Y)))
   )
   .from(T);

Further thoughts

From jOOQ integration test experience, I can say that Derby, H2, HSQLDB, MariaDB, MySQL, PostgreSQL, and SQLite will work this way. While this is awesome from a syntax perspective, do keep an eye on your execution plans to verify that this isn’t going to produce lots of very expensive nested loops…

PostgreSQL Syntax is a Mystery Only Exceeded by its Power

I just ran across this rather powerful PostgreSQL example statement from the manual. It reads

Increment the sales count of the salesperson who manages the account for Acme Corporation, and record the whole updated row along with current time in a log table:

WITH upd AS (
  UPDATE employees 
  SET sales_count = sales_count + 1 
  WHERE id = (
    SELECT sales_person 
    FROM accounts 
    WHERE name = 'Acme Corporation'
  )
  RETURNING *
)
INSERT INTO employees_log 
SELECT *, current_timestamp 
FROM upd;

This database keeps amazing me. A single statement allows for updating one table, taking the updated record(s) including all generated values as input for an insert into another table. But at the same time, PostgreSQL does not yet implement the SQL:2003 MERGE statement.

Crazy database…

People Managing to Correctly Spell “Moron” in a Blog Comment

The notorious ORM pro / con discussion heavily amuses me. I always find it very funny when people have passionate discussions about which solution is better, rather than discussing about which solution is better suited for the problem at hand. In the case of ORMs vs. plain SQL, obviously, no solution is simply better as both techniques have their merits. When comparing ORMs with jOOQ, I think that this page summarises it pretty well:
http://www.hibernate-alternative.com

Now, this article and most specifically, one answer is hilarious:
http://java.dzone.com/articles/defense-hand-coded-sql

While the article’s author is already asking for trouble, check out this one particular answer. I love it when people manage to correctly spell “moron”:

People who handwrite SQL are invariably morons.

Here’s what you miss out when using a good ORM with generated mappings:

– Automatic first and second level caching

– Guaranteed consistency between code and database structure. Change the database? Regenerate pojo’s -> compile errors until code adheres to database structure.

– True vendor independence. Yes, I’m switching between six different db’s in our products with zero issues.

– I work with objects, not relation sets. That kinda makes sense in an oop language.

– Build-in query languages in decent ORMs are much more productive and, again, vendor independent.

– Any decent ORM understands and injects vendor specific query hints better than you.

Also, get a clue.

Here’s my adequate reply to the above:

OK, now this was amusing :-)

– Automatic first and second level caching

This, obviously, is utterly impossible outside the world of ORMs.

– Guaranteed consistency between code and database structure. Change the database? Regenerate pojo’s -> compile errors until code adheres to database structure.

True. No one has ever written a code generator before it was added to Hibernate.

– I work with objects, not relation sets. That kinda makes sense in an oop language

… which your DBA will probably always agree with. Remember to remind your manager why he bought that 1M$ Oracle license, when you run N+1 selects for fetching your OOP objects.

– Build-in query languages in decent ORMs are much more productive and, again, vendor independent.

Of course, there is always a black / white answer to “productivity”- questions. Like, how productively you can express a SQL:2003 MERGE statement with HQL. Or, how productively you can calculate a running total involving window functions, or maybe, recursive SQL with HQL.

– Any decent ORM understands and injects vendor specific query hints better than you.

That is indeed an amazing theory, which I was utterly unaware of.

The eternal debate between ORM lovers and haters. Mankind has always been this stupid.  Like the AC vs. DC discussion between Nikola Tesla and Thomas Edison

And, Eclipse will totally win over IntelliJ! ;-)

Annotatiomania™ 2.1

It’s been a while since I have last ranted about annotations and the increasing Annotatiomania™ in the JEE ecosystem. I haven’t been exposed to much JEE either, until yesterday, when I was confronted with an awesome talk by Adam Bien at ch-open, where I’ve also held a talk about jOOQ. JEE 7 ships with lots of goodies.

But then, I saw this!

Let me treat you to a introductory example of how to use a stored procedure through JPA 2.1, which our friends from EclipseLink have had the courtesy to share with us:

    @NamedStoredProcedureQuery(
       name="ReadUsingMultipleResultSetMappings",
       procedureName="Read_Multiple_Result_Sets",
       resultSetMappings={
           "EmployeeResultSetMapping", 
           "AddressResultSetMapping", 
           "ProjectResultSetMapping", 
           "EmployeeConstructorResultSetMapping"
       }
   )
   
   @SqlResultSetMappings({
       @SqlResultSetMapping(
           name = "EmployeeResultSetMapping",
           entities = {
           @EntityResult(entityClass=Employee.class)
           }
       ),
       @SqlResultSetMapping(
           name="EmployeeConstructorResultSetMapping",
           classes = { 
               @ConstructorResult(
                   targetClass = EmployeeDetails.class,
                   columns = {
                       @ColumnResult(
                           name="EMP_ID", 
                           type=Integer.class
                       ),
                       @ColumnResult(
                           name="F_NAME", 
                           type=String.class
                       ),
                       @ColumnResult(
                           name="L_NAME", 
                           type=String.class
                       ),
                       @ColumnResult(
                           name="R_COUNT", 
                           type=Integer.class
                       )
                   }
               )
           }
       )
   })
   public Employee(){
       ....
   }

Obviously, Eclipse Copernicus (or what’s Kepler’s successor?) will ship with an Enterprise-licensed source code formatter built by Nobel Prize mathematicians to actually display the above. With that license, you also get a 50% discount coupon on the latest 67″ Samsung flat screen for an Enterprise coding experience. Awesome!

Also, whenever I hear “ReadUsingMultipleResultSetMappings”, I immediately think J2eeBasedPreAuthenticatedWebAuthenticationDetailsSource, too. Some powerful permutation mathematics and random natural language processing is thus involved.

Not just did JPAnnotatiomania™ 2.1 treat us with stored procedure support, named fetch graphs are now also part of the game. Do note that with only a few lines of code, we will finally be safe from writing tedious SQL! Behold:

   @NamedEntityGraph(
       name="ExecutiveProjects"
       attributeNodes={
           @NamedAttributeNode("address"),
           @NamedAttributeNode(
               value="projects", 
               subgraph="projects"
           )
       },
       subgraphs={
           @NamedSubgraph(
               name="projects",
               attributeNodes={
                   @NamedAttributeNode("properties")
               }
           ),
           @NamedSubgraph(
               name="projects",
               type=LargeProject.class,
               attributeNodes={
                   @NamedAttributeNode("executive")
               }
           )
       }
   )

Another 5000$ spent on formatting licenses to format the above.

Conclusion

No longer shall effective developers be paid by the lines of code, but they should be paid by the lines of @-signs they write. And I’m surely looking forward to Adam Bien’s next talk, about JEE 7 Best Practices!

Crazy Translations of Simple SQL Expressions to Various SQL Dialects

SQL is standardised by ISO / IEC. We have wonderful standards, such as SQL-92, SQL:1999, SQL:2003, SQL:2008. Right? In theory. In practice, many things do not have an equivalence between the actual SQL implementations. This is very true for common function support – most of which are not part of any standard. Here’s an interesting insight into equivalent expressions for very simple functions.

For this article, I’m using jOOQ, a library that models SQL as an internal domain-specific language in Java. jOOQ builds an AST model of your SQL statement in Java memory, before rendering it to the chosen SQL dialect. This helps standardising SQL, such that your programming model stays simple, while being able to access useful functionality.

The following little Java program is used:

package org.jooq.compatibility;

import static org.jooq.SQLDialect.SQL99;
import static org.jooq.impl.DSL.*;

import java.sql.Date;
import java.util.EnumSet;

import org.jooq.QueryPart;
import org.jooq.SQLDialect;
import org.jooq.impl.DSL;

public class Compatibility {

    // The main method prints a couple of simple
    // and common SQL functions
    public static void main(String[] args) {
        print(sin(3));
        print(power(2, 4));
        print(sinh(3));
        print(lpad(inline("abc"), 3));
        print(dateDiff(currentDate(),
                       Date.valueOf("2000-01-01")));
        print(bitCount((byte) 5));
    }

    private static void print(QueryPart part) {
        System.out.println("Printing " + part);
        System.out.println("---------------------");

        // Get only SQLDialect families, don't
        // distinguish between
        // SQL Server 2008 or SQL Server 20012
        EnumSet<SQLDialect> dialects =
            EnumSet.noneOf(SQLDialect.class);
        for (SQLDialect dialect:SQLDialect.values())
            if (dialect != SQL99)
                dialects.add(dialect.family());

        // Render the passed in SQL clause to
        // all supported SQL dialects
        for (SQLDialect dialect: dialects)
            System.out.println(
                String.format("%1$9s : %2$s",
                dialect, DSL.using(dialect)
                            .renderInlined(part)
            ));

        System.out.println();
        System.out.println();
    }
}

Now check out this output!

Printing sin(3)

This is the most straightforward among the tested functions. It renders the same in every dialect.

      ASE : sin(3)
   CUBRID : sin(3)
      DB2 : sin(3)
    DERBY : sin(3)
 FIREBIRD : sin(3)
       H2 : sin(3)
   HSQLDB : sin(3)
   INGRES : sin(3)
  MARIADB : sin(3)
    MYSQL : sin(3)
   ORACLE : sin(3)
 POSTGRES : sin(3)
   SQLITE : sin(3)
SQLSERVER : sin(3)
   SYBASE : sin(3)

Note that SQLite doesn’t really support this function, though. In subsequent listings, only those dialects are listed that can really emulate the function in question.

In the above list, ASE is Sybase ASE whereas SYBASE is Sybase SQL Anywhere.

Printing power(2, 4)

This gets a little trickier, as the power function isn’t supported in all databases. But it’s easy to emulate it using logarithms and exp.

      ASE : power(2, 4)
   CUBRID : power(2, 4)
      DB2 : power(2, 4)
    DERBY : exp((ln(2) * 4))
 FIREBIRD : power(2, 4)
       H2 : power(2, 4)
   HSQLDB : power(2, 4)
   INGRES : power(2, 4)
  MARIADB : power(2, 4)
    MYSQL : power(2, 4)
   ORACLE : power(2, 4)
 POSTGRES : power(2, 4)
SQLSERVER : power(2, 4)
   SYBASE : power(2, 4)

Printing sinh(3)

OK. Hyperbolic Sine isn’t exactly used every day. But when you need it, you better remember the formula. Do you? Here’s how this function renders:

      ASE : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   CUBRID : ((exp((3 * 2)) - 1) / (exp(3) * 2))
      DB2 : sinh(3)
    DERBY : sinh(3)
 FIREBIRD : sinh(3)
       H2 : sinh(3)
   HSQLDB : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   INGRES : ((exp((3 * 2)) - 1) / (exp(3) * 2))
  MARIADB : ((exp((3 * 2)) - 1) / (exp(3) * 2))
    MYSQL : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   ORACLE : sinh(3)
 POSTGRES : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   SQLITE : sinh(3)
SQLSERVER : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   SYBASE : ((exp((3 * 2)) - 1) / (exp(3) * 2))

Printing lpad(‘abc’, 3)

Off to string functions, they’re useful every now and then. But even simple functions as LPAD aren’t supported in all databases. Here’s a couple of creative solutions using repeat.

      ASE : (replicate(' ', (3 - char_length('abc'))) || 'abc')
   CUBRID : lpad('abc', 3, ' ')
      DB2 : lpad('abc', 3, ' ')
 FIREBIRD : cast(lpad('abc', 3, ' ') as varchar(4000))
       H2 : lpad('abc', 3, ' ')
   HSQLDB : lpad('abc', 3, ' ')
   INGRES : lpad('abc', 3, ' ')
  MARIADB : lpad('abc', 3, ' ')
    MYSQL : lpad('abc', 3, ' ')
   ORACLE : lpad('abc', 3, ' ')
 POSTGRES : lpad('abc', 3, ' ')
   SQLITE : substr(replace(replace(substr(quote(zeroblob(((3 - length('abc') - 1 + length(' ')) / length(' ') + 1) / 2)), 3), '''', ''), '0', ' '), 1, (3 - length('abc'))) || 'abc'
SQLSERVER : (replicate(' ', (3 - len('abc'))) + 'abc')
   SYBASE : (repeat(' ', (3 - length('abc'))) || 'abc')

Eek! Did you see that emulation for SQLite? Awful. Some background can be seen here.

Printing datediff()

Now, datetime arithmetics is probably the least cross-vendor compatible area in SQL. You can only lose if you do not use a well integration-tested tool like jOOQ to abstract these functions. Here’s what the simple difference between two dates amounts to:

      ASE : datediff(day, '2000-01-01', current_date())
   CUBRID : current_date() - date '2000-01-01'
      DB2 : (days(current_date) - days(date '2000-01-01'))
    DERBY : {fn timestampdiff(sql_tsi_day, date('2000-01-01'), current_date) }
 FIREBIRD : datediff(day, date '2000-01-01', current_date)
       H2 : datediff('day', date '2000-01-01', current_date())
   HSQLDB : datediff('day', date '2000-01-01', current_date)
   INGRES : cast((current_date - date '2000-01-01') as integer)
  MARIADB : datediff(current_date(), date '2000-01-01')
    MYSQL : datediff(current_date(), date '2000-01-01')
   ORACLE : sysdate - date '2000-01-01'
 POSTGRES : current_date - date '2000-01-01'
   SQLITE : (strftime('%s', current_date) - strftime('%s', '2000-01-01')) / 86400
SQLSERVER : datediff(day, '2000-01-01', convert(date, current_timestamp))
   SYBASE : datediff(day, '2000-01-01', current date)

The good news is, almost every database has some means of actually implementing this. Except SQLite, where dividing by 86400 seconds is not really daylight savings save (!).

Printing bit_count(5)

Last but not least, let’s have a look at this monster. The above function counts the number of bits set to 1 in the argument value 5. It would result to 2. This example only deals with byte values (TINYINT). You can imagine what larger integer support would look like.

   CUBRID : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
      DB2 : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as integer)
 FIREBIRD : cast((bin_and(5, 1) + bin_shr(bin_and(5, 2), 1) + bin_shr(bin_and(5, 4), 2) + bin_shr(bin_and(5, 8), 3) + bin_shr(bin_and(5, 16), 4) + bin_shr(bin_and(5, 32), 5) + bin_shr(bin_and(5, 64), 6) + bin_shr(bin_and(5, -128), 7)) as integer)
       H2 : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as int)
   HSQLDB : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as int)
  MARIADB : bit_count(5)
    MYSQL : bit_count(5)
   ORACLE : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as number(10))
 POSTGRES : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
   SQLITE : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
   SYBASE : cast(((5 & 1) + ((5 & 2) / power(2, 1)) + ((5 & 4) / power(2, 2)) + ((5 & 8) / power(2, 3)) + ((5 & 16) / power(2, 4)) + ((5 & 32) / power(2, 5)) + ((5 & 64) / power(2, 6)) + ((5 & -128) / power(2, 7))) as int)

The awful thing about the above is that not only the bit_count() function is hardly supported, but there’s a lot of lack of support for bitwise operations as well.

Note, there is also a very interesting discussion on the jOOQ User Group about alternative, shorter ways to emulate the bit_count() function.

Scared?

Don’t be. There’s bigger minefields than function cross-vendor compatibility in SQL. If you want to know for yourself, download jOOQ and run your own function emulations with the above program.

Stay tuned for future posts on these topics!

PostgreSQL 9.3 Released!

A great database has just gotten better. PostgreSQL 9.3 has been released today. While improved reliability and availability is certainly quite a thrilling addition, from the jOOQ perspective, the most interesting features are new SQL syntax elements. These include:

  • Better support for JSON
  • LATERAL JOIN (or as SQL:1999 calls it: lateral derived table)
  • Materialised views
  • Updatable views

See all the improvements for yourself:
https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.3