Crazy Translations of Simple SQL Expressions to Various SQL Dialects

SQL is standardised by ISO / IEC. We have wonderful standards, such as SQL-92, SQL:1999, SQL:2003, SQL:2008. Right? In theory. In practice, many things do not have an equivalence between the actual SQL implementations. This is very true for common function support – most of which are not part of any standard. Here’s an interesting insight into equivalent expressions for very simple functions.

For this article, I’m using jOOQ, a library that models SQL as an internal domain-specific language in Java. jOOQ builds an AST model of your SQL statement in Java memory, before rendering it to the chosen SQL dialect. This helps standardising SQL, such that your programming model stays simple, while being able to access useful functionality.

The following little Java program is used:

package org.jooq.compatibility;

import static org.jooq.SQLDialect.SQL99;
import static org.jooq.impl.DSL.*;

import java.sql.Date;
import java.util.EnumSet;

import org.jooq.QueryPart;
import org.jooq.SQLDialect;
import org.jooq.impl.DSL;

public class Compatibility {

    // The main method prints a couple of simple
    // and common SQL functions
    public static void main(String[] args) {
        print(sin(3));
        print(power(2, 4));
        print(sinh(3));
        print(lpad(inline("abc"), 3));
        print(dateDiff(currentDate(),
                       Date.valueOf("2000-01-01")));
        print(bitCount((byte) 5));
    }

    private static void print(QueryPart part) {
        System.out.println("Printing " + part);
        System.out.println("---------------------");

        // Get only SQLDialect families, don't
        // distinguish between
        // SQL Server 2008 or SQL Server 20012
        EnumSet<SQLDialect> dialects =
            EnumSet.noneOf(SQLDialect.class);
        for (SQLDialect dialect:SQLDialect.values())
            if (dialect != SQL99)
                dialects.add(dialect.family());

        // Render the passed in SQL clause to
        // all supported SQL dialects
        for (SQLDialect dialect: dialects)
            System.out.println(
                String.format("%1$9s : %2$s",
                dialect, DSL.using(dialect)
                            .renderInlined(part)
            ));

        System.out.println();
        System.out.println();
    }
}

Now check out this output!

Printing sin(3)

This is the most straightforward among the tested functions. It renders the same in every dialect.

      ASE : sin(3)
   CUBRID : sin(3)
      DB2 : sin(3)
    DERBY : sin(3)
 FIREBIRD : sin(3)
       H2 : sin(3)
   HSQLDB : sin(3)
   INGRES : sin(3)
  MARIADB : sin(3)
    MYSQL : sin(3)
   ORACLE : sin(3)
 POSTGRES : sin(3)
   SQLITE : sin(3)
SQLSERVER : sin(3)
   SYBASE : sin(3)

Note that SQLite doesn’t really support this function, though. In subsequent listings, only those dialects are listed that can really emulate the function in question.

In the above list, ASE is Sybase ASE whereas SYBASE is Sybase SQL Anywhere.

Printing power(2, 4)

This gets a little trickier, as the power function isn’t supported in all databases. But it’s easy to emulate it using logarithms and exp.

      ASE : power(2, 4)
   CUBRID : power(2, 4)
      DB2 : power(2, 4)
    DERBY : exp((ln(2) * 4))
 FIREBIRD : power(2, 4)
       H2 : power(2, 4)
   HSQLDB : power(2, 4)
   INGRES : power(2, 4)
  MARIADB : power(2, 4)
    MYSQL : power(2, 4)
   ORACLE : power(2, 4)
 POSTGRES : power(2, 4)
SQLSERVER : power(2, 4)
   SYBASE : power(2, 4)

Printing sinh(3)

OK. Hyperbolic Sine isn’t exactly used every day. But when you need it, you better remember the formula. Do you? Here’s how this function renders:

      ASE : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   CUBRID : ((exp((3 * 2)) - 1) / (exp(3) * 2))
      DB2 : sinh(3)
    DERBY : sinh(3)
 FIREBIRD : sinh(3)
       H2 : sinh(3)
   HSQLDB : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   INGRES : ((exp((3 * 2)) - 1) / (exp(3) * 2))
  MARIADB : ((exp((3 * 2)) - 1) / (exp(3) * 2))
    MYSQL : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   ORACLE : sinh(3)
 POSTGRES : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   SQLITE : sinh(3)
SQLSERVER : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   SYBASE : ((exp((3 * 2)) - 1) / (exp(3) * 2))

Printing lpad(‘abc’, 3)

Off to string functions, they’re useful every now and then. But even simple functions as LPAD aren’t supported in all databases. Here’s a couple of creative solutions using repeat.

      ASE : (replicate(' ', (3 - char_length('abc'))) || 'abc')
   CUBRID : lpad('abc', 3, ' ')
      DB2 : lpad('abc', 3, ' ')
 FIREBIRD : cast(lpad('abc', 3, ' ') as varchar(4000))
       H2 : lpad('abc', 3, ' ')
   HSQLDB : lpad('abc', 3, ' ')
   INGRES : lpad('abc', 3, ' ')
  MARIADB : lpad('abc', 3, ' ')
    MYSQL : lpad('abc', 3, ' ')
   ORACLE : lpad('abc', 3, ' ')
 POSTGRES : lpad('abc', 3, ' ')
   SQLITE : substr(replace(replace(substr(quote(zeroblob(((3 - length('abc') - 1 + length(' ')) / length(' ') + 1) / 2)), 3), '''', ''), '0', ' '), 1, (3 - length('abc'))) || 'abc'
SQLSERVER : (replicate(' ', (3 - len('abc'))) + 'abc')
   SYBASE : (repeat(' ', (3 - length('abc'))) || 'abc')

Eek! Did you see that emulation for SQLite? Awful. Some background can be seen here.

Printing datediff()

Now, datetime arithmetics is probably the least cross-vendor compatible area in SQL. You can only lose if you do not use a well integration-tested tool like jOOQ to abstract these functions. Here’s what the simple difference between two dates amounts to:

      ASE : datediff(day, '2000-01-01', current_date())
   CUBRID : current_date() - date '2000-01-01'
      DB2 : (days(current_date) - days(date '2000-01-01'))
    DERBY : {fn timestampdiff(sql_tsi_day, date('2000-01-01'), current_date) }
 FIREBIRD : datediff(day, date '2000-01-01', current_date)
       H2 : datediff('day', date '2000-01-01', current_date())
   HSQLDB : datediff('day', date '2000-01-01', current_date)
   INGRES : cast((current_date - date '2000-01-01') as integer)
  MARIADB : datediff(current_date(), date '2000-01-01')
    MYSQL : datediff(current_date(), date '2000-01-01')
   ORACLE : sysdate - date '2000-01-01'
 POSTGRES : current_date - date '2000-01-01'
   SQLITE : (strftime('%s', current_date) - strftime('%s', '2000-01-01')) / 86400
SQLSERVER : datediff(day, '2000-01-01', convert(date, current_timestamp))
   SYBASE : datediff(day, '2000-01-01', current date)

The good news is, almost every database has some means of actually implementing this. Except SQLite, where dividing by 86400 seconds is not really daylight savings save (!).

Printing bit_count(5)

Last but not least, let’s have a look at this monster. The above function counts the number of bits set to 1 in the argument value 5. It would result to 2. This example only deals with byte values (TINYINT). You can imagine what larger integer support would look like.

   CUBRID : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
      DB2 : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as integer)
 FIREBIRD : cast((bin_and(5, 1) + bin_shr(bin_and(5, 2), 1) + bin_shr(bin_and(5, 4), 2) + bin_shr(bin_and(5, 8), 3) + bin_shr(bin_and(5, 16), 4) + bin_shr(bin_and(5, 32), 5) + bin_shr(bin_and(5, 64), 6) + bin_shr(bin_and(5, -128), 7)) as integer)
       H2 : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as int)
   HSQLDB : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as int)
  MARIADB : bit_count(5)
    MYSQL : bit_count(5)
   ORACLE : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as number(10))
 POSTGRES : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
   SQLITE : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
   SYBASE : cast(((5 & 1) + ((5 & 2) / power(2, 1)) + ((5 & 4) / power(2, 2)) + ((5 & 8) / power(2, 3)) + ((5 & 16) / power(2, 4)) + ((5 & 32) / power(2, 5)) + ((5 & 64) / power(2, 6)) + ((5 & -128) / power(2, 7))) as int)

The awful thing about the above is that not only the bit_count() function is hardly supported, but there’s a lot of lack of support for bitwise operations as well.

Note, there is also a very interesting discussion on the jOOQ User Group about alternative, shorter ways to emulate the bit_count() function.

Scared?

Don’t be. There’s bigger minefields than function cross-vendor compatibility in SQL. If you want to know for yourself, download jOOQ and run your own function emulations with the above program.

Stay tuned for future posts on these topics!

Funky String Function Simulation in SQLite

SQLite is so light, it doesn’t have any useful string functions. It doesn’t have ASCII(), LPAD(), RPAD(), REPEAT(), POSITION(), you name it. It does, however, have a wonderful RANDOMBLOB() function. So if you really need a good random number generator, use a SQLite database and generate a 1GB blob. That should give you a couple of random numbers for the next years.

For a full (or rather, empty) list see the SQLite function reference here:
http://www.sqlite.org/lang_corefunc.html

Function Simulation: REPEAT()

Not having any functions doesn’t mean that you can’t simulate them. You can. Take REPEAT(), for instance. Apart from the RANDOMBLOB(), you can also generate a ZEROBLOB(). It’s a blob with lots of zeros in it. But you can’t just go and do this:

-- Simulate REPEAT('abc', 3)
replace(zeroblob(3), 0, 'abc')

That would be too easy. The problem with the zeroblob is, that when cast to a string, it is actually a zero-terminated string. Quite usual when programming in C. But hey, the first character is a zero, so the resulting string is terminated right at the beginning. How useful is that??

But here’s a trick, QUOTE() the ZEROBLOB(). That would escape all characters in hex format. In other words:

quote(zeroblob(3)) yields X'000000'

Nice. Now we’ve got three extra letters around twice as many zeroes as we wanted. So we’ll simply do this

-- Simulate REPEAT('abc', 3)
replace(substr(quote(zeroblob(2)), 3, 3), '0', 'abc')

-- Or more generally: X = 'abc', Y = 3
replace(substr(quote(zeroblob((Y + 1) / 2)), 3, Y), '0', X)

Doesn’t that start to make fun? Note, I have documented this simulation also here:
http://stackoverflow.com/questions/11568496/how-to-simulate-repeat-in-sqlite

Function Simulation: LPAD() and RPAD()

REPEAT() was easy. But REPEAT() was inspired by LPAD() and RPAD(), which is similar to REPEAT(), except that a character is padded to the left or right of another string, until a given length of the resulting string is reached. ZEROBLOB() will help us again! Let’s consider RPAD():

-- Simulate RPAD('abc', 7, '-')
'abc' || replace(replace(substr(quote(zeroblob(4)), 3, 4), '''', ''), '0', '-')

-- Or more generally:
-- RPAD() Using X = 7, Y = '-', Z = 'abc'
Z || replace(
       replace(
         substr(
           quote(zeroblob((X + 1) / 2)), 
           3, (X - length(Z))
         ), '''', ''
       ), '0', Y
     )

-- LPAD() Using X = 7, Y = '-', Z = 'abc'
replace(
  replace(
    substr(
      quote(zeroblob((X + 1) / 2)), 
      3, (X - length(Z))
    ), '''', ''
  ), '0', Y
) || Z

Now if this isn’t funky! This was actually something, I didn’t come up with myself. This was an answer I was given on Stack Overflow, where great minds spend lots of spare time on weird problems like this:
http://stackoverflow.com/questions/6576343/how-to-simulate-lpad-rpad-with-sqlite

Of course, these simulations will be part of the next version of jOOQ, so you don’t have to worry any longer about how to do LPAD(), RPAD(), and REPEAT().

What are procedures and functions after all?

Many RDBMS support the concept of “routines”, usually calling them procedures and/or functions. These concepts have been around in programming languages for a while, also outside of databases. Famous languages distinguishing procedures from functions are:

  • Ada
  • BASIC
  • Pascal
  • etc…

The general distinction between (stored) procedures and (stored) functions can be summarised like this:

Procedures:

  • Are called using JDBC CallableStatement
  • Have no return value
  • Usually support OUT parameters

Functions:

  • Can be used in SQL statements
  • Have a return value
  • Usually don’t support OUT parameters

But there are exceptions to these rules:

  • DB2, H2, and HSQLDB don’t allow for JDBC escape syntax when calling functions. Functions must be used in a SELECT statement
  • H2 only knows functions (without OUT parameters)
  • Oracle functions may have OUT parameters
  • Oracle knows functions that mustn’t be used in SQL statements for transactional reasons
  • Postgres only knows functions (with all features combined). OUT parameters can also be interpreted as return values, which is quite elegant/freaky, depending on your taste
  • The Sybase jconn3 JDBC driver doesn’t handle null values correctly when using the JDBC escape syntax on functions

In general, it can be said that the field of routines (procedures / functions) is far from being standardised in modern RDBMS. Every database has its ways and JDBC only provides little abstraction over the great variety of procedures / functions implementations, especially when advanced data types such as cursors / UDT’s / arrays are involved.