Derby casting madness – the sequel

I have recently blogged about the general bind variable casting madness in SQL:

https://lukaseder.wordpress.com/2011/08/31/rdbms-bind-variable-casting-madness/

So this is the sequel of the above story, purely dedicated to Derby with its “conversion table from hell“. One of jOOQ’s goals is to make SQL as compatible as possible across various databases, in a way that you can re-use the same SQL on various environments. For instance:

  • Use Derby to develop your database
  • Use DB2 for production

While I personally discourage such set-ups, I know that many developers prefer that, especially when it comes to run fast-running integration tests. And the above coupling of Derby and DB2 is an especially good one, as Derby is quite similar to DB2. See also this Stack Overflow Question:

http://stackoverflow.com/questions/4419684/portable-schema-between-derby-and-db2

But back to casting. In order to make casting as compatible as possible, jOOQ generates casting SQL according to the following rules:

Casting NUMERIC to VARCHAR

Interestingly, this is not supported, but casting to CHAR is. So jOOQ generates:

-- When 123 is inlined:
trim(cast(cast(123 as char(38)) as varchar(32672)))

-- When 123 is bound as a variable
trim(cast(cast(cast(? as int) as char(38)) as varchar(32672)))

Casting CHAR/VARCHAR to DOUBLE/FLOAT/REAL

Again, this isn’t supported for some reason. So jOOQ generates:

-- When 123.0 is inlined:
cast(cast('123.0' as decimal) as float)

-- When 123.0 is bound as a variable
cast(cast(cast(? as varchar(32672)) as decimal) as float)

Casting NUMERIC to BOOLEAN

This can’t be expressed simply with a CAST clause. A CASE .. WHEN clause is rendered instead, by jOOQ (note that Derby doesn’t support the simple CASE clause either…):

case when cast(? as int) = 0 then false
     when cast(? as int) is null then null
     else true
end

Casting CHAR/VARCHAR to BOOLEAN

The Derby documentation claims that this should work, but I’ve experienced quite some trouble. Derby seems to accept only SQL standard boolean literals and rejects values such as ‘0’, ‘1’, etc… Most databases accept ‘0’, ‘1’ as boolean string values as well. So jOOQ simulates the following

case when       cast(? as varchar(32672))  = '0' then false
     when lower(cast(? as varchar(32672))) = 'false' then false
     when lower(cast(? as varchar(32672))) = 'f' then false
     when cast(? as varchar(32672)) is null then null
     else true
end

Casting other pairs

Luckily, all other common types of casts seem to work as expected, also in the Derby database.

RDBMS bind variable casting madness

Spoiled with every day work on the Magnificent Oracle Database TM, I had to become quite inventive at the first time, when I wanted to start supporting RDBMS such as Derby, HSQLDB, DB2, etc in jOOQ. Oracle really does a great job in keeping the hard work away from you. Bind variable types are assessed at various instants in query execution, in order to find the optimal query execution plan and performance. These instants can be distinguished:

  1. SQL parse time: Much information is available early, when a SQL statement is parsed for the first time. This will lead to most optimal execution plans. Most RDBMS do this to some extent.
  2. Bind time: When you bind variables, well, the JDBC driver doesn’t have to be too smart to assess that a java.lang.Integer is best represented by NUMBER(10), or a java.lang.String by VARCHAR2. If bind-variable peeking is activated, then the execution plan can still be quite good, even if the bind variable type isn’t necessarily easy to infer anymore.
  3. Execution time: If all else fails, then while iterating over the cursor, you can still run checks on encountered data. It’s too late to affect the execution plan, but not to keep data integrity.

So, I could hardly imagine that any of these prepared statements would need bind variable casting:

-- Easy to infer bind types or optimal default cast, at SQL parse time
INSERT INTO table (field1, field2) VALUES (?, ?)
INSERT INTO table (field1, field2) SELECT ?, ? FROM DUAL
UPDATE table SET field1 = ?, field2 = ?
SELECT * FROM table WHERE field1 = ? AND field2 = ?
SELECT field1 * ? / ? FROM table
SELECT field1 || ? || field2 FROM table

-- Bind types can still be inferred, at bind time
SELECT ?, ? FROM DUAL

-- etc...

But some of you will know the depths of DB2 and Derby well enough to be aware of the ultra-strong type systems, those RDBMS chose to implement. In DB2, for instance, these two values are very different:

cast(null as char(3))
cast(null as integer)

In many RDBMS, for instance, this won’t work

-- Won't work
INSERT INTO table (field1, field2) SELECT ?, ? FROM DUAL
-- Has to be written like this
INSERT INTO table (field1, field2) SELECT cast(? as char(3)), cast(? as integer) FROM DUAL

For a generic database abstraction framework, this can be quite a challenge to hide from the client code, the way jOOQ aims to do it. So here’s an overview over this casting madness, and how which RDBMS can be categorised:

Friendly RDBMS

These RDBMS are my friends. They are powerful, provide lots of functionality, and infer all sorts of data-types, so I don’t have to cast any bind variables. Give a hand to these guys:

  • MySQL (*)
  • Postgres
  • Oracle
  • SQL Server
  • SQLite (**)

(*) MySQL is a bit special. For some obscure reason, the type used in the CAST() function differs from the type defined in the table’s DDL statement. See some documentation on this.
(**) SQLite is very special. Well, they just don’t care. You wanna put a VARCHAR into a NUMERIC column? No problem. A BOOLEAN into a DATE column? Even better! They call that “type affinity“.

Friendly RDBMS, with the odd JDBC bug

Very unfortunately, these JDBC drivers seem to suffer from 1-2 bugs concerning uncast variable binding in rather complex SQL (e.g. nested SELECTs in JOIN clauses). So to be on the safe side, casting is better.

  • Ingres
  • Sybase SQL Anywhere

The difficult ones

They do a lot of inferring. But also a lot of intentional omitting inferring. You can never be sure whether you can omit the cast for your next query or not with these guys:

  • H2
  • HSQLDB

I have recently reported several bugs to H2 and HSQLDB regarding casting. Some 50% of them were rejected, as “works-as-designed”. So I guess, they will remain “difficult” for the next couple of years. Also interesting: H2 is developed by a former developer of HSQLDB (read about the H2 history). They are very much alike, still, even after heavy re-designs. But the places where casting is needed are entirely different. Let’s turn a blind eye on their claim of “intentionally” omitting type inference ;-)

The lost ones

These RDBMS hardly infer any type. In fact, a bind variable without an associated type is utterly meaningless in many situations. Up until DB2 v9.7, even the literal null had no meaning without an associated type. That’s very hard to handle in a database abstraction layer, as null really isn’t a value from the Java perspective. So the strict ones are:

  • Derby
  • DB2

In many cases, even when using jOOQ, you’ll have to use the jOOQ cast API extensively to get your types right… If you dare. Because with Derby, you cannot execute simple casts, such as from INTEGER to VARCHAR without casting to CHAR first. See the conversion table from hell.