RDBMS bind variable casting madness


Spoiled with every day work on the Magnificent Oracle Database TM, I had to become quite inventive at the first time, when I wanted to start supporting RDBMS such as Derby, HSQLDB, DB2, etc in jOOQ. Oracle really does a great job in keeping the hard work away from you. Bind variable types are assessed at various instants in query execution, in order to find the optimal query execution plan and performance. These instants can be distinguished:

  1. SQL parse time: Much information is available early, when a SQL statement is parsed for the first time. This will lead to most optimal execution plans. Most RDBMS do this to some extent.
  2. Bind time: When you bind variables, well, the JDBC driver doesn’t have to be too smart to assess that a java.lang.Integer is best represented by NUMBER(10), or a java.lang.String by VARCHAR2. If bind-variable peeking is activated, then the execution plan can still be quite good, even if the bind variable type isn’t necessarily easy to infer anymore.
  3. Execution time: If all else fails, then while iterating over the cursor, you can still run checks on encountered data. It’s too late to affect the execution plan, but not to keep data integrity.

So, I could hardly imagine that any of these prepared statements would need bind variable casting:

-- Easy to infer bind types or optimal default cast, at SQL parse time
INSERT INTO table (field1, field2) VALUES (?, ?)
INSERT INTO table (field1, field2) SELECT ?, ? FROM DUAL
UPDATE table SET field1 = ?, field2 = ?
SELECT * FROM table WHERE field1 = ? AND field2 = ?
SELECT field1 * ? / ? FROM table
SELECT field1 || ? || field2 FROM table

-- Bind types can still be inferred, at bind time
SELECT ?, ? FROM DUAL

-- etc...

But some of you will know the depths of DB2 and Derby well enough to be aware of the ultra-strong type systems, those RDBMS chose to implement. In DB2, for instance, these two values are very different:

cast(null as char(3))
cast(null as integer)

In many RDBMS, for instance, this won’t work

-- Won't work
INSERT INTO table (field1, field2) SELECT ?, ? FROM DUAL
-- Has to be written like this
INSERT INTO table (field1, field2) SELECT cast(? as char(3)), cast(? as integer) FROM DUAL

For a generic database abstraction framework, this can be quite a challenge to hide from the client code, the way jOOQ aims to do it. So here’s an overview over this casting madness, and how which RDBMS can be categorised:

Friendly RDBMS

These RDBMS are my friends. They are powerful, provide lots of functionality, and infer all sorts of data-types, so I don’t have to cast any bind variables. Give a hand to these guys:

  • MySQL (*)
  • Postgres
  • Oracle
  • SQL Server
  • SQLite (**)

(*) MySQL is a bit special. For some obscure reason, the type used in the CAST() function differs from the type defined in the table’s DDL statement. See some documentation on this.
(**) SQLite is very special. Well, they just don’t care. You wanna put a VARCHAR into a NUMERIC column? No problem. A BOOLEAN into a DATE column? Even better! They call that “type affinity“.

Friendly RDBMS, with the odd JDBC bug

Very unfortunately, these JDBC drivers seem to suffer from 1-2 bugs concerning uncast variable binding in rather complex SQL (e.g. nested SELECTs in JOIN clauses). So to be on the safe side, casting is better.

  • Ingres
  • Sybase SQL Anywhere

The difficult ones

They do a lot of inferring. But also a lot of intentional omitting inferring. You can never be sure whether you can omit the cast for your next query or not with these guys:

  • H2
  • HSQLDB

I have recently reported several bugs to H2 and HSQLDB regarding casting. Some 50% of them were rejected, as “works-as-designed”. So I guess, they will remain “difficult” for the next couple of years. Also interesting: H2 is developed by a former developer of HSQLDB (read about the H2 history). They are very much alike, still, even after heavy re-designs. But the places where casting is needed are entirely different. Let’s turn a blind eye on their claim of “intentionally” omitting type inference ;-)

The lost ones

These RDBMS hardly infer any type. In fact, a bind variable without an associated type is utterly meaningless in many situations. Up until DB2 v9.7, even the literal null had no meaning without an associated type. That’s very hard to handle in a database abstraction layer, as null really isn’t a value from the Java perspective. So the strict ones are:

  • Derby
  • DB2

In many cases, even when using jOOQ, you’ll have to use the jOOQ cast API extensively to get your types right… If you dare. Because with Derby, you cannot execute simple casts, such as from INTEGER to VARCHAR without casting to CHAR first. See the conversion table from hell.

Tags: , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,083 other followers

%d bloggers like this: