RDBMS bind variable casting madness

Spoiled with every day work on the Magnificent Oracle Database TM, I had to become quite inventive at the first time, when I wanted to start supporting RDBMS such as Derby, HSQLDB, DB2, etc in jOOQ. Oracle really does a great job in keeping the hard work away from you. Bind variable types are assessed at various instants in query execution, in order to find the optimal query execution plan and performance. These instants can be distinguished:

  1. SQL parse time: Much information is available early, when a SQL statement is parsed for the first time. This will lead to most optimal execution plans. Most RDBMS do this to some extent.
  2. Bind time: When you bind variables, well, the JDBC driver doesn’t have to be too smart to assess that a java.lang.Integer is best represented by NUMBER(10), or a java.lang.String by VARCHAR2. If bind-variable peeking is activated, then the execution plan can still be quite good, even if the bind variable type isn’t necessarily easy to infer anymore.
  3. Execution time: If all else fails, then while iterating over the cursor, you can still run checks on encountered data. It’s too late to affect the execution plan, but not to keep data integrity.

So, I could hardly imagine that any of these prepared statements would need bind variable casting:

-- Easy to infer bind types or optimal default cast, at SQL parse time
INSERT INTO table (field1, field2) VALUES (?, ?)
INSERT INTO table (field1, field2) SELECT ?, ? FROM DUAL
UPDATE table SET field1 = ?, field2 = ?
SELECT * FROM table WHERE field1 = ? AND field2 = ?
SELECT field1 * ? / ? FROM table
SELECT field1 || ? || field2 FROM table

-- Bind types can still be inferred, at bind time
SELECT ?, ? FROM DUAL

-- etc...

But some of you will know the depths of DB2 and Derby well enough to be aware of the ultra-strong type systems, those RDBMS chose to implement. In DB2, for instance, these two values are very different:

cast(null as char(3))
cast(null as integer)

In many RDBMS, for instance, this won’t work

-- Won't work
INSERT INTO table (field1, field2) SELECT ?, ? FROM DUAL
-- Has to be written like this
INSERT INTO table (field1, field2) SELECT cast(? as char(3)), cast(? as integer) FROM DUAL

For a generic database abstraction framework, this can be quite a challenge to hide from the client code, the way jOOQ aims to do it. So here’s an overview over this casting madness, and how which RDBMS can be categorised:

Friendly RDBMS

These RDBMS are my friends. They are powerful, provide lots of functionality, and infer all sorts of data-types, so I don’t have to cast any bind variables. Give a hand to these guys:

  • MySQL (*)
  • Postgres
  • Oracle
  • SQL Server
  • SQLite (**)

(*) MySQL is a bit special. For some obscure reason, the type used in the CAST() function differs from the type defined in the table’s DDL statement. See some documentation on this.
(**) SQLite is very special. Well, they just don’t care. You wanna put a VARCHAR into a NUMERIC column? No problem. A BOOLEAN into a DATE column? Even better! They call that “type affinity“.

Friendly RDBMS, with the odd JDBC bug

Very unfortunately, these JDBC drivers seem to suffer from 1-2 bugs concerning uncast variable binding in rather complex SQL (e.g. nested SELECTs in JOIN clauses). So to be on the safe side, casting is better.

  • Ingres
  • Sybase SQL Anywhere

The difficult ones

They do a lot of inferring. But also a lot of intentional omitting inferring. You can never be sure whether you can omit the cast for your next query or not with these guys:

  • H2
  • HSQLDB

I have recently reported several bugs to H2 and HSQLDB regarding casting. Some 50% of them were rejected, as “works-as-designed”. So I guess, they will remain “difficult” for the next couple of years. Also interesting: H2 is developed by a former developer of HSQLDB (read about the H2 history). They are very much alike, still, even after heavy re-designs. But the places where casting is needed are entirely different. Let’s turn a blind eye on their claim of “intentionally” omitting type inference 😉

The lost ones

These RDBMS hardly infer any type. In fact, a bind variable without an associated type is utterly meaningless in many situations. Up until DB2 v9.7, even the literal null had no meaning without an associated type. That’s very hard to handle in a database abstraction layer, as null really isn’t a value from the Java perspective. So the strict ones are:

  • Derby
  • DB2

In many cases, even when using jOOQ, you’ll have to use the jOOQ cast API extensively to get your types right… If you dare. Because with Derby, you cannot execute simple casts, such as from INTEGER to VARCHAR without casting to CHAR first. See the conversion table from hell.

Postgres INSERT .. RETURNING clause and how this can be simulated in other RDBMS

One of jOOQ’s major features is to take the most useful SQL constructs and clauses from any RDBMS and make them available to other SQL dialects, as well. This had been done previously with MySQL’s INSERT .. ON DUPLICATE KEY UPDATE construct, which can be easily simulated with the more powerful SQL standard MERGE statement, or using procedural SQL blocks where this is supported.

A recent request made me think about Postgres’ INSERT .. RETURNING clause, which is probably the most intuitive and concise way of returning generated keys from an insert statement. The importance of doing that becomes clear in the context of a jOOQ UpdatableRecord, which, when inserted, should refresh its IDENTITY, or Primary Key value. With jOOQ 1.6.4, this was done in a bit “risky” way, by just fetching the MAX(PK) value from the table, immediately after insert. Obviously, this can go terribly wrong in a highly concurrent system, when other transactions are committed earlier.

But how to map INSERT .. RETURNING to other RDBMS? Here’s a quick overview over what’s supported in which database / JDBC driver:

Vendor-specific SQL syntax support

On Postgres and DB2, you can also execute this for INSERT statements:

ResultSet rs = statement.executeQuery();

The SQL syntax to fetch a java.sql.ResultSet from an INSERT statement works like this:

-- Postgres
INSERT INTO .. RETURNING *

-- DB2
SELECT * FROM FINAL TABLE (INSERT INTO ..)

Oracle also knows of a similar clause

INSERT INTO .. RETURNING INTO ?, ?, ...

This PL/SQL extension cannot be executed easily with standard JDBC, though. Either, it has to be wrapped in a PL/SQL block and executed as a java.sql.CallableStatement, or by casting the prepared statement into an OraclePreparedStatement and registering return values, as described here:

http://stackoverflow.com/questions/682539/return-rowid-parameter-from-insert-statement-using-jdbc-connection-to-oracle

Optimal JDBC support

Some RDBMS have “optimal” support for returning values after an INSERT statement. By “optimal”, I mean that all table columns can be returned, regardless of whether they are actual keys or not. These RDBMS are: HSQLDB, Oracle, DB2.

If this is supported by the JDBC driver, then the simulation of the Postgres INSERT .. RETURNING clause is very simple, as the requested fields can be passed at prepared statement initialisation time:

// Watch out for case-sensitivity!
String[] columnNames = // [...] transform RETURNING clause
PreparedStatement stmt = connection.prepareStatement(sql, columnNames);

Limited JDBC support

Other RDBMS have “limited” support for returning values. This means that only generated IDENTITY (AUTO_INCREMENT) values will be returned. This applies to: Derby, H2, MySQL, SQL Server.

PreparedStatement stmt = connection.prepareStatement(sql,
  Statement.RETURN_GENERATED_KEYS);

If more than the IDENTITY column value is requested in the simulated INSERT .. RETURNING clause, then an additional SELECT statement has to be issued right after. If transactions are properly handled by client code (i.e. the SELECT will run in the same transaction as the INSERT), then no race conditions can occur and the behaviour of this is correct.

No JDBC support

Unfortunately, there are also JDBC drivers that do not support returning values from INSERT statements. The affected RDBMS are: Sybase, SQLite. In a future version of jOOQ, the INSERT .. RETURNING clause can be simulated in three steps:

  1. INSERT
  2. Fetch Sybase @@identity / SQLite last_insert_rowid()
  3. Fetch other requested columns

Loading CSV data with jOOQ

After the recent efforts made in jOOX, developments of jOOQ have been continued. The main new feature of the upcoming release 1.6.5 is the support for loading of CSV data. The jOOQ Factory will now provide access to a dedicated fluent API for loading CSV files into generated tables, specifying a field mapping and various other parameters related to the batch-processing of bulk loads. Some sample code of what the API might look like:

// The typical jOOQ factory
Factory create = new Factory(connection, SQLDialect.ORACLE);

// Configure and execute a Loader object
Loader<TAuthor> loader =
create.loadInto(AUTHOR)
      .onDuplicateKeyError()
      .onErrorAbort()
      .commitAll()
      .loadCSV("1;'Kafka'\n" +
               "2;Frisch")
      .fields(Author.ID, Author.LAST_NAME)
      .quote('\'')
      .separator(';')
      .ignoreRows(0)
      .execute();

// The resulting Loader object then holds various
// information about the loading process:

// The number of processed rows
int processed = loader.processed();

// The number of stored rows (INSERT or UPDATE)
int stored = loader.stored();

// The number of ignored rows (due to errors, or duplicate rule)
int ignored = loader.ignored();

// The errors that may have occurred during loading
List<LoaderError> errors = loader.errors();
LoaderError error = errors.get(0);

// The exception that caused the error
SQLException exception = error.exception();

// The row that caused the error
int rowIndex = error.rowIndex();
String[] row = error.row();

// The query that caused the error
Query query = error.query();

Along with the previously implemented export API, it is easy to export results from org.jooq.Result into CSV, let users modify them in Excel or any other office software, and upload the CSV again. Other ideas for future versions of jOOQ will also include loading data from XML and JSON data sources, “merging” data (i.e. including DELETE operations), etc.

Feedback is very welcome.

There’s still potential for new SQL dialects

A recent feature request reminded me, that there is still a lot of potential for new SQL dialects supported by jOOQ. User Philippe is considering jOOQ for various projects in his organisation, and one dialect he’s missing is that of Sybase‘s Adaptive Server Enterprise. Sybase is one of the older databases, that is still widely used, especially in the health industry sector. Its history started in the mid-80’s where it was at some point forked into Microsoft’s SQL Server. This is well reflected in jOOQ, where you can see that many dialect-specific behaviour patterns are the same for Sybase SQL Anywhere and SQL Server. Some common heritage from the “Ingres dinosaur” can also be seen.

jOOQ’s Sybase support is currently limited to SQL Anywhere, but there’s no reason not to support Sybase ASE as well! I’m looking forward to contributions from Philippe. Note, there are also other dialects that are worth considering for support by jOOQ:

See a nice chart about RDBMS history here:
http://en.wikipedia.org/wiki/User:Intgr/RDBMS_timeline

See the discussion with Philippe here:
http://groups.google.com/group/jooq-user/browse_thread/thread/130b44cb59b14a82

Oracle’s object-oriented PL/SQL extensions

I have recently re-discovered an interesting feature of Oracle‘s PL/SQL language. Not only can you define your own types very easily, you can also associate “methods” to them, as in other object-oriented languages. Oracle calls those “methods” member functions and member procedures. This is documented here, for example:

http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28371/adobjbas.htm#i477669

So you can define your own type like this

create type circle as object (
   radius number,
   member procedure draw,
   member function  circumference return number
);

create type body circle as
   member procedure draw is
   begin
     null; -- draw the circle
   end draw;

   member function circumference return number is
   begin
     return 2 * 3.141 * radius;
   end circumference;
end;

In PL/SQL, you can instanciate that type and call its member procedures and functions easily:

declare
   c circle;
begin
   c := circle(5);
   dbms_output.put_line(to_char(c.circumference));
end;

The same function can be called in JDBC with this statement

CallableStatement call = c.prepareCall(
  "{ ? = call circle(5).circumference() }");

It would only feel natural for jOOQ-generated UDT records to provide access to the underlying member procedures and functions. The jOOQ UDTRecord is an attachable object. That means, that if it is fetched from the database, it holds a reference to a jOOQ Configuration, and thus to a JDBC Connection. So if you create a procedure that returns a circle, you can call procedures and functions directly on that circle. Your Java code might look like this:

// Call the stored function that returns a new circle type
CircleRecord circle = Functions.getNewCircle(configuration);

// Use the attached CircleRecord to calculate the circumference
BigDecimal circumference = circle.circumference();

// And draw the circle in the database
circle.draw();

Look out in future versions of jOOQ for this exciting new feature! This awesome feature has been part of jOOQ for a long time now!

FluentDOM, another mimick of jQuery DOM manipulation, in PHP

The triumph of jQuery over any other XML API seems prominent, in many languages. Here is another example of a nice jQuery-port to PHP: FluentDOM.

http://fluentdom.github.com/

Similar to jOOX, FluentDOM aims to combine a jQuery-like fluent API with XPath and general DOM XML manipulation. Here are some simple examples taken from the FluentDOM documentation:

// read a file and set the message tag's content
echo FluentDOM($xmlFile)
  ->find('/message')
  ->text('Hello World!');

// Find the <root> first then the second element in it
var_dump($fd->find('/root')->find('*[2]')->item(0)->textContent);

// Append elements to an object
$menu
  ->append('<li/>')
  ->append('<a/>')
  ->attr('href', '/sample.php')
  ->text('Sample');

I’m in contact with the developers of FluentDOM. As always with OSS, there is great potential for synergy, which in the end will make both products better. For jOOX, this means that loading of files/streams is going to be a nice plus. XPath is already implemented in the upcoming release 0.9.2. On the other hand, maybe FluentDOM can get inspiration from jOOQ’s document creation syntax (which isn’t part of jQuery):

$("root",
  $("element",
    $("child", "text"),
    $("child", "more text")));

… which will create

<root>
  <element>
    <child>text</child>
    <child>more text</child>
  </element>
</root>

Excited as always, let’s get back to hacking! 🙂

See the latest progress here: http://code.google.com/p/joox/

Another Fluent API: jOOX. Porting jQuery to Java

Recently, in my every day programming madness, I really felt the urge to kill someone involved with the formal specification of DOM. The beloved Document Object Model. While everyone understands that this API is complete in functionality and scope and it’s a standard, and it’s almost the same in every language…. well it’s incredibly verbose. Manipulating XML is about as fun and exciting as cleaning the dishes of a 2000-people Indian wedding.

And then, suddenly, I remembered that this is how I felt with Java’s support for advanced SQL and how JPA/CriteriaQuery made me feel like that poor dishwasher, before. And I wondered whether someone had felt like me before. So I asked this question on Stack Overflow:

http://stackoverflow.com/questions/6996013/a-nice-java-xml-dom-utility

And I got the expected answers about JDOM and dom4j. Two dinosaur projects that are neither sexier nor more efficient than the standard itself (e.g. Xerces). See this answer about performance:

http://stackoverflow.com/questions/6996013/a-nice-java-xml-dom-utility#6998870

I had also found one project, that has a somewhat fluent approach:

http://code.google.com/p/xmltool/

It looks quite nice, actually, although it is a bit biased towards DOM creation, not navigation. And then, it struck me like lightning: “Why hasn’t anyone ported jQuery” to Java, yet?? jQuery is exactly how an XML API should be: Awesome. Fluent. Fun, and efficient to use. So I tried to hack something together that looks like jQuery and that’s the beginning of another product in the “jOO-Star suite”: jOOX with X for XML! I wanted this to be fluent, and fun and efficient to use. Like jOOQ. So jOOX will be an attempt for doing precisely that. Here’s an example of what jOOX code looks like:

// Find the order at index for and add an element "paid"
joox(document).find("orders")
              .children()
              .eq(4)
              .append("<paid>true</paid>");

// Find those orders that are paid and flag them as "settled"
joox(document).find("orders")
              .children()
              .find("paid")
              .after("<settled>true</settled>");

This rapid prototype of a jQuery port looks very promising to me, even if the most important features aren’t there yet (e.g. navigation with expression languages, selectors, etc). With Java’s static typing and without all the browser-related issues and JavaScript event handling and CSS and all that, pure DOM navigation and manipulation is actually not that hard to wrap. In any case, I have now even more respect for the jQuery guys, as I’m just touching the tip of the iceberg.

So in the future, I will also post one or two entries about jOOX on this blog. Looking forward to feedback!

Download jOOX from Google Code:

http://code.google.com/p/joox/