Funky String Function Simulation in SQLite

SQLite is so light, it doesn’t have any useful string functions. It doesn’t have ASCII(), LPAD(), RPAD(), REPEAT(), POSITION(), you name it. It does, however, have a wonderful RANDOMBLOB() function. So if you really need a good random number generator, use a SQLite database and generate a 1GB blob. That should give you a couple of random numbers for the next years.

For a full (or rather, empty) list see the SQLite function reference here:
http://www.sqlite.org/lang_corefunc.html

Function Simulation: REPEAT()

Not having any functions doesn’t mean that you can’t simulate them. You can. Take REPEAT(), for instance. Apart from the RANDOMBLOB(), you can also generate a ZEROBLOB(). It’s a blob with lots of zeros in it. But you can’t just go and do this:

-- Simulate REPEAT('abc', 3)
replace(zeroblob(3), 0, 'abc')

That would be too easy. The problem with the zeroblob is, that when cast to a string, it is actually a zero-terminated string. Quite usual when programming in C. But hey, the first character is a zero, so the resulting string is terminated right at the beginning. How useful is that??

But here’s a trick, QUOTE() the ZEROBLOB(). That would escape all characters in hex format. In other words:

quote(zeroblob(3)) yields X'000000'

Nice. Now we’ve got three extra letters around twice as many zeroes as we wanted. So we’ll simply do this

-- Simulate REPEAT('abc', 3)
replace(substr(quote(zeroblob(2)), 3, 3), '0', 'abc')

-- Or more generally: X = 'abc', Y = 3
replace(substr(quote(zeroblob((Y + 1) / 2)), 3, Y), '0', X)

Doesn’t that start to make fun? Note, I have documented this simulation also here:
http://stackoverflow.com/questions/11568496/how-to-simulate-repeat-in-sqlite

Function Simulation: LPAD() and RPAD()

REPEAT() was easy. But REPEAT() was inspired by LPAD() and RPAD(), which is similar to REPEAT(), except that a character is padded to the left or right of another string, until a given length of the resulting string is reached. ZEROBLOB() will help us again! Let’s consider RPAD():

-- Simulate RPAD('abc', 7, '-')
'abc' || replace(replace(substr(quote(zeroblob(4)), 3, 4), '''', ''), '0', '-')

-- Or more generally:
-- RPAD() Using X = 7, Y = '-', Z = 'abc'
Z || replace(
       replace(
         substr(
           quote(zeroblob((X + 1) / 2)), 
           3, (X - length(Z))
         ), '''', ''
       ), '0', Y
     )

-- LPAD() Using X = 7, Y = '-', Z = 'abc'
replace(
  replace(
    substr(
      quote(zeroblob((X + 1) / 2)), 
      3, (X - length(Z))
    ), '''', ''
  ), '0', Y
) || Z

Now if this isn’t funky! This was actually something, I didn’t come up with myself. This was an answer I was given on Stack Overflow, where great minds spend lots of spare time on weird problems like this:
http://stackoverflow.com/questions/6576343/how-to-simulate-lpad-rpad-with-sqlite

Of course, these simulations will be part of the next version of jOOQ, so you don’t have to worry any longer about how to do LPAD(), RPAD(), and REPEAT().

Simulation of TRUNC() in Derby

Derby is missing out a lot of functions from the set of functions that other databases usually provide. One example is the TRUNC(value, decimals) function. According to the Wikipedia, truncation can be achieved as such:

-- trunc(x, n) 
CASE WHEN x > 0 
THEN
  floor(power(10, n) * x) / power(10, n) 
ELSE 
  ceil(power(10, n) * x) / power(10, n) 
END

Unfortunately, there is no POWER(base, exponent) function in Derby either. But no problem, we can simulate that as well. Let’s consider the Wikipedia again and we’ll find:

power(b, x) = exp(x * ln(b))

If we substitute that into the original simulation, we get for Derby:

-- trunc(x, n) 
CASE WHEN x > 0 
THEN 
  floor(exp(n * ln(10)) * x) / exp(n * ln(10))
ELSE 
  ceil(exp(n * ln(10)) * x) / exp(n * ln(10)) 
END

Verbose, probably quite inefficient, but effective! Let’s run a short test, too:

create table test (x numeric(10, 5), n int);

insert into test values (11.111, 0);
insert into test values (11.111, 1);
insert into test values (11.111, 2);
insert into test values (11.111, -1);

select
  x, n, 
  case when x >= 0
  then
    floor(exp(n * ln(10)) * x) / exp(n * ln(10))
  else
    ceil(exp(n * ln(10)) * x) / exp(n * ln(10))
  end "trunc(x, n)"
from test;

The above yields

X N TRUNC(X, N)
11.111 0 11
11.111 1 11.1
11.111 2 11.11
11.111 -1 10

How to simulate MySQL’s INSERT statement extensions

I have previously posted about the SQL MERGE statement, and how powerful it is here:
https://blog.jooq.org/2011/11/29/arcane-magic-with-the-sql2003-merge-statement/

Unfortunately, not all databases support this statement. Also, very often it is quite a bit of overkill, when what you want to do is to simply INSERT or UPDATE a single record, depending on whether it already exists.

MySQL’s nice INSERT statement extensions

MySQL has a couple of very nice extensions to the INSERT statement. Some of them are also supported by the CUBRID database. Let’s have a look at two features of the (simplified) syntax definition:

INSERT [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    VALUES (expr,...),(...),...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

 
Essentially, we’re looking at two keywords / clauses:

  1. IGNORE: This will cause the INSERT statement to silently fail with a warning, if we’re about to insert a duplicate key
  2. ON DUPLICATE KEY UPDATE: This will cause the INSERT statement to perform an update on all affected records, if we’re about to insert a duplicate key

The full documentation can be seen here:
http://dev.mysql.com/doc/refman/5.5/en/insert.html

How jOOQ models these statements in other dialects

jOOQ’s API conveniently allows for using those MySQL features directly in insert statements. An example

// Insert an author who has recently married and thus
// changed their last name. If we already have that author, then
// update their last name:

create.insertInto(AUTHOR, AUTHOR.ID, AUTHOR.LAST_NAME)
      .values(3, "Smith")
      .onDuplicateKeyUpdate()
      .set(AUTHOR.LAST_NAME, "Smith")
      .execute();

// Alternatively, we may not care if the author already
// exists, we just want to insert them without failure:

create.insertInto(AUTHOR, AUTHOR.ID, AUTHOR.LAST_NAME)
      .values(3, "Smith")
      .onDuplicateKeyIgnore()
      .execute();

Note how jOOQ doesn’t model the IGNORE keyword as a keyword between INSERT [IGNORE] INTO. This keeps the public API a bit more concise, while at the same time being more consistent with pre-existing functionality, both in the INSERT syntax, as well as in the Loader API:

https://blog.jooq.org/2011/08/23/loading-csv-data-with-jooq-2/

Now, databases that do not support these syntax elements can simulate them using MERGE. The above statements will render something like this on an Oracle database:

-- ON DUPLICATE KEY UPDATE statement
merge into "AUTHOR" 
using (select 1 from dual) 
on ("AUTHOR"."ID" = 3) 
when matched then update set "LAST_NAME" = 'Smith' 
when not matched then insert ("ID", "LAST_NAME") values (3, 'Smith')

-- ON DUPLICATE KEY IGNORE statement
merge into "AUTHOR" 
using (select 1 from dual) 
on ("AUTHOR"."ID" = 3) 
when not matched then insert ("ID", "LAST_NAME") values (3, 'Smith')

Conclusion

These practices allow you to write short and concise code using the jOOQ API while still being able to target several backend databases optimally using efficient SQL statements. This will help you increase both your development as well as your runtime performance!