MySQL Bad Idea #573

This is MySQL’s Bad Idea #573 (after #384, which I’ve blogged about before) I’ve just had a terrible experience with a bug report from the jOOQ User Group, related to escaping of backslashes in string literals in MySQL. First, I thought to myself, whatever. SQL doesn’t escape backslashes. The only escape character within a string literal according to the early SQL standards is the quote as in quote quote. Citing from SQL-1992 (slightly simplified):

<character string literal> ::=
    <quote> [ <character representation>... ] <quote>

<character representation> ::=
    <nonquote character>
  | <quote symbol><nonquote character> ::= !! See the Syntax Rules.
<quote symbol> ::= <quote><quote>

Alright? Crystal clear. There’s no escaping other than <quote><quote>

The only time when you will need to be able to escape something is with the LIKE predicate, in case you want to escape % and _ symbols. You can then use the ESCAPE clause:

<like predicate> ::=
    <match value> [ NOT ] LIKE <pattern>
    [ ESCAPE <escape character> ]

I have now learned, that MySQL (and of course MariaDB) unlike any other database also support quoting with backslashes, similar to Java and other languages. That’s not a problem per se, although from a cross-vendor compatibility perspective, it’s quite nasty. But then, I’ve discovered there is a flag called NO_BACKSLASH_ESCAPES:

This just reminds me of PHP’s horrible magic quotes. In fact, combine arbitrary configurations of PHP and MySQL on your server and good luck to you of ever getting string literals right. Sigh.

MySQL Bad Idea #384

MySQL is a database of compromise. Compromise between running a production-ready relational database and being popular with all sorts of hackers – mostly the ones that don’t really like SQL. And because they don’t really like SQL, they choose MySQL, as MySQL is very forgiving. It is just as forgiving as their favourite language PHP, which forgives their mistakes involving escaping and quoting through funny things like “magic quotes”. Not only is MySQL forgiving, it allows you to write “wrong” SQL and still does something with it. Here’s what I mean by “wrong” SQL:

In MySQL, you can legally execute the following statement:

SELECT o.custid, c.name, MAX(o.payment)
  FROM orders AS o, customers AS c
  WHERE o.custid = c.custid
  GROUP BY o.custid;

The statement was taken from here:
http://dev.mysql.com/doc/refman/5.6/en/group-by-hidden-columns.html

So what does this statement even mean? What will be returned in the c.name projection? MAX(c.name)? ANY(c.name)? FIRST(c.name)? NULL? 42? According to the documentation, ANY(c.name) would best describe what’s going on. This peculiar syntax is probably quite clever for those few that really know when this is useful. When they know exactly, that o.custid and c.name have a 1:1 correlation, and they can speed things up a little by avoiding writing things like MAX(c.name), or by adding c.name to the GROUP BY clause (“yes, saved yet another 8 characters”).

But the bulk of newbie MySQL users will be confused by this.

  • First, they will be confused because they don’t get the c.name they’d expect.
  • Secondly, they will eventually switch over to another database that gets these things right, and be frustrated all over again, over the funny syntax errors, such as ORA-00979 not a GROUP BY expression

So please,

  • MySQL users: stop using this non-feature. It will only cause pain and suffering, even if you know how/why it works. SQL’s GROUP BY was not meant to work that way.
  • MySQL: Deprecate this non-feature.