Use MySQL’s Strict Mode on all new Projects!

Posted on November 20, 2014November 1, 2014 by lukaseder

MySQL is a database that has been bending the SQL standard in ways that make it hard to move off MySQL. What may appear to be a clever technique for vendor lockin (or maybe just oversight of the standard) can be quite annoying in understanding the real meaning of the SQL language. One such example is MySQL’s interpretation of how GROUP BY works. In MySQL, unlike any other database, you can put arbitrary expressions into your SELECT clause, even if they do not have a formal dependency on the GROUP BY expression. For instance:


SELECT employer, first_name, last_name
FROM employees
GROUP BY employer

This will work in MySQL, but what does it mean? If we only have one resulting record per employer, which one of the employees will be returned? The semantics of the above query is really this one:


SELECT employer, ARBITRARY(first_name), ARBITRARY(last_name)
FROM employees
GROUP BY employer

If we assume that there is such an aggregation function as ARBITRARY(). Some may claim that this can be used for some clever performance “optimisation”. I say: Don’t. This is so weakly specified, it is not even clear if the two references of this pseudo-ARBITRARY() aggregate function will produce values from the same record. Just look at the number of Stack Overflow questions that evolve around the “not a GROUP BY expression” error:

I’m sure that parts of this damage that has been caused to a generation of SQL developers is due to the fact that this works in some databases.

ONLY_FULL_GROUP_BY

But there is a flag in MySQL called ONLY_FULL_GROUP_BY, and Morgan Tocker, the MySQL community manager suggests eventually turning it on by default. MySQL community members tend to agree that this is a good decision in the long run. While it is certainly very hard to turn this flag on for a legacy application, all new applications built on top of MySQL should make sure to turn on this flag. In fact, new applications should even consider turning on “strict SQL mode” entirely, to make sure they get a better, more modern SQL experience. For more information about MySQL server modes, please consider the manual: https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html

Published by lukaseder

I made jOOQ View all posts by lukaseder

9 thoughts on “Use MySQL’s Strict Mode on all new Projects!”

Maaartinus says:

November 20, 2014 at 23:55

Sure, MySql without the strict mode is a mess. However, this ARBITRARY would be a useful function:

1. Sometimes you really don’t care which first_name you get. Maybe you’re just inspecting the data, rather than writing production code.

2. It saves you from repeating all the columns if first_name and last_name are functions of employer (i.e., employer is a unique key or the table is denormalized).

3. Sometimes, even if none the above holds, you may be satisfied with an arbitrary result. Once, I had a table, where there were various (mis)spellings of “first_name” for one employer. I could not fix it as the table had to exactly represent imported data and in the result I needed a unique “employer”. So I resorted to MIN, whereby ARBITRARY would do.

If ARBITRARY guaranteed to return corresponding first_name and last_name, it would provide a new functionality.

Note that there are other cases when the SELECT is non-deterministic (e.g., incomplete ORDER BY).

Loading...

Reply
1. lukaseder says:
  
  November 21, 2014 at 07:43
  Yes, I agree. The SQL standard refers to the SOME or ANY aggregate functions, named consistently with the quantified comparison predicate keywords:
```
<aggregate function> ::=
    COUNT <left paren> <asterisk> <right paren> [ <filter clause> ]
  | <general set function> [ <filter clause> ]
  | <binary set function> [ <filter clause> ]
  | <ordered set function> [ <filter clause> ]
  | <array aggregate function> [ <filter clause> ]

<general set function> ::=
    <set function type> <left paren> [ <set quantifier> ]
        <value expression> <right paren>

<set function type> ::=
    <computational operation>

<computational operation> ::=
    AVG
  | MAX
  | MIN
  | SUM
  | EVERY
  | ANY
  | SOME
  | COUNT
  | STDDEV_POP
  | STDDEV_SAMP
  | VAR_SAMP
  | VAR_POP
  | COLLECT
  | FUSION
  | INTERSECTION
```
  SOME and ANY appeared in SQL:1999. I haven’t seen this in the wild yet, though…
  
  Loading...
  Reply
2. morgo (@morgo) says:
  
  November 24, 2014 at 18:51
  
  When I’m just casually browsing data, I often use GROUP_CONCAT() to full see the set of values that match.
  
  As Lukas mentions, there is an any value function:
  http://dev.mysql.com/doc/refman/5.7/en/miscellaneous-functions.html#function_any-value
  
  ANY_VALUE() is also helpful to retrofit to queries where the illegal column grouping is not known to cause application problems, and there is a desire to change the sql mode to be strict.
  
  Loading...
  
  Reply
  1. lukaseder says:
    
    November 24, 2014 at 21:37
    
    Interesting! I wasn’t aware of that function – thanks for sharing. Too bad it’s not simply called ANY, it would make MySQL the only database I’m aware of that implements this SQL standard aggregate function (see my comment) :)
    
    Loading...
    
    Reply
    1. Maaartinus says:
      
      November 25, 2014 at 05:07
      
      > Too bad it’s not simply called ANY
      
      Too bad, you can’t simply write
      
      val ANY = ANY_VALUE;
      
      to get what you want. ;)
      
      Loading...
      
      Reply
      1. lukaseder says:
        
        November 25, 2014 at 06:53
        
        Are you trolling, now? ;-)
        
        Loading...
        
        Reply
      2. Maaartinus says:
        
        November 25, 2014 at 08:40
        
        Actually, no. I really wonder, if there could be an language offering what SQL does without its rigidity. Something based on “standard” programming constructs with all its flexibility rather than pseudo-English sentences and tons of keywords.
        
        Loading...
        
        Reply
        
        lukaseder says:
        
        November 25, 2014 at 09:12
        
        It would probably be incredibly hard to implement declarative programming (i.e. cost-based optimisers and execution planners), once you’d make SQL a “standard” programming language… Or to put it in Winston Churchill’s words:
        
        SQL is the worst form of database querying. Except for all the other forms
        
        Loading...
        
        Reply
  2. lukaseder says:
    
    November 25, 2014 at 06:57
    
    ANY_WAY()… (lame pun), I think we should implement support for that in jOOQ via https://github.com/jOOQ/jOOQ/issues/3798.
    
    For MySQL, this would be natively supported. In other databases, we can still resort to MAX()
    
    Loading...
    
    Reply

Use MySQL’s Strict Mode on all new Projects!

ONLY_FULL_GROUP_BY

Like this:

Published by lukaseder

9 thoughts on “Use MySQL’s Strict Mode on all new Projects!”

Leave a ReplyCancel reply

ONLY_FULL_GROUP_BY

Like this:

Published by lukaseder

9 thoughts on “Use MySQL’s Strict Mode on all new Projects!”

Leave a ReplyCancel reply

Discover more from Java, SQL and jOOQ.