It’s the Little Things: The PL/SQL NULL Statement, and why Every Language Should have One

Syntax is one of those topics. One of those emotional topics that lead to very very very important discussions. I personally like PL/SQL. It is extremely verbose, and precise. It forces you to adhere to a very strong and rigid type system, slowing you down, which is likely to help you avoid mistakes.

There is one thing in PL/SQL that I like in particular. There is no such thing as an empty block.

While in Java, you could write:

// Just an empty block:
{}

// An empty block with a label:
label1: {}

// Or, in fact, the empty statement:
;
label2: ;

The problem with the above from a mere syntactic perspective is that an empty block may have been left unintentionally empty. An empty statement may not even be visible at all. Consider this in the context of an if statement:

if (something) {

}

if (somethingElse) ;

In PL/SQL, this isn’t possible. There is no such thing as an empty block. The following doesn’t compile:

BEGIN
END;

Nope:

ORA-06550: line 2, column 1:
PLS-00103: Encountered the symbol "END" when expecting ...

Or, take the IF statement. No emptiness allowed here, either:

IF 1 = 1 THEN
END IF;

Doesn’t work.

If you DO want to create empty blocks (e.g. as placeholders for code you’ll write later on), you’ll have to explicitly put a dummy statement there. PL/SQL has such a statement. The NULL statement:

BEGIN
  NULL;
END;

IF 1 = 1 THEN
  NULL;
END IF;

That makes sense. You immediately see: OK, nothing going on here.

Conclusion

Verbosity helps decrease the number of bugs in your code. The less people can be concerned with syntax (see again, the discussion about very very very important topics), the more they can focus on what they really intend to write. Let’s swallow our pride. When we get used to a language, we’ll accept ANY language. They’re all flawed and quirky. It doesn’t matter. But at least, the language should keep us from arguing about different ways to style it.

Now, let me go reformat all that moron’s lower-case PL/SQL code. Who the hell would write lower-case begin and end!??

Should I Put That Table Alias or Not?

Infrequent SQL developers often get confused about when to put parentheses and/or aliases on derived tables. There has been this recent Reddit discussion about the subject, where user Elmhurstlol was wondering why they needed to provide an alias to the derived table (the subselect with the UNION) in the following query:

SELECT AVG(price) AS AVG_PRICE
FROM (
  SELECT price from product a JOIN pc b
  ON a.model=b.model AND maker='A'
  UNION ALL
  SELECT price from product a JOIN laptop c
  ON a.model=c.model and maker='A'
) hello

The question really was about why the "hello" table alias was necessary, because often it seems not to be required.

Here’s what the SQL standard states

If in doubt, it is often useful to consider the SQL standard about the rationale behind some syntax elements. In this case, let’s consider the freely available SQL 1992 standard text (for simplicity), and see how it specifies table references:

<table reference> ::=
    <table name> [ [ AS ] <correlation name>
        [ <left paren> <derived column list> 
          <right paren> ] ]
  | <derived table> [ AS ] <correlation name>
        [ <left paren> <derived column list>
          <right paren> ]
  | <joined table>

<derived table> ::= <table subquery>

<table subquery> ::= <subquery>

<subquery> ::= <left paren> <query expression>
               <right paren>

(for more information about the awesome and completely underused derived column list feature, read this article here)

The essence of the above syntax specification is this:

  • A derived table MUST always be aliased
  • The AS keyword is optional, for improved readability
  • The parentheses MUST always be put around subqueries

Following these rules, you’ll be pretty safe in most SQL dialects. Here are some deviations to the above, though:

  • Some dialects allow for unaliased derived tables. However, this is still a bad idea, because you will not be able to unambiguously qualify a column from such a derived table

Takeaway

Always provide a meaningful alias to your derived tables. As simple as that.

SQL Server Trick: Circumvent Missing ORDER BY Clause

SQL Server is known to have a very strict interpretation of the SQL standard. For instance, the following expressions or statements are not possible in SQL Server:

-- Get arbitrarily numbered row_numbers
SELECT ROW_NUMBER() OVER ()

-- Skip arbitrary rows
SELECT a
FROM (VALUES (1), (2), (3), (4)) t(a)
OFFSET 3 ROWS

Strictly speaking, that limitation makes sense because the above ROW_NUMBER() or OFFSET expressions are non-deterministic. Two subsequent executions of the same query might produce different results. But then again, any ORDER BY clause is non-deterministic, if you do not order by a strictly UNIQUE expression, such as a primary key.

So, that’s a bit of a pain, because other databases aren’t that strict and after all, you might just not care about explicit ordering for a quick, ad-hoc query, so a “reasonable”, lenient default would be useful.

Constant ORDER BY clauses don’t work

You cannot add a constant ORDER BY clause to window functions either. I.e.:

-- This doesn't work:
SELECT ROW_NUMBER() OVER (ORDER BY 'a')

-- But this does!
SELECT a
FROM (VALUES (1), (2), (3), (4)) t(a)
ORDER BY 'a'
OFFSET 3 ROWS

Note that ORDER BY 'a' uses a constant VARCHAR expression, not a numeric one, as that would be generating column-reference-by-index expressions, which would be non-constant in the second example.

Random column references don’t work

So you’re thinking that you can just add a random column reference? Sometimes you can, but often you cannot:

-- This doesn't work:
SELECT ROW_NUMBER() OVER (
  ORDER BY [no-column-available-here]
)

-- But this does!
SELECT a
FROM (VALUES (1), (2), (3), (4)) t(a)
ORDER BY a
OFFSET 3 ROWS

The above examples show that you do not always have a column reference available in any given SQL expression. There is no useful column that you could refer to from the ROW_NUMBER() function. At the same time, you can write ORDER BY a in the second example, but only if a is a “comparable” value, i.e. not a LOB, such as text or image.

Besides, as we don’t really care about the actual ordering, is it worth ordering the result set by anything at all? Do you happen to have an index on a?

Quasi-constant ORDER BY expressions do work

So, to stay on the safe side, if ever you need a dummy ORDER BY expression in SQL Server, use a quasi-constant expression, like @@version (or @@language, or any of these). The following will always work:

-- This always works:
SELECT ROW_NUMBER() OVER (ORDER BY @@version)

-- So does this:
SELECT a
FROM (VALUES (1), (2), (3), (4)) t(a)
ORDER BY @@version
OFFSET 3 ROWS

From the upcoming jOOQ 3.4, we’ll also generate such synthetic ORDER BY clauses that will help you simplify writing vendor-agnostic SQL in these edge-cases, as we believe that you simply shouldn’t think of these things all the time.

Feature Request for the JLS: Auto-Rethrow

Java 7 has eased some pain in the area of exception handling when the new try-with-resources and multi-catch syntaxes were introduced. The latter is very interesting from a syntax perspective because it is the only place in Java where formal union types are allowed, similar to what Ceylon offers. Remember, a union type A | B is a type has all the properties that both types share. For example:

try {
    // ...
}
catch (SQLDataException | SQLWarning e) {

    // Properties that both types share:
    System.out.println(e.getSQLState());
}

The above can be quite useful in cases when we simply want to re-throw a variety of similar but disjoint exceptions in the same way. But often, we want do this:

// Some code
try {
}

// Rethrow
catch (OurException e) {
    throw e;
}

// Wrap other types of exceptions in OurException
catch (Exception e) {
    throw new OurException(e);
}

That’s quite a few lines of code for something that we do all the time and that could be called “rethrow-or-wrap” pattern. How about if we could rewrite the above as such:

// Some code
try {
}

// Re-throw exceptions of type OurException
// ... and wrap all other types in OurException
throws OurException;

Without a catch clause, this would be exactly the same as our previous example. Rethrow or wrap. Java is a language that heavily recycles keywords, so a dedicated keyword is out of question. throws would be a good keyword here as it is somewhat similar to the throws clause of a method signature, i.e. the throws clause of the try block.

If we only want to rethrow some exceptions and write tailor-made catch blocks for other types of exceptions, we could still do the same:

// Some code
try {
}

// Re-throw exceptions of type OurException
throws OurException

// But deal with all other types here
catch (Exception e) {
    throw new OurException("Custom message", e);
}

Java should have a “rethrow Exception” statement. tweet this

What are your thoughts on this? Would such a syntax pull its own weight?

Why PostgreSQL is so Awesome

Just recently, I’ve blogged about PostgreSQL 9.3 having been released, which is awesome enough as PostgreSQL finally supports materialised views and updatable views. I have then blogged about PostgreSQL’s syntax being a mystery only exceeded by its power, as it allows for treating INSERT and UPDATE statements as table references, when used with the RETURNING clause. This is quite fancy, even if not very useful in everyday SQL.

But what I’m writing about today is colossal:

PostgreSQL predicates are just ordinary expressions

Let this sink in. To PostgreSQL, predicates are just ordinary expressions evaluating to the boolean type. This is documented in the SELECT syntax reference, here:
http://www.postgresql.org/docs/9.3/static/sql-select.html

Citing:

The optional WHERE clause has the general form

WHERE condition

where condition is any expression that evaluates to a result of type boolean […]

It struck me like lightning when I read this Stack Overflow question. You can put a predicate everywhere! Although I knew this before, I had never thought about how awesome this is! You can use predicates in the SELECT clause:

SELECT a, b, c = d, e IN (SELECT x FROM y)
FROM t

You can use predicates in the GROUP BY clause:

SELECT count(*)
FROM t
GROUP BY c = d, e IN (SELECT x FROM y)

You can use predicates in the ORDER BY clause:

SELECT *
FROM t
ORDER BY c = d, e IN (SELECT x FROM y)

You can aggregate predicates using the EVERY aggregate function.

Don’t believe it? See for yourself in this SQLFiddle!

“Ordinary” SQL

In “ordinary” SQL (i.e. standards-compliant), predicates have to be transformed into ordinary value expressions using the CASE clause. Repeating the above examples:

SELECT clause:

SELECT a, b,
       CASE WHEN c = d THEN true ELSE false END,
       CASE WHEN e IN (SELECT x FROM y)
            THEN true ELSE false END
FROM t

GROUP BY clause:

SELECT count(*)
FROM t
GROUP BY CASE WHEN c = d THEN true ELSE false END,
         CASE WHEN e IN (SELECT x FROM y)
              THEN true ELSE false END

ORDER BY clause:

SELECT *
FROM t
ORDER BY CASE WHEN c = d THEN true ELSE false END,
         CASE WHEN e IN (SELECT x FROM y)
              THEN true ELSE false END

Impact for jOOQ

The answer given in the above Stack Overflow question shows how jOOQ can standardise this behaviour by rendering predicates / conditions directly as column expressions where this is supported, while emulating this behaviour using an equivalent CASE expression, otherwise.

Take predicates in the SELECT clause, for instance:

DSL.using(configuration)
   .select(
       T.A, T.B,
       // Transform a jOOQ Condition into a Field:
       field(T.C.eq(T.D)),
       field(T.E.in(select(Y.X).from(Y)))
   )
   .from(T);

Further thoughts

From jOOQ integration test experience, I can say that Derby, H2, HSQLDB, MariaDB, MySQL, PostgreSQL, and SQLite will work this way. While this is awesome from a syntax perspective, do keep an eye on your execution plans to verify that this isn’t going to produce lots of very expensive nested loops…

Java, if this were a better world

Just a little dreaming about a better world, where some old blunders in the Java platform would’ve been corrected and some awesome missing features would’ve been implemented. Don’t get me wrong. I think Java is awesome. But it still has some issues, like any other platform.

Without any particular order, without claiming to be anything near exhaustive, and most importantly, without claiming to be well thought-through and entirely correct, I wish for these things:

Serialisability

Within an object, serialisability is the default. If you don’t want a member to be serialisable, you mark it “transient”. Why on earth do we have to add this silly marker interface “Serializable” to all of our classes?

All objects should be Serializable by default. Non-serialisability should be the “feature” that is marked explicitly

Of course, serialisability itself has a lot of weird details that I won’t go into, here

Cloning

Since all objects should be Serializable by default,

All objects should also be Cloneable by default. Non-cloneability should be the “feature” that is marked explicitly

Besides, shallow cloning is hardly ever useful. Hence

All objects should deep-clone themselves by default. Shallow cloning can be implemented explicitly

Note, the clone method should be some native method in java.lang.System or some other utility. It shouldn’t be on java.lang.Object, allowing client code to implement their proper interpretation of cloning, without any accidental name clashes.

Alternatively, similar private callback methods could be implemented, the same way that this is done for serialisation, if cloning should be customised.

Unsigned numbers

Why isn’t this part of Java?

There should be an unsigned version of all integer primitives, as well as java.lang.Number wrappers

Primitives

Primitives are a pain to support in APIs. int and Integer should be the same from a syntax perspective. int[] and Integer[] should be, too

Primitives and their wrappers should be better integrated in the language and in the JVM

This one is probably not really resolveable without giving up on the performance advantage that true primitives offer. See Scala…

Properties

Getters and setters aren’t really state-of-the-art.

Properties should be supported more formally

See also a recent article and its comments on this blog:

https://blog.jooq.org/2013/01/12/bloated-javabeans-part-ii-or-dont-add-getters-to-your-api/

Collections

The collection API should be better integrated with the language. Like in many other languages, it should be possible to dereference collection contents using square brackets and curly braces. The JSON syntax would be an obvious choice. It should be possible to write:

// Translates to new ArrayList<>(...);
List<Integer> list = [ 1, 2, 3 ];

// Translates to list.get(0);
Integer value = list[0];

// Translates to list.set(0, 3);
list[0] = 3;

// Translates to list.add(4);
list[] = 4;

// Translates to new LinkedHashMap<>(...);
Map<String, Integer> map = { "A": 1, "B": 2 }; 

// Translates to map.get(0);
Integer value = map["A"]

// Translates to map.put("C", 3);
map["C"] = 3;

ThreadLocal

ThreadLocal can be a nice thing in some contexts. Probably, the concept of ThreadLocal isn’t 100% sound, as it can cause memory leaks. But assuming that there were no problems,

threadlocal should be a keyword, like volatile and transient

If transient deserves to be a keyword, then threadlocal should be, too. This would work as follows:

class Foo {
  threadlocal Integer bar;

  void baz() {
    bar = 1;           // Corresponds to ThreadLocal.set()
    Integer baz = bar; // Corresponds to ThreadLocal.get()
    bar = null;        // Corresponds to ThreadLocal.remove()
  }
}

Of course, such a keyword could be applied to primitives as well

References

References are something weird in Java. They’re implemented as Java objects in the java.lang.ref package, but treated very specially by the JVM and the GC.

Just like for threadlocal, there should be keywords to denote a reference

Of course, with the introduction of generics, there is only little gain in adding such a keyword. But it still feels smelly that some classes are “very special” within the JVM, but not language syntax features.

Reflection

Please! Why on earth does it have to be so verbose?? Why can’t Java (Java-the-language) be much more dynamic? I’m not asking for a Smalltalk-kind of dynamic, but couldn’t reflection be built into the language somehow, as syntactic sugar?

The Java language should allow for a special syntax for reflection

Some pain-easing can be achieved on a library-level, of course. jOOR is one example. There are many others.

Interfaces

Interfaces in Java always feel very weird. Specifically, with Java 8’s extension methods, they start losing their right to exist, as they move closer to abstract classes. Of course, even with Java 8, the main difference is that classes do not allow multiple inheritance. Interfaces do – at least, they allow for multiple inheritance of specification (abstract methods) and behaviour (default methods), not for state.

But they still feel weird, mainly because their syntax diverges from classes, while their features converge. Why did the lambda expert group decide to introduce a default keyword?? If interfaces allow for abstract methods (as today) and concrete methods (defender methods, extension methods), why can’t interfaces have the same syntax as classes? I’ve asked the expert group with no luck:

http://mail.openjdk.java.net/pipermail/lambda-dev/2012-August/005393.html

Still, I’d wish that…

Interface syntax should be exactly the same as class syntax, wherever appropriate

This includes static methods, final methods, private methods, package-private methods, protected methods, etc.

Default visibility

Default visibility shouldn’t be specified by the absence of a private/protected/public keyword. First of all, this absence isn’t dealt with the same way in classes and interfaces. Then, it is not very readable.

Default visibility should be specified by a “package” or “local” or similar keyword

Literals

This would be an awesome addition in everyday work.

There should be list, map, regex, tuple, record, string (improved), range literals

I’ve blogged about this before:

https://blog.jooq.org/2012/06/01/array-list-set-map-tuple-record-literals-in-java/

Some ideas mentioned by Brian Goetz on the lambda-dev mailing list were found here:

http://mail.openjdk.java.net/pipermail/lambda-dev/2012-May/004979.html

#[ 1, 2, 3 ]                          // Array, list, set
#{ "foo" : "bar", "blah" : "wooga" }  // Map literals
#/(\d+)$/                             // Regex
#(a, b)                               // Tuple
#(a: 3, b: 4)                         // Record
#"There are {foo.size()} foos"        // String literal

I’ll add

#(1..10)                              // Range (producing a List)

Final

Methods, attributes, parameters, local variables, they can all be declared as “final”. Immutability is a good thing in many ways, and should be encouraged (I’ll blog about this, soon). Other languages, such as Scala, distinguish the “val” and “var” keywords. In addition to those other languages’ impressive type inference capabilities, in most cases, val is preferred to var. If one wants to express a modifiable variable, they can still use “var”

Final should be the default behaviour for members, parameters and local variables

Override

It is dangerous to accidentally override a method. Other languages have solved this by causing compilation errors on overrides

An override keyword should be introduced to explicitly override a method

Some Java compilers (e.g. the Eclipse compiler) can be configured to emit a warning / error on the absence of the java.lang.Override annotation. However, this really should be a keyword, not an annotation.

Modules

Dependency management is a nightmare in Java. There is one other language, that builds compilation units in terms of modules: Fantom. Stephen Colebourne (the JodaTime guy) is a big fan of Fantom, and has held a speech at Devoxx. He also blogs about Fantom, from time to time:
http://blog.joda.org/search/label/fantom

A compilation unit should be expressed in the form of a “module” / jar file

This would of course make Maven obsolete, as the Java compiler could already handle dependencies much better.

Varargs and generics

Come on. @SafeVarargs?? Of course, this can never be resolved entirely correctly, due to generic type erasure. But still

There should be no generic type erasure

Tuples and Records

I really think that this is something missing from Java

There should be language support for tuples and records

Scala has integrated tuples up to a degree of 22, .NET supports tuples up to a degree of 8. This would be a nice feature in the Java language as well. Specifically, records (or structs) would be a nice thing to have. As mentioned before, there should be literals for tuples and records, too. Something along these lines:

#(a, b)                               // Tuple
#(a: 3, b: 4)                         // Record

Compiler

A compiler API that goes far beyond adding some annotation processing would be nice. I’d love to be able to extend Java-the-language itself. I’d like to embed SQL statements directly into Java code, similar to SQL being embeddable in PL/SQL. Of course, such SQL code would be backed by a library like jOOQ.

The compiler API should allow for arbitrary language extension

Of course, this improved compiler API should be done in a way that auto-completion, syntax highlighting and other features work automatically in IDEs like Eclipse, as the compiler extensions would be able to expose necessary artefacts to IDEs.

OK, I agree, this improvement is a lot of dreaming :-)

Type inference

If unambiguous, couldn’t type inference just be as powerful as Scala’s? I don’t want to write down the complete type of every local variable.

Scala’s local type inference should be supported

Operator overloading

OK, this is a highly religious topic. Many of you won’t agree. But I just like it

Java should support operator overloading

Some library operations are just better expressed using operators, rather than methods. Think of BigInteger and BigDecimal’s horribly verbose API.

Any other ideas? Add comments!

Of course, lambdas and extension methods are missing and generics are erased. While the latter will never be fixed, the first will be in Java 8. So lets forgive Sun and Oracle for making us wait so long for lambdas

Syntax for calling “super” in Java 8 Defender methods

This is a very interesting discussion. How to reference default methods from implemented interfaces throughout the class / interface hierarchy?

Situation:

interface K {
  int m() default { return 88; }
}

interface J extends K {
  int m() default { return K.super.m(); }
                        // ^^^^^^^^^^^^ How to express this?
}

Solution ideas:

  • K.super.m()
  • super.K.m()
  • ((K) super).m()
  • K::m()
  • K.default.m()
  • super<K>.m()
  • super(K).m()
  • super(K.class).m()
  • super[K].m()

Any other crazy ideas? See the discussion here:

http://mail.openjdk.java.net/pipermail/lambda-dev/2012-August/005616.html