Hilarious Rant about SQL Injection


My recent article about SQL injection has stirred some serious emotions on JCG. I don’t want to keep it from you! An extract:

[…] The idea that if I use an ORM, my SQL injection woes will magically go away is f***ing harmful, shortsighted, and anybody who thinks that should be kicked squarely in a sensitive region. […]

And if you’ve survived that kick…

[…] Since there is no SQL statement, things like “‘a’; TRUNCATE your_mom” get stored/selected from as exactly that […]

And if your mom has survived truncation, too:

[…] ORM still won’t save you. Validate your god-d*** f***ing inputs, jack-a**.

So please, sanitise your code, or the angry man will come and get you!! 🙂
http://www.javacodegeeks.com/2012/07/database-abstraction-and-sql-injection.html#comment-603438572

When will we have LINQ in Java?


LINQ is one of Microsoft’s .NET Framework’s most distinct language features. When it was first introduced to languages such as C#, it required heavy changes to the language specification. Yet, this addition was extremely powerful and probably unequalled by other languages / platforms, such as Java, Scala, etc. Granted, Scala has integrated XML in a similar fashion into its language from the beginning, but that is hardly the same accomplishment. Nowadays, Typesafe developers are developing SLICK – Scala Language Integrated Connection Kit, which has similar ambitions, although the effort spent on it is hardly comparable: one “official” Scala developer against a big Microsoft team. Let alone the potential of getting into patent wars with Microsoft, should SLICK ever become popular.

What does Java have to offer?

There are many attempts of bringing LINQ-like API’s to the Java world, as the following Stack Overflow question shows:

http://stackoverflow.com/questions/1217228/what-is-the-java-equivalent-for-linq

Here’s another newcomer project by Julian Hyde, that I’ve recently discovered:

https://github.com/julianhyde/linq4j

He tried his luck on the lambda-dev mailing list, without any response so far. We’re all eagerly awaiting Java 8 and project lambda with its lambda expressions and extension methods. But when will we be able to catch up with Microsoft’s LINQ? After all, jOOQ, linq4j are “internal domain specific languages”, which are all limited by the expressivity of their host language (see my previous blog post about building domain specific languages in Java).

Java 9 maybe? We can only hope!

Database Abstraction and SQL Injection


I have subscribed to various user groups of jOOQ’s competing database abstraction tools. One of which is ActiveJDBC, a Java implementation of Active Record design pattern. Its maintainer Igor Polevoy recently claimed that:

SQL injection is a web application problem, and not directly related to an ORM. ActiveJDBC will process any SQL that is passed to it.

(See the discussion here: https://groups.google.com/d/topic/activejdbc-group/5D2jhWuW4Sg/discussion)

Is that really true? Should the database abstraction layer delegate SQL injection prevention to the client application?

SQL Injection Background

SQL injection is a problem that most of us developers have had to deal with one time or another in their professional lives. Wikipedia explains the problem nicely. Given the following piece of Java code (or any other language):

statement = "SELECT * FROM users WHERE name = '" + userName + "';"

Imagine that “userName” is a variable taken from an HTTP request. Blindly pasting an HTTP request parameter gives way to simple attacks as these:

-- attacker sends this code in the userName field:
userName = "a';DROP TABLE users; SELECT * FROM userinfo WHERE 't' = 't"

-- resulting in the following statement:
statement = "SELECT * FROM users WHERE name = 'a';"
          + "DROP TABLE users;" +
          + "SELECT * FROM userinfo WHERE 't' = 't';"

This doesn’t happen to you?

Maybe not. But the problem is seen quite often on Stack Overflow. More than 2000 results when searching for “SQL injection”: http://stackoverflow.com/search?q=sql+injection. So even if you know how to prevent it, someone on your team may not. OK, but …

it’s not that bad if one out of 500 statements is badly written by some programmer that was oblivious of this threat?

Think again. Have you ever heard of a tool called sqlmap? This tool will find any problematic page within your application within a couple of seconds/minutes/hours, depending on the severity of your injection problem. Not only that, once it has found problematic pages, it will be able to extract ALL kinds of data from your database. I mean ALL kinds of data. A selection of sqlmap features:

  • Support to enumerate users, password hashes, privileges, roles, databases, tables and columns.
  • Support to search for specific database names, specific tables across all databases or specific columns across all databases’ tables. This is useful, for instance, to identify tables containing custom application credentials where relevant columns’ names contain string like name and pass.
  • Support to download and upload any file from the database server underlying file system when the database software is MySQL, PostgreSQL or Microsoft SQL Server.
  • Support to execute arbitrary commands and retrieve their standard output on the database server underlying operating system when the database software is MySQL, PostgreSQL or Microsoft SQL Server.

Yes! If you suffer from SQL-injection-unsafe code, an attacker can seize your server under some circumstances!! In our company, we’ve tried sqlmap in a sandbox environment against some open source blog software with known vulnerabilities. We’ve managed to seize the server in no time, without writing a single line of SQL

Database Abstraction and SQL Injection

OK, now that I have your attention, let’s think again about what Igor Polevoy said:

SQL injection is a web application problem, and not directly related to an ORM. ActiveJDBC will process any SQL that is passed to it.

Yes, he may be right. Given that ActiveJDBC is a slim wrapper for JDBC, allowing to do nice CRUD simplifications, such as these (taken from their website):

List<Employee> people =
Employee.where("department = ? and hire_date > ? ", "IT", hireDate)
        .offset(21)
        .limit(10)
        .orderBy("hire_date asc");

Did you spot the risk for SQL injection? Right. Even if it uses bind values for underlying PreparedStatements, this tool is as unsafe as JDBC. You can avoid SQL injection, if you’re careful. Or you can start concatenating strings all over. But you have to be aware of that! How does jOOQ handle situations like these? The jOOQ manual explains how bind values are handled explicitly or implicitly. Here are some examples:

// Implicitly creating a bind value for "Poe"
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal("Poe"));

// Explicitly creating a (named) bind value for "Poe"
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal(param("lastName", "Poe")));

// Explicitly inlining "Poe" in the generated SQL string
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal(inline("Poe")));

The above examples will yield

SELECT * FROM T_AUTHOR WHERE LAST_NAME = ?
SELECT * FROM T_AUTHOR WHERE LAST_NAME = ?
SELECT * FROM T_AUTHOR WHERE LAST_NAME = 'Poe'

In the case where “Poe” is inlined, escaping is handled by jOOQ, to prevent syntax errors and SQL injection. But jOOQ also supports injecting SQL strings directly in generated SQL. For instance:

// Inject plain SQL into jOOQ
create.select()
      .from(T_AUTHOR)
      .where("LAST_NAME = 'Poe'");

In this case, SQL injection can occur just as with JDBC

Conclusion

In essence, Igor is right. It is the (client) application developer’s responsibility to be aware of SQL injection issues created by their code. But if a database abstraction framework built on top of JDBC can avoid SQL injection as much as possible in its API, all the better. From a SQL injection perspective, database abstraction frameworks can be divided into three categories:

  • Simple utilities. These include Spring’s JdbcTemplate, or Apache’s DbUtils. They really just enhance the JDBC API with convenience (less exception handling, less verbosity, simpler variable binding, simpler data fetching). Of course, these tools won’t prevent SQL injection
  • Full SQL abstraction. These include jOOQ, JaQu, JPA’s CriteriaQuery and others. Their normal operation mode will always render bind values in generated SQL. This prevents SQL injection in most cases.
  • The others. Many other frameworks (including ActiveJDBC and Hibernate) are mainly based on (SQL or HQL) string operations. While they abstract many SQL-related things, they do not prevent SQL injection at all.

So, when you choose any SQL abstraction tool in your Java application, beware of the severity of SQL injection. And beware of the fact, whether your tool helps you prevent it or not!

Igor’s response

Note that Igor has posted this interesting response to this post here:

http://igorpolevoy.blogspot.ch/2012/07/defend-against-sql-injection-using.html

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: MySQL


When you’re spoiled with Oracle’s fabulous query transformation capabilities and its really well-done cost-based optimiser, then you might forget how difficult SQL query tuning used to be in the “old days” or with those less sophisticated databases. Here’s a really nice explanation of the various means of implementing an ANTI-JOIN in MySQL:

http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/

Open source user rants


So far, I have escaped jOOQ user rants and insults. Maybe it’s because jOOQ is still quite a niche product. Maybe it’s because jOOQ has almost no bugs 😉 The only real rant I’ve seen so far is this one by a contributor to JDO, JPA, EJB 3.0:

http://erix-data-services.blogspot.ch/2010/10/jooq.html

An extract:

How should we react when reading posts like this in 2010?
It’s like being back 15 years earlier.
How is that possible to write so many irrelevant statements in a single post?

The so-called “pragmatic” approach has given birth to so many useless frameworks like this one.

The good news is there is still a need for a comprehensive data manipulation solution.

Apparently, jOOQ stirs emotions in some people 😉

Anyway, there’s critique, there are rants and then, there are plain insults. I’ve recently stumbled across this hilarious thread here:

https://bugs.php.net/bug.php?id=50696

Quite nice 🙂 What’s your favourite rant by an open source user?

“NoSQL” should be called “SQL with alternative storage models”


Time and again, you’ll find blog posts like this one here telling you the same “truths” about SQL vs. NoSQL:

http://onewebsql.com/blog/no-sql-do-i-really-need-it
(OneWebSQL being a competitor of jOOQ, see a previous article for a comparison)

Usually, those blogs aim for the same arguments being:

  • Performance (“SQL” can “never” scale as much as “NoSQL”)
  • ACID (you don’t always need it)
  • Schemalessness (just store any data)

For some funny reason, all of these ideas have led to the misleading term “NoSQL”, which is interpreted by some as being “no SQL”, by others as being “not only SQL”. But SQL really just means “Structured Query Language”, and it is extremely powerful in terms of expressing relational context. It is well-designed for ad-hoc creation of tuples, records, tables, sets and for mapping them to other projections, reducing them to custom aggregations, etc. Note the terms “map/reduce”, which are often employed by NoSQL evangelists.

For good reasons, the Facebook Query Language (FQL), one of the leading NoSQL query languages, closely resembles SQL although it operates on a completely different data model. Oracle too, has jumped on the “NoSQL” train and sells its own product. It won’t be very long until the two types of data storage will merge and can be queried by an ISO/IEEE standardised SQL:2015 (or so). Because the true spirit of “NoSQL” does not consist in the way data is queried. It consists in the way data is stored. NoSQL is all about data storage. So, sooner or later, you will just create “traditional” tables along with “graph tables” and “hashmap tables” in the same database and join them in single SQL queries without thinking much about today’s hype.

“NoSQL” should be called “SQL with alternative storage models” and queried with pure SQL!

Funky String Function Simulation in SQLite


SQLite is so light, it doesn’t have any useful string functions. It doesn’t have ASCII(), LPAD(), RPAD(), REPEAT(), POSITION(), you name it. It does, however, have a wonderful RANDOMBLOB() function. So if you really need a good random number generator, use a SQLite database and generate a 1GB blob. That should give you a couple of random numbers for the next years.

For a full (or rather, empty) list see the SQLite function reference here:
http://www.sqlite.org/lang_corefunc.html

Function Simulation: REPEAT()

Not having any functions doesn’t mean that you can’t simulate them. You can. Take REPEAT(), for instance. Apart from the RANDOMBLOB(), you can also generate a ZEROBLOB(). It’s a blob with lots of zeros in it. But you can’t just go and do this:

-- Simulate REPEAT('abc', 3)
replace(zeroblob(3), 0, 'abc')

That would be too easy. The problem with the zeroblob is, that when cast to a string, it is actually a zero-terminated string. Quite usual when programming in C. But hey, the first character is a zero, so the resulting string is terminated right at the beginning. How useful is that??

But here’s a trick, QUOTE() the ZEROBLOB(). That would escape all characters in hex format. In other words:

quote(zeroblob(3)) yields X'000000'

Nice. Now we’ve got three extra letters around twice as many zeroes as we wanted. So we’ll simply do this

-- Simulate REPEAT('abc', 3)
replace(substr(quote(zeroblob(2)), 3, 3), '0', 'abc')

-- Or more generally: X = 'abc', Y = 3
replace(substr(quote(zeroblob((Y + 1) / 2)), 3, Y), '0', X)

Doesn’t that start to make fun? Note, I have documented this simulation also here:
http://stackoverflow.com/questions/11568496/how-to-simulate-repeat-in-sqlite

Function Simulation: LPAD() and RPAD()

REPEAT() was easy. But REPEAT() was inspired by LPAD() and RPAD(), which is similar to REPEAT(), except that a character is padded to the left or right of another string, until a given length of the resulting string is reached. ZEROBLOB() will help us again! Let’s consider RPAD():

-- Simulate RPAD('abc', 7, '-')
'abc' || replace(replace(substr(quote(zeroblob(4)), 3, 4), '''', ''), '0', '-')

-- Or more generally:
-- RPAD() Using X = 7, Y = '-', Z = 'abc'
Z || replace(
       replace(
         substr(
           quote(zeroblob((X + 1) / 2)), 
           3, (X - length(Z))
         ), '''', ''
       ), '0', Y
     )

-- LPAD() Using X = 7, Y = '-', Z = 'abc'
replace(
  replace(
    substr(
      quote(zeroblob((X + 1) / 2)), 
      3, (X - length(Z))
    ), '''', ''
  ), '0', Y
) || Z

Now if this isn’t funky! This was actually something, I didn’t come up with myself. This was an answer I was given on Stack Overflow, where great minds spend lots of spare time on weird problems like this:
http://stackoverflow.com/questions/6576343/how-to-simulate-lpad-rpad-with-sqlite

Of course, these simulations will be part of the next version of jOOQ, so you don’t have to worry any longer about how to do LPAD(), RPAD(), and REPEAT().