Prevent SQL Injection with SQL Builders Like jOOQ

As long as we allow ourselves to write string-based dynamic SQL embedded in other programming languages like Java, we will have a certain risk of being vulnerable to SQL injection. That’s a fact. Don’t believe it? Check out this website exposing all vulnerabilities on Stack Overflow for PHP questions:

https://laurent22.github.io/so-injections

In a previous blog post, I’ve shown how fatal such a single vulnerability can be, if discovered. A lot of blog posts out there warn about the potential of attackers injecting a DROP DATABASE statement.

18mpenleoksq8jpg

Also, everyone knows this famous xkcd:
https://xkcd.com/327

But in my opinion, a much more important threat is not immediate damage to your system, but data leakage. Using tools like sqlmap, every script kiddie can download your credit card information and other sensitive data from your database.

The best remedy: Avoid string-based embedded SQL

The best remedy is to always avoid string-based embedded SQL, whenever you can. I.e. try not to do things like this too often:

try (PreparedStatement s = c.prepareStatement(
    "SELECT first_name, last_name "
  + "FROM users "
  + "WHERE user_id = ? ")) {

    s.setInt(1, userId);
    try (ResultSet rs = s.executeQuery()) {
        ...
    }
}

Sure, there is currently nothing wrong with the above Java code. There is no vulnerability as we’re using a bind variable. But the risk of some developer not paying attention and accidentally adding a vulnerability when adding another predicate is too high.

Fine, so we’re not using string-based embedded SQL. But what are the alternatives?

Use a query DSL like jOOQ to build your dynamic SQL

APIs like jOOQ help you build SQL statements in a type safe, composable way as if the Java language actually understood SQL. The previous JDBC prepared statement now translates to the following:

Result<?> result =
ctx.select(USERS.FIRST_NAME, USERS.LAST_NAME)
   .from(USERS)
   // Bind variable embedded in statement
   .where(USERS.USER_ID.eq(userId))
   .fetch();

In jOOQ, the underlying JDBC PreparedStatement is created transparently and the userId bind variable is placed right in the middle of the statement so you don’t have to worry about the boring details. There’s no way you can have any accidental SQL injection vulnerability in such jOOQ API calls, because every SQL clause and expression is part of an expression tree that is managed by jOOQ.

And what’s best: The Java compiler can now type check your SQL statement to a certain degree. This is a huge benefit in terms of productivity and code quality.

Of course, SQL builders aren’t a perfect shield for SQL injection, as they usually expose some API to insert custom SQL strings. For instance, in jOOQ, you can write:

Result<?> result =
ctx.select(USERS.FIRST_NAME, USERS.LAST_NAME)
   .from(USERS)
   // Bind variable embedded in "plain SQL" string
   .where("user_id = ?", userId)
   .fetch();

This alternative API usage is just as convenient as the previous one. The bind variable is located right where it is needed (not applied later on a PreparedStatement), but of course this is again string-based embedded SQL, which is potentially vulnerable if you get it wrong.

The important thing here is that the string-based method is the exception, which is used only very rarely when you need a very advanced SQL feature that is not supported by jOOQ. By default, you’re using the “save” approach without strings. And since jOOQ 3.9, you can also add a type checker that generates a compilation warning or error as soon as you’re using jOOQ’s “plain SQL” API. Read more about that here:
https://blog.jooq.org/2016/05/09/jsr-308-and-the-checker-framework-add-even-more-typesafety-to-jooq-3-9

Use views and stored procedures

The safest and least intrusive way to prevent such problems, of course, is not to use embedded SQL at all, but to use views and/or stored procedures instead. By moving and storing the SQL statements into the database, you get a few advantages in addition to the lack of SQLi vulnerability:

  • The database can type-check your statement in its entirety
  • You can easily reuse a statement in different applications, e.g. not all written in Java
  • You could even revoke grants to the underlying tables and grant access only to these views and stored procedures, which adds another layer of security to your system

If you’re writing PL/SQL in Oracle, the previous statement would be transformed to something like this:

FOR rec IN (
  SELECT first_name, last_name
  FROM users
  -- Bind variable again embedded in SQL statement
  WHERE user_id = l_user_id
) LOOP
  ...
END LOOP;

Just like with jOOQ, you can easily embed the SQL statement in the “host language”, except that this time, it isn’t Java but PL/SQL. There’s absolutely no way the above statement will ever be vulnerable to SQL injection, as the statement isn’t even a dynamic SQL statement anymore.

Again, you can write dynamic SQL also in PL/SQL (or other database’s stored procedure languages), e.g. by using EXECUTE IMMEDIATE, but the important thing is again that dynamic, string-based, embedded SQL is the exception not the default. That’s the important thing here!

Now, if you do decide to use stored procedures, then jOOQ is again here to help you, as you can easily call that procedure from jOOQ, again in a type safe way. More about that in this blog post:
Painless Access from Java to PL/SQL Procedures with jOOQ

Beware! SQL isn’t the only language vulnerable to injection

Vlad Mihalcea displayed an equally important threat to JPA based applications. Scroll down in his blog post to find him mentioning JPQL injection:
A beginner’s guide to SQL injection and how you should prevent it

Yes, if you’re doing something silly as:

public List<Post> getPostsByTitle(String title) {
    return doInJPA(entityManager -> {
        return entityManager.createQuery(
            "select p " +
            "from Post p " +
            "where" +
            "   p.title = '" + title + "'", Post.class)
        .getResultList();
    });
}

… then an attacker can inject any sort of JPQL code into the title variable. The possibilities are a bit more limited than with SQL injection, but it is still perfectly possible to read and dump the entire database (including credit card information, remember?) from such a vulnerability.

Again, if you’re doing this quite often, you should consider switching to a more type safe way to write JPQL (e.g. using the the infamous Criteria API), or again switch to SQL and jOOQ.

Read Vlad’s post for more details:
A beginner’s guide to SQL injection and how you should prevent it

Conclusion

There are many external DSLs, like SQL, JPQL, XPath, regular expressions, and what not. Some of them are extremely powerful and they’re used to operate on sensitive data. Which means that if you leak control of the language outside of your application, you’re very vulnerable.

Vulnerabiliy mostly happens when you embed those external DSLs into Java in a string-based form. The best remedies are:

  • Use an internal DSL that models the external DSL instead
  • Keep the external DSL external, e.g. by using views and stored procedures

Both of these approaches work because they both avoid strings.

JSR-308 and the Checker Framework Add Even More Typesafety to jOOQ 3.9

Java 8 introduced JSR-308, which added new annotation capabilities to the Java language. Most importantly: Type annotations. It is now possible to design monsters like the below:

The code displayed in that tweet really compiles. Every type can be annotated now, in order to enhance the type system in any custom way. Why, you may ask? One of the main driving use-cases for this language enhancement is the Checker Framework, an Open Source library that allows you to easily implement arbitrary compiler plugins for sophisticated type checking. The most boring and trivial example would be nullability. Consider the following code:

import org.checkerframework.checker.nullness.qual.Nullable;

class YourClassNameHere {
    void foo(Object nn, @Nullable Object nbl) {
        nn.toString(); // OK
        nbl.toString(); // Fail
        if (nbl != null)
            nbl.toString(); // OK again
    }
}

The above example can be run directly in the Checker Framework live demo console. Compiling the above code with the following annotation processor:

javac -processor org.checkerframework.checker.nullness.NullnessChecker afile.java

Yields:

Error: [dereference.of.nullable] dereference of possibly-null reference nbl:5:9

That’s pretty awesome! It works in quite a similar way as the flow sensitive typing that is implemented in Ceylon or Kotlin, for instance, except that it is much more verbose. But it is also much much more powerful, because the rules that implement your enhanced and annotated Java type system can be implemented directly in Java using annotation processors! Which makes annotations turing-complete, in a way 😉

award

How does this help jOOQ?

jOOQ has shipped with two types of API documentation annotations for quite a while. Those annotations are:

  • @PlainSQL – To indicate that a DSL method accepts a “plain SQL” string which may introduce SQL injection risks
  • @Support – To indicate that a DSL method works either natively with, or can be emulated for a given set of SQLDialect

An example of such a method is the CONNECT BY clause, which is supported by Cubrid, Informix, and Oracle, and it is overloaded to accept also a “plain SQL” predicate, for convenience:

@Support({ CUBRID, INFORMIX, ORACLE })
@PlainSQL
SelectConnectByConditionStep<R> connectBy(String sql);

Thus far, these annotations were there only for documentation purposes. With jOOQ 3.9, not anymore. We’re now introducing two new annotations to the jOOQ API:

  • org.jooq.Allow – to allow for a set of dialects (or for the @PlainSQL annotation) to be used within a given scope
  • org.jooq.Require – to require for a set of dialects to be supported via the @Support annotation within a given scope

This is best explained by example. Let’s look at @PlainSQL first

Restricting access to @PlainSQL

One of the biggest advantages of using the jOOQ API is that SQL injection is pretty much a thing of the past. With jOOQ being an internal domain-specific language, users really define the SQL expression tree directly in their Java code, rather than a stringified version of the statement as with JDBC. The expression tree being compiled in Java, there’s no possibility of injecting any unwanted or unforeseen expressions via user input.

There is one exception though. jOOQ doesn’t support every SQL feature in every database. This is why jOOQ ships with a rich “plain SQL” API where custom SQL strings can be embedded anywhere in the SQL expression tree. For instance, the above CONNECT BY clause:

DSL.using(configuration)
   .select(level())
   .connectBy("level < ?", bindValue)
   .fetch();

The above jOOQ query translates to the following SQL query:

SELECT level
FROM dual
CONNECT BY level < ?

As you can see, it is perfectly possible to “do it wrong” and create a SQL injection risk, just like in JDBC:

DSL.using(configuration)
   .select(level())
   .connectBy("level < " + bindValue)
   .fetch();

The difference is very subtle. With jOOQ 3.9 and the Checker Framework, it is now possible to specify the following Maven compiler configuration:

<plugin>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.3</version>
    <configuration>
        <source>1.8</source>
        <target>1.8</target>
        <fork>true</fork>
        <annotationProcessors>
            <annotationProcessor>org.jooq.checker.PlainSQLChecker</annotationProcessor>
        </annotationProcessors>
        <compilerArgs>
            <arg>-Xbootclasspath/p:1.8</arg>
        </compilerArgs>
    </configuration>
</plugin>

The org.jooq.checker.PlainSQLChecker will ensure that no client code using API annotated with @PlainSQL will compile. The error message we’re getting is something like:

C:\Users\lukas\workspace\jOOQ\jOOQ-examples\jOOQ-checker-framework-example\src\main\java\org\jooq\example\checker\PlainSQLCheckerTests.java:[17,17] error: [Plain SQL usage not allowed at current scope. Use @Allow.PlainSQL.]

If you know-what-you’re-doing™ and you absolutely must use jOOQ’s @PlainSQL API at a very specific location (scope), you can annotate that location (scope) with @Allow.PlainSQL and the code compiles just fine again:

// Scope: Single method.
@Allow.PlainSQL
public List<Integer> iKnowWhatImDoing() {
    return DSL.using(configuration)
              .select(level())
              .connectBy("level < ?", bindValue)
              .fetch(0, int.class);
}

Or even:

// Scope: Entire class.
@Allow.PlainSQL
public class IKnowWhatImDoing {
    public List<Integer> iKnowWhatImDoing() {
        return DSL.using(configuration)
                  .select(level())
                  .connectBy("level < ?", bindValue)
                  .fetch(0, int.class);
    }
}

Or even (but then you might just turn off the checker):

// Scope: entire package (put in package-info.java)
@Allow.PlainSQL
package org.jooq.example.checker;

The benefits are clear, though. If security is very important to you (and it should be), then just enable the org.jooq.checker.PlainSQLChecker on each developer build, or at least in CI builds, and get compilation errors whenever “accidental” @PlainSQL API usage is encountered.

Restricting access to SQLDialect

Now, much more interesting for most users is the ability to check whether jOOQ API that is used in client code really supports your database. For instance, the above CONNECT BY clause is supported only in Oracle (if we ignore the not so popular Cubrid and Informix databases). Let’s assume you do work with Oracle only. You want to make sure that all jOOQ API that you’re using is Oracle-compatible. You can now put the following annotation to all packages that use the jOOQ API:

// Scope: entire package (put in package-info.java)
@Allow(ORACLE)
package org.jooq.example.checker;

Now, simply activate the org.jooq.checker.SQLDialectChecker to type check your code for @Allow compliance and you’re done:

<plugin>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.3</version>
    <configuration>
        <source>1.8</source>
        <target>1.8</target>
        <fork>true</fork>
        <annotationProcessors>
            <annotationProcessor>org.jooq.checker.SQLDialectChecker</annotationProcessor>
        </annotationProcessors>
        <compilerArgs>
            <arg>-Xbootclasspath/p:1.8</arg>
        </compilerArgs>
    </configuration>
</plugin>

From now on, whenever you use any jOOQ API, the above checker will verify that any of the following three yields true:

  • The jOOQ API being used is not annotated with @Support
  • The jOOQ API being used is annotated with @Support, but without any explicit SQLDialect (i.e. “works on all databases”), such as DSLContext.select()
  • The jOOQ API being used is annotated with @Support and with at least one of the SQLDialects referenced from @Allow

Thus, within a package annotated as such…

// Scope: entire package (put in package-info.java)
@Allow(ORACLE)
package org.jooq.example.checker;

… using a method annotated as such is fine:

@Support({ CUBRID, INFORMIX, ORACLE })
@PlainSQL
SelectConnectByConditionStep<R> connectBy(String sql);

… but using a method annotated as such is not:

@Support({ MARIADB, MYSQL, POSTGRES })
SelectOptionStep<R> forShare();

In order to allow for this method to be used, client code could, for instance, allow the MYSQL dialect in addition to the ORACLE dialect:

// Scope: entire package (put in package-info.java)
@Allow({ MYSQL, ORACLE })
package org.jooq.example.checker;

From now on, all code in this package may refer to methods supporting either MySQL and/or Oracle.

The @Allow annotation helps giving access to API on a global level. Multiple @Allow annotations (of potentially different scope) create a disjunction of allowed dialects as illustrated here:

// Scope: class
@Allow(MYSQL)
class MySQLAllowed {

    @Allow(ORACLE)
	void mySQLAndOracleAllowed() {
	    DSL.using(configuration)
		   .select()
		   
		   // Works, because Oracle is allowed
		   .connectBy("...")
		   
		   // Works, because MySQL is allowed
		   .forShare();
	}
}

As can be seen above, allowing for two dialects disjunctively won’t ensure that a given statement will work on either of the databases. So…

What if I want both databases to be supported?

In this case, we’ll resort to using the new @Require annotation. Multiple @Require annotations (of potentially different scope) create a conjunction of required dialects as illustrated here:

// Scope: class
@Allow
@Require({ MYSQL, ORACLE })
class MySQLAndOracleRequired {

    @Require(ORACLE)
	void onlyOracleRequired() {
	    DSL.using(configuration)
		   .select()
		   
		   // Works, because only Oracle is required
		   .connectBy("...")
		   
		   // Doesn't work because Oracle is required
		   .forShare();
	}
}

How to put this in use

Let’s assume your application only requires to work with Oracle. You can now put the following annotation on your package, and you will be prevented from using any MySQL-only API, for instance, because MySQL is not allowed as a dialect in your code:

@Allow(ORACLE)
package org.jooq.example.checker;

Now, as requirements change, you want to start supporting MySQL as well from your application. Just change the package specification to the following and start fixing all compilation errors in your jOOQ usage.

// Both dialects are allowed, no others are
@Allow({ MYSQL, ORACLE })

// Both dialects are also required on each clause
@Require({ MYSQL, ORACLE })
package org.jooq.example.checker;

Defaults

By default, for any scope, the following annotations are assumed by the org.jooq.checker.SQLDialectChecker:

  • Nothing is allowed. Each @Allow annotation adds to the set of allowed dialects.
  • Everything is required. Each @Require annotation removes from the set of required dialects.

See it in action

These features will be an integral part of jOOQ 3.9. They’re available simply by adding the following dependency:

<dependency>
    <!-- Use org.jooq            for the Open Source edition
             org.jooq.pro        for commercial editions, 
             org.jooq.pro-java-6 for commercial editions with Java 6 support,
             org.jooq.trial      for the free trial edition -->
	
    <groupId>org.jooq</groupId>
    <artifactId>jooq-checker</artifactId>
    <version>${org.jooq.version}</version>
</dependency>

… and then choosing the appropriate annotation processors to your compiler plugin.

Cannot wait until jOOQ 3.9? You don’t have to. Just check out the 3.9.0-SNAPSHOT version from GitHub and follow the example project given here:

https://github.com/jOOQ/jOOQ/tree/master/jOOQ-examples/jOOQ-checker-framework-example

Done! From now on, when using jOOQ, you can be sure that whatever code you write will work on all the databases that you plan to support!

I think that this year’s Annotatiomaniac Champion title should go to the makers of the Checker Framework:

award

Further reading about the Checker Framework:

“NoSQL Injection” – What 40000 Unsecured MongoDB Databases Mean for our Industry

The news is all over reddit

Major security alert as 40,000 MongoDB databases left unsecured on the internet

Security is a feature that is often neglected until it’s too late. And when it’s too late, it is often hard to bake it into a well-established architecture without major refactoring efforts.

Every system and thus also every database is always vulnerable. Most databases, however, do offer a significant amount of features to implement a security layer – and MongoDB is no different from any other DBMS here. So, how could this massive security hole happen?

Security is a cultural thing. Either, a company has security in their DNA, or it doesn’t. The same is true for scalability, or user experience, or any other aspect of software engineering. I’ve worked for companies that are at completely opposite ends of security awareness. Some (in the E-Banking field) were ultra-paranoiac, implementing thorough security checks in around 7 layers of the application. Others were rather lenient with management focusing much more on marketing than anything else. Without any empirical evidence, however, there was a certain correlation between security-awareness in a company and the backend-orientedness of the same company, E-Banking being a very backend-oriented business.

Backend developers are more security aware

This is an over-generalisation and probably doesn’t do justice to many excellent frontend developers out there, but security is where the data is. Where the algorithms are. Where people reason about constraints, workflows, batch jobs, accounting, money, … algorithms. These folks focus on all the users. On the system. And they want to protect it. On the flip side, they might neglect usability.

There is only little security-awareness where the user experience is. Where people reason about layout, formatting, usability, style, … user interfaces. These folks focus on single users. On their experience. And they want to make things easy for the user.

(and again, the same is true for scalability)

It is no coincidence that backend technology evolves extremely slowly. Java: 20 years and we’ve just finally gotten lambdas. SQL: 30 years and we still don’t have easy ways to reuse code.

At the same time, frontend technology evolves at the “speed of reddit”. The next hype is just 100 karma away, and we’ll throw all the previous tech out of the window, just to be part of the game.

Clearly, security is something that has to be reasoned about way too thoroughly for it to survive in the fast-paced frontend world.

What does MongoDB have to do with it?

The current event isn’t actually directly related to MongoDB (you could probably find just as many unprotected MySQL instances out there). But it strongly correlates with MongoDB’s sales and marketing strategies. MongoDB has done very aggressive and successful marketing in the past, claiming that the reign of the RDBMS is over – just as much as the reign of the RDBMS had been over before, when the astonishing object databases surfaced this planet. Well, we all know where object or XML databases went:

d8938bef47ea2f62ed0543dd9e35a483

This time, the anti-RDBMS marketing resonated mostly with frontend developers, obviously, because JSON is their favourite data representation format, and MongoDB promised to be able to store data directly from the DOM into the DB. Not only did this mean “the end of the DBA” for some software vendors, but many vendors also hoped that they could omit operations, and perhaps even backend development. What obviously worked well for prototyping and simple applications doesn’t scale well to applications with sensitive data.

The Solution

The solution is obvious. Homogeneity kills your business. You should hire a variety of different types of personnel. You should have skilled frontend developers, backend developers, operations people, DBA, and security experts on your team. You should make them work all together, hear each of their opinions, review each others’ code, learn from each other. Because each one of them has a strong focus and interest on an entirely different, yet equally important aspect of your application.

Do not neglect any of these aspects. Because if you do, and if it’s security, and if you lose sensitive customer data – well, you’re not going to stay in business, you’ll be sued in court.

Got hooked on the security topic?

Continue reading about …

Using SQL Injection Vulnerabilities to Dump Your Database

The threat caused by SQL injection is heavily underestimated even by many senior developers and software architects. Most people are unaware of the fact that an entire server can be at risk by a single vulnerability even in the remotest piece of logic. This article will give a frightening insight into the potential severity of SQL injection vulnerabilities.

What is SQL injection?

The Wikipedia article on SQL injection reads:

SQL injection is a code injection technique, used to attack data driven applications, in which malicious SQL statements are inserted into an entry field for execution.

In other words, if a website or some other software entity has a vulnerability, it is possible for an attacker to “inject” arbitrary pieces of SQL code for execution on the server. A straight-forward and popular example of such a vulnerability is when a dynamic SQL statement directly embeds user input such as the following Java code does:

01-sqli

The title variable could be user input, e.g. an HTML field. Nothing prevents this user input from being a SQL code snippet that would make perfect sense syntactically. For example:

02-sqli

It is easy to see that when substituted into the previous SQL statement, this “title” will result in always selecting all films. Another, famous example of how this can go wrong is xkcd’s famous Little Bobby Tables strip.

Automating the search for vulnerabilities

The above introduction shows that SQL injection is possible and quite simple to exploit, manually, if the server side source code is well known. Finding such a vulnerability in a huge application with thousands of SQL statements, however, is a lot of work. Besides, there are much more subtle vulnerabilities than the obvious pattern matching search. sqlmap is an Open Source, GPLv2 licensed tool for automating such searches.

Let’s assume we have a simple RESTful application written in Java using Jetty and JAX-RS querying the MySQL Sakila example database, which exposes the above vulnerability at this URL:

http://localhost:8080/sql-injection-examples/vulnerable-services/films/alien

alien is a @PathParam value passed to this method, which returns all films titled %alien%:

03-sqli

Now, let’s download sqlmap and let it run against the above URL:

python sqlmap.py 
  -u localhost:8080/sql-injection-examples/vulnerable-services/films/

Note that sqlmap is implemented in Python 2.x. When we let this run, my server-side SQL logs show me that there are a couple of interesting SQL queries being executed:

04-sqli

sqlmap tries to inject all sorts of snippets that would help it discover if the vulnerable query is deterministic, whether the URL is stable, what database server type this is, if the vulnerability is inside a subquery, whether UNION clauses can be appended, etc. This is also nicely displayed in the sqlmap stdout log output:

05-sqli

With a total of 59 HTTP requests (among which 41 resulted in HTTP 500 errors), sqlmap was capable of detecting the nature of the vulnerability of my SQL statement, and it also figured out the database server and version.

Dumping data

This is not so impressive yet. With a lot of SQL knowledge and creativity, I might have figured this out myself. But sqlmap offers a lot of other interesting operation modes as can be seen in the user guide. Let us use the database navigation tools. For this, we’ll add the –dbs parameter on the same URL:

python sqlmap.py 
  -u localhost:8080/sql-injection-examples/vulnerable-services/films/ 
  --dbs

This will immediately dump the following databases:

06-sqli

Note that sqlmap logs everything it learns in a local SQLite database, such that it can keep the number of HTTP requests as small as possible. This is important for an attacker, as frequent HTTP 404 or 500 statuses will eventually trigger attention by maintenance personnel at the server side.

But how did it find the actual database names through my vulnerable SQL statement? In three steps:

  1. It sent a first statement to check if the vulnerability still exists
  2. It sent a second statement to see how many databases there are (8)
  3. It sent a third statement to learn the name of each database

Let’s look at step 2. The query executed is this one:

07-sqli

Because sqlmap previously discovered that my naïve server implementation just dumps stack traces of HTTP 500 errors, it knew that it could generate the following MySQL error message and convey the information of interest hidden inside that message:

com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: 
  Duplicate entry 'qnfwq8qdmiq1' for key 'group_key'

The SQL concat function wraps the number 8 with two uniquely identifiable strings. With a similar query and the MySQL LIMIT clause, database names can then be extracted one by one, e.g. the Sakila database:

com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: 
  Duplicate entry 'qnfwqsakilaqdmiq1' for key 'group_key'

Again, the information of interest is wrapped between delimiters.

Dumping data without stack traces

Obviously, dumping stack traces in a productive system is really a bad idea, as you should never give an intruder any hints about how your system works. Let’s try to change our application such that it will display a simple 500 Internal Server Error message:

08-sqli

No problem for sqlmap. When I rerun my previous database dumping command, sqlmap generates all database names letter by letter. Now, that it has no way of producing error message output in HTTP responses, it can only do a binary search on each letter of a schema and see if any given search still produces the regular list of Sakila films or an empty list:

09-sqli

The above query checks if the first schema (LIMIT 0, 1) has a letter higher than ASCII code 120 (ORD) at position 13 (MID). This is a lot slower and more visible in server logs than before, but it can still produce the result.

Dumping data without displaying UI output

What if the vulnerability is deep down in our application, never producing UI output? Let’s change our service again, executing the vulnerable query but without returning any data:

10-sqli

My log file shows me that there are many more SQL statements executed by sqlmap and even the confirmation if there is a vulnerability at the given URL takes longer, as sqlmap now resorts to binary searches using delays. Here’s how sqlmap confirms that there is a vulnerability:

11-sqli

As can be seen, the above queries run for N or N + 5 seconds, where N is the amount of time the actual query takes. If N is sufficiently stable, delay-based binary searches can be performed to discover database names letter by letter.

Dumping sensitive data

The previous sections have shown that a vulnerability is almost always exploitable by automatic means. Intruders have time. They can run their script over a week to have sqlmap dump your complete database schema. But the threat doesn’t end there. Once table and column names are accessible, all of those tables can be dumped, too. Let’s dump the film table:

python sqlmap.py 
  -u localhost:8080/sql-injection-examples/vulnerable-services/films/ 
  --dump –T film

This takes very long, as each row and each column is fetched through at least one single query. But eventually, I get a nice CSV file of the complete contents. Needless to say that this could be your application’s credit card information, or other sensitive or classified data.

Arbitrary querying

You would think you’d notice if someone dumps your entire schema? That’s not necessary for the intruder. Remember, they have already been able to extract schema, table and column names. They can now selectively execute arbitrary queries as such:

python sqlmap.py 
  -u localhost:8080/sql-injection-examples/vulnerable-services/films/ 
  --sql-query=”SELECT a.first_name, a.last_name, count(*) 
               FROM film f 
               JOIN film_actor fa USING (film_id) 
               JOIN actor a USING (actor_id) 
               GROUP BY actor_id, a.first_name, a.last_name 
               ORDER BY count(*) DESC LIMIT 1”

The above query will find the actor who has played in most films:

12-sqli

Not all queries are syntactically injectable into all vulnerabilities on all RDBMS. But the Little Bobby Tables example is not what most intruders are after anyway: Causing damage to your system. Most intruders do not want to be seen in order to be able to silently steal your data.

Other sqlmap features

sqlmap doesn’t stop there. sqlmap allows you to switch database users, figuring out root or DBA passwords by brute force. Once you gain access to a user or role with higher grants and depending on the actual RDBMS, sqlmap will also allow you to:

  • Upload and execute a SQL script
  • Inject and execute custom user-defined functions
  • Read or write files on the database server file system
  • Open up a shell on the database server operating system
  • Manipulate the database server’s Windows registry

On a less serious note, sqlmap can be an excellent database server administration tool, should you have forgotten the credentials of your own local database development environment!

Do not leave SQL injection to chance.

Counteractions

In principle, it is almost impossible to completely prevent vulnerabilities involving SQL. SQL is a dynamically interpreted language and as such, it is much more vulnerable to code injection than static languages like Java. In particular, it is extremely vulnerable as it is a common practice to perform dynamic SQL based on user input criteria, such as search criteria. Obviously, other server-side interpreted languages are equally vulnerable, but SQL also happens to be one of the most popular ones.

However, taking the right measures will make it very hard for an intruder to actually find a vulnerability and exploit it without you noticing their activities first. Remember, the less information you display the longer sqlmap takes and the more traces it leaves in your server logs to even verify if a vulnerability is present.

Here are a couple of measures that you should follow and enforce in your team:

Never trust user input

First and most importantly: Never ever trust user input. Even if your application is only used by 10 users from the same company via intranet, your database might contain salaries or other sensitive data, and a malicious employee can dump such data.

Note that user input should not be trusted in other scenarios either, e.g. when accessing the file system.

Use as little user input as possible

Not only should you distrust user input, you should also use as little user input as possible directly in your SQL. For example, if users can make a choice in an HTML dropdown. Instead of simply embedding the HTTP POST parameter value into your SQL statement, parse it first and encapsulate it in an appropriate pre-defined type. In Java, a good match for HTML options could be an enum type.

You know your data best, and thus, you should validate all user input on the server, immediately after receiving it.

Use bind variables

Try to express your SQL statements as statically as possible. SQL string concatenation makes it very easy for junior developers to make mistakes. If you are an engineering team leader, it is your responsibility to help your team members avoid such mistakes.

The simplest way to prevent SQL injection is by using bind variables. JDBC drivers (if you’re operating with Java) and databases have very few bugs in that area, such that streaming bind variables to the database will not generate any easily exploitable vulnerability.

Use static code analysis tools

If you are using Java and JDBC directly, you may be able to detect some vulnerabilities using static code analysis tools, such as FindBugs™ or Alvor. Obviously, such tools can only detect SQL injection when SQL statements are “relatively static”. If you distribute SQL criteria string generation over several code modules, it is not possible for a static code analysis tool, to detect vulnerabilities.

Use sqlmap

sqlmap is not necessarily a tool for malicious activities. It is free and Open Source, available under the GPLv2 license. You can use it in your continuous integration test suites to detect SQL injection regressions. As you know your own software very well, you can configure sqlmap with a variety of parameters that will give it a “head start” as it will not need to figure out whether you are operating on MySQL, PostgreSQL, Oracle, etc.

Apply SQL abstraction

In addition to a very low-level API (JDBC in Java, native functions in PHP, etc.), most platforms also have a variety of higher-level APIs that abstract the SQL language in a non-string-based way. However, don’t be tricked into thinking that SQL language abstraction itself keeps you safe from code injection. Popular frameworks like Hibernate or JPA use string-based query languages like HQL or JPQL. These languages are less vulnerable than SQL because they’re less expressive. But they are still vulnerable!

In .NET an example of a non-string-based SQL abstraction is LINQ-to-SQL. In Java, examples are SQLJ, JPA CriteriaQuery or jOOQ. Our previous example query would translate to these ones:

Static SQL with SQLJ

Static SQL with SQLJ

Statically typed LINQ-to-SQL with LINQ

Statically typed LINQ-to-SQL with LINQ

Statically typed JPQL with JPA Criteria Query

Statically typed JPQL with JPA Criteria Query

Statically typed SQL with jOOQ

Statically typed SQL with jOOQ

In addition to being much safer from SQL injection through enforcing the use of bind variables, statically typed internal domain-specific languages also help prevent syntax mistakes. In the case of jOOQ, this is further supported by jOOQ’s source code generator, which will generate Java literals for tables and columns.

Carefully craft database GRANTs and security policies

Don’t just blindly use the MySQL root user with no sensible password for everything (or the postgres user in PostgreSQL, or the dbo user in SQL Server, etc.). Create a well-designed security strategy, where users are granted access only to those elements that they really need to access.

Often, a good idea is also to add an additional layer of security indirection inside your database through database views. For instance, web users are granted access to a few read-only views, never to tables directly. This will allow you to dynamically restrict who can manipulate what data.

This measure will not prevent SQL injection, but it will minimize the possible damage an intruder can cause, once they have penetrated your system.

Use firewalls

If you’re developing a web application, you can choose from a variety of firewalls that have some SQL injection protection features. Apart from simple regular-expression based pattern matching (which is not really reliable or useful), such an entry server should also support two very powerful features to help you prevent mistakes when accepting user input through HTTP GET and POST parameters:

  • URL encryption: All application URLs are encrypted and never disclosed to the client. It is thus impossible to tamper with GET parameters. It is even impossible to recognise GET parameters. The disadvantage of this feature is the fact that it can be quite difficult to implement a modern JavaScript-based rich internet application because… you cannot modify GET parameters.
  • Form Protection: All acceptable values that can be chosen in a form are known to the entry server and validated using an encrypted hash. This makes it impossible to tamper with HTML <select>, <input type=”hidden”>, <input type=”checkbox”>, <input type=”radio”> values. Obviously, “threatening” user input can still originate from regular <input type=”text”> or <textarea> values.

An example of an entry server implementing the above is Airlock by a Swiss company called Ergon Informatik AG.

Other measures

A lot of research has been made on the topic of countermeasures against SQL injection. Interesting publications to read are:

Conclusion

Even with recent alternative data storage and access models that have come to be known as NoSQL, SQL as a querying and database interfacing language is still hardly challenged. Major SQL vendors implement better and better features into the SQL standard. SQL is here to stay.

Yet, many development teams are unaware of the magnitude of this threat. In millions of lines of application code, it can be sufficient if one SQL vulnerability is detected by a malicious entity operating with automated tools, such as sqlmap. Such an entity may be able to extract all data from your database, and / or execute malicious code on your servers. This can go up to seizing the server.

It is a software architect’s or technical lead’s responsibility to minimise the risk of creating SQL injection vulnerabilities, as even skilled developers may accidentally create such a vulnerability. The simplest and most effective measure is to use bind variables and static SQL wherever possible. More elaborate countermeasures can be achieved by lots of training, screening, testing, and using entry servers.

Hilarious Rant about SQL Injection

My recent article about SQL injection has stirred some serious emotions on JCG. I don’t want to keep it from you! An extract:

[…] The idea that if I use an ORM, my SQL injection woes will magically go away is f***ing harmful, shortsighted, and anybody who thinks that should be kicked squarely in a sensitive region. […]

And if you’ve survived that kick…

[…] Since there is no SQL statement, things like “‘a’; TRUNCATE your_mom” get stored/selected from as exactly that […]

And if your mom has survived truncation, too:

[…] ORM still won’t save you. Validate your god-d*** f***ing inputs, jack-a**.

So please, sanitise your code, or the angry man will come and get you!! 🙂
http://www.javacodegeeks.com/2012/07/database-abstraction-and-sql-injection.html#comment-603438572

Database Abstraction and SQL Injection

I have subscribed to various user groups of jOOQ’s competing database abstraction tools. One of which is ActiveJDBC, a Java implementation of Active Record design pattern. Its maintainer Igor Polevoy recently claimed that:

SQL injection is a web application problem, and not directly related to an ORM. ActiveJDBC will process any SQL that is passed to it.

(See the discussion here: https://groups.google.com/d/topic/activejdbc-group/5D2jhWuW4Sg/discussion)

Is that really true? Should the database abstraction layer delegate SQL injection prevention to the client application?

SQL Injection Background

SQL injection is a problem that most of us developers have had to deal with one time or another in their professional lives. Wikipedia explains the problem nicely. Given the following piece of Java code (or any other language):

statement = "SELECT * FROM users WHERE name = '" + userName + "';"

Imagine that “userName” is a variable taken from an HTTP request. Blindly pasting an HTTP request parameter gives way to simple attacks as these:

-- attacker sends this code in the userName field:
userName = "a';DROP TABLE users; SELECT * FROM userinfo WHERE 't' = 't"

-- resulting in the following statement:
statement = "SELECT * FROM users WHERE name = 'a';"
          + "DROP TABLE users;" +
          + "SELECT * FROM userinfo WHERE 't' = 't';"

This doesn’t happen to you?

Maybe not. But the problem is seen quite often on Stack Overflow. More than 2000 results when searching for “SQL injection”: http://stackoverflow.com/search?q=sql+injection. So even if you know how to prevent it, someone on your team may not. OK, but …

it’s not that bad if one out of 500 statements is badly written by some programmer that was oblivious of this threat?

Think again. Have you ever heard of a tool called sqlmap? This tool will find any problematic page within your application within a couple of seconds/minutes/hours, depending on the severity of your injection problem. Not only that, once it has found problematic pages, it will be able to extract ALL kinds of data from your database. I mean ALL kinds of data. A selection of sqlmap features:

  • Support to enumerate users, password hashes, privileges, roles, databases, tables and columns.
  • Support to search for specific database names, specific tables across all databases or specific columns across all databases’ tables. This is useful, for instance, to identify tables containing custom application credentials where relevant columns’ names contain string like name and pass.
  • Support to download and upload any file from the database server underlying file system when the database software is MySQL, PostgreSQL or Microsoft SQL Server.
  • Support to execute arbitrary commands and retrieve their standard output on the database server underlying operating system when the database software is MySQL, PostgreSQL or Microsoft SQL Server.

Yes! If you suffer from SQL-injection-unsafe code, an attacker can seize your server under some circumstances!! In our company, we’ve tried sqlmap in a sandbox environment against some open source blog software with known vulnerabilities. We’ve managed to seize the server in no time, without writing a single line of SQL

Database Abstraction and SQL Injection

OK, now that I have your attention, let’s think again about what Igor Polevoy said:

SQL injection is a web application problem, and not directly related to an ORM. ActiveJDBC will process any SQL that is passed to it.

Yes, he may be right. Given that ActiveJDBC is a slim wrapper for JDBC, allowing to do nice CRUD simplifications, such as these (taken from their website):

List<Employee> people =
Employee.where("department = ? and hire_date > ? ", "IT", hireDate)
        .offset(21)
        .limit(10)
        .orderBy("hire_date asc");

Did you spot the risk for SQL injection? Right. Even if it uses bind values for underlying PreparedStatements, this tool is as unsafe as JDBC. You can avoid SQL injection, if you’re careful. Or you can start concatenating strings all over. But you have to be aware of that! How does jOOQ handle situations like these? The jOOQ manual explains how bind values are handled explicitly or implicitly. Here are some examples:

// Implicitly creating a bind value for "Poe"
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal("Poe"));

// Explicitly creating a (named) bind value for "Poe"
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal(param("lastName", "Poe")));

// Explicitly inlining "Poe" in the generated SQL string
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal(inline("Poe")));

The above examples will yield

SELECT * FROM T_AUTHOR WHERE LAST_NAME = ?
SELECT * FROM T_AUTHOR WHERE LAST_NAME = ?
SELECT * FROM T_AUTHOR WHERE LAST_NAME = 'Poe'

In the case where “Poe” is inlined, escaping is handled by jOOQ, to prevent syntax errors and SQL injection. But jOOQ also supports injecting SQL strings directly in generated SQL. For instance:

// Inject plain SQL into jOOQ
create.select()
      .from(T_AUTHOR)
      .where("LAST_NAME = 'Poe'");

In this case, SQL injection can occur just as with JDBC

Conclusion

In essence, Igor is right. It is the (client) application developer’s responsibility to be aware of SQL injection issues created by their code. But if a database abstraction framework built on top of JDBC can avoid SQL injection as much as possible in its API, all the better. From a SQL injection perspective, database abstraction frameworks can be divided into three categories:

  • Simple utilities. These include Spring’s JdbcTemplate, or Apache’s DbUtils. They really just enhance the JDBC API with convenience (less exception handling, less verbosity, simpler variable binding, simpler data fetching). Of course, these tools won’t prevent SQL injection
  • Full SQL abstraction. These include jOOQ, JaQu, JPA’s CriteriaQuery and others. Their normal operation mode will always render bind values in generated SQL. This prevents SQL injection in most cases.
  • The others. Many other frameworks (including ActiveJDBC and Hibernate) are mainly based on (SQL or HQL) string operations. While they abstract many SQL-related things, they do not prevent SQL injection at all.

So, when you choose any SQL abstraction tool in your Java application, beware of the severity of SQL injection. And beware of the fact, whether your tool helps you prevent it or not!

Igor’s response

Note that Igor has posted this interesting response to this post here:

http://igorpolevoy.blogspot.ch/2012/07/defend-against-sql-injection-using.html