The threat caused by SQL injection is heavily underestimated even by many senior developers and software architects. Most people are unaware of the fact that an entire server can be at risk by a single vulnerability even in the remotest piece of logic. This article will give a frightening insight into the potential severity of SQL injection vulnerabilities.
What is SQL injection?
The Wikipedia article on SQL injection reads:
SQL injection is a code injection technique, used to attack data driven applications, in which malicious SQL statements are inserted into an entry field for execution.
In other words, if a website or some other software entity has a vulnerability, it is possible for an attacker to “inject” arbitrary pieces of SQL code for execution on the server. A straight-forward and popular example of such a vulnerability is when a dynamic SQL statement directly embeds user input such as the following Java code does:
The title variable could be user input, e.g. an HTML field. Nothing prevents this user input from being a SQL code snippet that would make perfect sense syntactically. For example:
It is easy to see that when substituted into the previous SQL statement, this “title” will result in always selecting all films. Another, famous example of how this can go wrong is xkcd’s famous Little Bobby Tables strip.
Automating the search for vulnerabilities
The above introduction shows that SQL injection is possible and quite simple to exploit, manually, if the server side source code is well known. Finding such a vulnerability in a huge application with thousands of SQL statements, however, is a lot of work. Besides, there are much more subtle vulnerabilities than the obvious pattern matching search. sqlmap is an Open Source, GPLv2 licensed tool for automating such searches.
Let’s assume we have a simple RESTful application written in Java using Jetty and JAX-RS querying the MySQL Sakila example database, which exposes the above vulnerability at this URL:
alien is a @PathParam value passed to this method, which returns all films titled %alien%:
Now, let’s download sqlmap and let it run against the above URL:
python sqlmap.py -u localhost:8080/sql-injection-examples/vulnerable-services/films/
Note that sqlmap is implemented in Python 2.x. When we let this run, my server-side SQL logs show me that there are a couple of interesting SQL queries being executed:
sqlmap tries to inject all sorts of snippets that would help it discover if the vulnerable query is deterministic, whether the URL is stable, what database server type this is, if the vulnerability is inside a subquery, whether UNION clauses can be appended, etc. This is also nicely displayed in the sqlmap stdout log output:
With a total of 59 HTTP requests (among which 41 resulted in HTTP 500 errors), sqlmap was capable of detecting the nature of the vulnerability of my SQL statement, and it also figured out the database server and version.
This is not so impressive yet. With a lot of SQL knowledge and creativity, I might have figured this out myself. But sqlmap offers a lot of other interesting operation modes as can be seen in the user guide. Let us use the database navigation tools. For this, we’ll add the –dbs parameter on the same URL:
python sqlmap.py -u localhost:8080/sql-injection-examples/vulnerable-services/films/ --dbs
This will immediately dump the following databases:
Note that sqlmap logs everything it learns in a local SQLite database, such that it can keep the number of HTTP requests as small as possible. This is important for an attacker, as frequent HTTP 404 or 500 statuses will eventually trigger attention by maintenance personnel at the server side.
But how did it find the actual database names through my vulnerable SQL statement? In three steps:
- It sent a first statement to check if the vulnerability still exists
- It sent a second statement to see how many databases there are (8)
- It sent a third statement to learn the name of each database
Let’s look at step 2. The query executed is this one:
Because sqlmap previously discovered that my naïve server implementation just dumps stack traces of HTTP 500 errors, it knew that it could generate the following MySQL error message and convey the information of interest hidden inside that message:
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry 'qnfwq8qdmiq1' for key 'group_key'
The SQL concat function wraps the number 8 with two uniquely identifiable strings. With a similar query and the MySQL LIMIT clause, database names can then be extracted one by one, e.g. the Sakila database:
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry 'qnfwqsakilaqdmiq1' for key 'group_key'
Again, the information of interest is wrapped between delimiters.
Dumping data without stack traces
Obviously, dumping stack traces in a productive system is really a bad idea, as you should never give an intruder any hints about how your system works. Let’s try to change our application such that it will display a simple 500 Internal Server Error message:
No problem for sqlmap. When I rerun my previous database dumping command, sqlmap generates all database names letter by letter. Now, that it has no way of producing error message output in HTTP responses, it can only do a binary search on each letter of a schema and see if any given search still produces the regular list of Sakila films or an empty list:
The above query checks if the first schema (LIMIT 0, 1) has a letter higher than ASCII code 120 (ORD) at position 13 (MID). This is a lot slower and more visible in server logs than before, but it can still produce the result.
Dumping data without displaying UI output
What if the vulnerability is deep down in our application, never producing UI output? Let’s change our service again, executing the vulnerable query but without returning any data:
My log file shows me that there are many more SQL statements executed by sqlmap and even the confirmation if there is a vulnerability at the given URL takes longer, as sqlmap now resorts to binary searches using delays. Here’s how sqlmap confirms that there is a vulnerability:
As can be seen, the above queries run for N or N + 5 seconds, where N is the amount of time the actual query takes. If N is sufficiently stable, delay-based binary searches can be performed to discover database names letter by letter.
Dumping sensitive data
The previous sections have shown that a vulnerability is almost always exploitable by automatic means. Intruders have time. They can run their script over a week to have sqlmap dump your complete database schema. But the threat doesn’t end there. Once table and column names are accessible, all of those tables can be dumped, too. Let’s dump the film table:
python sqlmap.py -u localhost:8080/sql-injection-examples/vulnerable-services/films/ --dump –T film
This takes very long, as each row and each column is fetched through at least one single query. But eventually, I get a nice CSV file of the complete contents. Needless to say that this could be your application’s credit card information, or other sensitive or classified data.
You would think you’d notice if someone dumps your entire schema? That’s not necessary for the intruder. Remember, they have already been able to extract schema, table and column names. They can now selectively execute arbitrary queries as such:
python sqlmap.py -u localhost:8080/sql-injection-examples/vulnerable-services/films/ --sql-query=”SELECT a.first_name, a.last_name, count(*) FROM film f JOIN film_actor fa USING (film_id) JOIN actor a USING (actor_id) GROUP BY actor_id, a.first_name, a.last_name ORDER BY count(*) DESC LIMIT 1”
The above query will find the actor who has played in most films:
Not all queries are syntactically injectable into all vulnerabilities on all RDBMS. But the Little Bobby Tables example is not what most intruders are after anyway: Causing damage to your system. Most intruders do not want to be seen in order to be able to silently steal your data.
Other sqlmap features
sqlmap doesn’t stop there. sqlmap allows you to switch database users, figuring out root or DBA passwords by brute force. Once you gain access to a user or role with higher grants and depending on the actual RDBMS, sqlmap will also allow you to:
- Upload and execute a SQL script
- Inject and execute custom user-defined functions
- Read or write files on the database server file system
- Open up a shell on the database server operating system
- Manipulate the database server’s Windows registry
On a less serious note, sqlmap can be an excellent database server administration tool, should you have forgotten the credentials of your own local database development environment!
In principle, it is almost impossible to completely prevent vulnerabilities involving SQL. SQL is a dynamically interpreted language and as such, it is much more vulnerable to code injection than static languages like Java. In particular, it is extremely vulnerable as it is a common practice to perform dynamic SQL based on user input criteria, such as search criteria. Obviously, other server-side interpreted languages are equally vulnerable, but SQL also happens to be one of the most popular ones.
However, taking the right measures will make it very hard for an intruder to actually find a vulnerability and exploit it without you noticing their activities first. Remember, the less information you display the longer sqlmap takes and the more traces it leaves in your server logs to even verify if a vulnerability is present.
Here are a couple of measures that you should follow and enforce in your team:
Never trust user input
First and most importantly: Never ever trust user input. Even if your application is only used by 10 users from the same company via intranet, your database might contain salaries or other sensitive data, and a malicious employee can dump such data.
Note that user input should not be trusted in other scenarios either, e.g. when accessing the file system.
Use as little user input as possible
Not only should you distrust user input, you should also use as little user input as possible directly in your SQL. For example, if users can make a choice in an HTML dropdown. Instead of simply embedding the HTTP POST parameter value into your SQL statement, parse it first and encapsulate it in an appropriate pre-defined type. In Java, a good match for HTML options could be an enum type.
You know your data best, and thus, you should validate all user input on the server, immediately after receiving it.
Use bind variables
Try to express your SQL statements as statically as possible. SQL string concatenation makes it very easy for junior developers to make mistakes. If you are an engineering team leader, it is your responsibility to help your team members avoid such mistakes.
The simplest way to prevent SQL injection is by using bind variables. JDBC drivers (if you’re operating with Java) and databases have very few bugs in that area, such that streaming bind variables to the database will not generate any easily exploitable vulnerability.
Use static code analysis tools
If you are using Java and JDBC directly, you may be able to detect some vulnerabilities using static code analysis tools, such as FindBugs™ or Alvor. Obviously, such tools can only detect SQL injection when SQL statements are “relatively static”. If you distribute SQL criteria string generation over several code modules, it is not possible for a static code analysis tool, to detect vulnerabilities.
sqlmap is not necessarily a tool for malicious activities. It is free and Open Source, available under the GPLv2 license. You can use it in your continuous integration test suites to detect SQL injection regressions. As you know your own software very well, you can configure sqlmap with a variety of parameters that will give it a “head start” as it will not need to figure out whether you are operating on MySQL, PostgreSQL, Oracle, etc.
Apply SQL abstraction
In addition to a very low-level API (JDBC in Java, native functions in PHP, etc.), most platforms also have a variety of higher-level APIs that abstract the SQL language in a non-string-based way. However, don’t be tricked into thinking that SQL language abstraction itself keeps you safe from code injection. Popular frameworks like Hibernate or JPA use string-based query languages like HQL or JPQL. These languages are less vulnerable than SQL because they’re less expressive. But they are still vulnerable!
In addition to being much safer from SQL injection through enforcing the use of bind variables, statically typed internal domain-specific languages also help prevent syntax mistakes. In the case of jOOQ, this is further supported by jOOQ’s source code generator, which will generate Java literals for tables and columns.
Carefully craft database GRANTs and security policies
Don’t just blindly use the MySQL root user with no sensible password for everything (or the postgres user in PostgreSQL, or the dbo user in SQL Server, etc.). Create a well-designed security strategy, where users are granted access only to those elements that they really need to access.
Often, a good idea is also to add an additional layer of security indirection inside your database through database views. For instance, web users are granted access to a few read-only views, never to tables directly. This will allow you to dynamically restrict who can manipulate what data.
This measure will not prevent SQL injection, but it will minimize the possible damage an intruder can cause, once they have penetrated your system.
If you’re developing a web application, you can choose from a variety of firewalls that have some SQL injection protection features. Apart from simple regular-expression based pattern matching (which is not really reliable or useful), such an entry server should also support two very powerful features to help you prevent mistakes when accepting user input through HTTP GET and POST parameters:
- Form Protection: All acceptable values that can be chosen in a form are known to the entry server and validated using an encrypted hash. This makes it impossible to tamper with HTML <select>, <input type=”hidden”>, <input type=”checkbox”>, <input type=”radio”> values. Obviously, “threatening” user input can still originate from regular <input type=”text”> or <textarea> values.
An example of an entry server implementing the above is Airlock by a Swiss company called Ergon Informatik AG.
A lot of research has been made on the topic of countermeasures against SQL injection. Interesting publications to read are:
- “A Classification of SQL Injection Attacks and Countermeasures” by William G.J. Halfond, Jeremy Viegas, and Alessandro Orso of the Georgia Institute of Technology
- “Improving Web Application Security: Threats and Countermeasures” by J.D. Meier, Alex Mackman, Michael Dunner, Srinath Vasireddy, Ray Escamilla and Anandha Murukan of the Microsoft Corporation
- “SQL Injection Prevention Cheat Sheet” of OWASP
- “SQL Injection” of The Web Application Security Consortium
- “SQL Injection” of Wikipedia (with lots of links)
Even with recent alternative data storage and access models that have come to be known as NoSQL, SQL as a querying and database interfacing language is still hardly challenged. Major SQL vendors implement better and better features into the SQL standard. SQL is here to stay.
Yet, many development teams are unaware of the magnitude of this threat. In millions of lines of application code, it can be sufficient if one SQL vulnerability is detected by a malicious entity operating with automated tools, such as sqlmap. Such an entity may be able to extract all data from your database, and / or execute malicious code on your servers. This can go up to seizing the server.
It is a software architect’s or technical lead’s responsibility to minimise the risk of creating SQL injection vulnerabilities, as even skilled developers may accidentally create such a vulnerability. The simplest and most effective measure is to use bind variables and static SQL wherever possible. More elaborate countermeasures can be achieved by lots of training, screening, testing, and using entry servers.