J2eeBasedPreAuthenticatedWebAuthenticationDetailsSource… WAT??

WAT? This hilarious talk about the incredible quirks of JavaScript had been going around on Twitter and other media. In case you haven’t already, take a look at it here:

https://www.destroyallsoftware.com/talks/wat

Speaking of WAT, let’s talk about Spring Security. Spring has undertaken great effort to replace the clumsy, verbose parts of J2EE to replace them with straightforward and intuitive APIs. In Spring Security, you will find great features like the eraseCredentialsAfterAuthentication property, which keeps you from accidentally letting credentials lie around. Accidentally. What, like printing them to the standard out, or to the HTML document?? If this happens to you accidentally, then you might have a lot of other problems, too. Frameworks helping people to avoid such accidents remind me of PHP and this interesting post about how fundamentally broken PHP is:

http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-design/

If you’ve ever heard about PHP’s abominable “magic quotes”, you’ll understand what I mean.

More examples

There’s more. Spring has incredibly specialised API types, such as:

  • J2eeBasedPreAuthenticatedWebAuthenticationDetailsSource
  • PreAuthenticatedGrantedAuthoritiesWebAuthenticationDetails
  • PreAuthenticatedGrantedAuthoritiesAuthenticationDetails
  • GrantedAuthorityFromAssertionAttributesUserDetailsService
  • MutableGrantedAuthoritiesContainer
  • MethodSecurityMetadataSourceBeanDefinitionParser
  • AbstractUserDetailsServiceBeanDefinitionParser

Wow. I didn’t know Java identifiers were even allowed to be that long. And there’s more. Consider the package names (line break introduced artificially for improved readability):

org.springframework.security.web
   .authentication.preauth.websphere
   .WebSpherePreAuthenticatedWebAuthenticationDetailsSource

Note the repetition of concepts in both package and class names:

  • 2x “web”
  • 4x “authentication” (out of which 2x “pre-authenticated”)
  • 2x “websphere

For all of you who weren’t aware that the class browsing frame in the regular Javadoc frameset has a horizontal scrollbar, I invite you to navigate through the incredible verbosity of Spring Security:

http://static.springsource.org/spring-security/site/docs/3.1.x/apidocs/index.html

</rant>

Learn Eclipse Keyboard Shortcuts Easily

Disclaimer: Some users experienced this plugin to break their Eclipse Juno (see comments). Use at own risk!

Using keyboard shortcuts can vastly increase your productivity. Instead of switching between keyboard and mouse all the times, trying to figure out where that mouse cursor is, left/right clicking your way through context menus nested several levels deep, you could’ve just hit AltShift-I, for instance, for inlining constants, variables, methods, etc… But how to remember the most important ones? Tricky…

Except if you’re using the mousefeed Eclipse plugin:
http://www.mousefeed.com/ (ignore the bad page layout)

This plugin will show a very annoying and obnoxious, big popup indicating the keyboard shortcut that you could’ve used just now instead of clicking through Eclipse menus with your mouse. For instance:

A shortcut reminder
A shortcut reminder

This will eventually annoy you so much that you’ll willingly use the suggested keyboard shortcut just to make the popup go away! And the best thing is, you’ll only learn the shortcuts for those actions that you really execute very often.

Be productive again!

Alvor: Static SQL analysis in Strings passed to JDBC

I have recently discovered this nice Eclipse plugin here:
http://code.google.com/p/alvor/

It evaluates String, StringBuilder, StringBuffer, CharSequence and many other types passed to JDBC method for subsequent execution. It doesn’t do a bad job at this, even if it is in beta mode. The rate of false positives that I have experienced is around 20% for regular SQL statements, and 100% for stored procedure calls (which seem not to be supported). Checks include:

  • Syntax correctness
  • Semantics correctness
  • Object availability

It does so by

  • Comparing SQL against its own internal SQL grammar
  • Checking SQL statements against an actual database (provided a JDBC driver, JDBC URL, user, password)

This is extremely powerful, as it can find common bugs resulting from bad SQL string concatenation, misspelled table / column names, type mismatches, etc. With findbugs’ capabilities of analysing control flows, this could be made even better to detect even remote corner-cases or SQL passed to methods for the concatenation of sub-clauses and sub-selects. An example screenshot is given on their website:

An Alvor Screenshot

 

Check out Alvor for yourself here:
http://code.google.com/p/alvor/

Note, I have also posted a request to the FindBugs mailing list here:
https://mailman.cs.umd.edu/pipermail/findbugs-discuss/2012-September/003634.html

Static SQL analysis of this sort would be a great addition to FindBugs. If you think so, too, please support my request on the FindBugs mailing list

Serious SQL: A “convex hull” of “correlated tables”

Now THIS is an interesting, and challenging question on the jOOQ user group:
https://groups.google.com/d/topic/jooq-user/6TBBLYt9eR8/discussion

Say you have a big database with lots of tables and foreign key references. Now you would like to know all tables that are somehow inter-connected by their respective foreign key relationship “paths”. You could call this a “convex hull” around all of your “correlated tables”. Here’s a pseudo-algorithm to achieve this:

// Initialise the hull with an "origin" table
Set tables = {"any table"};
int size = 0;

// Grow the "tables" result until no new tables are added
while (size < tables.size) {
  size = tables.size;

  for (table in tables) {
    tables.addAll(table.referencedTables);
    tables.addAll(table.referencingTables);
  }
}

At the end of this algorithm, you would have all tables in the “tables” set, that are somehow connected with the original “any table”.

Calculate this with jOOQ

With jOOQ’s generated classes, you can easily implement the above algorithm in Java. This would be an example implementation

public class Hull {
  public static Set<Table<?>> hull(Table<?>... tables) {
    Set<Table<?>> result =
        new HashSet<Table<?>>(Arrays.asList(tables));

    // Loop as long as there are no new result tables
    int size = 0;
    while (result.size() > size) {
      size = result.size();

      for (Table<?> table : new ArrayList<Table<?>>(result)) {

        // Follow all outbound foreign keys
        for (ForeignKey<?, ?> fk : table.getReferences()) {
          result.add(fk.getKey().getTable());
        }

        // Follow all inbound foreign keys from tables
        // within the same schema
        for (Table<?> other : table.getSchema().getTables()) {
          if (other.getReferencesTo(table).size() > 0) {
            result.add(other);
          }
        }
      }
    }

    return result;
  }

  public static void main(String[] args) {
    // Calculate the "convex hull" for the T_AUTHOR table
    System.out.println(hull(T_AUTHOR));
  }
}

Do it with SQL

Now this still looks straightforward. But we’re SQL pro’s and we love weird queries, so let’s give Oracle SQL a shot at resolving this problem in a single SQL statement. Here goes (warning, some serious SQL ahead)!

-- "graph" denotes an undirected foreign key reference graph
-- for schema "TEST"
with graph as (
  select c1.table_name t1, c2.table_name t2
  from all_constraints c1
    join all_constraints c2
      on c1.owner = c2.r_owner
      and c1.constraint_name = c2.r_constraint_name
  where c1.owner = 'TEST'
  union all
  select c2.table_name t1, c1.table_name t2
  from all_constraints c1
    join all_constraints c2
      on c1.owner = c2.r_owner
      and c1.constraint_name = c2.r_constraint_name
  where c1.owner = 'TEST'
),
-- "paths" are all directed paths within that schema
-- as a #-delimited string
paths as (
  select sys_connect_by_path(t1, '#') || '#' path
  from graph
  connect by nocycle prior t1 = t2
),
-- "subgraph" are all those directed paths that go trough
-- a given table T_AUTHOR
subgraph as (
  select distinct t.table_name,
    regexp_replace(p.path, '^#(.*)#$', '\1') path
  from paths p
  cross join all_tables t
  where t.owner = 'TEST'
  and p.path like '%#' || t.table_name || '#%'
),
-- This XML-trick splits paths and generates rows for every distinct
-- table name
split_paths as (
select distinct table_name origin,
  cast(t.column_value.extract('//text()') as varchar2(4000)) table_names
from
  subgraph,
  table(xmlsequence(xmltype(
      '<x><x>' || replace(path, '#', '</x><x>') ||
'</x></x>').extract('//x/*'))) t
),
-- "table_graphs" lists every table and its associated graph
table_graphs as (
  select
    origin,
    count(*) graph_size,
    listagg(table_names, ', ') within group (order by 1) table_names
  from split_paths
  group by origin
)
select
  origin,
  graph_size "SIZE",
  dense_rank() over (order by table_names) id,
  table_names
from table_graphs
order by origin

When run against the jOOQ integration test database, this beautiful query will return:

+----------------------+------+----+-----------------------------------------+
| ORIGIN               | SIZE | ID | TABLE_NAMES                             |
+----------------------+------+----+-----------------------------------------+
| T_658_11             |    7 |  3 | T_658_11, T_658_12, T_658_21, T_658_22, |
|                      |      |    | T_658_31, T_658_32, T_658_REF           |
| T_658_12             |    7 |  3 | T_658_11, T_658_12, T_658_21, T_658_22, |
|                      |      |    | T_658_31, T_658_32, T_658_REF           |
| T_658_21             |    7 |  3 | T_658_11, T_658_12, T_658_21, T_658_22, |
|                      |      |    | T_658_31, T_658_32, T_658_REF           |
| T_658_22             |    7 |  3 | T_658_11, T_658_12, T_658_21, T_658_22, |
|                      |      |    | T_658_31, T_658_32, T_658_REF           |
| T_658_31             |    7 |  3 | T_658_11, T_658_12, T_658_21, T_658_22, |
|                      |      |    | T_658_31, T_658_32, T_658_REF           |
| T_658_32             |    7 |  3 | T_658_11, T_658_12, T_658_21, T_658_22, |
|                      |      |    | T_658_31, T_658_32, T_658_REF           |
| T_658_REF            |    7 |  3 | T_658_11, T_658_12, T_658_21, T_658_22, |
|                      |      |    | T_658_31, T_658_32, T_658_REF           |
| T_AUTHOR             |    7 |  1 | T_AUTHOR, T_BOOK, T_BOOK_DETAILS,       |
|                      |      |    | T_BOOK_SALE, T_BOOK_STORE,              |
|                      |      |    | T_BOOK_TO_BOOK_STORE, T_LANGUAGE        |
| T_BOOK               |    7 |  1 | T_AUTHOR, T_BOOK, T_BOOK_DETAILS,       |
|                      |      |    | T_BOOK_SALE, T_BOOK_STORE,              |
|                      |      |    | T_BOOK_TO_BOOK_STORE, T_LANGUAGE        |
| T_BOOK_DETAILS       |    7 |  1 | T_AUTHOR, T_BOOK, T_BOOK_DETAILS,       |
|                      |      |    | T_BOOK_SALE, T_BOOK_STORE,              |
|                      |      |    | T_BOOK_TO_BOOK_STORE, T_LANGUAGE        |
| T_BOOK_STORE         |    7 |  1 | T_AUTHOR, T_BOOK, T_BOOK_DETAILS,       |
|                      |      |    | T_BOOK_SALE, T_BOOK_STORE,              |
|                      |      |    | T_BOOK_TO_BOOK_STORE, T_LANGUAGE        |
| T_BOOK_TO_BOOK_STORE |    7 |  1 | T_AUTHOR, T_BOOK, T_BOOK_DETAILS,       |
|                      |      |    | T_BOOK_SALE, T_BOOK_STORE,              |
|                      |      |    | T_BOOK_TO_BOOK_STORE, T_LANGUAGE        |
| T_DIRECTORY          |    1 |  2 | T_DIRECTORY                             |
| T_LANGUAGE           |    7 |  1 | T_AUTHOR, T_BOOK, T_BOOK_DETAILS,       |
|                      |      |    | T_BOOK_SALE, T_BOOK_STORE,              |
|                      |      |    | T_BOOK_TO_BOOK_STORE, T_LANGUAGE        |
| X_TEST_CASE_64_69    |    4 |  4 | X_TEST_CASE_64_69, X_TEST_CASE_71,      |
|                      |      |    | X_TEST_CASE_85, X_UNUSED                |
| X_TEST_CASE_71       |    4 |  4 | X_TEST_CASE_64_69, X_TEST_CASE_71,      |
|                      |      |    | X_TEST_CASE_85, X_UNUSED                |
| X_TEST_CASE_85       |    4 |  4 | X_TEST_CASE_64_69, X_TEST_CASE_71,      |
|                      |      |    | X_TEST_CASE_85, X_UNUSED                |
| X_UNUSED             |    4 |  4 | X_TEST_CASE_64_69, X_TEST_CASE_71,      |
|                      |      |    | X_TEST_CASE_85, X_UNUSED                |
+----------------------+------+----+-----------------------------------------+

Can you beat this? :-)

I challenge you to write a shorter query and to achieve the same result! Here’s the integration test database:
https://github.com/jOOQ/jOOQ/blob/master/jOOQ-test/src/org/jooq/test/oracle/create.sql

Note that the above query is horribly inefficient. There’s a lot of potential in beating that, too!

When will we have LINQ in Java?

LINQ is one of Microsoft’s .NET Framework’s most distinct language features. When it was first introduced to languages such as C#, it required heavy changes to the language specification. Yet, this addition was extremely powerful and probably unequalled by other languages / platforms, such as Java, Scala, etc. Granted, Scala has integrated XML in a similar fashion into its language from the beginning, but that is hardly the same accomplishment. Nowadays, Typesafe developers are developing SLICK – Scala Language Integrated Connection Kit, which has similar ambitions, although the effort spent on it is hardly comparable: one “official” Scala developer against a big Microsoft team. Let alone the potential of getting into patent wars with Microsoft, should SLICK ever become popular.

What does Java have to offer?

There are many attempts of bringing LINQ-like API’s to the Java world, as the following Stack Overflow question shows:

http://stackoverflow.com/questions/1217228/what-is-the-java-equivalent-for-linq

Here’s another newcomer project by Julian Hyde, that I’ve recently discovered:

https://github.com/julianhyde/linq4j

He tried his luck on the lambda-dev mailing list, without any response so far. We’re all eagerly awaiting Java 8 and project lambda with its lambda expressions and extension methods. But when will we be able to catch up with Microsoft’s LINQ? After all, jOOQ, linq4j are “internal domain specific languages”, which are all limited by the expressivity of their host language (see my previous blog post about building domain specific languages in Java).

Java 9 maybe? We can only hope!

Database Abstraction and SQL Injection

I have subscribed to various user groups of jOOQ’s competing database abstraction tools. One of which is ActiveJDBC, a Java implementation of Active Record design pattern. Its maintainer Igor Polevoy recently claimed that:

SQL injection is a web application problem, and not directly related to an ORM. ActiveJDBC will process any SQL that is passed to it.

(See the discussion here: https://groups.google.com/d/topic/activejdbc-group/5D2jhWuW4Sg/discussion)

Is that really true? Should the database abstraction layer delegate SQL injection prevention to the client application?

SQL Injection Background

SQL injection is a problem that most of us developers have had to deal with one time or another in their professional lives. Wikipedia explains the problem nicely. Given the following piece of Java code (or any other language):

statement = "SELECT * FROM users WHERE name = '" + userName + "';"

Imagine that “userName” is a variable taken from an HTTP request. Blindly pasting an HTTP request parameter gives way to simple attacks as these:

-- attacker sends this code in the userName field:
userName = "a';DROP TABLE users; SELECT * FROM userinfo WHERE 't' = 't"

-- resulting in the following statement:
statement = "SELECT * FROM users WHERE name = 'a';"
          + "DROP TABLE users;" +
          + "SELECT * FROM userinfo WHERE 't' = 't';"

This doesn’t happen to you?

Maybe not. But the problem is seen quite often on Stack Overflow. More than 2000 results when searching for “SQL injection”: http://stackoverflow.com/search?q=sql+injection. So even if you know how to prevent it, someone on your team may not. OK, but …

it’s not that bad if one out of 500 statements is badly written by some programmer that was oblivious of this threat?

Think again. Have you ever heard of a tool called sqlmap? This tool will find any problematic page within your application within a couple of seconds/minutes/hours, depending on the severity of your injection problem. Not only that, once it has found problematic pages, it will be able to extract ALL kinds of data from your database. I mean ALL kinds of data. A selection of sqlmap features:

  • Support to enumerate users, password hashes, privileges, roles, databases, tables and columns.
  • Support to search for specific database names, specific tables across all databases or specific columns across all databases’ tables. This is useful, for instance, to identify tables containing custom application credentials where relevant columns’ names contain string like name and pass.
  • Support to download and upload any file from the database server underlying file system when the database software is MySQL, PostgreSQL or Microsoft SQL Server.
  • Support to execute arbitrary commands and retrieve their standard output on the database server underlying operating system when the database software is MySQL, PostgreSQL or Microsoft SQL Server.

Yes! If you suffer from SQL-injection-unsafe code, an attacker can seize your server under some circumstances!! In our company, we’ve tried sqlmap in a sandbox environment against some open source blog software with known vulnerabilities. We’ve managed to seize the server in no time, without writing a single line of SQL

Database Abstraction and SQL Injection

OK, now that I have your attention, let’s think again about what Igor Polevoy said:

SQL injection is a web application problem, and not directly related to an ORM. ActiveJDBC will process any SQL that is passed to it.

Yes, he may be right. Given that ActiveJDBC is a slim wrapper for JDBC, allowing to do nice CRUD simplifications, such as these (taken from their website):

List<Employee> people =
Employee.where("department = ? and hire_date > ? ", "IT", hireDate)
        .offset(21)
        .limit(10)
        .orderBy("hire_date asc");

Did you spot the risk for SQL injection? Right. Even if it uses bind values for underlying PreparedStatements, this tool is as unsafe as JDBC. You can avoid SQL injection, if you’re careful. Or you can start concatenating strings all over. But you have to be aware of that! How does jOOQ handle situations like these? The jOOQ manual explains how bind values are handled explicitly or implicitly. Here are some examples:

// Implicitly creating a bind value for "Poe"
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal("Poe"));

// Explicitly creating a (named) bind value for "Poe"
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal(param("lastName", "Poe")));

// Explicitly inlining "Poe" in the generated SQL string
create.select()
      .from(T_AUTHOR)
      .where(LAST_NAME.equal(inline("Poe")));

The above examples will yield

SELECT * FROM T_AUTHOR WHERE LAST_NAME = ?
SELECT * FROM T_AUTHOR WHERE LAST_NAME = ?
SELECT * FROM T_AUTHOR WHERE LAST_NAME = 'Poe'

In the case where “Poe” is inlined, escaping is handled by jOOQ, to prevent syntax errors and SQL injection. But jOOQ also supports injecting SQL strings directly in generated SQL. For instance:

// Inject plain SQL into jOOQ
create.select()
      .from(T_AUTHOR)
      .where("LAST_NAME = 'Poe'");

In this case, SQL injection can occur just as with JDBC

Conclusion

In essence, Igor is right. It is the (client) application developer’s responsibility to be aware of SQL injection issues created by their code. But if a database abstraction framework built on top of JDBC can avoid SQL injection as much as possible in its API, all the better. From a SQL injection perspective, database abstraction frameworks can be divided into three categories:

  • Simple utilities. These include Spring’s JdbcTemplate, or Apache’s DbUtils. They really just enhance the JDBC API with convenience (less exception handling, less verbosity, simpler variable binding, simpler data fetching). Of course, these tools won’t prevent SQL injection
  • Full SQL abstraction. These include jOOQ, JaQu, JPA’s CriteriaQuery and others. Their normal operation mode will always render bind values in generated SQL. This prevents SQL injection in most cases.
  • The others. Many other frameworks (including ActiveJDBC and Hibernate) are mainly based on (SQL or HQL) string operations. While they abstract many SQL-related things, they do not prevent SQL injection at all.

So, when you choose any SQL abstraction tool in your Java application, beware of the severity of SQL injection. And beware of the fact, whether your tool helps you prevent it or not!

Igor’s response

Note that Igor has posted this interesting response to this post here:

http://igorpolevoy.blogspot.ch/2012/07/defend-against-sql-injection-using.html

“NoSQL” should be called “SQL with alternative storage models”

Time and again, you’ll find blog posts like this one here telling you the same “truths” about SQL vs. NoSQL:

http://onewebsql.com/blog/no-sql-do-i-really-need-it
(OneWebSQL being a competitor of jOOQ, see a previous article for a comparison)

Usually, those blogs aim for the same arguments being:

  • Performance (“SQL” can “never” scale as much as “NoSQL”)
  • ACID (you don’t always need it)
  • Schemalessness (just store any data)

For some funny reason, all of these ideas have led to the misleading term “NoSQL”, which is interpreted by some as being “no SQL”, by others as being “not only SQL”. But SQL really just means “Structured Query Language”, and it is extremely powerful in terms of expressing relational context. It is well-designed for ad-hoc creation of tuples, records, tables, sets and for mapping them to other projections, reducing them to custom aggregations, etc. Note the terms “map/reduce”, which are often employed by NoSQL evangelists.

For good reasons, the Facebook Query Language (FQL), one of the leading NoSQL query languages, closely resembles SQL although it operates on a completely different data model. Oracle too, has jumped on the “NoSQL” train and sells its own product. It won’t be very long until the two types of data storage will merge and can be queried by an ISO/IEEE standardised SQL:2015 (or so). Because the true spirit of “NoSQL” does not consist in the way data is queried. It consists in the way data is stored. NoSQL is all about data storage. So, sooner or later, you will just create “traditional” tables along with “graph tables” and “hashmap tables” in the same database and join them in single SQL queries without thinking much about today’s hype.

“NoSQL” should be called “SQL with alternative storage models” and queried with pure SQL!