Detect JDBC API Misusage with JDBCLint

I’ve recently seen an advertisement for JDBCLint on the H2 User Group. JDBCLint is an Apache licensed JDBC proxy implementation that does some plausibility checks on the lifecycles of your JDBC objects. For instance, it

  • Checks if a ResultSet is closed twice
  • Checks if a ResultSet is not closed at all (in the finalizer)
  • Checks if a ResultSet yields unread columns

All of these checks can be disabled by specifying relevant properties on the proxy. And the best thing is that this proxy is so easy to integrate:

import com.maginatics.jdbclint.ConnectionProxy;
...
Connection connection =
    DriverManager.getConnection(...);
connection = ConnectionProxy.newInstance(
    connection, new Properties());
connection.close();
// reports error and optionally throws exception
connection.close();

In addition to static code analysis tools like FindBugs or Alvor, this tool can help you find very subtle memory leaks in your large legacy application.

Certainly a tool to have on your tool chain!

Deep Stack Traces Can be a Sign for Good Code Quality

The term “leaky abstractions” has been around for a while. Coining it is most often attributed to Joel Spolsky, who wrote this often-cited article about it. I’ve now stumbled upon another interpretation of a leaky abstraction, measured by the depth of a stack trace:

Leaky Abstractions as understood by Geek and Poke (Licensed CC-BY)
Leaky Abstractions as understood by Geek and Poke (Licensed CC-BY)

So, long stack traces are bad according to Geek & Poke. I’ve seen this argument before on Igor Polevoy’s blog (he’s the creator of ActiveJDBC, a Java implementation of the popular Ruby ActiveRecord query interface). Much like Joel Spolsky’s argumentation was often used to criticise ORMs, Igor’s argument was also used to compare ActiveJDBC with Hibernate. I’m citing:

One might say: so what, why do I care about the size of dependencies, depth of stack trace, etc. I think a good developer should care about these things. The thicker the framework, the more complex it is, the more memory it allocates, the more things can go wrong.

I completely agree that a framework with a certain amount of complexity tends to have longer stack traces. So if we run these axioms through our mental Prolog processors:

  • if Hibernate is a leaky abstraction, and
  • if Hibernate is complex, and
  • if complexity leads to long stack traces, then
  • leaky abstractions and long stack traces correlate

I wouldn’t go as far as claiming there’s a formal, causal connection. But a correlation seems logical.

But these things aren’t necessarily bad. In fact, long stack traces can be a good sign in terms of software quality. It can mean that the internals of a piece of software show a high amount of cohesion, a high degree of DRY-ness, which again means that there is little risk for subtle bugs deep down in your framework. Remember that a high cohesion and high DRY-ness lead to a large portion of the code being extremely relevant within the whole framework, which again means that any low-level bug will pretty much blow up the whole framework as it will lead to everything going wrong. If you do test-driven development, you’ll be rewarded by noticing immediately that your silly mistake fails 90% of your test cases.

A real-world example

Let’s use jOOQ as an example to illustrate this, as we’re already comparing Hibernate and ActiveJDBC. Some of the longest stack traces in a database access abstraction can be achieved by putting a breakpoint at the interface of that abstraction with JDBC. For instance, when fetching data from a JDBC ResultSet.

Utils.getFromResultSet(ExecuteContext, Class<T>, int) line: 1945
Utils.getFromResultSet(ExecuteContext, Field<U>, int) line: 1938
CursorImpl$CursorIterator$CursorRecordInitialiser.setValue(AbstractRecord, Field<T>, int) line: 1464
CursorImpl$CursorIterator$CursorRecordInitialiser.operate(AbstractRecord) line: 1447
CursorImpl$CursorIterator$CursorRecordInitialiser.operate(Record) line: 1
RecordDelegate<R>.operate(RecordOperation<R,E>) line: 119
CursorImpl$CursorIterator.fetchOne() line: 1413
CursorImpl$CursorIterator.next() line: 1389
CursorImpl$CursorIterator.next() line: 1
CursorImpl<R>.fetch(int) line: 202
CursorImpl<R>.fetch() line: 176
SelectQueryImpl<R>(AbstractResultQuery<R>).execute(ExecuteContext, ExecuteListener) line: 274
SelectQueryImpl<R>(AbstractQuery).execute() line: 322
T_2698Record(UpdatableRecordImpl<R>).refresh(Field<?>...) line: 438
T_2698Record(UpdatableRecordImpl<R>).refresh() line: 428
H2Test.testH2T2698InsertRecordWithDefault() line: 931

Compared to ActiveJDBC’s stack traces, that’s quite a bit more, but still less compared to Hibernate (which uses lots of reflection and instrumentation). And it involves rather cryptic inner classes with quite a bit of method overloading. How to interpret that? Let’s go through this, bottom-up (or top-down in the stack trace)

CursorRecordInitialiser

The CursorRecordInitialiser is an inner class that encapsules the initialisation of a Record by a Cursor, and it ensures that relevant parts of the ExecuteListener SPI are covered at a single place. It is the gateway to JDBC’s various ResultSet methods. It is a generic internal RecordOperation implementation that is called by…

RecordDelegate

… a RecordDelegate. While the class name is pretty meaningless, its purpose is to shield and wrap all direct record operations in a way that a central implementation of the RecordListener SPI can be achieved. This SPI can be implemented by client code to listen to active record lifecycle events. The price for keeping the implementation of this SPI DRY is a couple of elements on the stack trace, as such callbacks are the standard way to implement closures in the Java language. But keeping this logic DRY guarantees that no matter how a Record is initialised, the SPI will always be invoked. There are (almost) no forgotten corner-cases.

But we were initialising a Record in…

CursorImpl

… a CursorImpl, an implementation of a Cursor. This might appear odd, as jOOQ Cursors are used for “lazy fetching”, i.e. for fetching Records one-by-one from JDBC.

On the other hand, the SELECT query from this stack trace simply refreshes a single UpdatableRecord, jOOQ’s equivalent of an active record. Yet, still, all the lazy fetching logic is executed just as if we were fetching a large, complex data set. This is again to keep things DRY when fetching data. Of course, around 6 levels of stack trace could have been saved by simply reading the single record as we know there can be only one. But again, any subtle bug in the cursor will likely show up in some test case, even in a remote one like the test case for refreshing records.

Some may claim that all of this is wasting memory and CPU cycles. The opposite is more likely to be true. Modern JVM implementations are so good with managing and garbage-collecting short-lived objects and method calls, the slight additional complexity imposes almost no additional work to your runtime environment.

TL;DR: Long stack traces may indicate high cohesion and DRY-ness

The claim that a long stack trace is a bad thing is not necessarily correct. A long stack trace is what happens, when complex frameworks are well implemented. Complexity will inevitably lead to “leaky abstractions”. But only well-designed complexity will lead to long stack traces.

Conversely, short stack traces can mean two things:

  • Lack of complexity: The framework is simple, with few features. This matches Igor’s claim for ActiveJDBC, as he is advertising ActiveJDBC as a “simple framework”.
  • Lack of cohesion and DRY-ness: The framework is poorly written, and probably has poor test coverage and lots of bugs.

Tree data structures

As a final note, it’s worth mentioning that another case where long stack traces are inevitable is when tree structures / composite pattern structures are traversed using visitors. Anyone who has ever debugged XPath or XSLT will know how deep these traces are.

On Degrading Merchandise Quality

I’ve just replaced 4 halogen bulbs in my appartment. When I threw the old ones into our recycling bin, I’ve noticed that there were already around 8 bulbs in there. From the last 5-6 months, only! This made me sad, as it shows how our consumer society works. Read on to learn how jOOQ strives to be different with respect to endurability and long-lasting quality.

More stuff I’m made to buy because it falls apart

I’ve just bought a new mobile phone, because the old one didn’t really work anymore, after only 2 years!

I’ll be buying a new hard drive next week, because the one I’ve bought recently (to replace a broken, 3 year old one) heats up too quickly and doesn’t give me the throughput I want!

I buy 2 pairs of new Adidas a year, because they don’t last as long as my leather shoes!

I buy about 5 mini-umbrellas a year, because apparently, they’re made to last a mere 10 days of rainfall!

I buy about 2 new city bikes a year, because mending them is more expensive than buying new ones. It’s not that I’d just have to replace tires, they’re actually quite broken after 1/2 year!

I bought a Windows Surface RT tablet just to learn that almost no programs can run on it. I would have had to buy the Pro version and throw the old one away.

Should we really work this way?

I’m forced to buy new stuff. I buy new stuff because the previous stuff I’ve had breaks apart so quickly. And in many cases, it is very clear that it has been designed to break apart in a short period of time. Replacing things with new things is an industry of its own. If stuff were made to last and to work for 10 years (Hah 10 years. Our grandparents used the same stuff for 40 years!), corporations would make less money with new merchandise to replace their previous equivalent merchandise. Take my awesome Samsung flat screen TV, for instance. I had bought it around 7 year ago, and it still works like a charm. Samsung never got any money from me again, even if I’m a very happy customer. Is that a bad business plan for Samsung?

There’s more to life than making tons of money and keeping a paying customer base for your deliberately mediocre product line. There is a strong urge to contribute to making this world a better place. By selling quality products that do not fall apart very quickly. Products that do not require a lot of support. Products that are easy to use and long-lasting, such that the return on investment for your customer is extremely high, at the cost of making “only” a living instead of tons of money and waste.

At Data Geekery, producing such products is our highest credo. This is why we charge a bit of money for licensing with support included, because we think that our quality is so high, you might not even need any support. We could make a lot more money by lowering our quality and by hiring a lot of expensive consultants to explain to you how our complicated product works. But jOOQ is not complicated, it is extremely easy to use.

Many “free” Open Source products don’t work this way. They lure you in by being free and LGPL-licensed, unloading a lot of consulting and maintenance costs onto you only later on. It is your choice. Do you want to invest in your future by keeping maintenance costs low? Or do you want to get a cheap product and pay later on? Think about dirt cheap ink jet printers and how much you’ll pay on those quickly-emptying ink-cartridges later on. Think about dirt cheap coffee machines and how much you’ll pay on those capsules later on. Think about some computer products, and how often you have to pay for ridiculous updates, because the “old” minor release of the operating system is no longer supported.

Don’t think in short terms. Don’t fall for “free” stuff. There is no such thing as free lunch. You always pay. A little bit now, or much more later.

10 Reasons not to Choose a Particular Open Source software

We’re all Software Engineers of one type or another. Most of us have one thing in common, though: We’re lazy. And we know that someone else was less lazy and has already solved that tedious problem that we’re on. And because we’re not only lazy but also stingy, we search for Free Open Source software.

But the problem with Open Source software is: There are millions of options for about every problem domain. Just look at web development with “modern” JavaScript. Which tool to choose? Which one will still be there tomorrow? Will it work? Will I get maintenance? New features? Plugins from the community?

While it is not so easy to find the right tool among the good ones (commons or guava? mockito or jmock? Hibernate or jOOQ or MyBatis?), it is certainly easier to rule out the bad ones.

Here are some things to look out for, when evaluating an Open Source software (in no particular order)

1. NullPointerExceptions, ClassCastExceptions

This is one of my favourites. It is very easy to google. No one is completely safe from these annoyances. But when you find stack traces, bug reports, investigate them closely.

  • Do they appear often?
  • Do they appear in similar contexts?
  • Do they appear in places where they could’ve been omitted?

It’s a matter of good design to be able to avoid NullPointerExceptions and ClassCastExceptions. It happens to everyone. But no NullPointerException should be thrown from a place that can be statically discovered by the Java compiler or with FindBugs.

Needless to say that the list of no-go exceptions thrown from a database library, for instance, can be extended with SQLExceptions due to syntax errors produced by that library.

2. Community Discussing Bugs instead of Features, Strategies, Visions

Every Open Source software has users and with Google Groups and GitHub, it has become fairly easy to interact with an OSS community.

For larger projects, the community also extends to Stack Overflow, Reddit, Twitter, etc. These next steps are a sign of popularity of an Open Source software, but not necessarily a sign that you should use them. Also, don’t be blinded by users saying “hey this is so cool”, “it just made my day”, “best software ever”. They say that to everyone who’s supporting them out of their misery (or laziness, see the intro of this post).

What you should be looking out for is whether the community is discussing visions, strategies, features, truly awesome ideas that can be implemented next year, in the next major release. It’s a true sign that not only the software will probably stick around, it will also become much better.

The converse to this is a community that mainly discusses bugs (see NullPointerException, ClassCastException). Unlike a “visionary” community, a “buggy” community will just create work, not inspiration to the vendor. But which one’s the chicken, which one’s the egg?

Another converse to this is a community that is disappointed by the false promises given by the visions of the vendor. I often have a feeling that Scala’s SLICK might qualify for that as it introduces an insurmountable language-mapping impedance mismatch between its own, LINQ-inspired querying DSL and SQL.

3. Poor Manual, Poor Javadoc

That’s easy to discover. Do you really want that? The best and most authoritative information should come from the software vendor, not some weirdo forum on the web that you’ve googled.

A good example are PostgreSQL’s Manuals.

A rant about bad examples can be seen here:
http://www.cforcoding.com/2009/08/its-time-we-stopped-rewarding-projects.html

Don’t be deceived by the idea that it might get better eventually. Poorly documented software will be poor in many other aspects.  And it’s such an easy thing to discover!

Of course, the “right” amount of documentation is an entirely other story…

4. No Semantic Versioning

Search for release notes and see if you’ll find something that roughly corresponds to semver.org. You will want patch releases when your Open Source software that you’re using in mission-critical software fails. When you get a patch release, you don’t want 50 new features (with new NullPointerExceptions, ClassCastExceptions).

5. Unorganised Appearance

Again, we’re in times of GitHub. The good old CVS times are over, where HTML was still used to share cooking recipes. Check if your Open Source software uses those tools. If they show that they’re using them. It will help you ascertain that the software will still be good in a couple of years if the vendor isn’t crushed by the mess they’ve gotten themselves in.

6. Vendor Side-Project evolving into an Offspring Product

Now that is a sign not everyone may agree upon, I guess. But after the experience I’ve made in previous jobs, I strongly believe that software that has evolved out of necessity before making it a product really suffers from its legacy. It wasn’t a product from the beginning and it has strong ties to the vendor’s original requirements, which doesn’t bother the vendor, but it will bother you. And because the vendor still has very strong ties to their offspring, they won’t be ready to make fundamental changes in both code and vision!

Specifically, in the database field, there are a couple of these software, e.g.

Note, I don’t know any of the above tools, so they may as well be awesome. But be warned. They weren’t designed as products. They were designed for a very narrow purpose originating from a pre-Apache context.

7. Generics are Poorly (or Overly) Adopted

Generics were introduced in 2004 with Java 5. Now that the heated debates about generic type erasure are over, generics are well adopted. Or aren’t they? The latest stable release 3.2.1 of Apache Commons Collections is still not generified! That must’ve been the number 1 reason why people had started shifting to Google Guava (or its predecessors) instead. There’s not much making for a lousier day than having raw types (or eels) slapped around your face.

The other thing that you should look out for, though, is over-generification. Generics can become really hard, even for top-shot Java architects. A common blunder is to strongly correlate subtype polymorphism with generic polymorphism without being aware of the effects. Having too many generics in an API is a good sign for an architecture astronaut. (or a design astronaut in this case). We’ll see further down how that may correlate with the person behind the design decisions.

8. Vendor Cannot Handle Objective Criticism or Competition

Here’s how to find out, who’s behind the Open Source software. While this isn’t important for a small, geeky tool, you should be very interested in the vendor as a person when looking for a strategic OSS addition, especially if you’re dealing with a benevolent dictator. The vendor should be:

  • Aware of competition, i.e. they’re doing marketing, learning from them. Improving to compete. This means that they are interested in being truly better, not just “convinced that they’re better”.
  • Open minded with their competition, with you as a customer, and ready to discuss various points of view.
  • Interested in new ideas, possibly putting them on a roadmap right away (but without losing focus for his main strategic roadmap).

Even if this is Open Source, there’s no point in being arrogant or conceited. The vendor should treat you like a customer (as long as you’re not trolling). Open-mindedness will eventually lead to the better product in the long run.

9. Vendor has no Commercial or Marketing Interests at All

Now, (Free) Open Source is nice for many reasons. As a vendor, you get:

  • Feedback more quickly
  • Feedback more often
  • Community (with pull requests, feature additions, etc.)
  • The feeling that you’re doing something good

True? Yes. But that’s true for commercial software as well. So what’s the real reason for doing Open Source? It depends. Adobe for instance has started opening up a lot, recently, since their acquisition of Day Software. All of JCR, JackRabbit, the upcoming JackRabbit Oak, Sling and Felix are still at Apache with the original committers still on board. But one can certainly not say that Adobe has no commercial interests.

OSS vendors should think economically and build products. Eventually, they may start selling stuff around their core products, or separate community and commercial licenses. And unlike they get too greedy (see Oracle and MySQL, vs RedHat and MariaDB), that can make commercial Open Source a very interesting business, also for the customer who will then get the good parts of Open Source (partially free, open, with a vibrant community) along with the good parts of commercial software (premium support, warranties, etc.)

In other words, don’t choose overly geeky stuff. But you might have recognised those tools before (poor documentation, no semantic versioning, poor tooling).

10. No Traction Anymore

To wrap this up, here’s an obvious last one. Many Open Source products don’t show any traction by the vendor. That goes along well with the previous point, where the vendor has no commercial interest. Without commercial long-term interest, they’ll also lose all other interest. And you’re stuck with maintaining a pile of third-party code yourself (fixing its many many ClassCastExceptions, NullPointerExceptions).

TL;DR : Conclusion

You should chose Open Source just like commercial software. Economically.

  • Open Source is not an excuse for bad quality.
  • Open Source is not an excuse for lack of support.
  • Open Source is not an excuse for non-professionalism.

If Open Source fails you on any of the above, the joke will be on you, the customer. You’ll get a bad product, and you’ll pay the price with exaggerated maintenance on your side, which you thought you’d avoid by chosing something free. Nothing is free. Not even Free Open Source. Ask the Grumpy Nerd