Yak Shaving is a Good Way to Improve an API

Yak Shaving (uncountable):

  1. (idiomatic) Any apparently useless activity which, by allowing you to overcome intermediate difficulties, allows you to solve a larger problem.
  2. (idiomatic) A less useful activity done to consciously or unconsciously procrastinate about a larger but more useful task.

Both interpretations of the term Yak Shaving as explained by Wiktionary are absolutely accurate descriptions of most refactoring jobs. The Yak Shaving in refactoring itself can be described by this gif showing what happens when you want to change a light bulb:

light-bulb

However, when developing an API, it’s not such a bad idea to perform actual Yak Shaving (only the first interpretation, of course). Let’s look at an example why, from the daily work maintaining jOOQ.

The Task

For jOOQ 3.6, I wanted to implement a very simple feature. Feature #2639: Add stored procedure OUT values to DEBUG log output. This is not an important feature at all, but certainly very useful to a lot of jOOQ users. The idea is that every time you run a stored procedure with DEBUG logging activated, you’ll get the OUT parameters logged along with the procedure call. Here’s a visualisation:

debug-log-2

Now, the actual implementation would have been very easy. Just about 10 lines of code in the existing LoggerListener that already takes care of logging all the other things. But there were a couple of caveats, which reminded me of the above lightbulb changing gif:

The apparently useless activities

  1. There was no way to access the RETURN_VALUE meta information of a jOOQ Routine
  2. There was no easy way to access Routine IN and OUT values generically
  3. There was lifecycle event that modelled the moment when OUT parameters are fetched in jOOQ
  4. There was no way to format Routine OUT parameters in a nice way

Does this feel familiar? There is need for refactoring!

Now, this whole implementation is hidden in jOOQ’s internals. It wouldn’t matter too much for users, if this had been hacked together in one way or another. For instance, obviously the RETURN_VALUE meta information could be accessed through internal refactorings, the same is true for IN and OUT values. There are other lifecycle events that might have worked just as well, and formatting is easy to re-implement.

But this is a popular API that is used by many users who might profit from a cleaner solution. Thus, why don’t we simply refactor and implement:

  1. Add a public Routine.getReturnParameter() method
  2. Add public Routine.getValue() and setValue() methods
  3. Add ExecuteListener.outStart(ExecuteContext) and outEnd(ExecuteContext) to capture fetching of Routine OUT parameters
  4. Add Routine.outRecord() and Routine.inRecord() to view a Routine as a Record”

The thing is:

The API implementor is the first API consumer

It’s hard to foresee what API users really want. But if you’re implementing an API (or just a feature), and you discover that something is missing, always consider adding that missing thing to the public API. If it could be useful to yourself, internally, it could be even more useful to many others. This way, you turn one little nice feature into 5, amplifying the user love.

Don’t get me wrong. This doesn’t mean that every little piece of functionality needs to be exposed publicly, au contraire. But the fact that something is keeping you – as the maintainer from writing clean code might indicate that others implement the same hacky workarounds as you. And they won’t ask you explicitly for it!

Don’t believe it? Here’s an entirely subjective analysis of user feedback:

  • 0.2% – Hey, this is a cool product, I want to help the owner make it better, I’ll provide a very descriptive, constructive feature request and engage for the next 5 weeks to help implement it.
  • 0.8% – Whatever dudes. Make this work. Please.
  • 1.3% – Whatever dudes. Make this work. ASAP!
  • 4.0% – WTF is wrong with you guys? Didn’t you at least think about this once??
  • 4.7% – OK, I’m going to write this completely uninformed rant about this product now, which I hate so much. It makes my life completely miserable
  • 9.0% – Oh well, this doesn’t work. Let’s go home, it’s 17:00 anyways
  • 80.0% – Oh well, this didn’t work yesterday already. Let’s go home. It’s Friday, 16:00 anyways

Now, most of this list wasn’t meant entirely seriously, but you get the point. There may be those 0.2% of users / customers that love you and that actively engage with you. Others may still love you or at least like you, but they won’t engage. You have to guesstimate what they need.

So. Bottom line:

If you need it, they probably need it. Start Yak Shaving!

When to Use a Framework

I’ve come across this interesting article titled “Don’t Reinvent the Wheel! Use a Framework!” They All Say. The essence of the article lies in this little fact:

[When should “they” use a framework?”] When they understand the basics of the language and would be able to code what the framework/library does anyway

Frameworks and libraries are great because someone out there had spent a lot of time thinking about a very specific problem domain. Chances that they have gotten it right are very high BUT if you had enough time and money, you could build at least the useful parts of that framework yourself. Nonetheless, it’s cheaper to use/buy their code and have them maintain that part for you.

This is very true with JPA / Hibernate, for instance. If you know SQL and you know SQL well, then JPA does a great deal of helping you getting all that repetitive and often complex CRUD right, and you’ll even know how to tweak and tune JPA or the generated SQL where needed. Gavin King himself has said time and again:

Just because you’re using Hibernate, doesn’t mean you have to use it for everything. A point I’ve been making for about ten years now.

Hibernate helps you write some of your SQL, it doesn’t replace SQL. If you’re new to programming, you shouldn’t use Hibernate right away. You should first learn to write SQL and get a good understanding of your RDBMS. From my experience at conferences and JUG talks, this doesn’t only apply to junior programmers, though. It is very interesting to see how few seniors and architects know about window functions, for instance.

So, if you’re using an RDBMS and Hibernate/JPA, have your team be trained on all the layers of your technology. SQL, HQL/JPQL, and Java.

10 Reasons not to Choose a Particular Open Source software

We’re all Software Engineers of one type or another. Most of us have one thing in common, though: We’re lazy. And we know that someone else was less lazy and has already solved that tedious problem that we’re on. And because we’re not only lazy but also stingy, we search for Free Open Source software.

But the problem with Open Source software is: There are millions of options for about every problem domain. Just look at web development with “modern” JavaScript. Which tool to choose? Which one will still be there tomorrow? Will it work? Will I get maintenance? New features? Plugins from the community?

While it is not so easy to find the right tool among the good ones (commons or guava? mockito or jmock? Hibernate or jOOQ or MyBatis?), it is certainly easier to rule out the bad ones.

Here are some things to look out for, when evaluating an Open Source software (in no particular order)

1. NullPointerExceptions, ClassCastExceptions

This is one of my favourites. It is very easy to google. No one is completely safe from these annoyances. But when you find stack traces, bug reports, investigate them closely.

  • Do they appear often?
  • Do they appear in similar contexts?
  • Do they appear in places where they could’ve been omitted?

It’s a matter of good design to be able to avoid NullPointerExceptions and ClassCastExceptions. It happens to everyone. But no NullPointerException should be thrown from a place that can be statically discovered by the Java compiler or with FindBugs.

Needless to say that the list of no-go exceptions thrown from a database library, for instance, can be extended with SQLExceptions due to syntax errors produced by that library.

2. Community Discussing Bugs instead of Features, Strategies, Visions

Every Open Source software has users and with Google Groups and GitHub, it has become fairly easy to interact with an OSS community.

For larger projects, the community also extends to Stack Overflow, Reddit, Twitter, etc. These next steps are a sign of popularity of an Open Source software, but not necessarily a sign that you should use them. Also, don’t be blinded by users saying “hey this is so cool”, “it just made my day”, “best software ever”. They say that to everyone who’s supporting them out of their misery (or laziness, see the intro of this post).

What you should be looking out for is whether the community is discussing visions, strategies, features, truly awesome ideas that can be implemented next year, in the next major release. It’s a true sign that not only the software will probably stick around, it will also become much better.

The converse to this is a community that mainly discusses bugs (see NullPointerException, ClassCastException). Unlike a “visionary” community, a “buggy” community will just create work, not inspiration to the vendor. But which one’s the chicken, which one’s the egg?

Another converse to this is a community that is disappointed by the false promises given by the visions of the vendor. I often have a feeling that Scala’s SLICK might qualify for that as it introduces an insurmountable language-mapping impedance mismatch between its own, LINQ-inspired querying DSL and SQL.

3. Poor Manual, Poor Javadoc

That’s easy to discover. Do you really want that? The best and most authoritative information should come from the software vendor, not some weirdo forum on the web that you’ve googled.

A good example are PostgreSQL’s Manuals.

A rant about bad examples can be seen here:
http://www.cforcoding.com/2009/08/its-time-we-stopped-rewarding-projects.html

Don’t be deceived by the idea that it might get better eventually. Poorly documented software will be poor in many other aspects.  And it’s such an easy thing to discover!

Of course, the “right” amount of documentation is an entirely other story…

4. No Semantic Versioning

Search for release notes and see if you’ll find something that roughly corresponds to semver.org. You will want patch releases when your Open Source software that you’re using in mission-critical software fails. When you get a patch release, you don’t want 50 new features (with new NullPointerExceptions, ClassCastExceptions).

5. Unorganised Appearance

Again, we’re in times of GitHub. The good old CVS times are over, where HTML was still used to share cooking recipes. Check if your Open Source software uses those tools. If they show that they’re using them. It will help you ascertain that the software will still be good in a couple of years if the vendor isn’t crushed by the mess they’ve gotten themselves in.

6. Vendor Side-Project evolving into an Offspring Product

Now that is a sign not everyone may agree upon, I guess. But after the experience I’ve made in previous jobs, I strongly believe that software that has evolved out of necessity before making it a product really suffers from its legacy. It wasn’t a product from the beginning and it has strong ties to the vendor’s original requirements, which doesn’t bother the vendor, but it will bother you. And because the vendor still has very strong ties to their offspring, they won’t be ready to make fundamental changes in both code and vision!

Specifically, in the database field, there are a couple of these software, e.g.

Note, I don’t know any of the above tools, so they may as well be awesome. But be warned. They weren’t designed as products. They were designed for a very narrow purpose originating from a pre-Apache context.

7. Generics are Poorly (or Overly) Adopted

Generics were introduced in 2004 with Java 5. Now that the heated debates about generic type erasure are over, generics are well adopted. Or aren’t they? The latest stable release 3.2.1 of Apache Commons Collections is still not generified! That must’ve been the number 1 reason why people had started shifting to Google Guava (or its predecessors) instead. There’s not much making for a lousier day than having raw types (or eels) slapped around your face.

The other thing that you should look out for, though, is over-generification. Generics can become really hard, even for top-shot Java architects. A common blunder is to strongly correlate subtype polymorphism with generic polymorphism without being aware of the effects. Having too many generics in an API is a good sign for an architecture astronaut. (or a design astronaut in this case). We’ll see further down how that may correlate with the person behind the design decisions.

8. Vendor Cannot Handle Objective Criticism or Competition

Here’s how to find out, who’s behind the Open Source software. While this isn’t important for a small, geeky tool, you should be very interested in the vendor as a person when looking for a strategic OSS addition, especially if you’re dealing with a benevolent dictator. The vendor should be:

  • Aware of competition, i.e. they’re doing marketing, learning from them. Improving to compete. This means that they are interested in being truly better, not just “convinced that they’re better”.
  • Open minded with their competition, with you as a customer, and ready to discuss various points of view.
  • Interested in new ideas, possibly putting them on a roadmap right away (but without losing focus for his main strategic roadmap).

Even if this is Open Source, there’s no point in being arrogant or conceited. The vendor should treat you like a customer (as long as you’re not trolling). Open-mindedness will eventually lead to the better product in the long run.

9. Vendor has no Commercial or Marketing Interests at All

Now, (Free) Open Source is nice for many reasons. As a vendor, you get:

  • Feedback more quickly
  • Feedback more often
  • Community (with pull requests, feature additions, etc.)
  • The feeling that you’re doing something good

True? Yes. But that’s true for commercial software as well. So what’s the real reason for doing Open Source? It depends. Adobe for instance has started opening up a lot, recently, since their acquisition of Day Software. All of JCR, JackRabbit, the upcoming JackRabbit Oak, Sling and Felix are still at Apache with the original committers still on board. But one can certainly not say that Adobe has no commercial interests.

OSS vendors should think economically and build products. Eventually, they may start selling stuff around their core products, or separate community and commercial licenses. And unlike they get too greedy (see Oracle and MySQL, vs RedHat and MariaDB), that can make commercial Open Source a very interesting business, also for the customer who will then get the good parts of Open Source (partially free, open, with a vibrant community) along with the good parts of commercial software (premium support, warranties, etc.)

In other words, don’t choose overly geeky stuff. But you might have recognised those tools before (poor documentation, no semantic versioning, poor tooling).

10. No Traction Anymore

To wrap this up, here’s an obvious last one. Many Open Source products don’t show any traction by the vendor. That goes along well with the previous point, where the vendor has no commercial interest. Without commercial long-term interest, they’ll also lose all other interest. And you’re stuck with maintaining a pile of third-party code yourself (fixing its many many ClassCastExceptions, NullPointerExceptions).

TL;DR : Conclusion

You should chose Open Source just like commercial software. Economically.

  • Open Source is not an excuse for bad quality.
  • Open Source is not an excuse for lack of support.
  • Open Source is not an excuse for non-professionalism.

If Open Source fails you on any of the above, the joke will be on you, the customer. You’ll get a bad product, and you’ll pay the price with exaggerated maintenance on your side, which you thought you’d avoid by chosing something free. Nothing is free. Not even Free Open Source. Ask the Grumpy Nerd

Development schema, production schema

Most of us separate development data from production data, physically or at least, logically (except maybe Chuck Norris (official website, no kidding!)). If you’re lucky and you can afford multiple Oracle / other-expensive-database licenses, you might clone the same schema / owner name for every application instance on different servers. But sometimes, you can’t do that, and you have to put all schemata in the same box and name them:

  • DB_DEV
  • DB_TEST
  • DB_PROD

Or worse… several productive instances in the same box

Another, similar use case is when you deploy several instances of the same application in the same environment. For instance, you have a blogging server with 10 users. Every user has their own independent blog with their own tables. You may either resolve this problem by creating multiple schemata / owners again:

  • DB_USER1
  • DB_USER2
  • DB_USER3

Or, by adding prefixes / suffixes to your tables within a single schema:

  • DB.USER1_POSTS
  • DB.USER1_COMMENTS
  • DB.USER2_POSTS
  • DB.USER2_COMMENTS

This means that in every executed SQL statement, you’d have to patch relevant environments (DEV, TEST, PROD) or users (USER1, USER2, USER3) into all of your database artefacts. There are a lot of things that can go wrong.

Let jOOQ do that for you, instead

With jOOQ, it’s simple. jOOQ always generates the schema / owner in the generated SQL statements. This is usually the name of your development schema from which you generated source code. So if you have select from a DB_DEV.POSTS table, you’ll do this:

OracleFactory create = new OracleFactory(connection);
create.select(TEXT)
      .from(POSTS)
      .fetch();

// jOOQ generates:
// SELECT "DB_DEV"."POSTS"."TEXT" FROM "DB_DEV"."POSTS"

When you run this statement productively, you probably want this instead

// Create a mapping indicating that instead of DB_DEV,
// you want jOOQ to render DB_PROD as the schema
SchemaMapping mapping = new SchemaMapping();
mapping.add(DB_DEV, "DB_PROD");
OracleFactory create = new OracleFactory(connection, mapping);
create.select(TEXT)
      .from(POSTS)
      .fetch();

// jOOQ now generates:
// SELECT "DB_PROD"."POSTS"."TEXT" FROM "DB_PROD"."POSTS"

The same applies for prefixed tables. Let’s say, USER1 is logged in, and you want to prefix your tables with “USER1_”

// Create a mapping renaming some tables
SchemaMapping mapping = new SchemaMapping();
mapping.add(POSTS, "USER1_POSTS");
mapping.add(COMMENTS, "USER1_COMMENTS");
OracleFactory create = new OracleFactory(connection, mapping);
create.select(TEXT, COMMENT)
      .from(POSTS)
      .join(COMMENTS)
      .using(POST_ID)
      .fetch();

// jOOQ now generates:
// SELECT "DB"."USER1_POSTS"."TEXT",
//        "DB"."USER1_COMMENTS"."COMMENT"
//   FROM "DB"."USER1_POSTS"
//   JOIN "DB"."USER1_COMMENTS"
// USING ("POST_ID")

For more information, refer to the manual:

http://www.jooq.org/manual/ADVANCED/SchemaMapping