jOOQ Tuesdays: Oliver Gierke Talks About Spring Data

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

I’m very excited to feature today Oliver Gierke, the Spring Data Project Lead at Pivotal with strong opinions on DDD and REST.

Hi Oliver – Since 2011, you have mostly been working for Pivotal (previously SpringSource) on Spring Data as the project lead. What is it that fascinates you most about working with data?

To be completely honest it’s not data in itself that got me into Spring Data, or even the predecessor of the JPA module which I was working on even before I joined SpringSource, but the interest in managing complexity in software. When you talk about that, there’s no way you can avoid talking about Domain-Driven Design and its building blocks like value objects, entities, aggregates and the concept of a repository which then again gets you into the realms of data access.

So we as the Spring Data team always have to trade different driving forces against each other: first, the level of abstraction and the programming model that you use in your application code and how well and easy it actually allows you to implement domain logic that solves your business problem at hand. Second, the different tradeoffs that different data stores have already made and in how we actually allow our users to leverage them and at the same time expose some commonalities in the programming model so that developers can transfer knowledge between projects that might use different stores for certain reasons easily. Spring Data is trying to bridge that gap, provide a low entry barrier in modelling aggregates and repositories but at the same time give users the tools to fall back to very efficient means of data access that require a lot of developer control for cases where that’s the top priority.

Spring Data has an impressive number of officially and unofficially supported modules that reach far beyond relational data models. What are the biggest challenges in working with so many models and technologies in a single API?

Definitely the diversity in approaches and tradeoffs by the underlying persistence technologies. Actually that’s one of the reasons, Spring Data does not try to provide a singular unifying API. It’s not even trying to do that for stores of a certain category, for example document databases. We’ve rather taken a general Spring ecosystem philosophy and implemented it in the repositories and data access space: let’s have a consistent programming model with repeatable patterns so that it’s easy to understand the purpose of the abstraction, but then let the abstraction expose store specific specialties so that we’re not abstracting away those but rather allow developers to leverage them when needed.

The general concept of an interface based repository abstraction is not the most revolutionary thing on earth, I admit. But we now look back at almost a decade of experience in designing the parts of the programming model that are indeed API and I think we’ve learned a lot from mistakes we made. Java 8 will be the baseline for the upcoming second generation of Spring Data which enriches our options in terms of APIs. Reactive programming is a hot topic at them moment, too. So there are a lot of balls to juggle in that space.

What’s your favourite module, and why?

That’s of course hard to say as it’s been awesome to see what the individual store modules have turned into over time. However, I’ve grown a bit of a special relationship for the Commons module which is the foundation for all of the store ones as it basically contains the heart and soul of Spring Data: the object mapping facilities, the repository proxy implementation etc. And it’s great to see how we often times can add some functionality there and that functionality is immediately available for repositories of all stores.

On the other end of the spectrum there’s Spring Data REST, a module that exposes RESTful resources based on your aggregate and repository definitions. I like that very much as well, as it works across all the Spring Data modules exposing a repository API and is a great showcase of what you can achieve on top of such an abstraction. Also, it has really helped us to make developers aware of a couple of often overseen aspects of REST, but I guess we’re gonna get to that in a second.

Maintaining a big and widely used API is hard, balancing tons of user requests, integrating third party functionality, maintaining backwards compatibility. What are some maintainer battle stories you’d like to share?

It certainly is. Especially with so many concurrent — sometimes contradicting — forces in play. That starts with the question of versioning the modules: what do we actually version here? User facing API. But which part of the API is that? That totally depends on how much the user is customizing behavior. Is a developer building a Spring Data module for some new data store a user, too? Of course, but a very different kind of user. We usually try to be very conservative with changes that could affect application developers but a bit more demanding when it comes to the implementors of a store module.

That’s all stuff we sort of had and have to deal with on a day to day basis. Interestingly, we’ve been the first ones in the broader Spring engineering team that have picked up the notion of a release train — we group together releases of all modules and name them after famous computer scientists —, that had been popularized by the Eclipse team. That approach worked well for us and has now been adopted by Spring Cloud, Reactor and other teams as well.

Oliver, I have to ask, why does everyone misunderstand REST?

I’m kind of surprised this question comes up in this context, but I guess I have build up some reputation on the internet to complain about people being from unspecific to — in my opinion — outright wrong about this topic :D.

I guess the fundamental problem that REST has is that some parts of it are moderately easy to understand and implement. These days everyone agrees that URIs are a cool thing and that using the right HTTP verb for a given task is a good idea. But even with the latter you’ll easily find people that don’t understand why it’s a good idea to prefer a PUT request over a POST one. Which already brings us to the second part.

Then there are parts that are harder to grasp and a bit harder to implement. The hypermedia aspect comes to mind. Unfortunately theses aspects are the ones that heavily influence whether what you build delivers on the promises that REST makes: being an architectural style that gives you e.g. scalability and evolvability. So people basically start ignoring these aspects, sometimes even outright arguing they don’t need them but then turn around and criticize REST for not delivering on its promises.

In my opinion that’s a way to common pattern observable in the wild, but I guess the only way to improve the situation here is to work on making it easier to implement those aspects and good examples of the benefits you get when you follow those advanced constraints.