Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.
I’m very excited to feature today Richard North, creator of Testcontainers, a promising new database and UI testing tool for reproducible tests.
Hi Richard – you work at Skyscanner, which means you have tons of travel data to work with. What’s the most exciting thing about working with this data?
Skyscanner is a really data-led company, using data at all levels in decision-making. The volume of data we gather and process really helps us understand what travellers want and serve them better. For example, our destination recommender system helps with people discover new and interesting places to go, based upon a vast amount of data, algorithms and experiments. But it’s interesting how this varies from recommendations in internet media companies – there are far fewer possible destinations than books, songs and movies, yet users’ reasons for travelling and tastes can be more nuanced and varied.
There’s yet more data-oriented work under the surface, too – for example, the infrastructure needed to gather and analyse such large amounts of data, and the running of experiments to help us improve. There are a lot of smart people working hard to make this all happen, and it’s an exciting place to be!
You’ve created an increasingly popular testing framework, Testcontainers. What made you do it? What itch does it scratch?
Well, I think it’s something that scratches several itches at the same time – things that previously we only had isolated solutions to. The common element, though, is reproducibility of test environments. Of all my time developing and writing tests for JVM-based systems, it’s always been the non-JVM dependencies that caused the most complexity, unreliability and maintenance overhead.
I remember my first day as a developer, years ago: I was given a desktop machine and 2 days’ worth of step-by-step instructions that I needed to follow – just so that I’d be able to develop and run tests with all dependencies in place. A few months later I had to repeat the same task many times over when building new CI servers.
A lot has changed since then in terms of how we deploy and manage our production infrastructure, and thankfully Docker has done a lot to further bring prod-parity to developers’ machines.
Testcontainers started out as my effort to bring the full power of Docker to integrated testing on the JVM in two areas that I’ve experienced the most pain: testing against a clean, representative database, and making browser-based selenium testing more reproducible, both for developers and on CI.
Mostly being curious about testing databases, your documentation mentions Testcontainers as an alternative to using H2 as a test database. What are the disadvantages of emulating a database e.g. with H2? Did you make any personal experience with that?
Yes, definitely – it was one of the tipping point factors that triggered me to create Testcontainers. I do think H2 is a fantastic piece of work in what it manages to deliver, and it’s something I’ve used on a number of projects to good effect.
However, compatibility with real databases has often been a sticking point. Back in 2015, before I started Testcontainers, we were struggling with a few MySQL features that didn’t have equivalents in H2. We were facing the unpleasant prospect of having to constrain our implementation to what H2 would allow us to test against. It became fairly obvious that there was a gap in the market for an H2-like tool that was actually a facade to a Docker-based database container – and Testcontainers was born.
How do you think of mocking the database at any layer, including the DAO layer, service layer, etc.?
I’m all in favour of keeping tests small, light and layered, and using mocks to accomplish this. This might sound strange coming from somebody who has developed an integrated testing tool, but it’s true!
Still, I feel that we need to be pragmatic about how we approach automated tests and how we make sure we’re testing the right thing – especially when crossing boundaries. Are we testing how this code behaves against reality, or are we testing against our own (potentially false) understanding of how external components work?
My feeling is that it’s quite straightforward to mock layers of your system that you yourself wrote, or where you can easily jump into the source code, a spec or documentation. With an external component, you can still produce a mock that behaves how you expect, or how you witness the real thing behaving. But does that mock continue to represent the real thing, especially after accretion of other features, or the additional perils of state that a database entails – schema changes and actual data?
My ideal is to mock the data access layer for consumption by higher layers, but to be quite careful about what the data access layer itself talks to in my tests. It should probably be a real database. Hopefully Testcontainers is one tool that helps make this particular thing a little less painful – so that when you find yourself needing to do this, there’s a way to do it easily.
What’s the biggest challenge you’ve faced when testing databases, or other things?
It’s not databases, but I’d say that by far the hardest testing challenge I’ve faced as a developer is mobile apps, especially iOS. I’ve always enjoyed mobile development as a whole, but when switching from a Java server-side/web project to mobile, it really feels like you’re going back in time. Some of the challenges are harder – such as asynchronicity and platform APIs that make it harder to structure software in a testable way. But it also feels like the tooling is much further behind, and until quite recently received far less attention. I feel the net result has been that developers have been discouraged from investing in automated tests, which is sad given that we know how valuable they can be.
Things are getting better, but I do greatly prefer the testing aspects of working on server-side JVM projects. For all its difficulties, we are actually quite lucky!