Why Staying in Control of Your SQL is so Important
Lots of blog posts and research papers are written about the topics of scaling up and scaling out. This interesting blog post, for instance, sheds some light on the two strategies with respect to physical maintenance costs, such as cooling and electricity consumption. Certainly non-negligible aspects for very large systems.
But before solving problems at a very big scale, consider much simpler SQL tuning mechanisms. Very often, when you have a bottleneck in your application, it is at the database layer. This fact is used by many NoSQL evangelists to promote their products, claiming that scaling out is much easier with NoSQL databases. This might be true, but ask yourself: Do you need a system that works under heavy load? Or is your bottleneck a performance bottleneck?
In other words: Do you have a 5’000’000-concurrent-users problem? Or do you have a request-takes-more-than-3-seconds problem? Because if you suffer from the latter, you probably do not need to scale out nor up. Your “traditional” architecture is probably quite fine, but your database / SQL queries aren’t. The popular use-the-index-luke.com website features an interesting article about the two top performance problems caused by ORM tools. They are:
- The infamous N+1 selects problem
- The hardly-known Index-Only Scan
Both problems result from the fact that popular ORMs generate SQL code in a way that can hardly be influenced by developers. People often claim that tools like Hibernate generate better SQL than the average developer. This is true to an extent that the average developer might never care to actually learn to write better SQL. Which in turn leads to the above problems.
Hibernate is very good at generating 70% of your application’s boring CRUD SQL. At the same time, Hibernate never claimed to be a replacement for SQL as you will have to resort to native SQL in 30% of the time. Should you then use Hibernate’s native SQL API? Or an alternative like jOOQ?
The important thing is to get back in control of your SQL when performance matters. Know your indexes. Know your database meta data. And use a tool that allows you to write precisely the SQL statement you want. Learning better SQL will help you save lots of money on operations costs as you might not need to scale out nor up.