In my recent SQL work for a large Swiss bank, I have maintained nested database view monsters whose unnested SQL code amounted up to 5k lines of code, joining the same table over and over again in separate subselects combined via UNION operations. This monster performed in way under 50ms, no matter how we queried it (see “10 more common mistakes” about the speed of queries). Of course, this performance was only achieved after lots of fine-tuning, load-testing and benchmarking. But it worked. Our Oracle database never let us down on these things.
Nonetheless, many SQL users think that JOIN operations are slow. Why? Perhaps because they are / used to be, in MySQL? I’m currently reading this interesting book by Markus Winand. The book is called SQL Performance Explained. He’s also the author of Use-The-Index-Luke.com where you can get free insight into his book. I still recommend reading the whole book, though. Even SQL old-timers and SQL nerds like me will find 1-2 novel, very interesting approaches, some of which will be incorporated into jOOQ very soon!
In particular, consider this page which explains very well how Hash JOIN operations work:
2 thoughts on “The Myth About Slow SQL JOIN Operations”
The thing is though, scaling web applications is much easier and cheaper for the majority of software developers. So having as few joins as possible will remain a common scaling strategy for a while. Most people would be better of having more application servers than having to manage lots of database servers.
Treating databases like dumb buckets has served many large scale web services very well.
But of course, proper usage of SQL is still a very beneficial thing to learn.
Sure. Hibernate’s immense popularity is mainly due to the fact that it has responded to this need. 70% of all SQL interaction is probably CRUD and that should be very easy to do…