One of the biggest and undead myths in SQL is that COUNT(*) is faster than COUNT(1). Or was it that COUNT(1) is faster than COUNT(*)? Impossible to remember, because there's really no reason at all why one should be faster than the other. But is the myth justified? Let's measure! How does COUNT(...) work? But … Continue reading What’s Faster? COUNT(*) or COUNT(1)?
Usually, this blog is 100% pro window functions and advocates using them at any occasion. But like any tool, window functions come at a price and we must carefully evaluate if that's a price we're willing to pay. That price can be a sort operation. And as we all know, sort operations are expensive. They … Continue reading How to Avoid Excessive Sorts in Window Functions
Cost Based Optimisation is the de-facto standard way to optimise SQL queries in most modern databases. It is the reason why it is really really hard to implement a complex, hand-written algorithm in a 3GL (third generation programming language) such as Java that outperforms a dynamically calculated database execution plan, that has been generated from … Continue reading 10 Cool SQL Optimisations That do not Depend on the Cost Model
I stumbled upon an interesting question on Stack Overflow recently. A user wanted to query a table for a given predicate. If that predicate returns no rows, they wanted to run another query using a different predicate. Preferably in a single query. Challenge accepted! Canonical Idea: Use a Common Table Expression We're querying the Sakila … Continue reading How to Execute a SQL Query Only if Another SQL Query has no Results
A large-ish customer in banking (largest tables on that particular system: ~1 billion rows) once decided to separate the OLTP database from the "log database" in order to better use resources and prevent contention on some tables, as the append-only log database is used heavily for analytic querying of all sorts. That seems to make … Continue reading The Difficulty of Tuning Queries Over a Database Link – Or How I Learned to Stop Worrying and Love the DUAL@LINK Table
At a customer site, I've recently encountered a report where a programmer needed to count quite a bit of stuff from a single table. The counts all differed in the way they used specific predicates. The report looked roughly like this (as always, I'm using the Sakila database for illustration): -- Total number of films … Continue reading How to Calculate Multiple Aggregate Functions in a Single Query
Hah! Got nerd-sniped again: http://stackoverflow.com/questions/43099226/how-to-make-jooq-to-use-arrays-in-the-in-clause/43102102 A jOOQ user was wondering why jOOQ would generate an IN list for a predicate like this: Java COLUMN.in(1, 2, 3, 4) SQL COLUMN in (?, ?, ?, ?) ... when in fact there could have been the following predicate being generated, instead: COLUMN = any(?::int) In the second case, … Continue reading SQL IN Predicate: With IN List or With Array? Which is Faster?
Tuning SQL isn't always easy, and it takes a lot of practice to recognise how any given query can be optimised. One of the most important slides of my SQL training is the one summarising "how to be fast": Some of these bullets were already covered on this blog. For instance avoiding needless, mandatory work, … Continue reading How to Benchmark Alternative SQL Queries to Find the Fastest Query
There are many many opinions out there regarding the old surrogate key vs. natural key debate. Most of the times, surrogate keys (e.g. sequence generated IDs) win because they're much easier to design: They're easy to keep consistent across a schema (e.g. every table has an ID column, and that's always the primary key) They're … Continue reading Faster SQL Through Occasionally Choosing Natural Keys Over Surrogate Keys
In a recent blog post, I've advocated against the use of COUNT(*) in SQL, when a simple EXISTS() would suffice. This is important stuff. I keep tuning productive queries where a customer runs a COUNT(*) query like so: SELECT count(*) INTO v_any_wahlbergs FROM actor a JOIN film_actor fa USING (actor_id) WHERE a.last_name = 'WAHLBERG' ... … Continue reading Don’t Even use COUNT(*) For Primary Key Existence Checks