Sometimes when aggregating data with SQL, we’d love to add some additional filters. For instance, consider the following world bank data:
GDP per capita (current US$)
CREATE TABLE countries (
code CHAR(2) NOT NULL,
year INT NOT NULL,
gdp_per_capita DECIMAL(10, 2) NOT NULL
);
Now, let’s assume we’d like to find the number of countries with a GDP higher than 40,000 for each year.
With standard SQL:2003, and now also with the newly released PostgreSQL 9.4, we can now take advantage of the new FILTER clause, which allows us to write the following query:
SELECT
year,
count(*) FILTER (WHERE gdp_per_capita >= 40000)
FROM
countries
GROUP BY
year
6 thoughts on “The Awesome PostgreSQL 9.4 / SQL:2003 FILTER Clause for Aggregate Functions”
Why do you need the filter clause for example one, regular SQL handles this just fine. “SELECT year, count(*) FROM countries WHERE gdp_per_capita >= 40000 GROUP BY year” returns exactly the same result. I’m sure FILTER serves a purpose but your example is already covered by standard SQL.
SELECT year, count(*)
FROM countries
WHERE gdp_per_capita >= 40000
GROUP BY year
… is not the same as the article’s that is using FILTER. Perhaps the example was a bit too simplistic to explain the full scope of this “new” functionality, but in any case, in your example, if a year only has one record with gdp_per_capita = 30000, the year will not show up in your result, but it will show up in the FILTER‘ed result, with a COUNT(*) of 0.
For one, the FILTER clause *is* “regular” and “standard” SQL. It has been specified in SQL:2003. Besides, have you tried calculating two such aggregations with different filters in one go? There you have another use-case for FILTER. In fact, it is a more standard and more generally useful implementation of Oracle’s / SQL Server’s PIVOT clause.
Why do you need the filter clause for example one, regular SQL handles this just fine. “SELECT year, count(*) FROM countries WHERE gdp_per_capita >= 40000 GROUP BY year” returns exactly the same result. I’m sure FILTER serves a purpose but your example is already covered by standard SQL.
Hi Greg.
Your example:
… is not the same as the article’s that is using
FILTER
. Perhaps the example was a bit too simplistic to explain the full scope of this “new” functionality, but in any case, in your example, if a year only has one record withgdp_per_capita = 30000
, the year will not show up in your result, but it will show up in theFILTER
‘ed result, with aCOUNT(*)
of 0.For one, the FILTER clause *is* “regular” and “standard” SQL. It has been specified in SQL:2003. Besides, have you tried calculating two such aggregations with different filters in one go? There you have another use-case for
FILTER
. In fact, it is a more standard and more generally useful implementation of Oracle’s / SQL Server’sPIVOT
clause.Hope this helps,
Lukas
Seems like FILTER is a syntax sugar for COUNT + CASE:
Is that?
In most cases you’re right. But I suspect you cannot emulate
FILTER
withCASE
when usingDISTINCT
:Distinct relative to what? If you want distinct values of something, why don’t you express this explicitly in the inner view?
The
DISTINCT
keyword in an aggregate function will count only the DISTINCT values in each group, instead of counting ALL the values.What do you mean by “expressing this explicitly in the inner view”?