2009 2010 2011 2012 CA 40,764 47,465 51,791 52,409 DE 40,270 40,408 44,355 42,598 FR 40,488 39,448 42,578 39,759 GB 35,455 36,573 38,927 38,649 IT 35,724 34,673 36,988 33,814 JP 39,473 43,118 46,204 46,548 RU 8,616 10,710 13,324 14,091 US 46,999 48,358 49,855 51,755And the table structure:
Now, let’s assume we’d like to find the number of countries with a GDP higher than 40,000 for each year. With standard SQL:2003, and now also with the newly released PostgreSQL 9.4, we can now take advantage of the new
CREATE TABLE countries ( code CHAR(2) NOT NULL, year INT NOT NULL, gdp_per_capita DECIMAL(10, 2) NOT NULL );
FILTERclause, which allows us to write the following query:
The above query will now yield:
SELECT year, count(*) FILTER (WHERE gdp_per_capita >= 40000) FROM countries GROUP BY year
year count ------------ 2012 4 2011 5 2010 4 2009 4And that’s not it! As always, you can use any aggregate function also as a window function simply by adding an
OVER()clause to the end:
The result would then look something like this:
SELECT year, code, gdp_per_capita, count(*) FILTER (WHERE gdp_per_capita >= 40000) OVER (PARTITION BY year) FROM countries
year code gdp_per_capita count ------------------------------------ 2009 CA 40764.00 4 2009 DE 40270.00 4 2009 FR 40488.00 4 2009 GB 35455.00 4
jOOQ 3.6 will also support the new FILTER clause for aggregate functionsGood news for jOOQ users. You can write the same query with jOOQ intuitively as such:
DSL.using(configuration) .select( COUNTRIES.YEAR, count().filterWhere( COUNTRIES.GDP_PER_CAPITA.ge(40000) )) .from(COUNTRIES) .groupBy(COUNTRIES.YEAR) .fetch();
And the best thing is that jOOQ (as usual) emulates the above clause for you if you’re not using PostgreSQL. The equivalent query would be:
DSL.using(configuration) .select( COUNTRIES.YEAR, COUNTRIES.CODE, COUNTRIES.GDP_PER_CAPITA count().filterWhere( COUNTRIES.GDP_PER_CAPITA.ge(40000)) .over(partitionBy(COUNTRIES.YEAR))) .from(COUNTRIES) .fetch();
Read more about what’s new in PostgreSQL 9.4 here
SELECT year, count(CASE WHEN gdp_per_capita >= 40000 THEN 1 END) FROM countries GROUP BY year
6 thoughts on “The Awesome PostgreSQL 9.4 / SQL:2003 FILTER Clause for Aggregate Functions”
Why do you need the filter clause for example one, regular SQL handles this just fine. “SELECT year, count(*) FROM countries WHERE gdp_per_capita >= 40000 GROUP BY year” returns exactly the same result. I’m sure FILTER serves a purpose but your example is already covered by standard SQL.
… is not the same as the article’s that is using
FILTER. Perhaps the example was a bit too simplistic to explain the full scope of this “new” functionality, but in any case, in your example, if a year only has one record with
gdp_per_capita = 30000, the year will not show up in your result, but it will show up in the
FILTER‘ed result, with a
For one, the FILTER clause *is* “regular” and “standard” SQL. It has been specified in SQL:2003. Besides, have you tried calculating two such aggregations with different filters in one go? There you have another use-case for
FILTER. In fact, it is a more standard and more generally useful implementation of Oracle’s / SQL Server’s
Hope this helps,
Seems like FILTER is a syntax sugar for COUNT + CASE:
In most cases you’re right. But I suspect you cannot emulate
Distinct relative to what? If you want distinct values of something, why don’t you express this explicitly in the inner view?
DISTINCTkeyword in an aggregate function will count only the DISTINCT values in each group, instead of counting ALL the values.
What do you mean by “expressing this explicitly in the inner view”?