`MEDIAN()`

aggregate function. Remember that the `MEDIAN()`

is sligthly different from (and often more useful than) the `MEAN()`

or `AVG()`

(average).
While the average is calculated as the `SUM(exp) / COUNT(exp)`

, the `MEDIAN()`

tells you that 50% of all values in the sample are higher than the `MEDIAN()`

whereas the other 50% of the set are lower than the `MEDIAN()`

.
So, in other words, if you take the following query:
```
WITH t(value) AS (
SELECT 1 FROM DUAL UNION ALL
SELECT 2 FROM DUAL UNION ALL
SELECT 3 FROM DUAL
)
SELECT
avg(value),
median(value)
FROM
t;
```

avg median 2 2But if you heavily skew your data like this:

```
WITH t(value) AS (
SELECT 1 FROM DUAL UNION ALL
SELECT 2 FROM DUAL UNION ALL
SELECT 100 FROM DUAL
)
SELECT
avg(value),
median(value)
FROM
t;
```

avg median 34.333 2The above sample is of course statistically insignificant, but you can easily see that the effect can be dramatic and relevant, if you have more data: The skewing effect is very important in statistics and in order to make any interesting claim about anything, using percentiles is most often more useful than using averages. Take the average income vs. the median income in a country, for instance. While the average income in the U.S. (and in many other countries) has been steadily increasing, the median income has seen a decline over the past decade. This is due to wealth being heavily skewed towards the super-rich more and more. This blog is not about politics but about Java and SQL, so let’s get back into calculating the actual facts.

## Using precentiles in SQL

As we’ve seen before, the`MEDIAN()`

divides a sample into two equally-sized groups and takes the value “between” those two groups. This particular value is also called the 50th percentile because 50% of all values in the sample are lower than the `MEDIAN()`

. We can thus establish:
: The 0-percentile`MIN(exp)`

: The 50th-percentile`MEDIAN(exp)`

: The 100th-percentile`MAX(exp)`

`MIN()`

and `MAX()`

are supported in all SQL databases (and the SQL standard), `MEDIAN()`

is not in the SQL standard and only supported by the following jOOQ databases:
- CUBRID
- HSQLDB
- Oracle
- Sybase SQL Anywhere

`MEDIAN()`

in particular and any sort of percentile in general in the SQL standard, and since PostgreSQL 9.4 also in PostgreSQL using …
## Ordered-set aggregate functions

Interestingly, apart from window functions, you can also specify`ORDER BY`

clauses to certain aggregate functions that aggregate data based on ordered sets.
One such function is the SQL standard `percentile_cont`

function, which takes the percentile as an argument, and then accepts an additional `WITHIN GROUP`

clause that takes an `ORDER BY`

clause as an argument. These particular ordered-set functions are also called *inverse distribution functions*, because we want to find where a particular percentile is located in the distribution of all values in the sample (if you’re not scared by the math, check out the wikipedia article) So, in PostgreSQL 9.4+, the

`MEDIAN()`

function can be emulated like this:
```
WITH t(value) AS (
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 100
)
SELECT
avg(value),
percentile_cont(0.5) WITHIN GROUP (ORDER BY value)
FROM
t;
```

```
WITH t(value) AS (
SELECT 1 FROM DUAL UNION ALL
SELECT 2 FROM DUAL UNION ALL
SELECT 100 FROM DUAL
)
SELECT
listagg(value, ', ') WITHIN GROUP (ORDER BY value)
FROM
t;
```

listagg --------- 1, 2, 100On a side-note:

`LISTAGG()`

is, of course, completely useless, because it returns `VARCHAR2`

, which again has a max length of 4000 in Oracle. Useless…
## Emulation out-of-the-box with jOOQ

As always, jOOQ will emulate these kinds of things out of the box. You can either use the`DSL.median()`

function, or with the upcoming jOOQ 3.6, the new `DSL.percentileCont()`

function to produce the same value:
```
DSL.using(configuration)
.select(
median(T.VALUE),
percentileCont(0.5).withinGroupOrderBy(T.VALUE)
)
.from(T)
.fetch();
```