I have a rather complicated query performing some aggregations using GROUPING SETS, it looks roughly like the following:
SELECT
column1,
[... more columns here]
count(*)
FROM table_a
GROUP BY GROUPING SETS (
column1,
[... more columns here]
)
ORDER BY count DESC
This works very well in general, as long as the number of results for each group is reasonably small. But I have some columns in this query that can have a large number of distinct values, which results in a large amount of rows returned by this query.
I'm actually only interested in the top results for each group in the grouping set. But there doesn't seem to be an obvious way to limit the number of results per group in a query using grouping sets, LIMIT doesn't work in this case.
I'm using PostgreSQL 9.6, so I'm not restricted in which newer features I can use here.
So what my query does is something like this:
| column1 | column2 | count |
|---------|---------|-------|
| DE | | 32455 |
| US | | 3445 |
| FR | | 556 |
| GB | | 456 |
| RU | | 76 |
| | 12 | 10234 |
| | 64 | 9805 |
| | 2 | 6043 |
| | 98 | 2356 |
| | 65 | 1023 |
| | 34 | 501 |
What I actually want is something that only returns the top 3 results:
| column1 | column2 | count |
|---------|---------|-------|
| DE | | 32455 |
| US | | 3445 |
| FR | | 556 |
| | 12 | 10234 |
| | 64 | 9805 |
| | 2 | 6043 |
Use row_number
and grouping
select a, b, total
from (
select
a, b, total,
row_number() over(
partition by g
order by total desc
) as rn
from (
select a, b, count(*) as total, grouping ((a),(b)) as g
from t
group by grouping sets ((a),(b))
) s
) s
where rn <= 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With