Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Only return top n results for each group in GROUPING SETS query

Tags:

sql

postgresql

I have a rather complicated query performing some aggregations using GROUPING SETS, it looks roughly like the following:

SELECT 
    column1,
    [... more columns here]
    count(*)
FROM table_a 
GROUP BY GROUPING SETS (
    column1,
    [... more columns here]
)
ORDER BY count DESC

This works very well in general, as long as the number of results for each group is reasonably small. But I have some columns in this query that can have a large number of distinct values, which results in a large amount of rows returned by this query.

I'm actually only interested in the top results for each group in the grouping set. But there doesn't seem to be an obvious way to limit the number of results per group in a query using grouping sets, LIMIT doesn't work in this case.

I'm using PostgreSQL 9.6, so I'm not restricted in which newer features I can use here.

So what my query does is something like this:

| column1 | column2 | count |
|---------|---------|-------|
| DE      |         | 32455 |
| US      |         | 3445  |
| FR      |         | 556   |
| GB      |         | 456   |
| RU      |         | 76    |
|         | 12      | 10234 |
|         | 64      | 9805  |
|         | 2       | 6043  |
|         | 98      | 2356  |
|         | 65      | 1023  |
|         | 34      | 501   |

What I actually want is something that only returns the top 3 results:

| column1 | column2 | count |
|---------|---------|-------|
| DE      |         | 32455 |
| US      |         | 3445  |
| FR      |         | 556   |
|         | 12      | 10234 |
|         | 64      | 9805  |
|         | 2       | 6043  |
like image 634
Mad Scientist Avatar asked Oct 19 '25 12:10

Mad Scientist


1 Answers

Use row_number and grouping

select a, b, total
from (
    select 
        a, b, total, 
        row_number() over(
            partition by g 
            order by total desc
        ) as rn
    from (
        select a, b, count(*) as total, grouping ((a),(b)) as g
        from t
        group by grouping sets ((a),(b))
    ) s
) s
where rn <= 3
like image 117
Clodoaldo Neto Avatar answered Oct 21 '25 02:10

Clodoaldo Neto