Is there a way to combine an GROUP BY
and OBJECT_AGG
in snowflake with different levels of aggregation between the two?
In the following example, we wish to return, for each city, an object mapping cuisine
to the highest michelin stars of that cuisine in the city:
City | Restaurant | Cuisine | Michelin Stars |
---|---|---|---|
San Fransisco | Quince | Californian | 3 |
San Fransisco | Coi | Californian | 2 |
San Fransisco | Mister Jius | Chinese | 1 |
London | Le Gavroche | French | 2 |
London | La Dame de Pic | French | 2 |
London | Restaurant Gordon Ramsay | French | 3 |
We want to produce the following:
City | Cuisine to Top Rating |
---|---|
San Fransisco | {'Californian': 3, 'Chinese': 1} |
London | {'French': 3} |
My initial approach is:
SELECT
city,
OBJECT_AGG(cuisine, MAX(michelin_stars)::variant) over (partition by cuisine)
FROM
top_restaurants
GROUP BY city
This returns the error cuisine is not a valid group by expression
.
This question is similar to
From the above, I believe the cause is the GROUP BY
is calculated before the PARTITION BY
and breaks when the GROUP BY
drops cuisine
as it attempts to aggregate.
https://docs.snowflake.com/en/sql-reference/functions/object_agg.html
Thanks to the order of operation, you can still do it in one select
. You just have to aggregate by city
and cuisine
first. When it's time for window function
to shine, you partition by city
. Obviously this leads to duplicates because window function
simply applies calculations to the result set left by group by
without collapsing any rows. Add a distinct
and you're good to go
select distinct
city,
object_agg(cuisine, max(michelin_stars)::variant) over (partition by city)
from t
group by city, cuisine;
I believe the cause is the GROUP BY is calculated before the PARTITION BY and breaks when the GROUP BY drops cuisine as it attempts to aggregate.
This is exactly what happened.
Related: A Beginner’s Guide to the True Order of SQL Operations
Alternative approach:
WITH cte AS (
SELECT *
FROM top_restaurants
QUALIFY ROW_NUMBER() OVER(PARTITION BY city,cuisine
ORDER BY michelin_stars DESC) = 1
)
SELECT city,
OBJECT_AGG(cuisine, michelin_stars::variant)
FROM cte
GROUP BY city
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With