My data is as mentioned below
customer_id usage_month usage_by_product usage
1 June {"A":50, "B":50} 100
1 July {"A":50, "B":10, "C":20} 80
1 Aug {"A":50, "D":500} 550
1 Sep {"C" :30} 30
I want to write a query that aggregates the total usage for the complete year
customer_id usage_by_product usage
1 {"A": 150, "B":60 760
"C": 50, "D":500}
Is it possible to do this kind of aggregation on maps in Athena (Presto)?
You can split your map into individual key/value pairs with map_entries + UNNEST. Then, it's a matter of summing and aggregating values back into a map.
For example:
WITH input AS (
SELECT * FROM (VALUES
(1, map(array['a', 'c'], array[50, 42])),
(1, map(array['a', 'b'], array[50, 18]))
) t(customer_id, m)
),
sum_by_map_key AS (
SELECT customer_id, k, sum(v) AS s
FROM input
CROSS JOIN UNNEST(map_entries(m)) AS u(k, v)
GROUP BY customer_id, k
)
SELECT customer_id, map_agg(k, s)
FROM sum_by_map_key
GROUP BY customer_id;
Outputs:
customer_id | _col1
-------------+---------------------
1 | {a=100, b=18, c=42}
(1 row)
Note:
to just add two maps like this, you could use map_zip_with. However, to use it when aggregating over multiple rows you would probably need to aggregate all the maps values into single array and run array reduction on it. Aggregating all the maps into single array(map) may or may not work, depending on the size of these arrays.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With