Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS athena (presto SQL): How to take the (set-like) union of arrays in a group by statement

I am trying to combine arrays of unique userids into one single array of unique userids. AWS athena does not have the set_union function, so I cannot use

, set_union(userids)

And reduce_agg seems to not allow for arrays

, reduce_agg(userids, ARRAY[], (a, b) -> array_union(a, b), (a, b) -> array_union(a, b))

Is there any other trick I can use to combine arrays into one array (distinct items)

like image 810
Roelant Avatar asked Feb 02 '26 15:02

Roelant


1 Answers

One option is to use array_agg and process the resulting array of arrays via flatten and array_distinct:

-- sample data
WITH dataset (id, userids) AS (
    VALUES (1, array [ 1, 2, 3 ]),
        (1, array [ 3, 4, 5 ])
) 

--query
SELECT array_distinct(flatten(array_agg(userids)))
FROM dataset
GROUP BY id

Output:

_col0
[1, 2, 3, 4, 5]
like image 188
Guru Stron Avatar answered Feb 04 '26 06:02

Guru Stron



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!