I have the following hive table
ID, class,value
1, A, 0.3
1, B, 0.4
1, C, 0.5
2, B, 0.1
2, C, 0.2
I want to get
ID, class:value
1, [A:0.3, B:0.4, C:0.5]
2, [B:0.1, C:0.2]
I know that there is a collect_set() UDAF that produces a list of class or list of value, is there anyway to get a list of key:value pairs?
NOTE:
I guess I can use two collect_set() one for class column and one for value column but I am not sure if the lists will be in the same order.
I've used the UnionUDAF from the Brickhouse library to do something similar. You create a map from each pair, and then union them all together during the aggregation.
Add JAR brickhouse.jar;
create temporary function BH_union as 'brickhouse.udf.collect.UnionUDAF';
SELECT S.ID, BH_union(S.v_map)
FROM (SELECT ID, map(class, value) as v_map from mytable) S
GROUP by S.ID
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With