Suppose I have the following dataset :-
Year Temp
1974 48
1974 48
1991 56
1983 89
1993 91
1938 41
1938 56
1941 93
1983 87
I want my final answer to be 93 ( Pertaining to the year 1941). I am able to find the Maximum temperature for each year(Say 1941-93) but unable to find only the maximum. Any suggestions are appreciated.
Thanks,
You can solve this problem in two ways.
Option1: Using (Group ALL + MAX)
A = LOAD 'input' USING PigStorage() AS (Year:int,Temp:int);
B = GROUP A ALL;
C = FOREACH B GENERATE MAX(A.Temp);
DUMP C;
Output:
(93)
Option2: Using (ORDER and LIMIT)
A = LOAD 'input' USING PigStorage() AS (Year:int,Temp:int);
B = ORDER A BY Temp DESC;
C = LIMIT B 1;
D = FOREACH C GENERATE Temp;
DUMP D;
Output:
(93)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With