Sorry for the ambiguous title. Not sure how to name it.
user_id | transaction_amount | date
1 | 100 | 2019-12-15
1 | 250 | 2019-12-16
2 | 300 | 2019-12-11
2 | 415 | 2019-12-12
I need to select latest transaction_amount per each user:
user_id | transaction_amount | date
1 | 250 | 2019-12-16
2 | 415 | 2019-12-12
How do I do that?
Here is what I do now:
SELECT
user_id,
transaction_amount
FROM the_original_table VT1
WHERE date = (
SELECT
MAX(date)
FROM the_original_table VT2
WHERE
VT1.user_id = VT2.user_id
)
GROUP BY
user_id,
date,
transaction_amount
My approach looks incorrect, but it does the job. How do I do it better?
Use a cte with row_number instead of array_agg, easier to read an supported by most vendors:
with ctetbl as (
select user_id, transaction_amount, date
,row_number() over (partition by user_id order by date desc) as rn
from the_original_table
)
select user_id, transaction_amount, date
from ctetbl
where rn = 1
In BigQuery, I would just use aggregation:
select array_agg(t order by date desc limit 1)[ordinal(1)].*
from the_original_table t
group by user_id;
Since this question was asked, a better approach has been introduced using qualify:
select t.*
from the_original_table t
where 1=1
qualify row_number() over (partition by user_id order by date desc) = 1;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With