Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

count distinct of customers relative to a date

I am trying to find distinct count of transacting client ids prior to the time of a transaction from this table and associate the count to each transaction

transaction id client id transaction date
t1 C1 01/01/2024
t2 C2 01/02/2024
t3 C2 01/02/2024
t4 C3 01/03/2024
t5 C1 01/04/2024
t6 C4 01/05/2024

Desired output:

transaction id client id transaction date prior distinct client count
t1 C1 01/01/2024 0
t2 C2 01/02/2024 1
t3 C2 01/02/2024 1
t4 C3 01/03/2024 2
t5 C1 01/04/2024 3
t6 C4 01/05/2024 3

I tried using

select
transaction id,
client id,
transaction date,
count(distinct client id) over (order by transaction date asc rows between unbounded preceding and 1 preceding)
from 
<table name>

However, snowflake does not support an OVER clause that contains an ORDER BY clause for DISTINCT operation in the window function.

I am also trying to avoid self join queries as the transaction table is very large. Any help with alternatives is appreciated.

like image 544
SnowQuest Avatar asked Oct 28 '25 20:10

SnowQuest


1 Answers

create or replace table transaction_client as     
select 
  column1 as transaction_id, 
  column2 as client_id, 
  column3 as transaction_date 
from 
values 
  ('t1', 'C1', '01/01/2024'), 
  ('t2', 'C2', '01/02/2024'), 
  ('t3', 'C2', '01/02/2024'), 
  ('t4', 'C3', '01/03/2024'), 
  ('t5', 'C1', '01/04/2024'), 
  ('t6', 'C4', '01/05/2024');
  
  
select 
  a.transaction_id, 
  a.client_id, 
  a.transaction_date, 
  count(distinct b.client_id) as prior_distinct_client_count 
from 
  transaction_client a 
  left join transaction_client b 
 on b.transaction_date < a.transaction_date 
group by  a.transaction_id,a.client_id, a.transaction_date 
order by a.transaction_date;
like image 94
vaniuser22 Avatar answered Oct 31 '25 11:10

vaniuser22