Taking latest non null record from multiple columns and bringing down to one row in SQL without iterating/looping

Question

I have a dataset which contains email, city, state, zip and date. What i need is one row for each emails and city, state and zip be filled with the latest non-null value available for each.

Input:

enter image description here

Output:

enter image description here

I am using a query like below but it is taking hours to run. Is there any other effecient way to get the desired output in SQL?

row_number()over(partition by Email_Addr order by email_effective_from desc) as rn1
into #d1 from data where zip is not null and email_addr is not null;
select Email_Addr,city,
row_number()over(partition by Email_Addr order by email_effective_from desc) as rn2 into #d2
from data where city is not null and email_addr is not null;
select Email_Addr,[state],
row_number()over(partition by Email_Addr order by email_effective_from desc) as rn3 into #d3
from data  where state is not null and email_addr is not null;
select a.email_addr,a.zip,b.city,c.[state] into #dff from #d1 a
full outer join #d2 b on a.email_addr=b.email_addr
full outer join #d3 c on a.email_addr=c.email_addr```

GMB · Accepted Answer

If you are running SQL Server 2022, one option uses last_value and ignore nulls:

select *
from (
    select email, date,
        last_value(city)  ignore nulls over(partition by email order by date) city,
        last_value(state) ignore nulls over(partition by email order by date) state,
        last_value(zip)   ignore nulls over(partition by email order by date) zip,
        row_number()                   over(partition by email order by date desc) rn
    from mytable t
) t
where rn = 1

email	date	city	state	zip	rn
abc	2023-01-04	B	JP	160007	1

fiddle

Or we can use with ties instead of filtering:

select top (1) with ties email, date,
    last_value(city)  ignore nulls over(partition by email order by date) city,
    last_value(state) ignore nulls over(partition by email order by date) state,
    last_value(zip)   ignore nulls over(partition by email order by date) zip,
    row_number() over(partition by email order by date desc) rn
from mytable t
order by row_number() over(partition by email order by date desc)

In earlier versions, one alternative uses a gaps-and-islands technique to build groups of rows, then aggregates over those groups:

select top (1) with ties email, date, 
    max(city)  over(partition by email, grp_city ) city,
    max(state) over(partition by email, grp_state) state,
    max(zip)   over(partition by email, grp_zip  ) zip
from (
    select t.*,
        count(city)  over(partition by email order by date) grp_city,
        count(state) over(partition by email order by date) grp_state,
        count(zip)   over(partition by email order by date) grp_zip
    from mytable t 
) t
order by row_number() over(partition by email order by date desc)

Demo on DB Fiddle

Taking latest non null record from multiple columns and bringing down to one row in SQL without iterating/looping

Tags:

sql

sql-server

aggregate-functions

gaps-and-islands

Ambreen

1 Answers

GMB

Recent Activity

Donate For Us

Taking latest non null record from multiple columns and bringing down to one row in SQL without iterating/looping

Tags:

sql

sql-server

aggregate-functions

gaps-and-islands

Ambreen

1 Answers

GMB

Related questions

Recent Activity

Donate For Us