I am facing a conceptual problem that I am having a hard time overcoming. I am hoping the SO folks can help me overcome it with a nudge in the right direction.
I am in the process of doing some ETL work with the source data being very similar and very large. I am loading it into a table that is intended for replication and I only want the most basic of information in this target table.
My source table looks something like this:

I need my target table to reflect it as such:

As you can see I didn't duplicate the InTransit status where it was duplicated in the source table. The steps I am trying to figure out how to achieve are
My source table could easily consist of 100k+ rows but having the need to run this every 15 minutes requires me to make sure this is very performant thus why I am really trying to avoid cursors.
Right now the only way I can see to do this is using a CLR sproc but I think there may be better ways thus I am hoping you guys can nudge me in the right direction.
I am sure I am probably leaving something out that you may need so please let me know what info you may need and I'll happily provide.
Thank you in advance!
EDIT: Ok I wasn't explicit enough in my question. My source table is going to contain multiple tracking Ids. It may be up to 100k+ rows containing mulitple TrackingId's and multiple statuses for each trackingId. I have to update the target table as above for each individual tracking Id but my source will be an amalgam of trackingId's.
Here's a solution without self-joins:
WITH q AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY statusDate) AS rn,
ROW_NUMBER() OVER (PARTITION BY status ORDER BY statusDate) AS rns
FROM tracking
WHERE tackingId = @id
),
qs AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY rn - rns ORDER BY statusDate) AS rnn
FROM q
)
SELECT *
FROM qs
WHERE rnn = 1
ORDER BY
statusDate
Here's a script to check:
DECLARE @tracking TABLE
(
id INT NOT NULL PRIMARY KEY,
trackingId INT NOT NULL,
status INT,
statusDate DATETIME
)
INSERT
INTO @tracking
SELECT 1, 1, 1, DATEADD(d, 1, '2010-01-01')
UNION ALL
SELECT 2, 1, 2, DATEADD(d, 2, '2010-01-01')
UNION ALL
SELECT 3, 1, 2, DATEADD(d, 3, '2010-01-01')
UNION ALL
SELECT 4, 1, 2, DATEADD(d, 4, '2010-01-01')
UNION ALL
SELECT 5, 1, 3, DATEADD(d, 5, '2010-01-01')
UNION ALL
SELECT 6, 1, 3, DATEADD(d, 6, '2010-01-01')
UNION ALL
SELECT 7, 1, 4, DATEADD(d, 7, '2010-01-01')
UNION ALL
SELECT 8, 1, 2, DATEADD(d, 8, '2010-01-01')
UNION ALL
SELECT 9, 1, 2, DATEADD(d, 9, '2010-01-01')
UNION ALL
SELECT 10, 1, 1, DATEADD(d, 10, '2010-01-01')
;
WITH q AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY statusDate) AS rn,
ROW_NUMBER() OVER (PARTITION BY status ORDER BY statusDate) AS rns
FROM @tracking
),
qs AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY rn - rns ORDER BY statusDate) AS rnn
FROM q
)
SELECT *
FROM qs
WHERE rnn = 1
ORDER BY
statusDate
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With