Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find last date of previous week using data.table in R

Tags:

r

data.table

I have a data.frame as shown below:

structure(list(Start_Date = c("D1", "D2", "D3", "D4", "NA", "NA", 
"D6", "D7"), Week = c("W1", "W1", "W1", "W2", "W2", "W3", "W4", 
"W4"), last_date = c(NA, NA, NA, "D3", "D3", "D4", "D4", "D4"
)), class = "data.frame", row.names = c(NA, -8L))

output is column 'last_date'

What I require ? -I want to find last non-NA date of previous week. If the previous week has only NA date then it should look into previous to previous week and find non-NA date. For example - for all dates of W2, the last date would be D3 (last non-NA date of previous week). For W3, it should return D4. For W4, as only last date of W3 is NA, it should look for non-NA date in previous to previous week (i.e. W2) and return D4.

Summarizing, the last date will be most recent non-NA date (not coming from current week)

As my data set is too bulky, I am looking for a data.table solution.

like image 798
Champ Avatar asked Jan 27 '26 19:01

Champ


1 Answers

Here is one option (assumes data is ordered already):

# Load data.table and convert data.frame to data.table
library(data.table)
setDT(df)

# Clean data; "NA" is just a regular character at the moment
df[Start_Date == "NA", Start_Date := NA_character_]

# Step 1: Create a numeric week index
df[, week_nr := .GRP, keyby = Week]
# Step 2: Create a lookup table (lut) for last date for each week
lut <- df[!is.na(Start_Date), last(Start_Date), by = week_nr + 1L]
# Step 3: Use join syntax to "consult" the lut and add a new column
df[, last_dat2 := lut[.SD, on = "week_nr", V1, roll = Inf]] 

Step 2-3 could be compressed to one step:

df[, last_dat2 := df[!is.na(Start_Date)
                     ][.SD, 
                       on = .(week_nr < week_nr), 
                       last(Start_Date), 
                       by = .EACHI]$V1]

Or a bit more cleanly using mult=:

df[, last_dat2 := df[!is.na(Start_Date)
                     ][.SD, 
                       on = .(week_nr < week_nr), 
                       Start_Date, 
                       mult = "last"]]

Output:

   Start_Date Week last_date week_nr last_dat2
1:         D1   W1      <NA>       1      <NA>
2:         D2   W1      <NA>       1      <NA>
3:         D3   W1      <NA>       1      <NA>
4:         D4   W2        D3       2        D3
5:       <NA>   W2        D3       2        D3
6:       <NA>   W3        D4       3        D4
7:         D6   W4        D4       4        D4
8:         D7   W4        D4       4        D4
like image 100
sindri_baldur Avatar answered Jan 29 '26 10:01

sindri_baldur



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!