Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate only rows based on values of another columns

Tags:

r

i have this dataset

  CASHPOINT_ID         DT     status   QT_REC
1   N053360330 2016-01-01 end_of_day      5
2   N053360330 2016-01-01 end_of_day      2
3   N053360330 2016-01-02 before          9
4   N053360330 2016-01-02 before         NA
5   N053360330 2016-01-03 end_of_day     16
6   N053360330 2016-01-03 end_of_day     NA

i want to aggregate only rows that don't have the column status marked as "before" and keep untouched the other. Resulting dataset should look like

 CASHPOINT_ID         DT     status       QT_REC
    1   N053360330 2016-01-01 end_of_day      7
    3   N053360330 2016-01-02 before          9
    4   N053360330 2016-01-02 before         NA
    5   N053360330 2016-01-03 end_of_day     16

Thanks.

like image 263
Marco Fumagalli Avatar asked Dec 31 '25 19:12

Marco Fumagalli


1 Answers

Using data.table

Assuming your original data is called dt and has been setDT() then you could do:

df <- rbind(
  dt[status == "end_of_day", .(QT_REC = sum(QT_REC, na.rm = TRUE)), 
     by = .(CASHPOINT_ID, DT, status)],
  dt[status != "end_of_day"]
)[order(DT)]

print(df)
   CASHPOINT_ID         DT     status QT_REC
1:   N053360330 2016-01-01 end_of_day      7
2:   N053360330 2016-01-02     before      9
3:   N053360330 2016-01-02     before     NA
4:   N053360330 2016-01-03 end_of_day     16
like image 112
sindri_baldur Avatar answered Jan 02 '26 10:01

sindri_baldur



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!