I have no experience with data.table, so I don't know if there is a solution to my question (30 minutes on Google gave no answer at least), but here it goes.
With data.frame I often use the following command to check the number of observations of a unique value:
df$Obs=with(df, ave(v1, ID-Date, FUN=function(x) length(unique(x))))
Is there any corresponding method when working with data.table?
Yes, there is. Happily, you've asked about one of the newest features of data.table, added in v1.8.2 :
:=by group is now implemented (FR#1491) and sub-assigning to a new column by reference now adds the column automatically (initialized withNAwhere the sub-assign doesn't touch) (FR#1997).:=by group can be combined with all types ofi, so:=by group includes grouping byias well as byby. Since:=by group is by reference, it should be significantly faster than any method that (directly or indirectly)cbinds the grouped results to DT, since no copy of the (large) DT is made at all. It's a short and natural syntax that can be compounded with other queries.
DT[,newcol:=sum(colB),by=colA]
In your example, iiuc, it should be something like :
DT[, Obs:=.N, by=ID-Date]
instead of :
df$Obs=with(df, ave(v1, ID-Date, FUN=function(x) length(unique(x))))
Note that := by group scales well for large data sets (and smaller datasets will a lot of small groups).
See ?":=" and Search data.table tag for "reference"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With