in R 3.6.0 (Pre-release) only I have a memory leak in the data.table package. This happens on the CRAN version as well as on the GH version.
require(data.table)
n <- 2e6
df <- data.frame(a=rnorm(n),
                 b=factor(rbinom(n,5,prob=0.5),1:5,letters[1:5]),
                 c=factor(rbinom(n,5,prob=0.5),1:5,letters[1:5]))
dt <- setDT(df)
print(pryr::mem_used())
fff <- function(aref) {
  ff <- lapply(1:5, function(i) {
    dt2 <- dt[,list(sumA=sum(get(aref))),by=b][,c:=letters[i]]
    dt2
  })
  return(rbindlist(ff))
}
for(i in 1:10) {
  f <- fff("a")
  rm("f")
  gc()
  print(pryr::mem_used())
}
gc()
print(pryr::mem_used())
returns (3.6.0 only)
81.2 MB
81.2 MB
81.2 MB
184 MB
287 MB
390 MB
493 MB
596 MB
699 MB
802 MB
any ideas?
Both the call to "get" and the "by" appear to be necessary. The `[,c:=letters[i]] is NOT, but it makes the memory leak appear much faster.
My session info
> sessionInfo()
R Under development (unstable) (2018-05-10 r74708)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] data.table_1.11.3
loaded via a namespace (and not attached):
[1] compiler_3.6.0   pryr_0.1.4       magrittr_1.5     tools_3.6.0     
[5] Rcpp_0.12.16     stringi_1.1.7    codetools_0.2-15 stringr_1.3.0   
Yay! A reproducible example. We've been struggling for a few weeks in this area. Your example looks extremely useful. Please join us on GitHub.
The current milestone (next release) is 1.11.4 and there are several related issues there. What made you think we didn't want you to raise an issue? Bullet point 3 of the issue template I guess. I've now changed those points to be clearer, I hope. You're a package developer having issues at-the-moment with as yet unreleased R 3.6.0 and recently released data.table, so that should be on GitHub.

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With