When is it worth using `remove` in R functions?

Question

What factors should I consider when deciding whether or not to remove a variable that will not be used again in a function?

Here's a noddy example:

DivideByLower <- function (a, b) {
  if (a > b) {
    tmp <- a
    a <- b
    b <- tmp
    remove(tmp) # When should I include this line?
  }

  # Return:
  a / b
}

I understand that tmp will be removed when the function finishes executing, but should I ever be concerned about removing it earlier?

Moody_Mudskipper · Accepted Answer

From Hadley Wickham's advanced R :

In some languages, you have to explicitly delete unused objects for their memory to be returned. R uses an alternative approach: garbage collection (or GC for short). GC automatically releases memory when an object is no longer used. It does this by tracking how many names point to each object, and when there are no names pointing to an object, it deletes that object.

In the case you're describing garbage collection will release the memory.

In case the output of your function is another function, in which case Hadley names these functions respectively the function factory and the manufactured function, the variables created in the body of the function factory will be available in the enclosing environment of the manufactured function, and memory won't be freed.

More info, still in Hadley's book, can be found in the chapter about function factories.

function_factory <- function(x){
  force(x)
  y <- "bar"
  fun <- function(z){
    sprintf("x, y, and z are all accessible and their values are '%s', '%s', and '%s'",
            x, y, z)
  }
  fun
}

manufactured_function <- function_factory("foo")
manufactured_function("baz")
#> [1] "x, y, and z are all accessible and their values are 'foo', 'bar', and 'baz'"

^{Created on 2019-07-08 by the reprex package (v0.3.0)}

In this case, if you want to control which variables are available in the enclosing environment, or be sure you don't clutter your memory, you might want to remove unnecessary objects, either by using rm / remove as you did, or as I tend to prefer, wrapped in an on.exit statement.

Another case in which I might use rm is if I want to access variables from a parent environment without risk of them being overriden inside of the function, but in that case it's often possible and cleaner to use eval.parent.

y <- 2
z <- 3
test0 <- function(x, var){
  y <- 1
  x + eval(substitute(var))
}

# opps, the value of y is the one defined in the body
test0(0, y)
#> [1] 1
test0(0, z)
#> [1] 3

# but it will work using eval.parent :
test1 <- function(x, var){
  y <- 1
  x + eval.parent(substitute(var))
}
test1(0, y)
#> [1] 2
test1(0, z)
#> [1] 3

# in some cases (better avoided), it can be easier/quick and dirty to do something like :
test2 <- function(x, var){
  y <- 1
  # whatever code using y
  rm(y)
  x + eval(substitute(var))
}
test2(0, y)
#> [1] 2
test2(0, z)
#> [1] 3

^{Created on 2019-07-08 by the reprex package (v0.3.0)}

When is it worth using `remove` in R functions?

Tags:

performance

function

r

Martin Smith

1 Answers

Moody_Mudskipper

Recent Activity

Donate For Us

When is it worth using `remove` in R functions?

Tags:

performance

function

r

Martin Smith

1 Answers

Moody_Mudskipper

Related questions

Recent Activity

Donate For Us