I have a piece of code that that is using the nnet package and I am interested in calculating a number of different neural network models & then saving all the models to disk (with save() ).
The issue that I am running into is that the "terms" elements in the neural network has an attribute ".Environment" that ends up being hundreds of megabytes whereas the rest of the model is only a few kilobytes. (once the fitted values & residuals are deleted)
Further, deleting the ".Environment" attribute doesn't appear to cause a problem in terms of using the model with 'predict'.
Does anyone have any idea what either R or nnet is doing with this attribute? Has anyone seen anything like this?
tl;dr: this is OK, except for some very special cases
The .Environment
attribute in R contains a reference to the context in which an R closure (usually a formula or a function) was defined. An R environment is a store holding values of variables, similarly to a list. This allows the formula to refer to these variables, for example:
> f = function(g) return(y ~ g(x))
> form = f(exp)
> lm(form, list(y=1:10, x=log(1:10)))
...
Coefficients:
(Intercept) g(x)
3.37e-15 1.00e+00
In this example, the formula form
if defined as y~exp(x)
, by giving g
the value of exp
. In order to be able to find the value of g
(which is an argument to function f
), the formula needs to hold a reference to the environment constructed inside the call to function f
.
You can see the enviroment attached to a formula by using the attributes()
or environment()
functions as follows:
> attributes(form)
$class
[1] "formula"
$.Environment
<environment: R_GlobalEnv>
> environment(form)
<environment: R_GlobalEnv>
I believe you are using the nnet()
function variant with a formula (rather than matrices), i.e.
> nnet(y ~ x1 + x2, ...)
Unfortunately, R keeps the entire environment (including all the variables defined where your formula is defined) allocated, even if your formula does not refer to any of it. There is no way to the language to easily tell what you may or may not be using from the environment.
One solution is to explicitly retain only the required parts of the environment. In particular, if your formula does not refer to anything in the environment (which is the most common case), it is safe to remove it.
I would suggest removing the environment from your formula before you call nnet
, something like this:
form = y~x + z
environment(form) = NULL
...
result = nnet(form, ...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With