Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding javascript closure variable capture in v8

I understand the semantics that a closure holds a reference to a variable lengthen it's life cycle, makes primitive variables not limited by calling stack, and thus those variables captured by closures should be specially treated.

I also understand variables in same scope could be differently treated depends on whether it was captured by closures in now-days javascript engine. for example,

function foo(){
    var a=2;
    var b=new Array(a_very_big_number).join('+');
    return function(){
        console.log(a);
    };
}
var b=foo();

as no one hold a reference to b in foo, there's no need to keep b in memory, thus memory used could be released as soon as foo returns(or even never created under furthur optimization).

My question is, why v8 seems to pack all variables referenced by all closures together in each calling context? for example,

function foo(){
    var a=0,b=1,c=2;
    var zig=function(){
        console.log(a);
    };
    var zag=function(){
        console.log(b);
    };
    return [zig,zag];
}

both zig and zag seems to hold a reference to a and b, even it's apparent that b is not available to zig. This could be awful when b is very big, and zig persists very long.

But stands on the point of view of the implementation, I can not understand why this is a must. Based on my knowledge, without calling eval, the scope chain can be determined before excution, thus the reference relationship can be determined. The engine should aware that when zig is no longer available, nether do a so the engine mark it as garbage.

Both chrome and firefox seems to obey the rule. Does standard say that any implementation must do this? Or this implementation is more practical, more efficient? I'm quite puzzling.

like image 836
Yuki N Avatar asked Oct 25 '25 08:10

Yuki N


2 Answers

The main obstacle is mutability. If two closures share the same var then they must do so in a way that mutating it from one closure is visible in the other. Hence it is not possible to copy the values of referenced variables into each closure environment, like functional languages would do (where bindings are immutable). You need to share a pointer to a common mutable heap location.

Now, you could allocate each captured variable as a separate cell on the heap, instead of one array holding all. However, that would often be more expensive in space and time because you'd need multiple allocations and two levels of indirection (each closure points to its own closure environment, which points to each shared mutable variable cell). With the current implementation it's just one allocation per scope and one indirection to access a variable (all closures within a single scope point to the same mutable variable array). The downside is that certain life times are longer than you might expect. It's a trade-off.

Other considerations are implementation complexity and debuggability. With dubious features like eval and expectations that debuggers can inspect the scope chain, the scope-based implementation is more tractable.

like image 184
Andreas Rossberg Avatar answered Oct 27 '25 23:10

Andreas Rossberg


The standard doesn't say anything about garbage collection, but gives some clues of what should happen. Reference : Standard

An outer Lexical Environment may, of course, have its own outer Lexical Environment. A Lexical Environment may serve as the outer environment for multiple inner Lexical Environments. For example, if a Function Declaration contains two nested Function Declarations then the Lexical Environments of each of the nested functions will have as their outer Lexical Environment the Lexical Environment of the current execution of the surrounding function."

Section 13 Function definition
  step 4: "Let closure be the result of creating a new Function object as specified in 13.2"

Section 13.2 "a Lexical Environment specified by Scope" (scope = closure)

Section 10.2 Lexical Environments:
"The outer reference of a (inner) Lexical Environment is a reference to the Lexical Environment that logically surrounds the inner Lexical Environment.

So, a function will have access to the environment of the parent.

like image 32
Hiteshdua1 Avatar answered Oct 28 '25 00:10

Hiteshdua1