Skip to content
Pieter van Ginkel edited this page May 28, 2015 · 8 revisions

The Scope system is the mechanism rjs uses to implement closures, with and eval.

Introduction

JavaScript supports closures. A closure is a (list of) variables that a function references from another function.

function f() {
  var a = 1;
  function g() {
    a += 1;
  }
  g();
  assert(a == 2);
}

In the example above, g reverences a from f. To implement this functionality, f creates a closure which contains a and passes that closure to g.

with and eval

Beyond normal closures, JavaScript supports with and eval.

with

with is a statement that allows the properties of an object to be used as local variables. To implement this functionality, rjs creates a scope that contains a reference to the object passed to the with statement. Resolving variables will then use that scope to find local variables.

function f() {
  var o = { a: 1 };
  with (o) {
    assert(a == 1);
  }
}

In the example above, the object passed to with is used to resolve local variables against. Because of this, a will resolve to the a property of o.

eval

eval is a JavaScript function that allows code to be passed as a string and executed. eval implements variable resolving in that it has access to the lexical scope in which it was executed.

function f() {
  eval('var a = 1;');
  assert(a == 1);
}

In the example above, eval creates a new local variable a in the lexical scope of f. Because of this, after the eval function completes, a new local variable has been introduced.

Thick and thin scopes

Rjs supports two kinds of scopes: thick and thin scopes. Thick scopes are full scopes that use objects (JavaScript hash tables) to implement their functionality. These are used to implement with and eval. Thin scopes are optimized scopes that implement closures when with and eval are not in play.

The JsScope struct implements closures in rjs. The JsScope struct is implemented using a flat array of JsValue instances. Some indexes into this array have special meaning:

  • [0]: The parent JsScope if there is one; JsUndefined otherwise. This is used to link one scope to another;
  • [1]: The object for the current scope; JsUndefined otherwise. This is used to implement thick scopes and references either the object passed to the with statement or the object that implements the lexical scope;
  • The contents of the remainder of the array depend on the type of slot:
    • [2..] for thin slots: Slots for local variables for normal closures;
    • [2] for thick slots: The arguments object if there is one.

Think scopes have the first index set to the parent scope, if there is one; the second index empty and use the remainder to implement slots for lifted (closed over) variables. Thick scopes have the first index set to the parent scope, if there is one; the second index set to the with object or the lexical scope and always have just these two entries. Thick scopes always end in the global scope. This is a JsScope instance that has parent set to null and has the global object as the current scope.

Thin scopes are far better optimized then thick scopes. When a variable resolves to a thin scope, it resolves to an index into a simple array. This isn't even a full JavaScript array, but a raw array. Thick scopes on the other hand are implemented using hash tables and require a lookup to find a variable. Even worse, when a variable is resolved against a thin scope with multiple levels, it is known at what level the lifted variable is located, so the lookup of the specific scope the lifted variable is in can be cached. With thick scopes, the algorithm needs to dynamically recurse over the parent field until it finds a matching scope. Because of this, thick scopes are only used when absolutely necessary, i.e. only when with and eval are in play.

Resolving variables

The are five modes in which a variable is implemented:

  • Local: A simple local variable that is placed on the stack;
  • Argument: An argument to a function;
  • Lifted: A variable that resolves to a thin scope. This resolves to the depth at which the scope is located and an index into the scope;
  • Global: A property on the global object;
  • Scoped: A property on a thick scope.

The local resolver implements the algorithm that detects the mode of a variable. The local resolver has two tasks:

  • Detect the mode of a specific variable;
  • Detect the scope mode of a function.

There are three types of blocks that local resolver work with:

  • Function blocks form a root where variables are allocated;
  • Catch blocks introduce a new lexical scope for the exception variable;
  • with blocks introduce a new lexical scope for the object passed to the with statement.

Note that ES6 introduces a fourth block for let variables. This currently is not implemented in rjs.

The algorithm above has the concept of blocks and scopes. Blocks are the three different types of blocks described above. Scopes are functions. However, the global scope is an exception to this. In the global scope, catch and with blocks behave as they were part of a function. However, the function block is not part of the global scope. So, when the below algorithm references a scope, this is either a function or the global scope except for the actual global scope (so nested catch or with blocks).

The algorithm the local resolver implements works as follows:

  • Global variables always have the global mode; otherwise
  • If the current scope has an eval call or a with statement, it is marked as building a thick scope. All parent scopes are also marked as building a thick scope to ensure that the complete chain is built;
  • For every variable, iterate over all blocks and find the variable;
    • If the variable resolves to a block in the current scope, it's a local or argument variable;
    • If the variable resolved to a block in a higher scope:
      • If the scope is the global scope, the variable has the global mode;
      • If the scope is marked as building a thick scope, the variable is a scoped variable and the current scope is marked as taking a thick scope. All intermediate levels are marked as building and taking a thick scope;
      • Otherwise the variable is a lifted variable, the scope the variable is resolved against is marked as building a thin scope and the current scope is marked as taking a thin scope. All intermediate levels are marked as building and taking a thin scope. When the resolved variable is an argument, the function prolog will move the contents of the variable into the scope. If the argument is part of a full arguments instance, the arguments instance is copied into the scope instead;
  • At the end of a scope, a check is done to see whether the scope builds a thick scope. If so, a second pass is done over all first level nested scopes:
    • If the nested scope is marked as taking a thin scope, it is changed as taking a thick scope. If it is marked as building a thin scope, it is marked as building a thick scope and the algorithm recurses into that scope. (Note that it cannot be marked as building a thick scope.) The example below shows when this happens:
function f() {
  var a = 1;
  function g() {
    // a is marked as lifted and f as building a thin scope.
    a += 1;
  }
  function h() {
    // a is marked as scoped and f as building a thick scope.
    with ({ b: 2 }) {
      a += 1;
    }
  }
}

If a scope changes from building a thin scope to building a thick scope, this invalidates all previously resolved variables. When this happens, the mode of all variables in that scope is changed to scoped.

Accessing variables

The implementation of the five different variable modes is as follows:

  • Global variables are implemented as a Load/StoreName on the global object;
  • Local variables are stored and loaded on the stack;
  • Argument variables are stored and loaded from the passed arguments;
  • Lifted variables are implemented as follows:
    • The current scope is loaded;
    • It is replaced with the parent scope for depth - 1 iterations;
    • The variable resolves to an index into the array of the found scope;
  • Scoped variables are implemented as follows:
    • The current scope is loaded;
    • The scope object is loaded;
    • A check is done whether that object has the variable name;
      • If so, that object is used as an argument to Load/StoreName;
      • Otherwise the search is restarted with the parent scope. Note that the global object is not checked explicitly because the scope chain always terminates in the global scope. If this algorithm doesn't find the variable, it becomes a ReferenceError on load and a store on the global object on store.

Scopes on functions

When a function that takes a scope is loaded, the current scope is passed onto that function. To implement this, there are to modes in which a function can be loaded:

  • LoadFunction loads a function that does not take a scope;
  • LoadScopedFunction loads a function that does take a scope.

eval calls

Direct eval calls (i.e. the exact eval('...') expression) have access to the current lexical scope. To implement this, there are two modes in which a call can be done:

  • Call implements a call that does not take a scope;
  • ScopedCall implements a call that does take a scope. Note that this must still check whether the called function resolves to global.eval. Only then is the scope passed.

Creating scopes

How a scope is created depends on whether it's a thin or thick scope.

Thin scopes

When a function builds a thin scope, the prolog of the function creates the scope instance. This is not part of the instruction set but part of the interpreter/code generator for the function.

Thick scopes

Function blocks and catch blocks that create a thick scope have the following instructions added:

  • EnterScope builds a new scope;
    • The parent scope is set to the current scope. An implicit global scope is always present. This means that the parent scope will never be null;
    • The scope object is initialized to a new object (hash table) that has prototype set to null;
    • The current scope is replaced with the new one;
    • To initialize the scope:
      • If this scope is at the function level, and the function has arguments, they are copied to the new scope;
      • If this scope is at the catch level, the exception variable is copied to the new scope;
  • LeaveScope replaces the current scope with the parent scope. Note that this instruction is in a finally block that starts after the EnterScope instruction.

with blocks have the following instructions added:

  • EnterWithScope builds a new scope:
    • The parent scope is set to the current scope, which is the function scope or a nested with or catch scope;
    • The scope object is initialized to the object passed to the with statement;
  • LeaveScope (see above).

The arguments variable

Normally arguments are accessed using the Load/StoreParam instruction. However, when the arguments variable is used, an arguments object must be created.

Normally the arguments variable is a normal local variable, with the following exceptions:

  • If an argument is closed over, the arguments variable becomes a slot on the scope. Access to these variables are then done in the same way as that arguments are accessed on the arguments object normally, instead that the arguments object is loaded from the closure;
  • When the scope has a thick scope, the arguments variable becomes index two of the scope. The arguments variable is special in that it cannot be accessed directly in scopes other than the scope the arguments variable belongs to. Because of this, it cannot be added as a variable to the scope object. This is solved by allocating a dedicated index for the arguments variable in the thick scope object. Accessing the arguments variable is done using the LoadScopedArguments instruction. This instruction has a depth parameter that specifies at what depth the scope is located that has the arguments variable. This instruction iterates over the parents until it finds the scope with the arguments variable and reads index two from this scope.

The arguments object is created using the CreateArguments instruction. This instruction takes the arguments off of the stack and initializes an arguments object with it. This instruction has a parameter specifying how the arguments object needs to be stored. For arguments variables that are stored in a local variable or lifted variable, the arguments object is pushed onto the stack and the next instruction stores the object in the same way normal values are stored. For thick scopes, the parameter specifies that the result needs to be stored in the thick scope object. The instruction will then put the result in the second index of the thick scope object.

Clone this wiki locally