Skip to content

Commit

Permalink
Add new proposal for lexical scoping
Browse files Browse the repository at this point in the history
  • Loading branch information
jamesls committed Mar 22, 2023
1 parent 16509d2 commit 34a4f50
Showing 1 changed file with 286 additions and 0 deletions.
286 changes: 286 additions & 0 deletions proposals/0000-lexical-scope.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
# Lexical Scoping

- JEP: (leave blank)
- Author: @jamesls
- Created: 2023-03-21

## Abstract
[abstract]: #abstract

This JEP proposes the introduction of lexical scoping through a new
`let` expression. You can now bind variables that are evaluated in the
context of a given lexical scope. This enables queries that can refer to
elements defined outside of their current scope, which is not currently
possible. This JEP supercedes JEP 11, which proposed similar functionality
through a `let()` function.

## Motivation
[motivation]: #motivation

A JMESPath expression is always evaluated in the context of a current
element, which can be explicitly referred to via the `@` token. The
current element changes as expressions are evaluated. For example,
suppose we had the expression `foo.bar[0]` that we want to evalute against
an input document of:

```json
{"foo": {"bar": ["hello", "world"]}, "baz": "baz"}
```

The expression, and the associated current element are evaluated as follows:

```
# Start
expression = foo.bar[0]
@ = {"foo": {"bar": ["hello", "world"]}, "baz": "baz"}
# Step 1
expression = foo
@ = {"foo": {"bar": ["hello", "world"]}, "baz": "baz"}
result = {"bar": ["hello", "world"]}
# Step 2
expression = bar
@ = {"bar": ["hello", "world"]}
result = ["hello", "world"]
# Step 3
expression = [0]
@ = ["hello", "world"]
result = "hello"
```

The end result of evaluating this expression is `"hello"`. Note that each
step changes that values that are accessible to the current expression being
evaluated. In "Step 2", it is not possible for the expression to reference
the value of `"baz"` in the current element of the previous step, "Step 1".

This ability to reference variables in a parent scope is a serious limitation
of JMESPath, and anecdotally is one of the commonly requested features
of the language. Below are examples of input documents and the desired output
documents that aren't possible to create with the current version of
JMESPath:

```
Input:
[
{"home_state": "WA",
"states": [
{"name": "WA", "cities": ["Seattle", "Bellevue", "Olympia"]},
{"name": "CA", "cities": ["Los Angeles", "San Francisco"]},
{"name": "NY", "cities": ["New York City", "Albany"]}
]
},
{"home_state": "NY",
"states": [
{"name": "WA", "cities": ["Seattle", "Bellevue", "Olympia"]},
{"name": "CA", "cities": ["Los Angeles", "San Francisco"]},
{"name": "NY", "cities": ["New York City", "Albany"]}
]
}
]
(for each list in "states", select the list of cities associated
with the state defined in the "home_state" key)
Output:
[
["Seattle", "Bellevue", "Olympia"],
["New York City", "Albany"]
]
```

```
Input:
{"imageDetails": [
{
"repositoryName": "org/first-repo",
"imageTags": ["latest", "v1.0", "v1.2"],
"imageDigest": "sha256:abcd"
},
{
"repositoryName": "org/second-repo",
"imageTags": ["v2.0", "v2.2"],
"imageDigest": "sha256:efgh"
},
]}
(create a list of pairs containing an image tag and its associated repo name)
Output:
[
["latest", "org/first-repo"],
["v1.0", "org/first-repo"],
["v1.2", "org/first-repo"],
["v2.0", "org/second-repo"],
["v2.2", "org/second-repo"],
]
```

In order to support these queries we need some way for an expression to
reference values that exist outside of its implicit current element.


## Specification
[specification]: #specification

A new "let expression" is added to the language. The expression has the
format: `let <bindings> in <expr>`. The updated grammar rules in ABNF are:

```
let-expression = "let" bindings "in" expression
bindings = variable-binding *( "," variable-binding )
variable-binding = variable-ref "=" expression
variable-ref = "$" unquoted-string
```

The `let-expression` and `variable-ref` rule are also added as a new expression
types:

```
expression =/ let-expression / variable-ref
```

Examples of this new syntax:

* `let $foo = bar in {a: myvar, b: $foo}`
* `let $foo = baz[0] in bar[? baz == $foo ] | [0]`
* `let $a = b, $c = d in bar[*].[$a, $c, foo, bar]`

### New evaluation rules

Let expressions are evaluated as follows.

Given the rule `"let" bindings "in" expression`, the `bindings` rule is
processed first. Each `variable-binding` within the `bindings` rule defines
the name of a variable and an expression. Each expression is evaluated, and the
result of this evaluation is then bound to the associated variable name.

Once all the `variable-binding` rules have been processed, the associated
`expression` clause of the let expression is then evaluated. During the
evaluation of the expression, any references, via the `variable-ref` rule, to a
variable name will evaluate to the value bound to the variable. Once the
associated expression has been evaluated, the let expression itself evaluates
to the result of this expression. After the let expression has been evaluated,
the variable bindings associated with the let expression are now longer valid.
This is also referred to as the visibility of a binding; the bindings of a
let expression are only visible during the evaluation of the `expression`
clause of the let expression.

When evaluating the `bindings` rule, a `variable-binding` for a variable name
that is already visible in the current scope will replace the existing binding
when evaluating the `expression` clause of the let expression. This means in
the context of nested let expressions (and consequently nested scopes), a
variable in an inner scope can shadow a variable defined in an outer scope.

If a `variable-ref` references a variable that has not been defined, the
evaluation of that `variable-ref` will trigger an `undefined-variable` error.
This error MUST occur when the expression is evaluated and not at compile
time. This is to enable implementations to define an implementation specific
mechanism for defining an initial or "global" scope. Implementations are free
to offer a "strict" compilation mode that a user can opt into, but MUST support
triggering an `undefined-variable` error only when the `variable-ref` is
evaluated.

### Examples

Basic examples demonstrating core functionality.

```
search(let $foo = foo in $foo, {"foo": "bar"}) -> "bar"
search(let $foo = foo.bar in $foo, {"foo": {"bar": "baz"}}) -> "baz"
search(let $foo = foo in [$foo, $foo], {"foo": "bar"}) -> ["bar", "bar"]
```

Nested bindings.

```
search(
let $a = a
in
b[*].[a, $a, let $a = 'shadow' in $a],
{"a": "topval", "b": [{"a": "inner1"}, {"a": "inner2"}]}
) -> [["inner1", "topval", "shadow"], ["inner2", "topval", "shadow"]]
```

Errors cases.

```
search($foo, {}) -> <error: undefined-variable>
search([let $foo = 'bar' in $foo, $foo], {}) -> <error: undefined-variable>
```


## Rationale
[rationale]: #rationale

The let expression proposed in this JEP is based off of similar constructs
in existing programming languages:


* Haskell: http://learnyouahaskell.com/syntax-in-functions#let-it-be
* Clojure: https://clojuredocs.org/clojure.core/let
* OCaml: https://v2.ocaml.org/manual/expr.html#sss:expr-localdef

It's important to use syntax and semantics that are already familiar to
developers. We are introducing lexical scoping, which is not a novel
concept, into the language, so care was taken to be consistent with
the mental model that developers already have.


## Testcases
[testcases]: #testcases

Basic expressions

```yaml
# Basic expressions
- given:
foo:
bar: baz
cases:
- expression: "let $foo = foo in $foo"
result:
bar: baz
- expression: "let $foo = foo.bar in $foo"
result: "baz"
- expression: "let $foo = foo.bar in [$foo, $foo]"
result: ["baz", "baz"]
- command: "Multiple assignments"
expression: "let $foo = 'foo', $bar = 'bar' in [$foo, $bar]"
result: ["foo", "bar"]
# Nested expressions
- given:
a: topval
b:
- a: inner1
- a: inner2
cases:
- expression: "let $a = a in b[*].[a, $a, let $a = 'shadow' in $a]"
result:
- ["inner1", "topval", "shadow"]
- ["inner2", "topval", "shadow"]
- comment: Bindings only visible within expression clause
expression: "let $a = 'top-a' in let $a = 'in-a', $b = $a in $b"
result: "top-a"
# Examples from Motivation section
- given:
- home_state: WA
states:
- name: WA
cities: ["Seattle", "Bellevue", "Olympia"]
- name: CA
cities: ["Los Angeles", "San Francisco"]
- name: NY
cities: ["New York City", "Albany"]
cases:
- expression: "[*].[? let $home_state = home_state in name == $home_state].cities"
result:
- ["Seattle", "Bellevue", "Olympia"]
- ["New York City", "Albany"]
```

0 comments on commit 34a4f50

Please sign in to comment.