From 8b5698ad8d4a1e1b774821f4370178384edc8afd Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Wed, 12 Apr 2017 16:16:47 +0200 Subject: [PATCH 01/14] Normative: Cache templates per site, rather than by contents The previous definition of template caching had a few issue: - (from @syg) Template strings may live forever due to putting them in a WeakMap - (from @ajklein) Because of this logic, it's rather difficult to implement any GC at all of template objects - (from @erights) The template string facility cannot be extended to expose anything about the site, as it's site-independent This patch makes template caching key off the Parse Node where the template occurs in source, rather than the List of Strings that the template evaluates into. These semantics seem to match SpiderMonkey's implementation of templates. V8, ChakraCore and JSC, on the other hand, implement the prior semantics. Resolves https://github.com/tc39/ecma262/issues/840 --- spec.html | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/spec.html b/spec.html index 173b3e022a..cc5e943465 100644 --- a/spec.html +++ b/spec.html @@ -6121,10 +6121,11 @@

Realms

[[TemplateMap]] - A List of Record { [[Strings]]: List, [[Array]]: Object}. + A List of Record { [[Site]]: Parse Node, [[Array]]: Object}. - Template objects are canonicalized separately for each realm using its Realm Record's [[TemplateMap]]. Each [[Strings]] value is a List containing, in source text order, the raw String values of a |TemplateLiteral| that has been evaluated. The associated [[Array]] value is the corresponding template object that is passed to a tag function. + Template objects are canonicalized separately for each realm using its Realm Record's [[TemplateMap]]. Each [[Site]] value is a Parse Node containing a |TemplateLiteral|. The associated [[Array]] value is the corresponding template object that is passed to a tag function. + Once a Parse Node becomes unreachable, the corresponding [[Array]] is also unreachable, and it would be unobservable if an implementation removed the pair from the [[TemplateMap]] list. @@ -12079,7 +12080,7 @@

Runtime Semantics: GetTemplateObject ( _templateLiteral_ )

1. Let _realm_ be the current Realm Record. 1. Let _templateRegistry_ be _realm_.[[TemplateMap]]. 1. For each element _e_ of _templateRegistry_, do - 1. If _e_.[[Strings]] and _rawStrings_ contain the same values in the same order, then + 1. If _e_.[[Site]] is the same Parse Node as _templateLiteral_, then 1. Return _e_.[[Array]]. 1. Let _cookedStrings_ be TemplateStrings of _templateLiteral_ with argument *false*. 1. Let _count_ be the number of elements in the List _cookedStrings_. @@ -12097,7 +12098,7 @@

Runtime Semantics: GetTemplateObject ( _templateLiteral_ )

1. Perform SetIntegrityLevel(_rawObj_, `"frozen"`). 1. Call _template_.[[DefineOwnProperty]](`"raw"`, PropertyDescriptor{[[Value]]: _rawObj_, [[Writable]]: *false*, [[Enumerable]]: *false*, [[Configurable]]: *false*}). 1. Perform SetIntegrityLevel(_template_, `"frozen"`). - 1. Append the Record{[[Strings]]: _rawStrings_, [[Array]]: _template_} to _templateRegistry_. + 1. Append the Record{[[Site]]: _templateLiteral_, [[Array]]: _template_} to _templateRegistry_. 1. Return _template_. From 30ce444ca458e5be21968949f87373799116daae Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Thu, 13 Apr 2017 22:33:37 +0200 Subject: [PATCH 02/14] Editorial: Clarify what "the same Parse Node" means --- spec.html | 2 ++ 1 file changed, 2 insertions(+) diff --git a/spec.html b/spec.html index cc5e943465..c764362ceb 100644 --- a/spec.html +++ b/spec.html @@ -490,6 +490,8 @@

The Syntactic Grammar

The syntactic grammar for ECMAScript is given in clauses 11, 12, 13, 14, and 15. This grammar has ECMAScript tokens defined by the lexical grammar as its terminal symbols (). It defines a set of productions, starting from two alternative goal symbols |Script| and |Module|, that describe how sequences of tokens form syntactically correct independent components of ECMAScript programs.

When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

+

Each time a stream of code points is parsed, it produces Parse Nodes with a fresh identity. Two Parse Nodes are considered the same Parse Node if they have the same identity, that is, that they resulted from the same invocation the grammar.

+ Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:
eval(str); eval(str);

Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ cannot be reparsed as an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the result of reparsing _P_ as an _N_". This will always be a Parse Node (an instance of _N_), since any parsing failure would have been detected by an early error rule.

From da16f4add59aa564a7ff8203fc354d1861149578 Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Thu, 12 Oct 2017 20:58:24 +0200 Subject: [PATCH 03/14] Editorial changes from @bterlson and @jmdyck --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index c764362ceb..5692aff283 100644 --- a/spec.html +++ b/spec.html @@ -490,7 +490,7 @@

The Syntactic Grammar

The syntactic grammar for ECMAScript is given in clauses 11, 12, 13, 14, and 15. This grammar has ECMAScript tokens defined by the lexical grammar as its terminal symbols (). It defines a set of productions, starting from two alternative goal symbols |Script| and |Module|, that describe how sequences of tokens form syntactically correct independent components of ECMAScript programs.

When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

-

Each time a stream of code points is parsed, it produces Parse Nodes with a fresh identity. Two Parse Nodes are considered the same Parse Node if they have the same identity, that is, that they resulted from the same invocation the grammar.

+

Each Parse Node is an instance of a symbol in the grammar; Parse Nodes represent spans of the source text that can be derived from that symbol. New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they are the same instance. Thus, two parse nodes are identical if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:

eval(str); eval(str);

Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

From 0a4ed7eed7a8e7528a137340e014fbcd2ae7343a Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Thu, 12 Oct 2017 23:17:58 +0200 Subject: [PATCH 04/14] Editorial: Fix two more editorial issues from @jmdyck --- spec.html | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/spec.html b/spec.html index 5692aff283..1980c6c2c9 100644 --- a/spec.html +++ b/spec.html @@ -489,8 +489,8 @@

The Numeric String Grammar

The Syntactic Grammar

The syntactic grammar for ECMAScript is given in clauses 11, 12, 13, 14, and 15. This grammar has ECMAScript tokens defined by the lexical grammar as its terminal symbols (). It defines a set of productions, starting from two alternative goal symbols |Script| and |Module|, that describe how sequences of tokens form syntactically correct independent components of ECMAScript programs.

When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

-

When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

-

Each Parse Node is an instance of a symbol in the grammar; Parse Nodes represent spans of the source text that can be derived from that symbol. New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they are the same instance. Thus, two parse nodes are identical if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. +

When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; a Parse Node represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

+

New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they are the same instance. Thus, two parse nodes are identical if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:

eval(str); eval(str);

Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

@@ -6126,7 +6126,7 @@

Realms

A List of Record { [[Site]]: Parse Node, [[Array]]: Object}. - Template objects are canonicalized separately for each realm using its Realm Record's [[TemplateMap]]. Each [[Site]] value is a Parse Node containing a |TemplateLiteral|. The associated [[Array]] value is the corresponding template object that is passed to a tag function. + Template objects are canonicalized separately for each realm using its Realm Record's [[TemplateMap]]. Each [[Site]] value is a Parse Node that is a |TemplateLiteral|. The associated [[Array]] value is the corresponding template object that is passed to a tag function. Once a Parse Node becomes unreachable, the corresponding [[Array]] is also unreachable, and it would be unobservable if an implementation removed the pair from the [[TemplateMap]] list. From 320fb657cfe3887b9b43f0e9df05da66d05058a3 Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Sat, 3 Feb 2018 02:17:59 +0100 Subject: [PATCH 05/14] Editorial: Switch to "covered" language per @allenwb --- spec.html | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/spec.html b/spec.html index 1980c6c2c9..4998678057 100644 --- a/spec.html +++ b/spec.html @@ -494,7 +494,7 @@

The Syntactic Grammar

Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:
eval(str); eval(str);

Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

-

In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ cannot be reparsed as an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the result of reparsing _P_ as an _N_". This will always be a Parse Node (an instance of _N_), since any parsing failure would have been detected by an early error rule.

+

In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering a _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_), since any parsing failure would have been detected by an early error rule. For a given Parse Node which is an instance of _P_, there is considered to be one unique Parse Node for _N_ which it covers.

@@ -11367,7 +11367,7 @@

Semantics

Static Semantics: CoveredParenthesizedExpression

CoverParenthesizedExpressionAndArrowParameterList : `(` Expression `)` - 1. Return the result of reparsing |CoverParenthesizedExpressionAndArrowParameterList| as a |ParenthesizedExpression|. + 1. Return the |ParenthesizedExpression| that is covered by |CoverParenthesizedExpressionAndArrowParameterList|. @@ -12211,7 +12211,7 @@

Static Semantics: Early Errors

PrimaryExpression : CoverParenthesizedExpressionAndArrowParameterList
  • - It is a Syntax Error if |CoverParenthesizedExpressionAndArrowParameterList| cannot be reparsed as a |ParenthesizedExpression|. + It is a Syntax Error if |CoverParenthesizedExpressionAndArrowParameterList| is not covering a |ParenthesizedExpression|.
  • All Early Error rules for |ParenthesizedExpression| and its derived productions also apply to CoveredParenthesizedExpression of |CoverParenthesizedExpressionAndArrowParameterList|. @@ -12329,7 +12329,7 @@

    Static Semantics: CoveredCallExpression

    CallExpression : CoverCallExpressionAndAsyncArrowHead - 1. Return the result of reparsing |CoverCallExpressionAndAsyncArrowHead| as a |CallMemberExpression|. + 1. Return the |CallMemberExpression| that is covered by |CoverCallExpressionAndAsyncArrowHead|. @@ -14215,7 +14215,7 @@

    Static Semantics: Early Errors

    AssignmentExpression : LeftHandSideExpression `=` AssignmentExpression
    • - It is a Syntax Error if |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and |LeftHandSideExpression| cannot be reparsed as an |AssignmentPattern|. + It is a Syntax Error if |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and |LeftHandSideExpression| is not covering an |AssignmentPattern|.
    • It is an early Reference Error if |LeftHandSideExpression| is neither an |ObjectLiteral| nor an |ArrayLiteral| and IsValidSimpleAssignmentTarget of |LeftHandSideExpression| is *false*. @@ -14363,7 +14363,7 @@

      Static Semantics: Early Errors

      DestructuringAssignmentTarget : LeftHandSideExpression
      • - It is a Syntax Error if |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if |LeftHandSideExpression| cannot be reparsed as an |AssignmentPattern|. + It is a Syntax Error if |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if |LeftHandSideExpression| is not covering an |AssignmentPattern|.
      • It is a Syntax Error if |LeftHandSideExpression| is neither an |ObjectLiteral| nor an |ArrayLiteral| and IsValidSimpleAssignmentTarget(|LeftHandSideExpression|) is *false*. @@ -16513,10 +16513,10 @@

        Static Semantics: Early Errors

        • - It is a Syntax Error if |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if |LeftHandSideExpression| cannot be reparsed as an |AssignmentPattern|. + It is a Syntax Error if |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if |LeftHandSideExpression| is not covering an |AssignmentPattern|.
        -

        If |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if |LeftHandSideExpression| can be reparsed as an |AssignmentPattern| then the following rules are not applied. Instead, the Early Error rules for |AssignmentPattern| are used.

        +

        If |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if an |AssignmentPattern| is covered by |LeftHandSideExpression| then the following rules are not applied. Instead, the Early Error rules for |AssignmentPattern| are used.

        • It is a Syntax Error if IsValidSimpleAssignmentTarget of |LeftHandSideExpression| is *false*. @@ -18704,7 +18704,7 @@

          Static Semantics: Early Errors

          ArrowParameters : CoverParenthesizedExpressionAndArrowParameterList
          • - It is a Syntax Error if |CoverParenthesizedExpressionAndArrowParameterList| cannot be reparsed as an |ArrowFormalParameters|. + It is a Syntax Error if |CoverParenthesizedExpressionAndArrowParameterList| is not covering an |ArrowFormalParameters|.
          • All early error rules for |ArrowFormalParameters| and its derived productions also apply to CoveredFormalsList of |CoverParenthesizedExpressionAndArrowParameterList|. @@ -18815,7 +18815,7 @@

            Static Semantics: CoveredFormalsList

            `(` Expression `,` `...` BindingPattern `)` - 1. Return the result of reparsing |CoverParenthesizedExpressionAndArrowParameterList| as an |ArrowFormalParameters|. + 1. Return the |ArrowFormalParameters| that is covered by |CoverParenthesizedExpressionAndArrowParameterList|. @@ -20148,7 +20148,7 @@

            Static Semantics: Early Errors

            • It is a Syntax Error if |CoverCallExpressionAndAsyncArrowHead| Contains |YieldExpression| is *true*.
            • It is a Syntax Error if |CoverCallExpressionAndAsyncArrowHead| Contains |AwaitExpression| is *true*.
            • -
            • It is a Syntax Error if |CoverCallExpressionAndAsyncArrowHead| cannot be reparsed as an |AsyncArrowHead|.
            • +
            • It is a Syntax Error if |CoverCallExpressionAndAsyncArrowHead| is not covering an |AsyncArrowHead|.
            • It is a Syntax Error if any element of the BoundNames of |CoverCallExpressionAndAsyncArrowHead| also occurs in the LexicallyDeclaredNames of |AsyncConciseBody|.
            • It is a Syntax Error if ContainsUseStrict of |AsyncConciseBody| is *true* and IsSimpleParameterList of |CoverCallExpressionAndAsyncArrowHead| is *false*.
            • All Early Error rules for |AsyncArrowHead| and its derived productions apply to CoveredAsyncArrowHead of |CoverCallExpressionAndAsyncArrowHead|.
            • @@ -20161,7 +20161,7 @@

              Static Semantics: CoveredAsyncArrowHead

              CoverCallExpressionAndAsyncArrowHead : MemberExpression Arguments - 1. Return the result of reparsing |CoverCallExpressionAndAsyncArrowHead| as an |AsyncArrowHead|. + 1. Return the |AsyncArrowHead| that is covered by |CoverCallExpressionAndAsyncArrowHead|. From 104da8bf84e3976fe50974b1d7990d7258ea03d1 Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Sat, 3 Feb 2018 17:33:13 -0500 Subject: [PATCH 06/14] Editorial: change "a Parse Node" back to "it". The sentence has two clauses separated by a semicolon. With the word "it", the second clause has two references ("it" and "that symbol") to the two nouns in the first clause ("Each Parse Node" and "a symbol in the grammar"). If you change "it" to "a Parse Node", it's no longer a reference to "Each Parse Node", so the second clause becomes confused about whether or not it's referring to things in the first clause. --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index 4998678057..0a729b630e 100644 --- a/spec.html +++ b/spec.html @@ -489,7 +489,7 @@

              The Numeric String Grammar

              The Syntactic Grammar

              The syntactic grammar for ECMAScript is given in clauses 11, 12, 13, 14, and 15. This grammar has ECMAScript tokens defined by the lexical grammar as its terminal symbols (). It defines a set of productions, starting from two alternative goal symbols |Script| and |Module|, that describe how sequences of tokens form syntactically correct independent components of ECMAScript programs.

              When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

              -

              When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; a Parse Node represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

              +

              When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

              New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they are the same instance. Thus, two parse nodes are identical if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:

              eval(str); eval(str);

              Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

              From 741c9fe7d06e1745530a2948f9909d1a0dc17b7f Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Sat, 3 Feb 2018 17:41:25 -0500 Subject: [PATCH 07/14] Editorial: Pare down the defn of "same Parse Node" ... avoiding the distraction of "the same instance" and "identical". --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index 0a729b630e..85994f63c7 100644 --- a/spec.html +++ b/spec.html @@ -490,7 +490,7 @@

              The Syntactic Grammar

              The syntactic grammar for ECMAScript is given in clauses 11, 12, 13, 14, and 15. This grammar has ECMAScript tokens defined by the lexical grammar as its terminal symbols (). It defines a set of productions, starting from two alternative goal symbols |Script| and |Module|, that describe how sequences of tokens form syntactically correct independent components of ECMAScript programs.

              When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

              When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

              -

              New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they are the same instance. Thus, two parse nodes are identical if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. +

              New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:

              eval(str); eval(str);

              Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

              The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

              From 3afeecbde94a5fc3de202eccb4801706e2a49bc1 Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Sat, 3 Feb 2018 17:45:57 -0500 Subject: [PATCH 08/14] Editorial: "a _N_" -> "an _N_" ... for ease of reading. --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index 85994f63c7..b93de7b84d 100644 --- a/spec.html +++ b/spec.html @@ -494,7 +494,7 @@

              The Syntactic Grammar

              Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:
              eval(str); eval(str);

              Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

              The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

              -

              In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering a _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_), since any parsing failure would have been detected by an early error rule. For a given Parse Node which is an instance of _P_, there is considered to be one unique Parse Node for _N_ which it covers.

              +

              In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_), since any parsing failure would have been detected by an early error rule. For a given Parse Node which is an instance of _P_, there is considered to be one unique Parse Node for _N_ which it covers.

              From d0578822c4fef892acefaf9078ae63e447fa2a0b Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Sat, 3 Feb 2018 18:16:27 -0500 Subject: [PATCH 09/14] Editorial: rework the "unique Parse Node which it covers" sentence. You can't say "a Parse Node which is an instance of _P_": _P_ *is* a Parse Node. Also, as a separate sentence, you might have to say "at most one" rather than "one". In trying to rework the sentence, I decided it was easier to drop it, and instead insert "unique for a given _P_" into the previous sentence. --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index b93de7b84d..dcdb26b95b 100644 --- a/spec.html +++ b/spec.html @@ -494,7 +494,7 @@

              The Syntactic Grammar

              Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:
              eval(str); eval(str);

              Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

              The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

              -

              In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_), since any parsing failure would have been detected by an early error rule. For a given Parse Node which is an instance of _P_, there is considered to be one unique Parse Node for _N_ which it covers.

              +

              In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_, unique for a given _P_), since any parsing failure would have been detected by an early error rule.

              From 9e424e42497dda1e6583641136bf315a37af25ef Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Sat, 3 Feb 2018 17:54:30 -0500 Subject: [PATCH 10/14] Editorial: change remaining occurrences of "reparsing" "the result of reparsing X as a Y" -> "the Y that is covered by X" --- spec.html | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/spec.html b/spec.html index dcdb26b95b..f71b310ffe 100644 --- a/spec.html +++ b/spec.html @@ -14284,7 +14284,7 @@

              Runtime Semantics: Evaluation

              1. If _hasNameProperty_ is *false*, perform SetFunctionName(_rval_, GetReferencedName(_lref_)). 1. Perform ? PutValue(_lref_, _rval_). 1. Return _rval_. - 1. Let _assignmentPattern_ be the result of reparsing |LeftHandSideExpression| as an |AssignmentPattern|. + 1. Let _assignmentPattern_ be the |AssignmentPattern| that is covered by |LeftHandSideExpression|. 1. Let _rref_ be the result of evaluating |AssignmentExpression|. 1. Let _rval_ be ? GetValue(_rref_). 1. Perform ? DestructuringAssignmentEvaluation of _assignmentPattern_ using _rval_ as the argument. @@ -14524,7 +14524,7 @@

              Runtime Semantics: IteratorDestructuringAssignmentEvaluation

              1. Let _v_ be ? GetValue(_defaultValue_). 1. Else, let _v_ be _value_. 1. If |DestructuringAssignmentTarget| is an |ObjectLiteral| or an |ArrayLiteral|, then - 1. Let _nestedAssignmentPattern_ be the result of reparsing |DestructuringAssignmentTarget| as an |AssignmentPattern|. + 1. Let _nestedAssignmentPattern_ be the |AssignmentPattern| that is covered by |DestructuringAssignmentTarget|. 1. Return the result of performing DestructuringAssignmentEvaluation of _nestedAssignmentPattern_ with _v_ as the argument. 1. If |Initializer| is present and _value_ is *undefined* and IsAnonymousFunctionDefinition(|Initializer|) and IsIdentifierRef of |DestructuringAssignmentTarget| are both *true*, then 1. Let _hasNameProperty_ be ? HasOwnProperty(_v_, `"name"`). @@ -14555,7 +14555,7 @@

              Runtime Semantics: IteratorDestructuringAssignmentEvaluation

              1. Increment _n_ by 1. 1. If |DestructuringAssignmentTarget| is neither an |ObjectLiteral| nor an |ArrayLiteral|, then 1. Return ? PutValue(_lref_, _A_). - 1. Let _nestedAssignmentPattern_ be the result of reparsing |DestructuringAssignmentTarget| as an |AssignmentPattern|. + 1. Let _nestedAssignmentPattern_ be the |AssignmentPattern| that is covered by |DestructuringAssignmentTarget|. 1. Return the result of performing DestructuringAssignmentEvaluation of _nestedAssignmentPattern_ with _A_ as the argument. @@ -14575,7 +14575,7 @@

              Runtime Semantics: KeyedDestructuringAssignmentEvaluation

              1. Let _rhsValue_ be ? GetValue(_defaultValue_). 1. Else, let _rhsValue_ be _v_. 1. If |DestructuringAssignmentTarget| is an |ObjectLiteral| or an |ArrayLiteral|, then - 1. Let _assignmentPattern_ be the result of reparsing |DestructuringAssignmentTarget| as an |AssignmentPattern|. + 1. Let _assignmentPattern_ be the |AssignmentPattern| that is covered by |DestructuringAssignmentTarget|. 1. Return the result of performing DestructuringAssignmentEvaluation of _assignmentPattern_ with _rhsValue_ as the argument. 1. If |Initializer| is present and _v_ is *undefined* and IsAnonymousFunctionDefinition(|Initializer|) and IsIdentifierRef of |DestructuringAssignmentTarget| are both *true*, then 1. Let _hasNameProperty_ be ? HasOwnProperty(_rhsValue_, `"name"`). @@ -16824,7 +16824,7 @@

              Runtime Semantics: ForIn/OfBodyEvaluation ( _lhs_, _stmt_, _iteratorRecord_, 1. Let _destructuring_ be IsDestructuring of _lhs_. 1. If _destructuring_ is *true* and if _lhsKind_ is ~assignment~, then 1. Assert: _lhs_ is a |LeftHandSideExpression|. - 1. Let _assignmentPattern_ be the result of reparsing _lhs_ as an |AssignmentPattern|. + 1. Let _assignmentPattern_ be the |AssignmentPattern| that is covered by _lhs_. 1. Repeat, 1. Let _nextResult_ be ? IteratorStep(_iteratorRecord_). 1. If _nextResult_ is *false*, return NormalCompletion(_V_). From 821bf1e0e84f242357eab08a03e34ae2e65e8f68 Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Sat, 3 Feb 2018 18:03:23 -0500 Subject: [PATCH 11/14] Editorial: passive voice to active voice Early Error rules say "X is [not] covering a Y". Runtime algs say "the Y that is covered by X". --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index f71b310ffe..0bd52cd70d 100644 --- a/spec.html +++ b/spec.html @@ -16516,7 +16516,7 @@

              Static Semantics: Early Errors

              It is a Syntax Error if |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if |LeftHandSideExpression| is not covering an |AssignmentPattern|.

            -

            If |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if an |AssignmentPattern| is covered by |LeftHandSideExpression| then the following rules are not applied. Instead, the Early Error rules for |AssignmentPattern| are used.

            +

            If |LeftHandSideExpression| is either an |ObjectLiteral| or an |ArrayLiteral| and if |LeftHandSideExpression| is covering an |AssignmentPattern| then the following rules are not applied. Instead, the Early Error rules for |AssignmentPattern| are used.

            • It is a Syntax Error if IsValidSimpleAssignmentTarget of |LeftHandSideExpression| is *false*. From af4517734397fffe18300baaaa7593b32192da8d Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Sun, 4 Feb 2018 22:05:34 +0100 Subject: [PATCH 12/14] Editorial: Note that parse trees and parse nodes are abstract and unobservable This is an attempt to include the clarification requested by @allenwb --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index 0bd52cd70d..a9886c091a 100644 --- a/spec.html +++ b/spec.html @@ -491,7 +491,7 @@

              The Syntactic Grammar

              When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

              When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

              New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. - Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:

              eval(str); eval(str);
              + Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:
              eval(str); eval(str);
              . Parse trees and Parse Nodes are abstract and unobservable; implementations might not actually construct them in memory.

              Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

              The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

              In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_, unique for a given _P_), since any parsing failure would have been detected by an early error rule.

              From 4846429cb722f4db5878857e0787de59e5325dee Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Sun, 4 Feb 2018 23:22:34 +0100 Subject: [PATCH 13/14] Editorial: Switch to @bterlson's wording for abstract parse trees --- spec.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec.html b/spec.html index a9886c091a..8b15a5c7f3 100644 --- a/spec.html +++ b/spec.html @@ -491,7 +491,7 @@

              The Syntactic Grammar

              When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

              When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

              New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. - Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:

              eval(str); eval(str);
              . Parse trees and Parse Nodes are abstract and unobservable; implementations might not actually construct them in memory. + Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:
              eval(str); eval(str);
              . Note that parse trees are specification types and implementations are not required to use an analogous data structure.

              Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

              The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

              In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_, unique for a given _P_), since any parsing failure would have been detected by an early error rule.

              From e2fd05e362c3b012a1a372756353d7bc32ca2402 Mon Sep 17 00:00:00 2001 From: Daniel Ehrenberg Date: Mon, 5 Feb 2018 00:06:42 +0100 Subject: [PATCH 14/14] Editorial: Tweak Parse Node note as suggested by @jmdyck --- spec.html | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/spec.html b/spec.html index 8b15a5c7f3..39db40867c 100644 --- a/spec.html +++ b/spec.html @@ -491,7 +491,8 @@

              The Syntactic Grammar

              When a stream of code points is to be parsed as an ECMAScript |Script| or |Module|, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (|Script| or |Module|), with no tokens left over.

              When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's goal symbol. When a Parse Node is an instance of a nonterminal, it is also an instance of some production that has that nonterminal as its left-hand side. Moreover, it has zero or more children, one for each symbol on the production's right-hand side: each child is a Parse Node that is an instance of the corresponding symbol.

              New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation. - Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:

              eval(str); eval(str);
              . Note that parse trees are specification types and implementations are not required to use an analogous data structure. + Parsing the same String multiple times will lead to different Parse Nodes, e.g., as occurs in:
              eval(str); eval(str);
              .
              + Parse Nodes are specification artefacts, and implementations are not required to use an analogous data structure.

              Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

              The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript |Script| or |Module|. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places.

              In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript |Script| or |Module|. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. Typically, an early error rule will then define an error condition if "_P_ is not covering an _N_", where _P_ is a Parse Node (an instance of the generalized production) and _N_ is a nonterminal from the supplemental grammar. Here, the sequence of tokens originally matched by _P_ is parsed again using _N_ as the goal symbol. (If _N_ takes grammatical parameters, then they are set to the same values used when _P_ was originally parsed.) An error occurs if the sequence of tokens cannot be parsed as a single instance of _N_, with no tokens left over. Subsequently, algorithms access the result of the parse using a phrase of the form "the _N_ that is covered by _P_". This will always be a Parse Node (an instance of _N_, unique for a given _P_), since any parsing failure would have been detected by an early error rule.