Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix attribute strings spec to match implementations #133

Closed
wants to merge 1 commit into from

Conversation

strager
Copy link

@strager strager commented Dec 18, 2021

According to my understanding of the JSX specification, the following
code should define one attribute whose name is attr and whose value is
hello/*"*/world:

function MyComponent() {
  return <div attr="hello/*"*/world" />;
}

However, I found no implementation which parses the code this way. All
implementations treat /* as part of the JSXSingleStringCharacters
rule rather than the beginning of a block comment.

Disallow parsing Comment and WhiteSpace rules between elements of
JSXSingleStringCharacters, etc. This makes the specification match
implementations.

Implementations tested, all of which agree with the modified grammar:

  • @babel/parser version 7.15.3
  • Flow version 0.144.0
  • TypeScript version 4.0.3
  • acorn version 8.3.0
  • espree version 6.2.1

According to my understanding of the JSX specification, the following
code should define one attribute whose name is `attr` and whose value is
`hello/*"*/world`:

    function MyComponent() {
      return <div attr="hello/*"*/world" />;
    }

However, I found no implementation which parses the code this way. All
implementations treat `/*` as part of the `JSXSingleStringCharacters`
rule rather than the beginning of a block comment.

Disallow parsing `Comment` and `WhiteSpace` rules between elements of
`JSXSingleStringCharacters`, etc. This makes the specification match
implementations.

Implementations tested, all of which agree with the modified grammar:

* @babel/parser version 7.15.3
* Flow version 0.144.0
* TypeScript version 4.0.3
* acorn version 8.3.0
* espree version 6.2.1
@facebook-github-bot
Copy link
Contributor

Hi @strager!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@wooorm
Copy link

wooorm commented Jan 16, 2022

The text contains:

Whitespace and Comments

JSX uses the same punctuators and braces as ECMAScript. WhiteSpace, LineTerminators and Comments are generally allowed between any punctuators.

There are several other cases where the text is theoretically ambiguous (e.g., comments or whitespace inside inside identifiers?).

But to me I don’t see any reason to assume that comments or whitespace would be allowed in a group of characters?


It seems a bit too over the top to add that big bold capitalized “NO WHITESPACE OR COMMENT” everywhere?

@nicolo-ribaudo
Copy link

If we want to make the spec "correct" as an extension to the ECMAScript spec, the solution is to use JSXDoubleStringCharacters :
: instead of JSXDoubleStringCharacters :
 (two :s), to make it part of the "lexical grammar" which disallows comments/spaces: https://tc39.es/ecma262/#sec-lexical-and-regexp-grammars

@strager
Copy link
Author

strager commented Jan 16, 2022

There are several other cases where the text is theoretically ambiguous (e.g., comments or whitespace inside inside identifiers?).

JSXIdentifier excludes whitespace and comments explicitly.

It seems a bit too over the top to add that big bold capitalized “NO WHITESPACE OR COMMENT” everywhere?

I was following the lead of JSXIdentifier.

If we want to make the spec "correct" as an extension to the ECMAScript spec, the solution is to use JSXDoubleStringCharacters :
: instead of JSXDoubleStringCharacters :
 (two :s), to make it part of the "lexical grammar" which disallows comments/spaces: https://tc39.es/ecma262/#sec-lexical-and-regexp-grammars

This sounds like a great solution!

@wooorm
Copy link

wooorm commented Jan 17, 2022

@strager Sorry, you are right that JSXIdentifier does! JSXText doesn’t.

Huxpro added a commit to Huxpro/jsx that referenced this pull request Feb 24, 2022
Let's be faithful to the de-facto and capture the HTML semantics
to the spec. I haven't seen any practices specifying transipilers
via ECMA-262 so this was a bit challenging may seems foreign.

I ended up extending `Static Semantics: SV` to make it concise and
(hopefully) accurate enough. I think this is a cool hack ;)

- Introduced `JSXStringCharacter` and `JSXString`
- Fix facebook#133 by using `::` to make problematic grammars lexcial

I intentionally making the set of supported HTML entities
implementation-defined to allow either HTML4 or HTML5 set but
this is open to discuss.
Huxpro added a commit to Huxpro/jsx that referenced this pull request Feb 24, 2022
Let's be faithful to the de-facto and capture the HTML semantics
to the spec. I haven't seen any practices specifying transipilers
via ECMA-262 so this was a bit challenging may seems foreign.

I ended up extending `Static Semantics: SV` to make it concise and
(hopefully) accurate enough. I think this is a cool hack ;)

- Introduced `JSXStringCharacter` and `JSXString`
- Fix facebook#133 by using `::` to make problematic grammars lexcial

I intentionally making the set of supported HTML entities
implementation-defined to allow either HTML4 or HTML5 set but
this is open to discuss.
Huxpro added a commit to Huxpro/jsx that referenced this pull request Feb 24, 2022
Let's be faithful to the de-facto and capture the HTML semantics
to the spec. I haven't seen any practices specifying transipilers
via ECMA-262 so this was a bit challenging may seems foreign.

I ended up extending `Static Semantics: SV` to make it concise and
(hopefully) accurate enough. I think this is a cool hack ;)

- Introduced `JSXStringCharacter` and `JSXString`
- Fix facebook#133 by using `::` to make problematic grammars lexcial

I intentionally making the set of supported HTML entities
implementation-defined to allow either HTML4 or HTML5 set but
this is open to discuss.
Huxpro added a commit to Huxpro/jsx that referenced this pull request Feb 24, 2022
Let's be faithful to the de-facto and capture the HTML semantics
to the spec. I haven't seen any practices specifying transipilers
via ECMA-262 so this was a bit challenging may seems foreign.

I ended up extending `Static Semantics: SV` to make it concise and
(hopefully) accurate enough. I think this is a cool hack ;)

- Introduced `JSXStringCharacter` and `JSXString`
- Fix facebook#133 by using `::` to make problematic grammars lexcial

I intentionally making the set of supported HTML entities
implementation-defined to allow either HTML4 or HTML5 set but
this is open to discuss.
Huxpro added a commit to Huxpro/jsx that referenced this pull request Feb 28, 2022
Let's be faithful to the de-facto and capture the HTML semantics
to the spec. I haven't seen any practices specifying transipilers
via ECMA-262 so this was a bit challenging may seems foreign.

I ended up extending `Static Semantics: SV` to make it concise and
(hopefully) accurate enough. I think this is a cool hack ;)

- Introduced `JSXStringCharacter` and `JSXString`
- Fix facebook#133 by using `::` to make problematic grammars lexcial

I intentionally making the set of supported HTML entities
implementation-defined to allow either HTML4 or HTML5 set but
this is open to discuss.
Huxpro added a commit to Huxpro/jsx that referenced this pull request Feb 28, 2022
Let's be faithful to the de-facto and capture the HTML semantics
to the spec. I haven't seen any practices specifying transipilers
via ECMA-262 so this was a bit challenging may seems foreign.

I ended up extending `Static Semantics: SV` to make it concise and
(hopefully) accurate enough. I think this is a cool hack ;)

- Introduced `JSXStringCharacter` and `JSXString`
- Fix facebook#133 by using `::` to make problematic grammars lexcial

I intentionally making the set of supported HTML entities
implementation-defined to allow either HTML4 or HTML5 set but
this is open to discuss.
@Huxpro Huxpro closed this in #136 Mar 1, 2022
Huxpro added a commit that referenced this pull request Mar 1, 2022
## Summary

Let's be faithful to the de-facto and document the HTML entity behaviors to the spec. Note that this is not about whether we should "drop this semantics or not", but about documenting the current behaviors that everyone has been living with for years.

### The Proposed Normative Change

I'm not aware of any practices specifying such transpiler/transform semantics in ECMA-262 so this is a really interesting attempt 🙂 So I ended up extending `Static Semantics: SV` which is the smartest way I can find to hack the semantics into the ECMA-262 spec. I think this should work and should be accurate enough. I'm curious on how implementors think about it though.

<del>I also intentionally left the set of supported HTML entities implementation-defined to allow either HTML4 or HTML5 set. This may be seen as a breaking change in some regard and **this is open to discuss here**. </del> We've reached consensus that only HTML4 entities are allowed.

This commit also close #133 by using `::` for characters which are supposed to be lexical grammars.

Close #126
Close #4

## Test Plan

open `index.html` and proof-read the spec ;)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants