-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normative: Add Intl.Segmenter #553
Conversation
gibson042
commented
Mar 10, 2021
- @leobalter
Should the "2 Conformance" https://tc39.es/ecma402/#conformance also be changed for Intl.Segmenter? |
05e9865
to
e767be7
Compare
I suggest you also add
to the end of normative-references.html |
Thank you @FrankYFTang. I have included UAX 29, but not CLDR or ICU because references to those are intentionally non-normative. |
@ryzokuken Could I get a review from you as well? |
@gibson042 absolutely! Sorry for the delay, but I thought this would be an even larger diff. On it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, with a few comments inline. Two main comments:
- Would you like to switch to the new structured arguments syntax?
- The AOs are interleaved all over the spec and I don't see a compelling reason for that. Would you mind putting the AO section at the top with all the AOs in it?
1. Let _internalSlotsList_ be « [[InitializedSegmenter]], [[Locale]], [[SegmenterGranularity]] ». | ||
1. Let _segmenter_ be ? OrdinaryCreateFromConstructor(NewTarget, *"%Segmenter.prototype%"*, _internalSlotsList_). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why this is a single-use variable and not inline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're referring internalSlotsList, it's for the value of naming what is otherwise a potentially confusing argument (especially when the list of slots exceeds three or four items) using a convention that is already fairly common throughout ECMA-262 and ECMA-402.
I would actually support a followup making it universal in ECMA-402, ideally using a table-based approach like OrdinaryFunctionCreate and BoundFunctionCreate and ModuleNamespaceCreate:
1. Let _internalSlotsList_ be « [[InitializedSegmenter]], [[Locale]], [[SegmenterGranularity]] ». | |
1. Let _segmenter_ be ? OrdinaryCreateFromConstructor(NewTarget, *"%Segmenter.prototype%"*, _internalSlotsList_). | |
1. Let _internalSlotsList_ be the internal slots listed in <emu-xref href="#table-segment-iterator-instance-slots"></emu-xref>. | |
1. Let _segmenter_ be ? OrdinaryCreateFromConstructor(NewTarget, *"%Segmenter.prototype%"*, _internalSlotsList_). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Captured at #649
</emu-clause> | ||
</emu-clause> | ||
|
||
<emu-clause id="sec-intl-segmenter-abstracts"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ECMA-402, the abstract operations for any object are at the top, before the constructor. Would you like to reorder these to be more consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with CreateSegmentsObject, I think positioning operations below their use makes for better (specifically more linear) consumption of the spec. I'm willing to change this, but would rather update other sections to follow its pattern. Thoughts?
</p> | ||
|
||
<p> | ||
The value of the [[RelevantExtensionKeys]] internal slot is « ». |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this would change at some point? If not, why not just hardcode the empty list in the constructor and get rid of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed that other constructors follow this pattern too, so you're definitely in the right here. Just generally curious if this is better than just hardcoding an empty list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better than hardcoding, because https://tc39.es/ecma402/#sec-internal-slots documents a [[RelevantExtensionKeys]] for each service constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
</ul> | ||
|
||
<emu-clause id="sec-%segmentsprototype%.containing"> | ||
<h1>%SegmentsPrototype%.containing ( _index_ )</h1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this and not Segmenter.prototype.containing
like everywhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containing
is not a method of Segmenter instances (the values returned from new Intl.Segmenter(…)
, whose prototype is %Segmenter.prototype%, which can be accessed as the initial value of Intl.Segmenter.prototype), but rather of Segments instances (the values returned from segmenter.segment(str)
, whose prototype is the different object %SegmentsPrototype%, for which there is no simple access path available to ECMAScript code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I mistyped, I meant Segments.prototype.containing
, didn't know that this wasn't valid.
A Segments instance is an object that represents the segments of a specific string, subject to the locale and options of its constructing Intl.Segmenter instance. | ||
</p> | ||
|
||
<emu-clause id="sec-createsegmentsobject" aoid="CreateSegmentsObject"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit unusual that this AO it placed here, away from all other AOs, is there a strong reason to prefer this over the opposite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a lot of value in positioning Create<Type> AOs as the first subsection of their <Type> section and following them with the description of the prototype they use, supporting a linear reading of the specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to switch to the new structured arguments syntax?
Yes, but I think I'd prefer that change after merge, to minimize divergence from the text of the proposal (which preceded structured headers).
The AOs are interleaved all over the spec and I don't see a compelling reason for that. Would you mind putting the AO section at the top with all the AOs in it?
How strongly do you feel about this? I have a fairly strong preference for this pattern (with the compelling reason being "better readability for linear spec consumption"), and would like to update the rest of ECMA-402 accordingly as soon as possible. But I'm willing to revert this to the existing pattern if you think that will be contentious or take too long.
1. Let _internalSlotsList_ be « [[InitializedSegmenter]], [[Locale]], [[SegmenterGranularity]] ». | ||
1. Let _segmenter_ be ? OrdinaryCreateFromConstructor(NewTarget, *"%Segmenter.prototype%"*, _internalSlotsList_). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're referring internalSlotsList, it's for the value of naming what is otherwise a potentially confusing argument (especially when the list of slots exceeds three or four items) using a convention that is already fairly common throughout ECMA-262 and ECMA-402.
I would actually support a followup making it universal in ECMA-402, ideally using a table-based approach like OrdinaryFunctionCreate and BoundFunctionCreate and ModuleNamespaceCreate:
1. Let _internalSlotsList_ be « [[InitializedSegmenter]], [[Locale]], [[SegmenterGranularity]] ». | |
1. Let _segmenter_ be ? OrdinaryCreateFromConstructor(NewTarget, *"%Segmenter.prototype%"*, _internalSlotsList_). | |
1. Let _internalSlotsList_ be the internal slots listed in <emu-xref href="#table-segment-iterator-instance-slots"></emu-xref>. | |
1. Let _segmenter_ be ? OrdinaryCreateFromConstructor(NewTarget, *"%Segmenter.prototype%"*, _internalSlotsList_). |
</p> | ||
|
||
<p> | ||
The value of the [[RelevantExtensionKeys]] internal slot is « ». |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better than hardcoding, because https://tc39.es/ecma402/#sec-internal-slots documents a [[RelevantExtensionKeys]] for each service constructor.
A Segments instance is an object that represents the segments of a specific string, subject to the locale and options of its constructing Intl.Segmenter instance. | ||
</p> | ||
|
||
<emu-clause id="sec-createsegmentsobject" aoid="CreateSegmentsObject"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a lot of value in positioning Create<Type> AOs as the first subsection of their <Type> section and following them with the description of the prototype they use, supporting a linear reading of the specification.
</ul> | ||
|
||
<emu-clause id="sec-%segmentsprototype%.containing"> | ||
<h1>%SegmentsPrototype%.containing ( _index_ )</h1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containing
is not a method of Segmenter instances (the values returned from new Intl.Segmenter(…)
, whose prototype is %Segmenter.prototype%, which can be accessed as the initial value of Intl.Segmenter.prototype), but rather of Segments instances (the values returned from segmenter.segment(str)
, whose prototype is the different object %SegmentsPrototype%, for which there is no simple access path available to ECMAScript code).
</emu-clause> | ||
</emu-clause> | ||
|
||
<emu-clause id="sec-intl-segmenter-abstracts"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with CreateSegmentsObject, I think positioning operations below their use makes for better (specifically more linear) consumption of the spec. I'm willing to change this, but would rather update other sections to follow its pattern. Thoughts?
Honestly, I'm not sure anymore. Your rationale sounds great to me, but there's also the benefit of keeping things organized in a single location, maybe this is especially notable here because the proposal introduces three types simultaneously. Perhaps the better judge of this would be the implementers? @FrankYFTang @Constellation does this new pattern of organizing AOs seem helpful to you? Do you think it correctly corresponds to how you consume the spec? If so, I have no reservations against it. @gibson042 should we also solicit feedback from the 262 editors here? |
Possibly, but 262 itself is inconsistent (e.g., Map follows the pattern I'm proposing here but ArrayBuffer is more like the rest of 402, and WeakRef is in between them). |
@gibson042 can you please make a PR for the editorial changes to 402 so I can merge this? |
@ryzokuken done: #650 |
Proposal moved to stage 4, and was merged in Ecma402: tc39/proposals@24bdab7 tc39/ecma402#553
* Drop Intl.displayNames v2 proposal, merged in Ecma402 Proposal reached stage 4, was integrated in Ecma402 and now returns a 404: tc39/proposals@f4378e1 tc39/ecma402#622 * Drop Intl.Segmenter proposal, merged in Ecma402 Proposal moved to stage 4, and was merged in Ecma402: tc39/proposals@24bdab7 tc39/ecma402#553