-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for "hasIndices" RegExp flag #968
Conversation
if (global) | ||
result.push_back('g'); | ||
if (ignoreCase) | ||
result.push_back('i'); | ||
if (multiline) | ||
result.push_back('m'); | ||
if (dotAll) | ||
result.push_back('s'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the order change? Do you think this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just to align the order with ES2022 specification here.
@@ -318,6 +318,8 @@ STR(multiline, "multiline") | |||
STR(unicode, "unicode") | |||
STR(sticky, "sticky") | |||
STR(dotAll, "dotAll") | |||
STR(indices, "indices") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there two definitions needed indicates
and hasIndicates
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/VM/JSLib/RegExp.cpp
Outdated
Handle<JSArray> indices) { | ||
// If there are no capture groups, then set groups to undefined. | ||
if (!mappingObj) { | ||
auto executionStatus = JSObject::defineNewOwnProperty( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should refrain from using defineNewOwnProperty
. You can only use this API when you know for certain that the property you're adding does not exist on the current object. To be safe, use defineOwnProperty
.
lib/VM/JSLib/RegExp.cpp
Outdated
Predefined::getSymbolID(Predefined::groups), | ||
PropertyFlags::defaultNewNamedPropertyFlags(), | ||
Runtime::getUndefinedValue()); | ||
assert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These assert
s should not be written this way. You should check the result of the defineProperty
and return in an error in a regular if
. Also remember to wrap these in the appropriate LLVM_(UN)LIKELY
For example,
if (LLVM_UNLIKELY(JSObject::defineNewOwnProperty(...) == ExecutionStatus::EXCEPTION)){
return ExecutionStatus::EXCEPTION;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do this, but since my createIndicesArray
is a void
method this approach wouldn't work, so I followed the same way was done to createGroupsObject
which is also a void
method and uses assert
. What's your advice? Change the method's signature to return something like CallResult<void>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You change the type to ExecutionStatus
.
lib/VM/JSLib/RegExp.cpp
Outdated
indices, | ||
runtime, | ||
Predefined::getSymbolID(Predefined::groups), | ||
PropertyFlags::defaultNewNamedPropertyFlags(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be getDefaultNewPropertyFlags
.
lib/VM/JSLib/RegExp.cpp
Outdated
matchObj, | ||
runtime, | ||
Predefined::getSymbolID(Predefined::indices), | ||
PropertyFlags::defaultNewNamedPropertyFlags(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto-getDefaultNewPropertyFlags
.
lib/VM/JSLib/RegExp.cpp
Outdated
@@ -595,6 +658,13 @@ CallResult<Handle<JSArray>> directRegExpExec( | |||
auto inputSHV = SmallHermesValue::encodeStringValue(*S, runtime); | |||
JSObject::setNamedSlotValueUnsafe(*A, runtime, inputDesc, inputSHV); | |||
|
|||
// Create indices array. | |||
auto indicesRes = JSArray::create(runtime, match.size(), match.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you create this unconditionally? This will only ever be used if hasIndices
is true, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -2051,7 +2051,6 @@ | |||
"import-assertions", | |||
"json-superset", | |||
"new.target", | |||
"regexp-match-indices", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good to test to spec, but it should not be the only source of testing. Add some examples of using the d
flag and making sure it's working in some basic tests cases in our lit tests. You can add these cases to test/hermes/regexp.js
. For instance, I haven't had time to debug this further, but I am getting an error when running this code on top of your changes:
let re = /(?<keyword>let|const|var)\s+(?<id>[a-zA-Z_$][0-9a-zA-Z_$]*)/d;
let match = re.exec("let myVar");
print((match.indices.groups));
Something funny is happening here since if I instead change the line to the following:
print(JSON.stringify((match.indices.groups)));
everything works fine.
edit- actually I figured out what's going on here. The root cause is outside of the scope of your changes, so you don't have to address them. But there should still be some cases added to our lit tests.
If you're curious, the problem is because ordinaryToPrimitive
in Operations.cpp
will try to lookup either toString
or valueOf
on an object to convert it to something printable. But since you defined the indices.groups
object with a null parent, it's impossible to find either of these functions on the prototype chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I was going to say that, this behavior also happens to match.groups
because it's a null-prototype object just like match.indices.groups
is.
This issue also happens in other engines if you do like this for example (although they'll work if you just console.log
the variable without string conversion):
let re = /(?<keyword>let|const|var)\s+(?<id>[a-zA-Z_$][0-9a-zA-Z_$]*)/d;
let match = re.exec("let myVar");
console.log('match.groups ' + match.groups); // Uncaught TypeError: Cannot convert object to primitive value
console.log('match.indices.groups ' + match.indices.groups); // Uncaught TypeError: Cannot convert object to primitive value
Some references:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec#return_value
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object#null-prototype_objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the testing, I'll implement some tests as requested 👍 . Should I implement tests for this "null-prototype object" case too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I don't think I would classify this as a bug. JSC and SpiderMonkey both exhibit the same behavior as Hermes in this case. You can test this by trying to print out the expression Object.create(null)
in the different engines.
You should not write tests for this case.
lib/VM/JSLib/RegExp.cpp
Outdated
Handle<JSObject> mappingObj, | ||
Handle<JSArray> indices) { | ||
// If there are no capture groups, then set groups to undefined. | ||
if (!mappingObj) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small nit- is there a point to flipping this conditional if both cases are written out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was just preference I guess, I can flip it if you like :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again it's a nit, but I personally feel like it would be cleaner without the negation.
Overall the code is looking good so far, thanks for taking this on :) |
This PR is still in a draft state. Please change the status of it so the rest of the team can review as well. |
Hi @fbmal7, I addressed all the changes requested and changed the PR status. |
Actually, I think this code could be cleaner if we consolidate the all the work related to populating the indices object into one method. This way we can also make the code read closer to the spec. We can also address the point about eagerly initializing And instead, move all that logic into a new method |
lib/VM/JSLib/RegExp.cpp
Outdated
return ExecutionStatus::EXCEPTION; | ||
} | ||
} else { | ||
// If there are no capture groups, then set groups to undefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, you can leave an inline comment that copy-pastes the spec. In this case it's coming from this definition.
// 7a. Else, Let groups be undefined.
if (LLVM_UNLIKELY(
JSObject::defineOwnProperty(
indices,
runtime,
Predefined::getSymbolID(Predefined::groups),
DefinePropertyFlags::getDefaultNewPropertyFlags(),
Runtime::getUndefinedValue()) == ExecutionStatus::EXCEPTION)) {
return ExecutionStatus::EXCEPTION;
}
This will make more sense when you refactor this code into a new method makeMatchIndicesIndexPairArray
. At the top of the function definition, you can insert an inline comment saying 22.2.7.8 MakeMatchIndicesIndexPairArray
.
Hi @fbmal7 thanks for your review and thoughts. Yes, I agree, let's keep the implementation closer to the specification 👍 |
a32c92b
to
122558c
Compare
Hi @fbmal7 , I addressed the requested changes, tried to keep the implementation as close as possible to the specification, please review it again when you have some time. |
Hi @fbmal7 , did you have a chance to take a look at the latest changes on this PR? Thanks! |
lib/VM/JSLib/RegExp.cpp
Outdated
} | ||
auto matchIndexPair = runtime.makeHandle<JSArray>(*matchIndexPairRes); | ||
|
||
JSArray::setElementAt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, you may use unsafeSetExistingElementAt
. You know that it's safe to write the element at that spot because you created it with the proper size above. You will also need to dereference the Handle in order to use that function.
lib/VM/JSLib/RegExp.cpp
Outdated
0, | ||
runtime.makeHandle( | ||
HermesValue::encodeUntrustedNumberValue(mg->location))); | ||
JSArray::setElementAt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto- unsafeSetExistingElementAt
lib/VM/JSLib/RegExp.cpp
Outdated
|
||
// Perform ! CreateDataPropertyOrThrow(A, ! ToString(𝔽(i)), | ||
// matchIndexPair). | ||
JSArray::setElementAt(A, runtime, idx, matchIndexPair); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto- unsafeSetExistingElementAt
lib/VM/JSLib/RegExp.cpp
Outdated
// Let matchIndexPair be undefined. | ||
// Perform ! CreateDataPropertyOrThrow(A, ! ToString(𝔽(i)), | ||
// matchIndexPair). | ||
JSArray::setElementAt(A, runtime, idx, Runtime::getUndefinedValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto- unsafeSetExistingElementAt
. When you switch this, you should also use HermesValue::encodeUndefinedValue()
instead of Runtime::getUndefinedValue()
.
lib/VM/JSLib/RegExp.cpp
Outdated
@@ -595,6 +701,10 @@ CallResult<Handle<JSArray>> directRegExpExec( | |||
auto inputSHV = SmallHermesValue::encodeStringValue(*S, runtime); | |||
JSObject::setNamedSlotValueUnsafe(*A, runtime, inputDesc, inputSHV); | |||
|
|||
// If R contains any GroupName, then let hasGroups be true. | |||
auto groupNames = regexp->getGroupNameMappings(runtime); | |||
bool hasGroups = !!groupNames; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the double exclamation mark, bool hasGroups = (bool)groupNames;
. Handle
overloads the bool operator. It overloads it with explicit
, which you can learn about the usage here.
lib/VM/JSLib/RegExp.cpp
Outdated
@@ -595,6 +701,10 @@ CallResult<Handle<JSArray>> directRegExpExec( | |||
auto inputSHV = SmallHermesValue::encodeStringValue(*S, runtime); | |||
JSObject::setNamedSlotValueUnsafe(*A, runtime, inputDesc, inputSHV); | |||
|
|||
// If R contains any GroupName, then let hasGroups be true. | |||
auto groupNames = regexp->getGroupNameMappings(runtime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write out the type here, it's not obvious from the name.
Handle<JSObject> groupNames = regexp->getGroupNameMappings(runtime);
lib/VM/JSLib/RegExp.cpp
Outdated
auto clazzHandle = runtime.makeHandle(mappingObj->getClass(runtime)); | ||
auto groupsObjRes = JSObject::create( | ||
runtime, Runtime::makeNullHandle<JSObject>(), clazzHandle); | ||
auto groupsObj = runtime.makeHandle(groupsObjRes.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: groupsObj
-> groups
… it closer to specs
…PairArray() method
122558c
to
57cf9ba
Compare
Hi @fbmal7 , thanks for the review. I've addressed all the changes, but I also had to use
Please let me know if it makes sense, thank you! |
Using Given that this PR will require a bytecode bump, the team is currently discussing the best way to merge these changes in. |
Hey @fbmal7 - We are trying to plan around this PR, would this be included in a new version of Hermes ( |
@AndrewGable We expect this PR to land in time for 0.73, barring any surprises. We will keep this thread updated. |
@fbmal7 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@fbmal7 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Original Author: contact@fabiohenriques.dev Original Git: 2afc7b0 Original Reviewed By: neildhar Original Revision: D48396577 This change adds support for `hasIndices` RegExp flag, according to ES2022 specifications. References: https://262.ecma-international.org/13.0/#sec-regexp-regular-expression-objects https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/hasIndices This change is also related to this issue: #932 Pull Request resolved: #968 Pulled By: fbmal7 Reviewed By: tmikov, neildhar Differential Revision: D50992837 fbshipit-source-id: 6c29b930fcc17957b95fe049afc190cfc840ff05
Summary
This change adds support for
hasIndices
RegExp flag, according to ES2022 specifications.References:
https://262.ecma-international.org/13.0/#sec-regexp-regular-expression-objects
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/hasIndices
This change is also related to this issue: #932
Test Plan