Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm: fix some string-handling issues in regex/glob matching #2998

Conversation

srenatus
Copy link
Contributor

Before, the opa_string_t's *char was used as-is; disregarding its length.
With certain inputs in consequent evaluations, this would end up using old
characters from a previous evaluation.

For example, evaluation

regex.match("^world$", input.message)

with input.message going through the sequence

"no-world"
"world"

would end up running the RE2 PartialMatch against the values

"no-world"
"worldrld"

and the second evaluation would thus wrongly fail.

Since glob.match is using regex.match underneath, it would show a similar
behaviour, as noted in #2962

Fixes #2962.

In the same spirit, changed some string usages in regex.is_valid and regex.find_all_string_submatch_n.

⚠️ I'm by no means an expert in C++, please review carefully. Any advise is appreciated, of course! 😃

Also provides an opa_println function for the WASM SDK... While that function is not used usually, with this change, it can be used for debugging purposes during development.

koponen-styra
koponen-styra previously approved these changes Dec 10, 2020
Copy link
Contributor

@koponen-styra koponen-styra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

tsandall
tsandall previously approved these changes Dec 10, 2020
Copy link
Member

@tsandall tsandall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM though I didn't see the opa_println implementation which would be useful (as long as it doesn't become a hard requirement for the wasm modules.)

@srenatus srenatus dismissed stale reviews from tsandall and koponen-styra via 89cb4dc December 10, 2020 19:06
@srenatus
Copy link
Contributor Author

@tsandall 🤦 forgot to push the commit. It's there now. Not become a hard requirement.

…ngth string casts

Before, the opa_string_t's *char was used as-is; disregarding its length.
With certain inputs in consequent evaluations, this would end up using old
characters from a previous evaluation.

For example, evaluation

    regex.match("^world$", input.message)

with input.message going through the sequence

    "no-world"
    "world"

would end up running the RE2 PartialMatch against the values

    "no-world"
    "worldrld"

and the second evaluation would thus wrongly fail.

Since glob.match is using regex.match underneath, it would show a similar
behaviour, as noted in open-policy-agent#2962

Note: for regex.is_valid, there is no test case. I had been seeing garbage
data trailing the pattern string before fixing this:

    opa_print(): ^foobfdfd[ard�

Fixes open-policy-agent#2962.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
This is useful for debugging using print statements :D

We should not have opa_println statements in our generated WASM code,
but if we do, during development or debugging, the passed string will
now be printed to stdout.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
@srenatus srenatus force-pushed the sr/issue-2962/glob.match-wrong-results branch from 89cb4dc to eaf516c Compare December 11, 2020 08:27
@srenatus
Copy link
Contributor Author

Squashed the builtin related fixes; kept the opa_println bit separate.

@tsandall tsandall merged commit 832ca65 into open-policy-agent:master Dec 11, 2020
@srenatus srenatus deleted the sr/issue-2962/glob.match-wrong-results branch December 11, 2020 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WASM "glob" is caching results
3 participants