Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\Q...\E for unsafe regex interpolation? #47907

Open
MarcMush opened this issue Dec 15, 2022 · 4 comments
Open

\Q...\E for unsafe regex interpolation? #47907

MarcMush opened this issue Dec 15, 2022 · 4 comments
Labels
docs This change adds or pertains to documentation strings "Strings!"

Comments

@MarcMush
Copy link
Contributor

julia> regex_name = Regex("[\"( ]\\Q$name\\E[\") ]") # interpolate value of name
r"[\"( ]\QJon\E[\") ]"

Note the use of the `\Q...\E` escape sequence. All characters between the `\Q` and the `\E`
are interpreted as literal characters (after string interpolation). This escape sequence can
be useful when interpolating, possibly malicious, user input.

these lines seem to indicate that \Q and \E can be used in a regex with unsafe, even malicious content, but I believe it can be easily defeated (name = "\\E.*\\Q")

Should the docs be updated to clarify this?

I found 2 PRs for escaping a regex but they are not merged (#29643 and #31989)

@StefanKarpinski
Copy link
Member

Agree that this is not a good solution to protecting against potentially malicious content being spliced into a regex. I think we do have some functions for safely splicing literal content into regexes though...

@Keno
Copy link
Member

Keno commented Dec 16, 2022

Should regex interpolation try to do the safe thing by default with a flag to turn it off? We could do a similar thing to commands where if you interpolate another regex it gets interpreted, but if you interpolate a string it gets whatever processing is necessary applied to make it literal.

@brenhinkeller brenhinkeller added the strings "Strings!" label Dec 18, 2022
@StefanKarpinski
Copy link
Member

Should regex interpolation try to do the safe thing by default with a flag to turn it off?

How would that work? There is no interpolation into regexes, just interpolated strings passed to the Regex constructor.

@Keno
Copy link
Member

Keno commented Jan 6, 2023

How would that work? There is no interpolation into regexes, just interpolated strings passed to the Regex constructor.

This was an unrelated comment. If that were available, the documentation could suggest to use it instead of saying anything about \Q and \E.

@Keno Keno added the docs This change adds or pertains to documentation label Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation strings "Strings!"
Projects
None yet
Development

No branches or pull requests

4 participants