Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str method argument escaping is inconsistent #401

Closed
davidism opened this issue Sep 15, 2023 · 3 comments
Closed

str method argument escaping is inconsistent #401

davidism opened this issue Sep 15, 2023 · 3 comments
Milestone

Comments

@davidism
Copy link
Member

davidism commented Sep 15, 2023

Prior to #400, some str methods were overridden using a generic wrapper that called escape on all argument values that were str, then wrapped the result in Markup. This made typing difficult, and was inefficient. When removing the wrapper, I noticed it was only applied to str methods that returned str, not other method that took str arguments. I also noticed that some arguments shouldn't have escape called on them even though they are str.

There are some methods where the escaping makes sense. For example, str.center is similar to adding strings, which should escape the values.

But there are many other places where the escaping is inconsistent or not obvious. For example, __contains__ (and find, index, etc.) is not overridden, but strip is, so you can get the following behavior.

>>> from markupsafe import Markup
>>> s = Markup("<br>")
>>> "<" in s
True
>>> s.strip("<")
Markup("<br>")

strip calls escape on its argument, resulting in the actual call being str.strip(s, "&lt;"). That's definitely not correct, because strip removes any number of each character, in any order, but then intention was obviously to remove the <.

Python 3.9 added removeprefix and removesuffix, which remove the exact string of characters if it is present. But it's not clear if those should escape their argument, should removeprefix("<") remove the literal <, or &lt;?

All methods should at least be consistent. Regardless of which direction we choose, escaping or not, the user can always wrap the argument in Markup or escape to get the other meaning.

@davidism davidism added this to the 2.2.0 milestone Sep 15, 2023
@davidism
Copy link
Member Author

davidism commented Sep 15, 2023

Should escape argument (already does):

  • __add__(self, value, /)
  • __radd__(self, value, /)
  • __mod__(self, value, /)
  • __getitem__(self, key, /)
  • format(*args, **kwargs)
  • format_map(mapping)
  • center(self, width, fillchar=' ', /)
  • ljust(self, width, fillchar=' ', /)
  • rjust(self, width, fillchar=' ', /)
  • join(self, iterable, /)

Unclear, leaning towards should not escape, currently does not:

  • __contains__(self, key, /)
  • find(sub[, start[, end]])
  • rfind(sub[, start[, end]])
  • index(sub[, start[, end]])
  • rindex(sub[, start[, end]])
  • count(sub[, start[, end]])
  • split(self, /, sep=None, maxsplit=-1)
  • rsplit(self, /, sep=None, maxsplit=-1)
  • startswith(prefix[, start[, end]])
  • endswith(suffix[, start[, end]])

Unclear, leaning towards should not escape, currently does:

  • replace(self, old, new, count=-1, /); old should follow what we decide here, new should be escaped
  • removeprefix(self, prefix, /)
  • removesuffix(self, suffix, /)
  • strip(self, chars=None, /)
  • lstrip(self, chars=None, /)
  • rstrip(self, chars=None, /)
  • partition(self, sep, /)
  • rpartition(self, sep, /)

Only wrap result:

  • __mul__(self, value, /)
  • __rmul__(self, value, /)
  • capitalize(self, /)
  • casefold(self, /)
  • expandtabs(self, /, tabsize=8)
  • lower(self, /)
  • splitlines(self, /, keepends=False)
  • swapcase(self, /)
  • title(self, /)
  • translate(self, table, /)
  • upper(self, /)
  • zfill(self, width, /)

@davidism
Copy link
Member Author

Despite the long list, the actual change isn't very big, only 8 methods need to be changed to not escape their argument. They're all conceptually similar to the the search methods such as in, find, and index, which already don't escape their argument.

@davidism
Copy link
Member Author

Oops, ended up pushing to main instead of a PR branch by mistake. So this is in ef4a839, which also happens to be the 666th commit.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant