Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature proposal: string removeprefix & removesuffix #38477

Closed
akosuas opened this issue Nov 17, 2020 · 9 comments
Closed

Feature proposal: string removeprefix & removesuffix #38477

akosuas opened this issue Nov 17, 2020 · 9 comments
Labels
feature Indicates new feature / enhancement requests strings "Strings!"

Comments

@akosuas
Copy link

akosuas commented Nov 17, 2020

These were recently added to python, and I think they'd also make sense in the Julia standard library. Here is a rationale expressed better than I probably could: https://www.python.org/dev/peps/pep-0616/

While incredibly simple, these functions come up often.

A julia implementation would be as short as:

removeprefix(s::AbstractString, prefix::AbstractString) = startswith(s, prefix) ? s[length(prefix)+1:end] : s
removesuffix(s::AbstractString, suffix::AbstractString) = endswith(s, prefix) ? s[end-length(prefix):end] : s

though perhaps some extra care would need to be taken around type stability (we could return s[1:end] in the other branch)?

@StefanKarpinski
Copy link
Member

Better still, spell them as / and \ (#13411). To reiterate the justification for this syntax, if s = a*b then we have s/b == a and a\s == b. This is exactly how it works for matrices:

julia> A = rand(3, 3)
3×3 Matrix{Float64}:
 0.905465  0.620071  0.431274
 0.148084  0.844358  0.388694
 0.282582  0.361448  0.475705

julia> B = rand(3, 3)
3×3 Matrix{Float64}:
 0.898229  0.0398066  0.464938
 0.672576  0.244668   0.845956
 0.92826   0.895045   0.0922891

julia> S = A*B
3×3 Matrix{Float64}:
 1.63069   0.573764  0.98534
 1.06172   0.56038   0.819012
 0.938503  0.525461  0.481055

julia> S/B  A
true

julia> A\S  B
true

Also, the fact that there are distinct left and right division operators can be seen as a justification of why it's better to use * for string concatenation than +: there are no left and right - operators because + is always commutative, so that wouldn't make sense, whereas * can be non-commutative, so having left and right division operations makes sense.

@akosuas
Copy link
Author

akosuas commented Nov 18, 2020

While that's neat, I personally wouldn't use it - the mental hurdle of remembering the correct interpretation & which operand is which would be too high, so I'd just opt for my own version instead.

Also - how would division behave if the string didn't have the prefix? Would that be an error, or would it return the string? I'd find the division interpretation strange if it returned the string (division by something silently becomes division by "one").

@ararslan
Copy link
Member

We could make these methods of chop. Currently to remove n characters from the front of a string, you'd use chop(s; head=n), and likewise for tail=n from the rear of the string. We could spell it as chop(s; head=prefix) and chop(s; tail=suffix), which allows you to compose both in a single call as chop(s; head=prefix, tail=suffix).

@ararslan ararslan added feature Indicates new feature / enhancement requests strings "Strings!" labels Nov 18, 2020
@stevengj
Copy link
Member

stevengj commented Nov 23, 2020

While I like the idea of / and \ in principle, my main qualm is that it would then seem that they should throw a "divisibility" error if the parent string doesn't have the requested suffix/prefix, as @akosuas points out. The Python behavior, as I understand it, is to leave the string unmodified in that case, which would seem to imply a different function name.

In contrast chop already does nothing if there is nothing to chop.

@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Feb 26, 2021

A correct code seems to be:

function chop(s::AbstractString; prefix::AbstractString = "", suffix::AbstractString = "")
  temp = startswith(s, prefix) ? s[begin+length(prefix):end] : s; endswith(temp, suffix) ? temp[begin:end-length(suffix)] : temp
end

There's one problem, what if the prefix is valid and the suffix and they just overlap. Then the order of the chops matter:

julia> chop(chop(" Palli ", suffix="lli "), prefix = " Pal")
" Pa"

Unlike for chop, as it's now, head+tail may be >= length and defined to return empty string, which is a third valid result. Would people actually chop prefix and suffix often enough; and with overlap for this to be a problem? It's also a bit slower if only either is needed. And would we want to go into combinations, e.g. head and prefix etc. ...?

@kcajf
Copy link
Contributor

kcajf commented May 28, 2021

I found myself needing these methods often, came across this issue and had a quick go at a PR (hopefully in time for 1.7).

Extending chop was a bit awkward because head and tail are keyword arguments, so can't be used for dispatch, + it wasn't clear which combinations of integer/string head and tail would be valid. I also agree with @PallHaraldsson's point above that having head and tail in a single method could lead to confusing cases.

Instead, I've kept them as two separate methods, chopprefix and chopsuffix. I think this naming makes more sense given the existence of chop (instead of python's removeprefix/removesuffix).

@StefanKarpinski
Copy link
Member

Another thought on naming: stripprefix, stripsuffix? Matching strip, lstrip and rstrip.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jun 10, 2021

But I agree that "chop" may be better than "strip" here, although technically what this does is closer to chomp: whereas chop removes a trailing character unconditionally, chomp removes a trailing newline only if it is present, which is similar to what this does.

@stevengj
Copy link
Member

Closed by #40995.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests strings "Strings!"
Projects
None yet
Development

No branches or pull requests

6 participants