Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first and last with nchar #23960

Merged
merged 5 commits into from
Oct 16, 2017
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,8 +236,12 @@ This section lists changes that do not have deprecation warnings.
Library improvements
--------------------

* Functions `first` and `last` now accept `nchar` argument for `AbstractString`.
If this argument is used they return a string consisting of first/last `nchar`
characters from the original string ([#23960]).

* The functions `nextind` and `prevind` now accept `nchar` argument that indicates
number of characters to move ([#23805]).
the number of characters to move ([#23805]).

* The functions `strip`, `lstrip` and `rstrip` now return `SubString` ([#22496]).

Expand Down
47 changes: 47 additions & 0 deletions base/strings/basic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -610,3 +610,50 @@ function filter(f, s::AbstractString)
end
String(take!(out))
end

## string first and last ##

"""
first(str::AbstractString, nchar::Integer)

Get a string consisting of the first `nchar` characters of `str`.

```jldoctest
julia> first("∀ϵ≠0: ϵ²>0", 0)
""

julia> first("∀ϵ≠0: ϵ²>0", 1)
"∀"

julia> first("∀ϵ≠0: ϵ²>0", 3)
"∀ϵ≠"
```
"""
function first(str::AbstractString, nchar::Integer)
nchar == 0 && return ""
nchar == 1 && return str[1:1]
str[1:nextind(str, 1, nchar-1)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it will give a BoundsError for nchar > length(str). Shouldn't it be min(endof(str), nextind(str, 1, nchar-1))?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is BoundsError intended here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intended. The idea is to have an invariant length(first(s, nchar)) == nchar. Similarly for last.
This might be changed the way you propose - then also docstring should be changed as it now promises nchar characters in the produced string.
I slightly prefer the current implementation (as it ensures the invariant) but I can see the rationale behind your proposal so I am OK change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we changed it to "at most" nchar elements of the string, then it would be similar to take.

My feeling is that using "at most" nchar elements would be a big more flexible, and it wouldn't hurt the uses where you want exactly nchar elements except that it wouldn't throw an error for a string of the wrong size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I will make a PR to have a clear decision point.

end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it return a SubString?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation in #23765 is that not. Usually first and last would be used with a small nchar so there is not much gain from SubString (or actually a loss if original string is very large, we get only a small part of it and the original string can be discarded). Observe that it can be a SubString if the original string is a SubString.


"""
last(str::AbstractString, nchar::Integer)

Get a string consisting of the last `nchar` characters of `str`.

```jldoctest
julia> last("∀ϵ≠0: ϵ²>0", 0)
""

julia> last("∀ϵ≠0: ϵ²>0", 1)
"0"

julia> last("∀ϵ≠0: ϵ²>0", 3)
"²>0"
```
"""
function last(str::AbstractString, nchar::Integer)
nchar == 0 && return ""
e = endof(str)
nchar == 1 && return str[e:e]
str[prevind(str, e, nchar-1):e]
end
2 changes: 2 additions & 0 deletions doc/src/stdlib/strings.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ Base.lstrip
Base.rstrip
Base.startswith
Base.endswith
Base.first(::AbstractString, ::Integer)
Base.last(::AbstractString, ::Integer)
Base.uppercase
Base.lowercase
Base.titlecase
Expand Down
20 changes: 20 additions & 0 deletions test/strings/basic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -623,3 +623,23 @@ end
@test prevind(strs[2], -1, 1) == 0
end
end

@testset "first and last" begin
s = "∀ϵ≠0: ϵ²>0"
@test_throws ArgumentError first(s, -1)
@test first(s, 0) == ""
@test first(s, 1) == "∀"
@test first(s, 2) == "∀ϵ"
@test first(s, 3) == "∀ϵ≠"
@test first(s, 4) == "∀ϵ≠0"
@test first(s, length(s)) == s
@test_throws BoundsError first(s, length(s)+1)
@test_throws ArgumentError last(s, -1)
@test last(s, 0) == ""
@test last(s, 1) == "0"
@test last(s, 2) == ">0"
@test last(s, 3) == "²>0"
@test last(s, 4) == "ϵ²>0"
@test last(s, length(s)) == s
@test_throws BoundsError last(s, length(s)+1)
end