Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

findnext(f, ::AbstractString, ::Int) ignores specialised nextind #26202

Open
samoconnor opened this issue Feb 25, 2018 · 0 comments
Open

findnext(f, ::AbstractString, ::Int) ignores specialised nextind #26202

samoconnor opened this issue Feb 25, 2018 · 0 comments
Labels
search & find The find* family of functions strings "Strings!"

Comments

@samoconnor
Copy link
Contributor

findnext(f, ::MyString, i) creates a temporary SubString{MyString} of the range to be searched, then calls pairs to iterate over indexes and characters [1]. pairs creates a keys iterator for the SubString [2][3]. When next method for EachStringIndex calls nextind it ends up at the AbstractString fallback method [4].

The default nextind increments the index by one and calls isvalid in a loop. I'm experimenting with an AbstractString that uses sparse 64-bit indexes. So nextind sometimes takes a very long time.

I have a work around for this:

Base.nextind(s::SubString{SplicedString}, i::Int) = nextind(s.string, i)
Base.prevind(s::SubString{SplicedString}, i::Int) = prevind(s.string, i)

... but it seems like it might be better for Base to include something like this:

Base.nextind(s::SubString, i::Int) = nextind(s.string, i)
Base.prevind(s::SubString, i::Int) = prevind(s.string, i)

[1]

function findnext(testf::Function, s::AbstractString, i::Integer)
z = ncodeunits(s) + 1
1 i  z || throw(BoundsError(s, i))
@inbounds i == z || isvalid(s, i) || string_index_err(s, i)
for (j, d) in pairs(SubString(s, i))
if testf(d)
return i + j - 1
end
end
return nothing
end

[2]

pairs(collection) = Generator(=>, keys(collection), values(collection))

[3]

julia/base/strings/basic.jl

Lines 478 to 486 in 5929c56

keys(s::AbstractString) = EachStringIndex(s)
length(e::EachStringIndex) = length(e.s)
first(::EachStringIndex) = 1
last(e::EachStringIndex) = lastindex(e.s)
start(e::EachStringIndex) = start(e.s)
next(e::EachStringIndex, state) = (state, nextind(e.s, state))
done(e::EachStringIndex, state) = done(e.s, state)
eltype(::Type{<:EachStringIndex}) = Int

[4]

julia/base/strings/basic.jl

Lines 462 to 471 in 5929c56

function nextind(s::AbstractString, i::Int, n::Int)
n < 0 && throw(ArgumentError("n cannot be negative: $n"))
z = ncodeunits(s)
@boundscheck 0 i z || throw(BoundsError(s, i))
n == 0 && return thisind(s, i) == i ? i : string_index_err(s, i)
while n > 0 && i < z
@inbounds n -= isvalid(s, i += 1)
end
return i + n
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
search & find The find* family of functions strings "Strings!"
Projects
None yet
Development

No branches or pull requests

2 participants