-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make parse(T, "")
and parse(T, " ")
etc, more specific regarding whitespace and empty strings
#20588
Make parse(T, "")
and parse(T, " ")
etc, more specific regarding whitespace and empty strings
#20588
Conversation
…ntains only whitespace
parse(T, "")
and parse(T, " ")
etc, more specificparse(T, "")
and parse(T, " ")
etc, more specific regarding whitespace and empty strings
base/parse.jl
Outdated
@@ -58,6 +58,10 @@ end | |||
|
|||
function tryparse_internal{T<:Integer}(::Type{T}, s::AbstractString, startpos::Int, endpos::Int, base_::Integer, raise::Bool) | |||
_n = Nullable{T}() | |||
if isempty(strip(s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think calling strip
will be too expensive here --- this function is very performance-critical, and most inputs don't need strip
at all. I believe parseint_preamble
returns a base of 0
if there is a problem; we should check for all-whitespace strings only in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have spotted that. Will commit shortly.
04035ab
to
d0eb2e3
Compare
Sorry that took a while. Took me farting around for an hour, trying to make it better for |
base/parse.jl
Outdated
Nullable{Bool}() | ||
|
||
substr = SubString(sbuff, startpos, endpos) | ||
if count(isspace, sbuff) == len # all chars were whitespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with this code, but should this be checking substr
instead of sbuff
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably add a test for that. >)
Third time's a charm. |
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @jrevels |
base/parse.jl
Outdated
@@ -126,27 +130,49 @@ end | |||
|
|||
function tryparse_internal(::Type{Bool}, sbuff::Union{String,SubString}, | |||
startpos::Int, endpos::Int, base::Integer, raise::Bool) | |||
len = endpos-startpos+1 | |||
p = pointer(sbuff)+startpos-1 | |||
null = Nullable{Bool}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see the point of defining this variable. Why not write Nullable{Bool}()
where you need it?
base/parse.jl
Outdated
|
||
substr = SubString(sbuff, startpos, endpos) | ||
if count(isspace, substr) == len # all chars were whitespace | ||
raise && throw(ArgumentError("input string only contains whitespace")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit weird stylistically. Why not wrap both branches inside an if raise
block?
base/parse.jl
Outdated
Nullable{Bool}() | ||
|
||
substr = SubString(sbuff, startpos, endpos) | ||
if count(isspace, substr) == len # all chars were whitespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all(isspace, substr)
would be simpler (if I'm not missing something).
base/parse.jl
Outdated
if isnull(result) | ||
throw(ArgumentError("cannot parse $(repr(s)) as $T")) | ||
end | ||
return get(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use unsafe_get
since you have already checked isnull
.
test/parse.jl
Outdated
# Issue 20587 | ||
for T in vcat(subtypes(Signed), subtypes(Unsigned)) | ||
for s in ["", " ", " "] | ||
# Without a base (handles things like "0x00001111", etc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove one space here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - from where?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 5 spaces instead of 4.
test/parse.jl
Outdated
|
||
# Test `tryparse_internal` with part of a string | ||
let b = " " | ||
result = @test_throws ArgumentError get(Base.tryparse_internal(Bool, b, 7, 11, 0, true)) == true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
== true
is redundant (same below).
|
||
# With a base | ||
result = @test_throws ArgumentError parse(T, s, 16) | ||
exception_with_base = result.value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would find it clearer to always access result.value
instead of defining exception_with_base
(same below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defined them separately so that it reminds you they are exceptions, and so that in the test-failed message you see which one it was, since let-expressions obscure the line number.
So, the old parsing code for integers was forgiving of leading and trailing whitespace. My new integer parsing code is also forgiving of whitespace. Shall I make them both forgiving of it? |
either both should allow whitespace or neither should - maybe the latter if it's possible to get any performance benefits from that |
@tkelman |
Fixes #20587.
Before this PR:
After this PR:
This message will also appear for
parse(T, "")
andparse(T, " ")
, etc.