-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
correctly parse all integer types. fixes #9289 #9316
Conversation
end | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. That can't be efficient :-\
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like it either. But couldn't find a better way. Using parse
was worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it is only when an abstract Number type is specified to readdlm
. Other cases perform as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is our fault for not having a good API for this. Given when you have to work with, this is the only way. But it's still terrible :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last try
...end
does not have a catch. Is there a reason to use try
...end
for that case instead of just val = parseint(BigInt, sval)
?
P.S. I vaguely recall something like this showing up in a C++ benchmark, which caused compiler implementers to come up with crazy optimizations for this kind of practice, even though the practice was nominally discouraged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArchRobison that is to let it fall through and try for bool or just set a string, when T
is Any
or Number
.
@JeffBezanson Yes, I think we can just limit to Int. Also, won't the specialized methods be able to optimize away conditions that don't apply? Which will then be better than using Union
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method will be specialized anyway; it's just more efficient not to have 5 copies of basically the same source code. Using @eval
forces all 5 copies to exist; relying on specialization only generates new code for cases that are actually used. Also @eval
and $
are just ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unfortunate, since stack unwinding is so incredibly expensive on windows and we always create the stack trace on an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will probably be useful to have versions of parseint
and parsefloat
that indicate failure instead of throwing an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to failures as return variables rather than exceptions.
As part of the String redesign, I'd like to see functions like those in https://github.com/johnmyleswhite/CSVReaders.jl/blob/master/src/05_parsers.jl get moved into Base.
Used parseint instead of float64_isvalid.
Modified the PR:
Some performance comparisons with a Master:
Branch:
|
Shall merge this tomorrow if there are no objections and is still open. |
Nice speed bumps. Any idea why the second test is slower? Also, there was a recent try-end return value fix. Not sure if that is relevant here. |
i'm mildly concerned this may have a major speed regression on windows do to the dependence on try/catch. do you happen to have a machine you could test on for comparison? i would instead suggest: if sval == "true"
cells[row,col] = true
elseif sval == "false"
cells[row,col] = false
elseif (sval[1] == '-' && isdigit(SubString(sval,2))) || isdigit(sval) #xxx: need to handle 0x, 0b, 0o here also
cells[row,col] = parseint(sval)
elseif float64_isvalid(sval, tmp64)
cells[row,col] = tmp64[1]
else
return true
end
return false or, just rewrite |
@ViralBShah I can see the second test result vary between @vtjnash Will try the options suggested. I tried to avoid too many string comparisons, but probably that will be faster than the |
a six character ascii-string compare is going to be faster than just about anything else you could attempt (plus, it will almost always short circuit after the first character is not matched in |
also, it's worth pointing out that the time to capture a stack trace is going to be proportional to the size of the stack. benchmark code is typically going to have a much shorter call stack than a real function. (full disclosure: I'm currently testing stack overflow code on windows ed5a8fc via |
Ouch. Should definitely avoid the |
Returning a |
I like the idea of using |
Here's a draft implementation of The
|
doing the work in C eliminates the ability to perform the crucial optimizations that make this fast. instead, use ccall's ability to interpret output struct parameters and add a C representation of Julia's typedef struct {
uint8_t isnull;
double value;
} nullable_float64;
nullable_float64 maybefloat64(char *s) {
nullable_float64 ret = {1, strtod(s));
return ret;
} ccall(:maybefloat64, Nullable{Float64}, (Ptr{Uint8},), s) |
Thanks! With that change:
|
that's good. now my other remaining question would be: why isn't the allocation 0 bytes? |
This seems to fix the extra allocation tanmaykm@94ee10b, @vtjnash please verify. Now:
|
wow, yes, good find! that code largely predated immutable types. that missing condition seems to have been an oversight. any idea why that affected |
Oops sorry, that was for a single loop (I had commented out the loop for some testing in my repl). Here's the correct result for
|
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces following methods that parse a string as the indicated type and return a `Nullable` with the result instead of throwing exception: - `maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)` - `maybefloat32(s::AbstractString)` and `maybefloat64(s::AbstractString)` Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces the tryparse method: - tryparse{T<:Integer}(::Type{T<:Integer},s::AbstractString) - tryparse(::Type{Float..},s::AbstractString) - a few variants of the above And: - tryparse(Float.., ...) call the corresponding C functions jl_try_strtof, jl_try_substrtof, jl_try_strtod and jl_try_substrtod. - The parseint, parsefloat, float64_isvalid and float32_isvalid methods wrap the corresponding tryparse methods. - The jl_strtod, jl_strtof, ... functions are wrappers over the jl_try_str... functions. This should fix JuliaLang#10498 as well. Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces the tryparse method: - tryparse{T<:Integer}(::Type{T<:Integer},s::AbstractString) - tryparse(::Type{Float..},s::AbstractString) - a few variants of the above And: - tryparse(Float.., ...) call the corresponding C functions jl_try_strtof, jl_try_substrtof, jl_try_strtod and jl_try_substrtod. - The parseint, parsefloat, float64_isvalid and float32_isvalid methods wrap the corresponding tryparse methods. - The jl_strtod, jl_strtof, ... functions are wrappers over the jl_try_str... functions. This should fix JuliaLang#10498 as well. Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Introduces the tryparse method: - tryparse{T<:Integer}(::Type{T<:Integer},s::AbstractString) - tryparse(::Type{Float..},s::AbstractString) - a few variants of the above And: - tryparse(Float.., ...) call the corresponding C functions jl_try_strtof, jl_try_substrtof, jl_try_strtod and jl_try_substrtod. - The parseint, parsefloat, float64_isvalid and float32_isvalid methods wrap the corresponding tryparse methods. - The jl_strtod, jl_strtof, ... functions are wrappers over the jl_try_str... functions. This should fix JuliaLang#10498 as well. Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
Closing. Will resubmit from a different branch. |
Continued in #10560 |
Used
parseint
andparsefloat
methods instead offloat64_isvalid
.