-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a lazy logrange
function and LogRange
type
#39071
Conversation
I agree this is useful. If we do this we should go with |
There probably should be a julia> collect( logrange(1 => 1000, 4) )
4-element Array{Float64,1}:
1.0
9.999999999999998
99.99999999999996
999.9999999999998 May I suggest the following: function logrange(p::Pair{<:Real, <:Real}, n::Integer)
lo, hi = promote(p.first, p.second)
invert = abs(lo) > abs(hi)
ratio = invert ? lo/hi : hi/lo
if ratio % 10 == 0
exp = exp10
log = log10
elseif ratio % 2 == 0
exp = exp2
log = log2
else
exp = Base.exp
log = Base.log
end
if ratio > 0
if invert
log_ratio = -log(ratio)
(x > log_ratio/2 ? lo*exp(x) : hi*exp(x-log_ratio) for x in range(0, log_ratio, length=n))
else
log_ratio = log(ratio)
(x < log_ratio/2 ? lo*exp(x) : hi*exp(x-log_ratio) for x in range(0, log_ratio, length=n))
end
else
throw(DomainError(p, "logrange requires that the first and last elements are both positive, or both negative"))
end
end This increases numerical stability for julia> collect( logrange(1 => 1000, 4) )
4-element Array{Float64,1}:
1.0
10.0
100.0
1000.0
julia> collect( logrange(1 => 1024, 3) )
3-element Array{Float64,1}:
1.0
32.0
1024.0
julia> collect( logrange(1 => 1024, 6) )
6-element Array{Float64,1}:
1.0
4.0
16.0
64.0
256.0
1024.0 |
Yes, the implementation here is just a sketch so far. Ideally it would always hit both endpoints to full accuracy, although I'm not sure how hard that would be. |
The trick might be figuring out which endpoint we are closer to and then deciding whether we should count up or down. For start, the calculation should be The need for selecting the correct base and for making these kind of decisions is what ultimately justifies having this method, I think. |
I updated my suggested code above. I think a ternary operator takes care of the endpoints. log_ratio = log(ratio)
(x < log_ratio/2 ? lo*exp(x) : hi*exp(x-log_ratio) for x in range(0, log_ratio, length=n)) |
Hmm... there is still some asymmetry to this. julia> collect(logrange(1 => 1000, 4))
4-element Array{Float64,1}:
1.0
10.0
100.0
1000.0
julia> collect(logrange(1000 => 1, 4))
4-element Array{Float64,1}:
999.9999999999998
100.00000000000004
10.000000000000005
1.0000000000000002 |
I fixed it by always taking the larger ratio. invert = abs(lo) > abs(hi)
ratio = invert ? lo/hi : hi/lo So now everything seems to work well: julia> collect( logrange( 1000 => 1, 4) )
4-element Array{Float64,1}:
1000.0
100.0
10.0
1.0
julia> collect( logrange( 1 => 1000, 4) )
4-element Array{Float64,1}:
1.0
10.0
100.0
1000.0
julia> collect( logrange( -1 => -1000, 4) )
4-element Array{Float64,1}:
-1.0
-10.0
-100.0
-1000.0
julia> collect( logrange( -1000 => -1, 4) )
4-element Array{Float64,1}:
-1000.0
-100.0
-10.0
-1.0 |
The Numpy equivalent of this function is called |
Can I suggest that 0 be allowed as an endpoint? I've always found it frustrating when I write code sweeping a parameter and |
@darsnack what you are suggesting requires an explicit base, which logrange doesn't because it's base-independent (the notion of "logarithmically equispaced" does not depend on which logarithm you choose) |
Ah thanks for the explanation, makes sense! |
If this takes keywords like range does, then something like Including zero in the results |
Yeah I don't necessarily want to complicate the proposal with too many API options now that I understand the rationale. I guess it depends on which arguments are chosen for specification. To me, whenever I specify |
The latest changes look very nice. I see that the iteration is based on multiplying the ratio by the prior element. How confident are you that you can avoid accumulated errors through long iterations? The iteration for
Only for In this case, might there be situations where multiplication or exponentiation might result in better stability? |
All else equal it would be nicer to have random access, which is what multiplication buys you in linear ranges. But it's tricky to make that fast & accurate here, I think the answer may simply be to collect this iterator if you need that. Notice that it iterates in twice the precision of the output, so there are quite a few spare bits for the inevitable errors. I haven't been very systematic about stress-testing it, I don't promise it's This gist compares some different approaches: https://gist.github.com/mcabbott/a40c784b4fd27656b581ad0dac09aae5 . |
I think we should not use Pair for this; I don't believe any other similar functions work like that? |
Notation is now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fix the build, you might need to split the Complex
code off and rename some functions so you don't have to duplicate them.
I also suggest that you take a look StepRangeLen
which has a flexible reference value to increase accuracy in a particular part of the range.
Can't this just be a very simple wrapper to whatever That is, simply: julia> struct LogRange{T, R<:AbstractRange} <: AbstractVector{T}
r::R
end
julia> LogRange(r) = LogRange{typeof(exp(first(r))), typeof(r)}(r)
LogRange
julia> Base.size(r::LogRange) = size(r.r)
julia> Base.getindex(r::LogRange, i::Integer) = exp(r.r[i])
julia> logrange(start, stop, len) = LogRange(range(log(start), log(stop), length=len))
logrange (generic function with 1 method) I guess there's the floating point annoyance that we no longer hit endpoints exactly due to the subsequent floating point exponentiation... that's annoying. julia> logrange(0.0001, 1, 5)
5-element LogRange{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}:
0.00010000000000000009
0.0010000000000000002
0.010000000000000004
0.10000000000000002
1.0 |
Yes my first version was literally |
IMO for this to be considered — and for it to be called |
I will think some more about strategies to do that, when I get a chance. Another option to perhaps consider is for it to be |
Do we want |
Would this probably have similar accuracy to DoubleFloats? If so that may solve the problem, but perhaps I should experiment a bit more with (& without) that to see what works well.
|
Yeah |
Hitting the endpoints is easy if we just save those values in the struct. The most noticeable problem is when the roots are well known.
If @mbauman used used julia> struct LogRange{T, R<:AbstractRange} <: AbstractVector{T}
r::R
end
julia> LogRange(r) = LogRange{typeof(exp(first(r))), typeof(r)}(r)
LogRange
julia> Base.size(r::LogRange) = size(r.r)
julia> Base.getindex(r::LogRange, i::Integer) = exp(r.r[i])
julia> Base.getindex(r::LogRange, i::Integer) = exp10(r.r[i])
julia> logrange(start, stop, len) = LogRange(range(log10(start), log10(stop), length=len))
logrange (generic function with 1 method)
julia> logrange( 0.0001, 1, 5 )
5-element LogRange{Float64,StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}:
0.0001
0.001
0.01
0.1
1.0
julia> logrange( 0.0001, 10000, 9 )
9-element LogRange{Float64,StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}:
0.0001
0.001
0.01
0.1
1.0
10.0
100.0
1000.0
10000.0 Of course this fails for base 2. julia> logrange( 2^-5, 2^5, 11 )
11-element LogRange{Float64,StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}:
0.031249999999999997
0.0625
0.12499999999999999
0.25
0.5
1.0
2.0
4.0
8.000000000000002
16.0
32.00000000000001 There's |
This is false: julia> @eval Base begin # on 623a4f8
function *(λ::Number, r::LogRange)
start = λ * r.start
stop = λ * r.stop
len = r.len
extra = log(λ)/(r.len-1) .+ r.extra # should be more careful about promotion, high precision, negative
$(Expr(:new, :(LogRange{typeof(start), typeof(extra[1])}), :start, :stop, :len, :extra))
end
show(io::IO, r::LogRange{T}) where {T<:Complex} = invoke(show, Tuple{IO,AbstractVector{T}}, io, r)
end;
julia> lo * logrange(1, hi/lo, 5)
5-element Base.LogRange{ComplexF32, ComplexF64}:
-1.0+0.01im, -1.00004+0.00500006im, -1.00005-1.43682f-9im, -1.00004-0.00500006im, -1.0-0.01im Edit, of course Obviously the entire point of using |
You do obviously need to choose a branch cut, and 4 possible conventions were calmly listed above. I still think that a natural one falls out of The last commit removes all complex support, aiming for the minimal thing. One unfortunate side effect of being less generic is that the struct can no longer store a Whether to support some operations which return another |
Thanks for pointing out the inaccuracy, I should've said "not possible using actual api, without resorting to julia & logrange internals". Sorry for being too lazy in stating that :) |
@oscardssmith, @mcabbott, is this ready to merge? I don't see any outstanding objections.
We still can, I just think it will be easier to do in a separate PR focused on expanding the scope of LogRange and don't want to delay merging what we have now until we have consensus for Complex handling. |
One small question is whether we are happy with this asymmetry: julia> logrange(2, 1e300, 5)
5-element Base.LogRange{Float64, Base.TwicePrecision{Float64}}:
2.0, 1.68179e75, 1.41421e150, 1.18921e225, 1.0e300
julia> logrange(2, Inf, 5) # matches the limit
5-element Base.LogRange{Float64, Base.TwicePrecision{Float64}}:
2.0, Inf, Inf, Inf, Inf
julia> logrange(1e-300, 20, 5)
5-element Base.LogRange{Float64, Base.TwicePrecision{Float64}}:
1.0e-300, 2.11474e-225, 4.47214e-150, 9.45742e-75, 20.0
julia> logrange(0, 20, 5) # limit would be [0,0,0,0,20]
5-element Base.LogRange{Float64, Base.TwicePrecision{Float64}}:
NaN, NaN, NaN, NaN, 20.0 My take is that neither of these is useful, but forgetting that 0 is infinitely far away is pretty common, and if you silently get something Supplying |
well the (current) docstring does say
so maybe however I also think it would be reasonable to just throw if
|
I don't like that asymmetry. I think the most mathematically sound definition, and my preference, would be to take the limit, follow
OTOH, I suspect we'll get bug reports of folks saying that Failing faster with
or even
is plausible, depending on how much we want to be proactively helpful. I tend to prefer returning sensible results whenever possible (i.e. not eagerly jumping to NaN if a limiting value exists) and counting on the user to provide 0/Inf/Nan inputs only if they want the appropriate outputs because that allows slightly more general usage, but could see going either way. Setting the halfway point Edit after reading @adienes's comment: if we are going to proactively assume the user made a mistake and return a result other than the limit, I think we should do it by throwing, not by returning NaN. |
My argument for not throwing an error is that Can I use Looking for precedents:
|
Last commit just throws errors on 0, NaN, Inf, matching what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mergable to me. I've reviewed everything but the src of _log_twice64_unchecked
and the usage of math internals, which I'm counting on @oscardssmith's previous approval to cover.
LogRange
functionlogrange
function and LogRange
type
(cherry picked from commit 3ed2b49)
Backported PRs: - [x] #39071 <!-- Add a lazy `logrange` function and `LogRange` type --> - [x] #51802 <!-- Allow AnnotatedStrings in log messages --> - [x] #53369 <!-- Orthogonalize re-indexing for FastSubArrays --> - [x] #48050 <!-- improve `--heap-size-hint` arg handling --> - [x] #53482 <!-- add IR encoding for EnterNode --> - [x] #53499 <!-- Avoid compiler warning about redefining jl_globalref_t --> - [x] #53507 <!-- update staled `Core.Compiler.Effects` documentation --> - [x] #53408 <!-- task splitting: change additive accumulation to multiplicative --> - [x] #53523 <!-- add back an alias for `check_top_bit` --> - [x] #53377 <!-- add _readdirx for returning more object info gathered during dir scan --> - [x] #53525 <!-- fix InteractiveUtils call in Base.runtests on failure --> - [x] #53540 <!-- use more efficient `_readdirx` for tab completion --> - [x] #53545 <!-- use `_readdirx` for `walkdir` --> - [x] #53551 <!-- revert "Add @create_log_macro for making custom styled logging macros (#52196)" --> - [x] #53554 <!-- Always return a value in 1-d circshift! of abstractarray.jl --> - [x] #53424 <!-- yet more atomics & cache-line fixes on work-stealing queue --> - [x] #53571 <!-- Update Documenter to v1.3 for inventory writing --> - [x] #53403 <!-- Move parallel precompilation to Base --> - [x] #53589 <!-- add back `unsafe_convert` to pointer for arrays --> - [x] #53596 <!-- build: remove extra .a file --> - [x] #53606 <!-- fix error path in `precompilepkgs` --> - [x] #53004 <!-- Unexport with, at_with, and ScopedValue from Base --> - [x] #53629 <!-- typo fix in scoped values docs --> - [x] #53630 <!-- sroa: Fix incorrect scope counting --> - [x] #53598 <!-- Use Base parallel precompilation to build stdlibs --> - [x] #53649 <!-- precompilepkgs: package in boths deps and weakdeps are in fact only weak --> - [x] #53671 <!-- Fix bootstrap Base precompile in cross compile configuration --> - [x] #52125 <!-- Load Pkg if not already to reinstate missing package add prompt --> - [x] #53602 <!-- Handle zero on arrays of unions of number types and missings --> - [x] #53516 <!-- permit NamedTuple{<:Any, Union{}} to be created --> - [x] #53643 <!-- Bump CSL to 1.1.1 to fix libgomp bug --> - [x] #53679 <!-- move precompile workload back from Base --> - [x] #53663 <!-- add isassigned methods for reinterpretarray --> - [x] #53662 <!-- [REPL] fix incorrectly cleared line after completions accepted --> - [x] #53611 <!-- Linalg: matprod_dest for Diagonal and adjvec --> - [x] #53659 <!-- fix #52025, re-allow all implicit pointer casts in cconvert for Array --> - [x] #53631 <!-- LAPACK: validate input parameters to throw informative errors --> - [x] #53628 <!-- Make some improvements to the Scoped Values documentation. --> - [x] #53655 <!-- Change tbaa of ptr_phi to tbaa_value --> - [x] #53391 <!-- Default to the medium code model in x86 linux --> - [x] #53699 <!-- Move `isexecutable, isreadable, iswritable` to `filesystem.jl` --> - [x] #41232 <!-- Fix linear indexing for ReshapedArray if the parent has offset axes --> - [x] #53527 <!-- Enable analyzegc checks for try catch and fix found issues --> - [x] #52092 - [x] #53682 <!-- Increase build precompilation --> - [x] #53720 - [x] #53553 <!-- typeintersect: fix `UnionAll` unaliasing bug caused by innervars. --> Contains multiple commits, manual intervention needed: - [ ] #53305 <!-- Propagate inbounds in isassigned with CartesianIndex indices --> Non-merged PRs with backport label: - [ ] #53736 <!-- fix literal-pow to return the right type when the base is -1 --> - [ ] #53707 <!-- Make ScopedValue public --> - [ ] #53696 <!-- add invokelatest to on_done callback in bracketed paste --> - [ ] #53660 <!-- put Logging back in default sysimage --> - [ ] #53509 <!-- revert moving "creating packages" from Pkg.jl --> - [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} --> - [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` --> - [ ] #52694 <!-- Reinstate similar for AbstractQ for backward compatibility --> - [ ] #51928 <!-- Styled markdown, with a few tweaks --> - [ ] #51816 <!-- User-themable stacktraces --> - [ ] #51811 <!-- Make banner size depend on terminal size --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia -->
I think it would be useful to have an easy way to make a range of points spaced logarithmically. This PR adds the following:
Before Julia 1.0, there was a
logspace
function which did something different, and less useful:This was removed in #25896 in favour of just writing
2 .^ (1:4)
. If you know what powers you want, that is fine. But if you know what start and endpoints you want, then it's more of a hassle to write something like2 .^ range(log2(2), log2(16), length=4)
. Well, for powers of 2, I think we can all do that in our heads, but if you want 20 plot points between about 3.5 and 1700, that's when this would be neat.After deciding whether or not this is a good idea at all:
I've given this the same notation as Add range(start => stop, length) #39069 but it could equally well belogrange(lo, hi; length)
instead, or in addition.logspace
behaves like the old one, so perhaps best to avoid that name. As @mkitti points out below, Python also has similarlogspace
, and ageomspace
which is like this. That's another possible name. Mathematica'sPowerRange[a,b,r]
takes start, stop, and ratio instead. (This one could be upgraded to accept keywordsratio=1.1
instead oflength=10
, likerange
, but I'm not sure that's often useful.)Ideally it would use something like TwicePrecision to hit the endpoints more precisely? If it is to be an iterator, then maybe it should iterate by multiplication, not by addition &exp
.Edit:
The current implementation is a
LogRange
struct. LikeLinRange
, this should hit the endpoints exactly, but not always rational intermediate points. I think this is a reasonable trade-off, and matches what was asked succinctly here.This thread got quite long, sorry. There are about a dozen implementations (here and in a linked gist), many trying harder to hit intermediate points, trying to figure out what was possible. Too many examples also show powers of 10, as that's an eay place to see how often you hit the exact number... these are a distraction, I think.
Perhaps I should also stress that the definition of this is mathematically very simple. Just as
LinRange(lo, hi, 3)
is the two ends plus their arithmetic mean, soLogRange(lo, hi, 3)
is the two ends plus the geometric mean. For any quantity which is commonly plotted on a log scale, or measured in dB, this is the natural way to divide up the space betweenlo
andhi
. No mention of a base is necessary. No magic inference about whether0.1
really means1//10
is needed either.