JET reports type-instability when sampling from `Beta` #1866

bvdmitri · 2024-05-31T10:02:22Z

Consider the following:

julia> @report_opt rand(Beta(1, 1), 10)
═════ 1 possible error found ═════
┌ rand(s::Beta{Float64}, dims::Int64) @ Distributions /Users/bvdmitri/.julia/packages/Distributions/fgrZq/src/genericrand.jl:22
│┌ rand(::Random.TaskLocalRNG, ::Beta{Float64}, ::Int64) @ Distributions /Users/bvdmitri/.julia/packages/Distributions/fgrZq/src/genericrand.jl:24
││┌ rand(rng::Random.TaskLocalRNG, s::Beta{Float64}, dims::Tuple{Int64}) @ Distributions /Users/bvdmitri/.julia/packages/Distributions/fgrZq/src/genericrand.jl:52
│││ runtime dispatch detected: Distributions.rand!(rng::Random.TaskLocalRNG, %3::Distributions.BetaSampler{Float64}, %2::Vector{Float64})::Vector{Float64}
││└────────────────────

The Normal works just fine

julia> @report_opt rand(Normal(1, 1), 10)
No errors detected

Also sampling a single point is fine

julia> @report_opt rand(Beta(1, 1))
No errors detected

This harms performance, when sampling "inplace" with Beta (with rand!)

The text was updated successfully, but these errors were encountered:

devmotion · 2024-05-31T13:40:37Z

The reason for this observation is that sampler(Beta(...)) is not type stable: Depending on the parameters, a different algorithm is used for sampling. #1350 "fixed" the performance and allocation issues arising from this type instability with a function barrier. Based on the benchmarks in that PR it seems it doesn't harm performance anymore.

bvdmitri · 2024-05-31T14:04:16Z

Ok, that explains a bit the situation. The real issue is that sampler(Beta(..., ...)) is type-unstable and depends on the parameters. But can it always return the same object, which would simply have an if inside depending on the parameters? E.g.

sampler(d::Beta) = BetaSampler(d, other_fields...) # type-stable, always returns the same

function rand!(rng, sampler::BetaSampler, container)
    if some_condition(sampler)
        one_algorithm!(rng, container)
    else
        another_algorithm!(rng, container)
    end
end

Or a specialized method for rand! for Beta would also solve this issue entirely, e.g.

function rand!(rng, dist::Beta, n)
    if some_condition(dist)
        rand!(rng, OnerBetaSampler(dist), n)
    else 
        rand!(rng, AnotherBetaSampler(dist), n)
    end
end

devmotion · 2024-05-31T14:17:43Z

But can it always return the same object

The motivation for sampler is to precompute algorithm-dependent quantities when drawing multiple variates. The number and type of these quantities are quite different for different algorithms, as you can see in https://github.com/JuliaStats/Distributions.jl/blob/master/src/samplers/gamma.jl. Always computing all of these would be a bit wasteful I think.

devmotion · 2024-05-31T14:19:03Z

Or a specialized method for rand! for Beta would also solve this issue entirely

I still wonder, is there any issue here? It seems the function barrier in #1350 fixed the performance issues.

bvdmitri · 2024-05-31T14:28:56Z

It shouldn't recompute algorithm-dependent quantities in my second proposal with a specialized rand! since it basically exactly the same code with an explicit static if statement (multiple dispatch is really just a sophisticated runtime if statement).

To answer your question about the performance, yes, it does affect performance in the following case:

julia> foo(x, dist) = foreach(1:100_000) do _
           rand!(dist, x)
       end

julia> x = zeros(10);

julia> dist = Beta(1, 1)
Beta{Float64}(α=1.0, β=1.0)

julia> sr = sampler(dist)

julia> @btime foo($x, $dist);
  30.232 ms (100000 allocations: 6.10 MiB)

julia> @btime foo($x, $sr);
  27.726 ms (0 allocations: 0 bytes)

It's not much in this particular example, but it does accumulate in our larger program and it also allocates extra (while the whole idea of rand! is to sample in place)

bvdmitri · 2024-05-31T14:33:50Z

Another thing, I mentioned that sampling a single point is fine and is in fact type-stable.

julia> @report_opt rand(Beta(1, 1))
No errors detected

That suggests that there is a potential for improvement for the in-place version

devmotion · 2024-05-31T14:42:56Z

To answer your question about the performance, yes, it does affect performance in the following case:

The example is a bit contrieved since ideally you would use the sampler if you want to sample from the same distribution in a loop (similar to how the Random docs show that you should construct and use a Random.Sampler if you want to call rand repeatedly - BTW there's an issue to unify sampler with Random.Sampler a bit more: #1316).

The single variate sampling does not use samplers, hence it behaves differently.

bvdmitri · 2024-05-31T16:05:14Z

In the real code the distribution is not the same unfortunately

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JET reports type-instability when sampling from `Beta` #1866

JET reports type-instability when sampling from `Beta` #1866

bvdmitri commented May 31, 2024

devmotion commented May 31, 2024

bvdmitri commented May 31, 2024

devmotion commented May 31, 2024

devmotion commented May 31, 2024

bvdmitri commented May 31, 2024

bvdmitri commented May 31, 2024 •

edited

Loading

devmotion commented May 31, 2024

bvdmitri commented May 31, 2024

JET reports type-instability when sampling from Beta #1866

JET reports type-instability when sampling from Beta #1866

Comments

bvdmitri commented May 31, 2024

devmotion commented May 31, 2024

bvdmitri commented May 31, 2024

devmotion commented May 31, 2024

devmotion commented May 31, 2024

bvdmitri commented May 31, 2024

bvdmitri commented May 31, 2024 • edited Loading

devmotion commented May 31, 2024

bvdmitri commented May 31, 2024

JET reports type-instability when sampling from `Beta` #1866

JET reports type-instability when sampling from `Beta` #1866

bvdmitri commented May 31, 2024 •

edited

Loading