add a type assert to logging to prevent Base.require invalidation #48810

KristofferC · 2023-02-27T11:14:22Z

Before:

In [2]: using ModelBaseEcon

In [3]: @time using Random # invalidations
# precompile(Tuple{typeof(Base.something), String, String, Vararg{String}})
# precompile(Tuple{typeof(Base.require), Module, Symbol})
# precompile(Tuple{typeof(Base._require_prelocked), Base.PkgId, String})
  0.475471 seconds (599.54 k allocations: 40.704 MiB, 8.27% gc time, 99.10% compilation time: 100% of which was recompilation)

After:

In [2]: using ModelBaseEcon

In [3]: @time using Random
  0.004375 seconds (5.75 k allocations: 372.331 KiB)

@time

ForwardDiff quite aggressively specializes most of its functions on the concrete input function type. This gives a slight performance improvement but it also means that a significant chunk of code has to be compiled for every call to `ForwardDiff` with a new function. Previously, for every equation in a model we would call `ForwardDiff.gradient` with the julia function corresponding to that equation. This would then compile the ForwardDiff functions for all of these julia functions. Looking at the specializations generated by a model, we see: ```julia GC = ForwardDiff.GradientConfig{FRBUS_VAR.MyTag, Float64, 4, Vector{ForwardDiff.Dual{FRBUS_VAR.MyTag, Float64, 4}}} MethodInstance for ForwardDiff.vector_mode_dual_eval!(::FRBUS_VAR.EquationEvaluator{:resid_515}, ::GC, ::Vector{Float64}) MethodInstance for ForwardDiff.vector_mode_gradient!(::DiffResults.MutableDiffResult{1, Float64, Tuple{Vector{Float64}}}, ::FRBUS_VAR.EquationEvaluator{:resid_515}, ::Vector{Float64}, ::GC) MethodInstance for ForwardDiff.vector_mode_dual_eval!(::FRBUS_VAR.EquationEvaluator{:resid_516}, ::GC, ::Vector{Float64}) MethodInstance for ForwardDiff.vector_mode_gradient!(::DiffResults.MutableDiffResult{1, Float64, Tuple{Vector{Float64}}}, ::FRBUS_VAR.EquationEvaluator{:resid_516}, ::Vector{Float64}, ::GC) ``` which are all identical methods compiled for different equations. In this PR, we instead "hide" all the concrete functions for every equation between a common "wrapper functions". This means that only one specialization of the ForwardDiff functions gets compiled. Using the following benchmark script: ```julia unique!(push!(LOAD_PATH, realpath("./models"))) using ModelBaseEcon using Random # See JuliaLang/julia#48810 @time using FRBUS_VAR m = FRBUS_VAR.model nrows = 1 + m.maxlag + m.maxlead ncols = length(m.allvars) pt = zeros(nrows, ncols); @time @eval eval_RJ(pt, m); using BenchmarkTools @Btime eval_RJ(pt, m); ``` This PR has the following changes: - Package load time: 0.078s -> 0.05s - First call `eval_RJ`: 11.47s -> 4.97s - Runtime performance of `eval_RJ`: 550μs -> 590μs So there seems to be about a 10% runtime performance in the `eval_RJ` call but the latency is drastically reduced.

@time

…tion Currently, every model gets its own ForwardDiff tag which means that every model also have a unique type of their dual numbers. This causes every function called with dual numbers to have to be recompiled for every model. In this PR, we define a shared tag in ModelBasedEcon that all models use. This means that we can push the precompile generation for many functions from the model into ModelBaseEcon itself which changes the cost of them from O(1) to O(n_models). This PR also corrects a mismatch in the `precompile` call and the call to `ForwardDiff`. In the precompile calls `MyTag` was used as the type to precompile for which means that the calls to `GradientConfig` should have used `MyTag()` (so that the type of the tag was `MyTag`.) Now, when `MyTag` was used to the `GradientConfig` call the type of it is actually `DataType` which means that the types in the `precompile` call was different compared to the types actually encountered at runtime. Using the following benchmark script: ```julia unique!(push!(LOAD_PATH, realpath("./models"))) # hide @time using ModelBaseEcon using Random # See JuliaLang/julia#48810 @time using FRBUS_VAR m = FRBUS_VAR.model nrows = 1 + m.maxlag + m.maxlead ncols = length(m.allvars) pt = zeros(nrows, ncols); @time @eval eval_RJ(pt, m); using BenchmarkTools @Btime eval_RJ(pt, m); ``` this PR has the following changes: - Loading ModelBaseEcon: 0.641551s -> 0.645943s - Loading model 0.053s -> 0.032s - First call `eval_RJ`: 5.50s -> 0.64s - Benchmark `eval_RJ`: 597.966μs -> 573.923μs

…8810) (cherry picked from commit 8570951)

@time

ForwardDiff quite aggressively specializes most of its functions on the concrete input function type. This gives a slight performance improvement but it also means that a significant chunk of code has to be compiled for every call to `ForwardDiff` with a new function. Previously, for every equation in a model we would call `ForwardDiff.gradient` with the julia function corresponding to that equation. This would then compile the ForwardDiff functions for all of these julia functions. Looking at the specializations generated by a model, we see: ```julia GC = ForwardDiff.GradientConfig{FRBUS_VAR.MyTag, Float64, 4, Vector{ForwardDiff.Dual{FRBUS_VAR.MyTag, Float64, 4}}} MethodInstance for ForwardDiff.vector_mode_dual_eval!(::FRBUS_VAR.EquationEvaluator{:resid_515}, ::GC, ::Vector{Float64}) MethodInstance for ForwardDiff.vector_mode_gradient!(::DiffResults.MutableDiffResult{1, Float64, Tuple{Vector{Float64}}}, ::FRBUS_VAR.EquationEvaluator{:resid_515}, ::Vector{Float64}, ::GC) MethodInstance for ForwardDiff.vector_mode_dual_eval!(::FRBUS_VAR.EquationEvaluator{:resid_516}, ::GC, ::Vector{Float64}) MethodInstance for ForwardDiff.vector_mode_gradient!(::DiffResults.MutableDiffResult{1, Float64, Tuple{Vector{Float64}}}, ::FRBUS_VAR.EquationEvaluator{:resid_516}, ::Vector{Float64}, ::GC) ``` which are all identical methods compiled for different equations. In this PR, we instead "hide" all the concrete functions for every equation between a common "wrapper functions". This means that only one specialization of the ForwardDiff functions gets compiled. Using the following benchmark script: ```julia unique!(push!(LOAD_PATH, realpath("./models"))) using ModelBaseEcon using Random # See JuliaLang/julia#48810 @time using FRBUS_VAR m = FRBUS_VAR.model nrows = 1 + m.maxlag + m.maxlead ncols = length(m.allvars) pt = zeros(nrows, ncols); @time @eval eval_RJ(pt, m); using BenchmarkTools @Btime eval_RJ(pt, m); ``` This PR has the following changes: - Package load time: 0.078s -> 0.05s - First call `eval_RJ`: 11.47s -> 4.97s - Runtime performance of `eval_RJ`: 550μs -> 590μs So there seems to be about a 10% runtime performance in the `eval_RJ` call but the latency is drastically reduced.

@time

…tion Currently, every model gets its own ForwardDiff tag which means that every model also have a unique type of their dual numbers. This causes every function called with dual numbers to have to be recompiled for every model. In this PR, we define a shared tag in ModelBasedEcon that all models use. This means that we can push the precompile generation for many functions from the model into ModelBaseEcon itself which changes the cost of them from O(1) to O(n_models). This PR also corrects a mismatch in the `precompile` call and the call to `ForwardDiff`. In the precompile calls `MyTag` was used as the type to precompile for which means that the calls to `GradientConfig` should have used `MyTag()` (so that the type of the tag was `MyTag`.) Now, when `MyTag` was used to the `GradientConfig` call the type of it is actually `DataType` which means that the types in the `precompile` call was different compared to the types actually encountered at runtime. Using the following benchmark script: ```julia unique!(push!(LOAD_PATH, realpath("./models"))) # hide @time using ModelBaseEcon using Random # See JuliaLang/julia#48810 @time using FRBUS_VAR m = FRBUS_VAR.model nrows = 1 + m.maxlag + m.maxlead ncols = length(m.allvars) pt = zeros(nrows, ncols); @time @eval eval_RJ(pt, m); using BenchmarkTools @Btime eval_RJ(pt, m); ``` this PR has the following changes: - Loading ModelBaseEcon: 0.641551s -> 0.645943s - Loading model 0.053s -> 0.032s - First call `eval_RJ`: 5.50s -> 0.64s - Benchmark `eval_RJ`: 597.966μs -> 573.923μs

add a type assert to logging to prevent Base.require invalidation

d0b3e82

KristofferC added compiler:latency Compiler latency backport 1.8 Change should be backported to release-1.8 backport 1.9 Change should be backported to release-1.9 labels Feb 27, 2023

KristofferC requested a review from timholy February 27, 2023 11:14

aviatesk approved these changes Feb 28, 2023

View reviewed changes

KristofferC merged commit 8570951 into master Feb 28, 2023

KristofferC deleted the kc/assert_log branch February 28, 2023 11:26

KristofferC added a commit that referenced this pull request Mar 3, 2023

add a type assert to logging to prevent Base.require invalidation (#4…

a982a08

…8810) (cherry picked from commit 8570951)

KristofferC mentioned this pull request Mar 3, 2023

Backports for 1.9.0-rc1 #48733

Merged

50 tasks

KristofferC added a commit that referenced this pull request Mar 3, 2023

add a type assert to logging to prevent Base.require invalidation (#4…

b5243af

…8810) (cherry picked from commit 8570951)

KristofferC removed the backport 1.9 Change should be backported to release-1.9 label Mar 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a type assert to logging to prevent Base.require invalidation #48810

add a type assert to logging to prevent Base.require invalidation #48810

KristofferC commented Feb 27, 2023 •

edited

Loading

add a type assert to logging to prevent Base.require invalidation #48810

add a type assert to logging to prevent Base.require invalidation #48810

Conversation

KristofferC commented Feb 27, 2023 • edited Loading

KristofferC commented Feb 27, 2023 •

edited

Loading