-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Locally-Scoped Named Functions have Surprising Behavior that Causes Poor Performance #47760
Comments
Isn't the problem that |
No. Declaring julia> const fibby = let
function fib(n) n ≤ 1 ? n : fib(n-1)+fib(n-2) end
end
(::var"#fib#1") (generic function with 1 method)
julia> @btime fibby(10)
6.500 μs (0 allocations: 0 bytes)
55 |
In some sense this is how globals/variables work. I think the solution might in the end be to document:
|
Can you elaborate or offer a reference? Is there a law of nature that determines this behavior? My understanding is that the function's reference is being boxed because it cannot be proven that its identifier will not be reassigned. If it is simply imposed that its identifier cannot be reassigned due to the use of special syntax, then boxing its reference can be avoided, no? |
This is effectively equivalent to implementing #5148. The proposed solution is to:
This requires deciding on a "local const rule"; as discussed in #5148 this is non-obvious. If we have a rule that works for local functions then it would also work for any other kind of binding. It would be technically breaking to make function definition in local scope implicitly const when it has not been previously, but I suspect we can justify that as a "minor change", although we'd have to do a PkgEval run to be sure. |
It is a bit more subtle than that, which is that if this was actually detectable to be valid for Which is to say, |
Is it though? Using anonymous function syntax, julia> const fib_happy = n -> n ≤ 1 ? n : fib_happy(n-1)+fib_happy(n-2)
#110 (generic function with 1 method)
julia> @btime fib_happy(10)
361.658 ns (0 allocations: 0 bytes)
55 So intuitively anyway, it seems like whatever notion this is, just needs to be extended to local scopes.
This made my brain explode.
This proposal is to assert that when a function is declared with named function syntax, this will always hold true regardless of its attempted internal usage. Unless this behavior is intentional: Named Functions Declared at Global Scope: julia> begin
function fib(n) n ≤ 1 ? n : fib(n-1)+fib(n-2) end
function fib() fib = "I'm sure this behavior is desired, yes?" end
end
fib (generic function with 2 methods)
julia> fib(10)
55
julia> fib()
"I'm sure this behavior is desired, yes?"
julia> fib(10)
55 Named Functions Declared in Local Scope: julia> fibby = let
function fib(n) n ≤ 1 ? n : fib(n-1)+fib(n-2) end
function fib() fib = "I'm sure this behavior is desired, yes?" end
end
(::var"#fib#1") (generic function with 2 methods)
julia> fibby(10)
55
julia> fibby()
"I'm sure this behavior is desired, yes?"
julia> fibby(10)
ERROR: MethodError: objects of type String are not callable |
But |
This is intentional, and should probably move discussion to discourse of this, since this is just basic scope rule stuff related to declaring |
Sorry, I do not understand what you mean. julia> fib_sad = n -> n ≤ 1 ? n : fib_sad(n-1)+fib_sad(n-2)
#1 (generic function with 1 method)
julia> const fib_happy = n -> n ≤ 1 ? n : fib_happy(n-1)+fib_happy(n-2)
#3 (generic function with 1 method)
julia> @btime $fib_sad(10)
6.320 μs (0 allocations: 0 bytes)
55
julia> @btime $fib_happy(10)
337.019 ns (0 allocations: 0 bytes)
55
Then we disagree about what ought to be intended: my assertion is that named function syntax should take special behaviors regarding the function's name, to have the local As it stands, within local scopes, named function syntax provides nothing of real use that anonymous function syntax doesn't already provide, and so if this situation is not improved, then it seems advisable that the use of named function syntax within local scopes should simply be deprecated, or at least discouraged, to avoid confusion and loss of performance (e.g., avoid multiple dispatch in favor of argument-type-dependent branches in order to avoid function reference boxing). But let's return to happy thoughts, and entertain the idea of fixing the problem. I suppose the behavior I'm proposing could be summarized with this concept: fibby = let
function fib(n) n ≤ 1 ? n : fib(n-1)+fib(n-2) end
function fib() fib = "I'm sure this behavior is desired, yes?" end
end would be conceptually equivalent to (assuming function var"#fib#1" end
function var"#fib#1"(n)
let fib = var"#fib#1"
n ≤ 1 ? n : fib(n-1)+fib(n-2)
end
end
function var"#fib#1"()
let fib = var"#fib#1"
fib = "I'm sure this behavior is desired, yes?"
end
end
fibby = let
local const fib = var"#fib#1"
end (I'm being a little bit verbose for emphasis.) Namely, named function syntax would cause the function to treat its own identifier in a different way than how it treats other variables it has captured from its enclosing scope, so that it will behave as it would when it has been declared at global scope with no surprises. Is this too tall an ask? |
It looks like we get 95% of what we want here simply by inserting julia> using REPL, BenchmarkTools
julia> function zoomzoom!(ex) # harmless to insert into global fcns
if ex isa Expr && ex.head ∈ (:function, :(=)) && ex.args[1] isa Expr && ex.args[1].head == :call
pushfirst!(ex.args[2].args, :(local $(ex.args[1].args[1]) = var"#self#"))
end
ex isa Expr && map(zoomzoom!, ex.args)
ex
end
zoomzoom! (generic function with 1 method)
julia> pushfirst!(Base.active_repl_backend.ast_transforms, zoomzoom!); # ;-)
julia> fibby = let
function fib(n) n ≤ 1 ? n : fib(n-1)+fib(n-2) end
function fib() fib = "I'm sure this behavior is desired, yes?" end
end
(::var"#fib#1") (generic function with 2 methods)
julia> @btime $fibby(10) # fast like globally defined named fcn
360.101 ns (0 allocations: 0 bytes)
55
julia> fibby()
"I'm sure this behavior is desired, yes?"
julia> fibby(10) # consistent w/ globally defined named fcn behavior
55
julia> @code_warntype fibby(10) # type-stable; notice local `fib`
MethodInstance for (::var"#fib#1")(::Int64)
from (::var"#fib#1")(n) in Main at REPL[4]:2
Arguments
#self#::Core.Const(var"#fib#1"())
n::Int64
Locals
fib::var"#fib#1"
Body::Int64
1 ─ (fib = #self#)
│ %2 = (n ≤ 1)::Bool
└── goto #3 if not %2
2 ─ return n
3 ─ %5 = (n - 1)::Int64
│ %6 = (fib)(%5)::Int64
│ %7 = (n - 2)::Int64
│ %8 = (fib)(%7)::Int64
│ %9 = (%6 + %8)::Int64
└── return %9 |
You do realize irony in claiming that you need it to be const, and for your example fix show assigning it values twice, right? That measurement shows the benefit of inference, not the absence of boxing. |
Not really. I just want it to behave like it does when declared at global scope, in terms of performance and user interface, and this is how it behaves when it is declared at global scope. |
This issue applies to @MasonProtter's StaticModules.jl and @mauro3's Parameters.jl as well: julia> using BenchmarkTools, StaticModules, Parameters
julia> module Foo
fib(n) = n ≤ 1 ? n : fib(n-1) + fib(n-2)
end
Main.Foo
julia> @btime Foo.fib(10)
354.450 ns (0 allocations: 0 bytes)
55
julia> @const_staticmodule Bar begin
fib(n) = n ≤ 1 ? n : fib(n-1) + fib(n-2)
end
StaticModule Bar containing
fib = fib
julia> @btime Bar.fib(10)
7.700 μs (0 allocations: 0 bytes)
55
julia> @with_kw struct BazType{F}
fib::F = (fib(n) = n ≤ 1 ? n : fib(n-1) + fib(n-2))
end
BazType
julia> const Baz = BazType()
BazType{var"#fib#8"}
fib: fib (function of type var"#fib#8")
julia> @btime Baz.fib(10)
8.200 μs (0 allocations: 0 bytes)
55 Whereas Testing shows the proposed partial fix solves these issues. |
Well, yeah of course it does. Both of those create a local scope, that's not new information and doesn't require new benchmarks. |
This issue was mentioned in 44029:
|
@vtjnash IIUC, this doesn't address the case of two mutually recursive |
Closing in favor of #53295 |
Given that #53295 was closed, maybe this issue should remain open to track the performance issue, which is different from the implementation of a specific strategy that addresses it. |
@bb010g #53295 was reopened, so I'll leave this closed. However, it might be good to open a new issue specific to your case of mutually-recursive local functions. There are two sub-cases: Case 1: Mutually-recursive singleton local functions (no captures except for each other): we can change references to each others' local name to reference their global names instead so they don't have to capture each other. Example: # this code:
let
is_even(x) = x==0 ? true : is_odd(x-1)
is_odd(x) = x==0 ? false : is_even(x-1)
end # is currently approximately this:
struct var"#is_even#3"
is_odd :: Core.Box
end
struct var"#is_odd#4"{T}
is_even :: T
end
(var"#self#" :: var"#is_even#3")(x) = x==0 ? true : var"#self#".is_odd.contents(x-1)
(var"#self#" :: var"#is_odd#4")(x) = x==0 ? false : var"#self#".is_even(x-1)
let
is_even = var"#is_even#3"(Core.Box())
is_odd = var"#is_odd#4"(is_even)
is_even.is_odd.contents = is_odd
end # would become approximately this:
struct var"#is_even#3" end
struct var"#is_odd#4" end
(::var"#is_even#3")(x) = x==0 ? true : var"#is_odd#4"()(x-1)
(::var"#is_odd#4")(x) = x==0 ? false : var"#is_even#3"()(x-1)
let
is_even = var"#is_even#3"()
is_odd = var"#is_odd#4"()
end Case 2: Mutually-recursive closures: mutually-recursive local functions that capture other values too must be closures and must capture each other; we can rearrange struct definitions so that they can capture each other in a type-stable manner, so long as they're declared one-after-the-other with no new captures introduced in between. # this code:
let a=0, b=0
is_even(x) = x==a ? true : is_odd(x-1)
is_odd(x) = x==b ? false : is_even(x-1)
end # is currently approximately this:
struct var"#is_even#3"{A}
is_odd :: Core.Box
a :: A
end
struct var"#is_odd#4"{T, B}
is_even :: T
b :: B
end
(var"#self#" :: var"#is_even#3")(x) = x==var"#self#".a ? true : var"#self#".is_odd.contents(x-1)
(var"#self#" :: var"#is_odd#4")(x) = x==var"#self#".b ? false : var"#self#".is_even(x-1)
let a=0, b=0
is_even = var"#is_even#3"(Core.Box(), a)
is_odd = var"#is_odd#4"(is_even, b)
is_even.is_odd.contents = is_odd
end # would become approximately this (notice struct def'ns in opposite order):
struct var"#is_odd#4"{T, B}
is_even :: T
b :: B
end
mutable struct var"#is_even#3"{B,A}
const a :: A
is_odd :: var"#is_odd#4"{var"#is_even#3"{B,A}, B}
var"#is_even#3"{B}(a::A) where {B,A} = new{B,A}(a)
end
(var"#self#" :: var"#is_even#3")(x) = x==var"#self#".a ? true : var"#self#".is_odd(x-1)
(var"#self#" :: var"#is_odd#4")(x) = x==var"#self#".b ? false : var"#self#".is_even(x-1)
let a=0, b=0
is_even = var"#is_even#3"{typeof(b)}(a)
is_odd = var"#is_odd#4"(is_even, b)
is_even.is_odd = is_odd
end In this latter case where the closures must capture each other, an allocation must still occur. However, at least it is type-stable. If the mutually-recursive local functions can be singleton as in Case 1, it is more performant to make them so. (see discussion) |
Background:
We have four different syntaxes to declare functions: two named function syntaxes and two anonymous function syntaxes. The differentiating feature of the named function syntaxes is function identifier type declaration to stabilize the type. This is convenient (don’t need keyword
const
), great for performance, and quite useful for declaring recursive functions.Example:
Named function syntax outperforms [non-
const
] anonymous functions:Problem:
For whatever reason, in local scopes the named function syntax seems to behave for all intents and purposes like anonymous function syntax by allowing the function's identifier to be reassigned. That means that within a local scope, we have loss of performance and four redundant syntaxes for doing basically the same thing. It’s also a break from the behavior that we come to expect from interacting with these syntaxes at global scope. And because recursive functions require their identifier to be declared simultaneous to their function body, we can’t use the
let
block trick that works for other variables captured by closures, and we can’t use type-annotation. And becauseconst
is disallowed in local scopes, the problem is inescapable.Examples:
Locally-declared named functions can be reassigned arbitrarily, causing surprising behavior:
Locally-declared recursive functions have very poor performance due to boxing their self-reference:
Current Solutions:
It is suggested here to declare the functions as
global
. This however circumvents the issue and pollutes the global namespace with what should have been local function definitions.For recursive functions, it is suggested here to use
var"#self#"
. However this is not official API, and also circumvents the issue.For recursive functions, it is suggested here to use Y-combinators. However this causes performance degradation, and also circumvents the issue.
For recursive functions, it is suggested here to rewrite recursive functions to take themselves as an argument, and wrap them in a calling closure, in a macro-automated fashion. However the extra closure causes increased compile time, and also circumvents the issue.
Proposed Solution:
It seems the right thing to do is to disallow functions declared with named function syntax in local scopes from having their identifier reassigned, consistent with the behavior of named function syntax at the global scope, so that references to them don’t need to be boxed and so that their identifiers' behavior is unsurprising.
Would Solve:
This problem (non-recursive).
This problem (recursive).
This problem (non-recursive)
This problem (recursive)
Probably more.
The text was updated successfully, but these errors were encountered: