Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: local static variables #15056

Open
zgimbutas opened this issue Feb 12, 2016 · 29 comments
Open

feature request: local static variables #15056

zgimbutas opened this issue Feb 12, 2016 · 29 comments

Comments

@zgimbutas
Copy link
Contributor

I have recently run into a major performance bottleneck while using const keyword for arrays inside functions. The following are two short examples that demonstrate this problem. In short, there is a huge performance penalty to declare even short const arrays inside the functions relative to global declarations (about a factor of 10 in this test, 0.000120 s for the slow version versus 0.000014 s for the fast version per 200 function calls) since the constant arrays are reinitialized every time the function is called, Moving constants into the outside (global) scope gets full Fortran-like speed. (In this example, I needed a faster version of the modified Bessel I_0 function for the so called Bessel-Kaiser window since the default AMOS based besseli is somewhat sluggish, 0.000130 s for the same workload).

Basically, a true equivalent of Fortran data or C const keyword is currently missing. The trick of moving constants outside the functions is imperfect for many obvious reasons, and a possible workaround using closures suffers from current inefficient implementation (plus it makes the code less readable). A possible solution would be to treat constants as initialized only once during the compilation step (just like the current treatment of global constants).

The code and the timings follow.

Slow version:

function i0eva(x)
#
#  This subroutine evaluates the modified bessel
#  function I_0 of a real argument x
#
#               input parameters:
#
#  x - the argument of the bessel function to be evaluated
#  
#               output parameters:
#
#  y - the modified bessel function I_0 of the argument x
#
#           . . . evaluate the function I_0
#

const p0to3=[
  0.3674336090541583E+00, -.1483942216332326E+00,
  0.7538109249292410E-01, -.3429202889217209E-01,
  0.1335283653302592E-01, -.4461174022644684E-02,
  0.1294920713528336E-02, -.3309968933824789E-03,
  0.7542634832396139E-04, -.1548671523094681E-04,
  0.2891239479464111E-05, -.4946276244165349E-06,
  0.7806369735032947E-07, -.1143166990263007E-07,
  0.1561212097141843E-08, -.1997725692654135E-09,
  0.2403196202942698E-10, -.2710668355445937E-11,
  0.2916600802142503E-12, -.3325583782158792E-13,
  0.3199318882156914E-14]

const p3to6=[
  0.1941982776283822E+00, -.2323945533089018E-01,
  0.4244030631168826E-02, -.8759373293717004E-03,
  0.1909924203989986E-03, -.4222271728163107E-04,
  0.9160432785148500E-05, -.1904704016240210E-05,
  0.3739731123576933E-06, -.6879564187569590E-07,
  0.1182061524819559E-07, -.1896547862750232E-08,
  0.2845293860689127E-09, -.4001288511461210E-10,
  0.5284459250803188E-11, -.6526404831544231E-12,
  0.7680294564558936E-13, -.9627486914965938E-14,
  0.1006473132584426E-14]

const p6to9=[
  0.3989284896164599E+00, 0.5092106731858055E-01,
  -.6127221009097870E-02, 0.6131729304562125E+00,
  -.4628528030678180E+01, -.1205084525934594E+02,
  0.7340780560072762E+03, -.8729685960965613E+04,
  0.5831335775767627E+05, -.2430536532475642E+06,
  0.6287065304211588E+06, -.9292779115166327E+06,
  0.6032651250101362E+06]

const p9to12=[
  0.3989407763246097E+00, 0.5000863927874731E-01,
  0.2226512800214346E-01, 0.1652919141248679E+00,
  -.1957907568960066E+01, 0.1894748754972728E+02,
  -.1103884918016903E+03, 0.3682201782413656E+03,
  -.5202693680397354E+03, 0.7939108402881759E+01,
  0.6287065304211588E+06, -.9292779115166327E+06]

const p12to24=[
  0.3989422838292772E+00, 0.4986710003332280E-01,
  0.2811267232744196E-01, 0.2585760370970408E-01,
  0.1658198437867097E+00, -.2952322171370709E+01,
  0.5466563227185643E+02, -.6921662882875580E+03,
  0.6144958292895270E+04, -.3605643906301946E+05,
  0.1258751734607877E+06, -.1960251421440677E+06]

const p24to48=[
  0.3989422803956930E+00, 0.4986778658145504E-01,
  0.2805045279642165E-01, 0.2923081851609006E-01,
  0.4428963589957261E-01, 0.1017678215019001E+00,
  0.6423782560416375E-01, 0.1927258316093262E+01]

const p48to96=[
  0.3989422804014209E+00, 0.4986778505649491E-01,
  0.2805062761844382E-01, 0.2921959904479739E-01,
  0.4472650398122760E-01, 0.9140365337986182E-01,
  0.2035846167608939E+00, 0.1104329030954804E+01]

const p96to192=[
  0.3989422804013822E+00, 0.4986778509112728E-01,
  0.2805061543528473E-01, 0.2922179806303123E-01,
  0.4451046595352375E-01, 0.1022708740517523E+00]

    x=abs(x)

#
#        . . . for  0< x <3
#
    if x <= 3

        n=21
        shft=3.0
        shft=shft/2
        y = polev(p0to3,x-shft,n)
        y = y * exp(x)
        return y

    end
#
#        . . . for  3< x <6
#
    if x <= 6 

        n=19
        shft=9.0
        shft=shft/2

        y = polev(p3to6,x-shft,n)
        y = y*exp(x)
        return y

    end
#
#        . . . for  6< x <9
#
    if x <= 9

        n=13
        xinv=1.0/x
        y = polev(p6to9,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  9< x <12
#
    if x <= 12

        n=10
        xinv=1.0/x
        y = polev(p9to12,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  12< x <24
#
    if x <= 24

        n=12
        xinv=1.0/x
        y = polev(p12to24,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  24< x <48
#
    if x <= 48

        n=8
        xinv=1.0/x
        y = polev(p24to48,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 96

        n=8
        xinv=1.0/x
        y = polev(p48to96,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 192

        n=6
        xinv=1.0/x
        y = polev(p96to192,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        for x>192
#
    d=1.0+1.0/(8*x)+9*1.0/(2*(8*x)^2)+
        9*25/(6*(8*x)^3)+
        9*25*49/(24*(8*x)^4)+
        9*25*49*81/(120*(8*x)^5)+
        9*25*49*81*121/(720*(8*x)^6)

    pi=4.0*atan(1.0)
    y=exp(x)/sqrt(2*pi*x)*d

    return y

end

@vectorize_1arg Number i0eva


function polev(p,x,n)

    y=0.0
    d=1.0
@inbounds    for i=1:n
        y=y+d*p[i]
        d=d*x
    end
    return y

end



x = linspace(0,200,200)

println("=============")
println("pre-compilation times")
@time y1 = besseli(0,x)
@time y0 = i0eva(x)

println("=============")
println("timings for besseli(0,x)")
for i = 1:3
@time y1 = besseli(0,x)
end

println("=============")
println("timings for i0eva(x)")
for i = 1:3
@time y0 = i0eva(x)
end

println("error = ",vecnorm((y0-y1)./y0))

Slow version timings:

=============
pre-compilation times
  0.061073 seconds (113.77 k allocations: 4.770 MB)
  0.036436 seconds (49.96 k allocations: 2.578 MB)
=============
timings for besseli(0,x)
  0.000133 seconds (1 allocation: 1.766 KB)
  0.000121 seconds (1 allocation: 1.766 KB)
  0.000119 seconds (1 allocation: 1.766 KB)
=============
timings for i0eva(x)
  0.000106 seconds (1.60 k allocations: 261.141 KB)
  0.000127 seconds (1.60 k allocations: 261.141 KB)
  0.000120 seconds (1.60 k allocations: 261.141 KB)
error = 3.671230449315888e-15

Fast version:

const p0to3=[
  0.3674336090541583E+00, -.1483942216332326E+00,
  0.7538109249292410E-01, -.3429202889217209E-01,
  0.1335283653302592E-01, -.4461174022644684E-02,
  0.1294920713528336E-02, -.3309968933824789E-03,
  0.7542634832396139E-04, -.1548671523094681E-04,
  0.2891239479464111E-05, -.4946276244165349E-06,
  0.7806369735032947E-07, -.1143166990263007E-07,
  0.1561212097141843E-08, -.1997725692654135E-09,
  0.2403196202942698E-10, -.2710668355445937E-11,
  0.2916600802142503E-12, -.3325583782158792E-13,
  0.3199318882156914E-14]

const p3to6=[
  0.1941982776283822E+00, -.2323945533089018E-01,
  0.4244030631168826E-02, -.8759373293717004E-03,
  0.1909924203989986E-03, -.4222271728163107E-04,
  0.9160432785148500E-05, -.1904704016240210E-05,
  0.3739731123576933E-06, -.6879564187569590E-07,
  0.1182061524819559E-07, -.1896547862750232E-08,
  0.2845293860689127E-09, -.4001288511461210E-10,
  0.5284459250803188E-11, -.6526404831544231E-12,
  0.7680294564558936E-13, -.9627486914965938E-14,
  0.1006473132584426E-14]

const p6to9=[
  0.3989284896164599E+00, 0.5092106731858055E-01,
  -.6127221009097870E-02, 0.6131729304562125E+00,
  -.4628528030678180E+01, -.1205084525934594E+02,
  0.7340780560072762E+03, -.8729685960965613E+04,
  0.5831335775767627E+05, -.2430536532475642E+06,
  0.6287065304211588E+06, -.9292779115166327E+06,
  0.6032651250101362E+06]

const p9to12=[
  0.3989407763246097E+00, 0.5000863927874731E-01,
  0.2226512800214346E-01, 0.1652919141248679E+00,
  -.1957907568960066E+01, 0.1894748754972728E+02,
  -.1103884918016903E+03, 0.3682201782413656E+03,
  -.5202693680397354E+03, 0.7939108402881759E+01,
  0.6287065304211588E+06, -.9292779115166327E+06]

const p12to24=[
  0.3989422838292772E+00, 0.4986710003332280E-01,
  0.2811267232744196E-01, 0.2585760370970408E-01,
  0.1658198437867097E+00, -.2952322171370709E+01,
  0.5466563227185643E+02, -.6921662882875580E+03,
  0.6144958292895270E+04, -.3605643906301946E+05,
  0.1258751734607877E+06, -.1960251421440677E+06]

const p24to48=[
  0.3989422803956930E+00, 0.4986778658145504E-01,
  0.2805045279642165E-01, 0.2923081851609006E-01,
  0.4428963589957261E-01, 0.1017678215019001E+00,
  0.6423782560416375E-01, 0.1927258316093262E+01]

const p48to96=[
  0.3989422804014209E+00, 0.4986778505649491E-01,
  0.2805062761844382E-01, 0.2921959904479739E-01,
  0.4472650398122760E-01, 0.9140365337986182E-01,
  0.2035846167608939E+00, 0.1104329030954804E+01]

const p96to192=[
  0.3989422804013822E+00, 0.4986778509112728E-01,
  0.2805061543528473E-01, 0.2922179806303123E-01,
  0.4451046595352375E-01, 0.1022708740517523E+00]


function i0eva(x)
#
#  This subroutine evaluates the modified bessel
#  function I_0 of a real argument x
#
#               input parameters:
#
#  x - the argument of the bessel function to be evaluated
#  
#               output parameters:
#
#  y - the modified bessel function I_0 of the argument x
#
#           . . . evaluate the function I_0
#

    x=abs(x)

#
#        . . . for  0< x <3
#
    if x <= 3

        n=21
        shft=3.0
        shft=shft/2
        y = polev(p0to3,x-shft,n)
        y = y * exp(x)
        return y

    end
#
#        . . . for  3< x <6
#
    if x <= 6 

        n=19
        shft=9.0
        shft=shft/2

        y = polev(p3to6,x-shft,n)
        y = y*exp(x)
        return y

    end
#
#        . . . for  6< x <9
#
    if x <= 9

        n=13
        xinv=1.0/x
        y = polev(p6to9,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  9< x <12
#
    if x <= 12

        n=10
        xinv=1.0/x
        y = polev(p9to12,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  12< x <24
#
    if x <= 24

        n=12
        xinv=1.0/x
        y = polev(p12to24,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  24< x <48
#
    if x <= 48

        n=8
        xinv=1.0/x
        y = polev(p24to48,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 96

        n=8
        xinv=1.0/x
        y = polev(p48to96,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 192

        n=6
        xinv=1.0/x
        y = polev(p96to192,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        for x>192
#
    d=1.0+1.0/(8*x)+9*1.0/(2*(8*x)^2)+
        9*25/(6*(8*x)^3)+
        9*25*49/(24*(8*x)^4)+
        9*25*49*81/(120*(8*x)^5)+
        9*25*49*81*121/(720*(8*x)^6)

    pi=4.0*atan(1.0)
    y=exp(x)/sqrt(2*pi*x)*d

    return y

end

@vectorize_1arg Number i0eva


function polev(p,x,n)

    y=0.0
    d=1.0
@inbounds    for i=1:n
        y=y+d*p[i]
        d=d*x
    end
    return y

end



x = linspace(0,200,200)

println("=============")
println("pre-compilation times")
@time y1 = besseli(0,x)
@time y0 = i0eva(x)

println("=============")
println("timings for besseli(0,x)")
for i = 1:3
@time y1 = besseli(0,x)
end

println("=============")
println("timings for i0eva(x)")
for i = 1:3
@time y0 = i0eva(x)
end

println("error = ",vecnorm((y0-y1)./y0))

Fast version timings:

=============
pre-compilation times
  0.062518 seconds (113.77 k allocations: 4.770 MB)
  0.023772 seconds (35.42 k allocations: 1.665 MB)
=============
timings for besseli(0,x)
  0.000132 seconds (1 allocation: 1.766 KB)
  0.000121 seconds (1 allocation: 1.766 KB)
  0.000143 seconds (1 allocation: 1.766 KB)
=============
timings for i0eva(x)
  0.000016 seconds (1 allocation: 1.766 KB)
  0.000014 seconds (1 allocation: 1.766 KB)
  0.000014 seconds (1 allocation: 1.766 KB)
error = 3.671230449315888e-15
@StefanKarpinski
Copy link
Member

There's already an open issue about the fact that const has no effect in local scope: #5148. However, that's not the issue here since you're not reassigning the arrays. Rather, the issue is confusion about what const means – you seem to want it to behave like static in C and only initialize the value once, but that's not what const means – it means that a binding will not change. Semantically, this must provide a new array each time the function runs. It could be possible to recognize by static analysis that these arrays never escape and are never modified, thus allowing them to be allocated once and reused, but that's a pretty fancy optimization. The right thing to do is what you do in the second version: make them const and global.

@KristofferC
Copy link
Member

If you search for "static variables" in the github search you will find discussion about this. The conclusion was that variables like this should be stack allocated and there are open PRs that have started working towards this.

@zgimbutas
Copy link
Contributor Author

Thanks, I see now that this issue has been mentioned in issues #5148 and #12627, I missed them while looking up for a possible solution. But making these variables both const and global introduces unnecessary global names and conflicts. Now, if it were possible for the user to somehow specify exactly that but without introducing extra global names...

I understand that this would break current language semantic but since currently const has no effect in local scope, is it possible to to change the behaviour of const keyword within the functions so that it would be able to refer only to the variables/arrays that are true constants (e.g. literals, literal arrays, etc.) at the binding level within the scope? The local keyword for variables already does the job for the objects that can be modified, so there is some overlap in functionality.

Or, maybe the compiler should be able to recognize the simplest cases of const literals and literal arrays (maybe multiplied by simple functions with literal arguments) and act accordingly? Constant literal arrays are constants after all, and performance gains for automatically moving them to the outside the scope are so big.

I would agree that const should refer only to a binding, and the user should not expect the content of an array to be immutable in this context.

@JeffBezanson
Copy link
Member

You can use let to assign the variables outside the function without introducing global names.

@JeffBezanson
Copy link
Member

Closely related: #5024

@zgimbutas
Copy link
Contributor Author

In this situation, using let will completely hide the array - it will be
unavailable inside the defined function as well.

Probably I should simply wait for proper implementation of static local
variables in Julia language. I would also provide some mechanism for the
initialization of such variables at the compilation time, before the first
call. This will make the code both efficient and thread-safe if static
variables are not being written to during the subsequent calls.

On Fri, Feb 12, 2016 at 11:02 PM, Jeff Bezanson notifications@github.com
wrote:

You can use let to assign the variables outside the function without
introducing global names.


Reply to this email directly or view it on GitHub
#15056 (comment).

@KristofferC
Copy link
Member

Someone posted a macro somewhere which I modified:

macro static(init)
  var = gensym()
  eval(current_module(), :(const $var = $init))
  var = esc(var)
  quote
    global $var
    $var
  end
end

function foo()
    J = @static zeros(5,5)
end

julia> @allocated foo()
0

J gets assigned to a global const variable and it gets initialized at macro expansion time (which is before compilation time). The macro uses gensym to not fight with other variable names.

@JeffBezanson
Copy link
Member

You can put the function definition inside the let block. I don't see why this doesn't work.

@zgimbutas
Copy link
Contributor Author

Unfortunately, then the function is unavailable and I have an error.

pre-compilation times
  0.062164 seconds (113.77 k allocations: 4.768 MB)
ERROR: LoadError: UndefVarError: i0eva not defined
 in include at ./boot.jl:261
 in include_from_node1 at ./loading.jl:304
 in process_options at ./client.jl:280
 in _start at ./client.jl:378
while loading /home/gimbutas/nist/github-contrib/julia-scripts/nufft1d/i0eva_dr.jl, in expression starting on line 155

if the function is defined as follow

let

const p0to3=[
  0.3674336090541583E+00, -.1483942216332326E+00,
  0.7538109249292410E-01, -.3429202889217209E-01,
  0.1335283653302592E-01, -.4461174022644684E-02,
  0.1294920713528336E-02, -.3309968933824789E-03,
  0.7542634832396139E-04, -.1548671523094681E-04,
  0.2891239479464111E-05, -.4946276244165349E-06,
  0.7806369735032947E-07, -.1143166990263007E-07,
  0.1561212097141843E-08, -.1997725692654135E-09,
  0.2403196202942698E-10, -.2710668355445937E-11,
  0.2916600802142503E-12, -.3325583782158792E-13,
  0.3199318882156914E-14]

const p3to6=[
  0.1941982776283822E+00, -.2323945533089018E-01,
  0.4244030631168826E-02, -.8759373293717004E-03,
  0.1909924203989986E-03, -.4222271728163107E-04,
  0.9160432785148500E-05, -.1904704016240210E-05,
  0.3739731123576933E-06, -.6879564187569590E-07,
  0.1182061524819559E-07, -.1896547862750232E-08,
  0.2845293860689127E-09, -.4001288511461210E-10,
  0.5284459250803188E-11, -.6526404831544231E-12,
  0.7680294564558936E-13, -.9627486914965938E-14,
  0.1006473132584426E-14]

const p6to9=[
  0.3989284896164599E+00, 0.5092106731858055E-01,
  -.6127221009097870E-02, 0.6131729304562125E+00,
  -.4628528030678180E+01, -.1205084525934594E+02,
  0.7340780560072762E+03, -.8729685960965613E+04,
  0.5831335775767627E+05, -.2430536532475642E+06,
  0.6287065304211588E+06, -.9292779115166327E+06,
  0.6032651250101362E+06]

const p9to12=[
  0.3989407763246097E+00, 0.5000863927874731E-01,
  0.2226512800214346E-01, 0.1652919141248679E+00,
  -.1957907568960066E+01, 0.1894748754972728E+02,
  -.1103884918016903E+03, 0.3682201782413656E+03,
  -.5202693680397354E+03, 0.7939108402881759E+01,
  0.6287065304211588E+06, -.9292779115166327E+06]

const p12to24=[
  0.3989422838292772E+00, 0.4986710003332280E-01,
  0.2811267232744196E-01, 0.2585760370970408E-01,
  0.1658198437867097E+00, -.2952322171370709E+01,
  0.5466563227185643E+02, -.6921662882875580E+03,
  0.6144958292895270E+04, -.3605643906301946E+05,
  0.1258751734607877E+06, -.1960251421440677E+06]

const p24to48=[
  0.3989422803956930E+00, 0.4986778658145504E-01,
  0.2805045279642165E-01, 0.2923081851609006E-01,
  0.4428963589957261E-01, 0.1017678215019001E+00,
  0.6423782560416375E-01, 0.1927258316093262E+01]

const p48to96=[
  0.3989422804014209E+00, 0.4986778505649491E-01,
  0.2805062761844382E-01, 0.2921959904479739E-01,
  0.4472650398122760E-01, 0.9140365337986182E-01,
  0.2035846167608939E+00, 0.1104329030954804E+01]

const p96to192=[
  0.3989422804013822E+00, 0.4986778509112728E-01,
  0.2805061543528473E-01, 0.2922179806303123E-01,
  0.4451046595352375E-01, 0.1022708740517523E+00]

function i0eva(x)
#
#  This subroutine evaluates the modified bessel
#  function I_0 of a real argument x
#
#               input parameters:
#
#  x - the argument of the bessel function to be evaluated
#  
#               output parameters:
#
#  y - the modified bessel function I_0 of the argument x
#
#           . . . evaluate the function I_0
#

    x=abs(x)

#
#        . . . for  0< x <3
#
    if x <= 3

        n=21
        shft=3.0
        shft=shft/2
        y = polev(p0to3,x-shft,n)
        y = y * exp(x)
        return y

    end
#
#        . . . for  3< x <6
#
    if x <= 6 

        n=19
        shft=9.0
        shft=shft/2

        y = polev(p3to6,x-shft,n)
        y = y*exp(x)
        return y

    end
#
#        . . . for  6< x <9
#
    if x <= 9

        n=13
        xinv=1.0/x
        y = polev(p6to9,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  9< x <12
#
    if x <= 12

        n=10
        xinv=1.0/x
        y = polev(p9to12,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  12< x <24
#
    if x <= 24

        n=12
        xinv=1.0/x
        y = polev(p12to24,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  24< x <48
#
    if x <= 48

        n=8
        xinv=1.0/x
        y = polev(p24to48,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 96

        n=8
        xinv=1.0/x
        y = polev(p48to96,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 192

        n=6
        xinv=1.0/x
        y = polev(p96to192,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        for x>192
#
    d=1.0+1.0/(8*x)+9*1.0/(2*(8*x)^2)+
        9*25/(6*(8*x)^3)+
        9*25*49/(24*(8*x)^4)+
        9*25*49*81/(120*(8*x)^5)+
        9*25*49*81*121/(720*(8*x)^6)

    pi=4.0*atan(1.0)
    y=exp(x)/sqrt(2*pi*x)*d

    return y

end

@vectorize_1arg Number i0eva

end



function polev(p,x,n)

    y=0.0
    d=1.0
@inbounds    for i=1:n
        y=y+d*p[i]
        d=d*x
    end
    return y

end

@zgimbutas
Copy link
Contributor Author

On the other hand, Kristoffer's solution is working! Thanks so much. This is a very good macro, should definitely go into the mainline of Julia.

Timings:

$ julia -O i0eva_dr.jl 
-O i0eva_dr.jl
=============
pre-compilation times
  0.064265 seconds (113.77 k allocations: 4.770 MB)
  0.025130 seconds (36.44 k allocations: 1.713 MB)
=============
timings for besseli(0,x)
  0.000124 seconds (1 allocation: 1.766 KB)
  0.000144 seconds (1 allocation: 1.766 KB)
  0.000111 seconds (1 allocation: 1.766 KB)
=============
timings for i0eva(x)
  0.000022 seconds (1 allocation: 1.766 KB)
  0.000014 seconds (1 allocation: 1.766 KB)
  0.000013 seconds (1 allocation: 1.766 KB)
error = 3.671230449315888e-15

if the function defined as follows:

macro static(init)
  var = gensym()
  eval(current_module(), :(const $var = $init))
  var = esc(var)
  quote
    global $var
    $var
  end
end

function i0eva(x)
#
#  This subroutine evaluates the modified bessel
#  function I_0 of a real argument x
#
#               input parameters:
#
#  x - the argument of the bessel function to be evaluated
#  
#               output parameters:
#
#  y - the modified bessel function I_0 of the argument x
#
#           . . . evaluate the function I_0
#

p0to3 = @static [
  0.3674336090541583E+00, -.1483942216332326E+00,
  0.7538109249292410E-01, -.3429202889217209E-01,
  0.1335283653302592E-01, -.4461174022644684E-02,
  0.1294920713528336E-02, -.3309968933824789E-03,
  0.7542634832396139E-04, -.1548671523094681E-04,
  0.2891239479464111E-05, -.4946276244165349E-06,
  0.7806369735032947E-07, -.1143166990263007E-07,
  0.1561212097141843E-08, -.1997725692654135E-09,
  0.2403196202942698E-10, -.2710668355445937E-11,
  0.2916600802142503E-12, -.3325583782158792E-13,
  0.3199318882156914E-14]

p3to6 = @static [
  0.1941982776283822E+00, -.2323945533089018E-01,
  0.4244030631168826E-02, -.8759373293717004E-03,
  0.1909924203989986E-03, -.4222271728163107E-04,
  0.9160432785148500E-05, -.1904704016240210E-05,
  0.3739731123576933E-06, -.6879564187569590E-07,
  0.1182061524819559E-07, -.1896547862750232E-08,
  0.2845293860689127E-09, -.4001288511461210E-10,
  0.5284459250803188E-11, -.6526404831544231E-12,
  0.7680294564558936E-13, -.9627486914965938E-14,
  0.1006473132584426E-14]

p6to9 = @static [
  0.3989284896164599E+00, 0.5092106731858055E-01,
  -.6127221009097870E-02, 0.6131729304562125E+00,
  -.4628528030678180E+01, -.1205084525934594E+02,
  0.7340780560072762E+03, -.8729685960965613E+04,
  0.5831335775767627E+05, -.2430536532475642E+06,
  0.6287065304211588E+06, -.9292779115166327E+06,
  0.6032651250101362E+06]

p9to12 = @static [
  0.3989407763246097E+00, 0.5000863927874731E-01,
  0.2226512800214346E-01, 0.1652919141248679E+00,
  -.1957907568960066E+01, 0.1894748754972728E+02,
  -.1103884918016903E+03, 0.3682201782413656E+03,
  -.5202693680397354E+03, 0.7939108402881759E+01,
  0.6287065304211588E+06, -.9292779115166327E+06]

p12to24 = @static [
  0.3989422838292772E+00, 0.4986710003332280E-01,
  0.2811267232744196E-01, 0.2585760370970408E-01,
  0.1658198437867097E+00, -.2952322171370709E+01,
  0.5466563227185643E+02, -.6921662882875580E+03,
  0.6144958292895270E+04, -.3605643906301946E+05,
  0.1258751734607877E+06, -.1960251421440677E+06]

p24to48 = @static [
  0.3989422803956930E+00, 0.4986778658145504E-01,
  0.2805045279642165E-01, 0.2923081851609006E-01,
  0.4428963589957261E-01, 0.1017678215019001E+00,
  0.6423782560416375E-01, 0.1927258316093262E+01]

p48to96 = @static [
  0.3989422804014209E+00, 0.4986778505649491E-01,
  0.2805062761844382E-01, 0.2921959904479739E-01,
  0.4472650398122760E-01, 0.9140365337986182E-01,
  0.2035846167608939E+00, 0.1104329030954804E+01]

p96to192 = @static [
  0.3989422804013822E+00, 0.4986778509112728E-01,
  0.2805061543528473E-01, 0.2922179806303123E-01,
  0.4451046595352375E-01, 0.1022708740517523E+00]

    x=abs(x)

#
#        . . . for  0< x <3
#
    if x <= 3

        n=21
        shft=3.0
        shft=shft/2
        y = polev(p0to3,x-shft,n)
        y = y * exp(x)
        return y

    end
#
#        . . . for  3< x <6
#
    if x <= 6 

        n=19
        shft=9.0
        shft=shft/2

        y = polev(p3to6,x-shft,n)
        y = y*exp(x)
        return y

    end
#
#        . . . for  6< x <9
#
    if x <= 9

        n=13
        xinv=1.0/x
        y = polev(p6to9,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  9< x <12
#
    if x <= 12

        n=10
        xinv=1.0/x
        y = polev(p9to12,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  12< x <24
#
    if x <= 24

        n=12
        xinv=1.0/x
        y = polev(p12to24,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  24< x <48
#
    if x <= 48

        n=8
        xinv=1.0/x
        y = polev(p24to48,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 96

        n=8
        xinv=1.0/x
        y = polev(p48to96,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        . . . for  48< x <96
#
    if x <= 192

        n=6
        xinv=1.0/x
        y = polev(p96to192,xinv,n)
        y = y*exp(x)*sqrt(xinv)
        return y

    end
#
#        for x>192
#
    d=1.0+1.0/(8*x)+9*1.0/(2*(8*x)^2)+
        9*25/(6*(8*x)^3)+
        9*25*49/(24*(8*x)^4)+
        9*25*49*81/(120*(8*x)^5)+
        9*25*49*81*121/(720*(8*x)^6)

    pi=4.0*atan(1.0)
    y=exp(x)/sqrt(2*pi*x)*d

    return y

end

@vectorize_1arg Number i0eva


function polev(p,x,n)

    y=0.0
    d=1.0
@inbounds    for i=1:n
        y=y+d*p[i]
        d=d*x
    end
    return y

end

@yuyichao
Copy link
Contributor

Unfortunately, then the function is unavailable and I have an error.

You are missing the global.

On the other hand, Kristoffer's solution is working! Thanks so much. This is a very good macro, should definitely go into the mainline of Julia.

No, it is identical with declaring a global and has all the issues of that approach.

@zgimbutas
Copy link
Contributor Author

Well, if it is a global unique name that is not interfering with other names, I am perfectly happy with it as a workaround. One just has to be sure that gensym() is generating truly unique names. I hope it does.

Still, saying something like

static const a = [1 2 3]

would be much cleaner.

@zgimbutas
Copy link
Contributor Author

Correct, adding

global i0eva

immediately after the function declaration fixed the error and the timings
are good

$ julia -O i0eva_dr.jl
-O i0eva_dr.jl
=============
pre-compilation times
  0.064265 seconds (113.77 k allocations: 4.770 MB)
  0.025130 seconds (36.44 k allocations: 1.713 MB)
=============
timings for besseli(0,x)
  0.000124 seconds (1 allocation: 1.766 KB)
  0.000144 seconds (1 allocation: 1.766 KB)
  0.000111 seconds (1 allocation: 1.766 KB)
=============
timings for i0eva(x)
  0.000022 seconds (1 allocation: 1.766 KB)
  0.000014 seconds (1 allocation: 1.766 KB)
  0.000013 seconds (1 allocation: 1.766 KB)
error = 3.671230449315888e-15

Still, it is not as elegant solution, the code is somewhat harder to read
and each function using static data will effectively need a separate
wrapper.

On Sun, Feb 14, 2016 at 11:11 AM, Yichao Yu notifications@github.com
wrote:

Unfortunately, then the function is unavailable and I have an error.

You are missing the global.

On the other hand, Kristoffer's solution is working! Thanks so much. This
is a very good macro, should definitely go into the mainline of Julia.

No, it is identical with declaring a global and has all the issues of that
approach.


Reply to this email directly or view it on GitHub
#15056 (comment).

@zgimbutas
Copy link
Contributor Author

zgimbutas commented Feb 14, 2016

One more minor modification. Saying

const p96to192 = @static [
  0.3989422804013822E+00, 0.4986778509112728E-01,
  0.2805061543528473E-01, 0.2922179806303123E-01,
  0.4451046595352375E-01, 0.1022708740517523E+00]

inside the code really does the job. So, the question is only if gensym() always generates unique names for the global scope. Should they be in a different namespace instead?

@KristofferC
Copy link
Member

You will get unique variable names.

@JeffBezanson
Copy link
Member

I could imagine a static keyword that effectively lifts the definition out to a let-bound variable for you.

@JeffBezanson JeffBezanson changed the title Constants keep getting reinitialized inside functions feature request: local static variables Feb 16, 2016
@cossio
Copy link
Contributor

cossio commented Dec 14, 2016

Will the static keyboard make it into 0.6?

@JeffBezanson
Copy link
Member

I think that's unlikely; there are other ways to accomplish this so it doesn't seem so important.

@dpsanders
Copy link
Contributor

Could the/a "correct" solution for this be added to the FAQ?

@colinfang
Copy link
Contributor

I could imagine a static keyword that effectively lifts the definition out to a let-bound variable for you.

let is not good enough as const doesn't work in let which could cause performance issue

@yuyichao
Copy link
Contributor

yuyichao commented Apr 7, 2017

const shouldn't be needed at all in local scope.

@cossio
Copy link
Contributor

cossio commented Apr 7, 2017

@yuyichao const does not affect performance in local scope. But it is still useful semantically and to avoid bugs, preventing inadvertent changes of a variable's value.

@yuyichao
Copy link
Contributor

yuyichao commented Apr 7, 2017

Sure but

  1. It doesn't affect performance as claimed by feature request: local static variables #15056 (comment)
  2. It's an unrelated issue since a static keyword (which is what this issue is talking about) isn't about providing that protection either and I believe we already have an issue that track this.

@cossio
Copy link
Contributor

cossio commented Apr 7, 2017

@yuyichao My point is that static const is something you cannot get with let (as claimed by @colinfang above), thus making static potentially more useful (assuming it admits const, that is).

@yuyichao
Copy link
Contributor

yuyichao commented Apr 7, 2017

I'm not sure what you are trying to argue about my comment here. If you just want to highlight the combination of const and static that's fine but it's totally unrelated to what I said. (OTOH it seems like #15056 (comment) is what you want and @static may not need to know anything about const.)

The comment I was replying to is specifically about performance (and it does not mention the function of static const at all) and it's that wrong claim that I'm replying to. I have not mentioned a single word about the usefulness of @static or combining @static with const.

@martinholters
Copy link
Member

Doesn't

macro static(expr)
    QuoteNode(eval(__module__, expr))
end

do the job here? (Except that it clashes with the existing @static macro and would need another name.)

@StefanKarpinski
Copy link
Member

It would be possible to use the same macro name by branching on the kind of expression. I.e. @static x = expr produces a static local variable while @static if ... else ... end evaluates the condition statically.

@dpinol
Copy link

dpinol commented Oct 22, 2021

For the record,
the updated version (eval => Base.eval & current_module() => @MODULE ) of #15056 (comment) works for my use case.

macro staticVal(init)
    var = gensym()
    Base.eval(@__MODULE__, :(const $var = $init))
    var = esc(var)
    quote
      global $var
      $var
    end
  end

Unfortunately it only works when called from the same module where it's defined. Otherwise, I get
UndefVarError: ##272 not defined on the line which contains $var. @KristofferC any idea how to solve it?

@t-bltg
Copy link
Contributor

t-bltg commented Oct 17, 2022

@dpinol, use __module__:

macro static_var(init)
  var = gensym()
  Base.eval(__module__, :(const $var = $init))
  quote
    global $var
    $var
  end |> esc
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests