Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in function calls #15257

Closed
KristofferC opened this issue Feb 26, 2016 · 2 comments
Closed

Performance regression in function calls #15257

KristofferC opened this issue Feb 26, 2016 · 2 comments
Labels
performance Must go faster

Comments

@KristofferC
Copy link
Member

Sorry for the general issue name but im not sure yet what is actually slower.

I was looking into doing matrix multiplication with tuples and stumbled upon a performance regression that might be interesting.

Script:

typealias Mat{M, N, T} NTuple{M, NTuple{N, T}}

# Generates expression for an unrolled matrix multiply
@generated function unrolled_matmult_noinline{T, M, N, K}(A::Mat{M, N, T}, B::Mat{N, K, T})
    ex = Expr(:tuple, [Expr(:tuple, [:(loopdot_noinline(A, B, $i, $j)) for j=1:N]...) for i=1:N]...)
    return ex
end

# Does the contraction on K
@noinline function loopdot_noinline{M, N, K, T}(A::Mat{M,K,T}, B::Mat{K,N,T}, Arow, Bcol)
    s = zero(T)
    for k = 1:K
        s += A[Arow][k] * B[k][Bcol]
    end
    s
end

# Creates a random tuple
@generated function rand_tuple{N, T}(::Type{Mat{N,N,T}})
    return  ex = Expr(:tuple, [Expr(:tuple, [rand(T) for j=1:N]...) for i=1:N]...)
end


function bench_noinline{N}(::Type{Val{N}})
    A = rand_tuple(Mat{N,N, Float64})
    @time for i in 1:10^5
        unrolled_matmult_noinline(A, A)
    end
end

for i in 1:9
    bench_noinline(Val{i})
end

0.4.3:

  0.000263 seconds
  0.002243 seconds
  0.006109 seconds
  0.017329 seconds
  0.039429 seconds
  0.077035 seconds
  0.137714 seconds
  0.231859 seconds
  0.355290 seconds

Master:

  0.000316 seconds
  0.002214 seconds
  0.009517 seconds
  0.028667 seconds
  0.061345 seconds
  0.142360 seconds
  0.392535 seconds
  0.719159 seconds
  1.347999 seconds

Note that the inlined version is about the same on both versions.

@KristofferC KristofferC changed the title Performance regression in Performance regression in function calls Feb 26, 2016
@KristofferC
Copy link
Member Author

The FixedSizeArrays.jl package also seems slower on 0.5 so maybe it is something with tuples in general?

function bench_FSA{N}(::Type{Val{N}})
    A = rand(Mat{N,N, Float64})
    @time for i in 1:10^7
        A * A
    end
end

for i in 1:4
    bench_FSA(Val{i})
end

0.4.3:

  0.019101 seconds
  0.028643 seconds
  0.077164 seconds
  0.187649 seconds

Master:

  0.016788 seconds
  0.038135 seconds
  0.114322 seconds
  0.281677 seconds

@KristofferC
Copy link
Member Author

Apparently a dup of #15277

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

2 participants