Skip to content

GEMM slower than GEMV slower than AXPY equivalent on Intel i5 CPU #528

@jiahao

Description

@jiahao

Consider the following Julia code (using Julia 0.4-dev):

let
       B = randn(1000, 1000)
       v = randn(1000)
       y = randn(1000); sB = [slice(B, :, j) for j = 1:size(B, 2)]

       @time for i = 1:1000;BLAS.gemm!('N', 'N', 1.0, B, v, 1.0, y);end;

       @time for i = 1:1000;BLAS.gemv!('N', 1.0, B, v, 1.0, y);end;

       @time for i = 1:1000;
           for j = 1:size(B,2)
             BLAS.axpy!(v[j], sB[j], y)
           end
       end
end

On @andreasnoack's machine, a Macbook Pro with i7-4870HQ CPU, GEMM is 4 times slower than GEMV:

elapsed time: 1.084909686 seconds (0 bytes allocated)
elapsed time: 0.2644927 seconds (0 bytes allocated)
elapsed time: 0.321705553 seconds (0 bytes allocated)

On my machine, a Macbook Pro with i5-4258U, I get similar behavior, but also that the AXPY equivalent is the fastest of the 3 computations:

elapsed time: 1.693223657 seconds (0 bytes allocated)
elapsed time: 0.818590556 seconds (0 bytes allocated)
elapsed time: 0.715702898 seconds (0 bytes allocated)

I find the relative performance behaviors surprising.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions