Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

noncliff: improve performance of _allthreesumtozero #377

Merged
merged 2 commits into from
Oct 25, 2024

Conversation

Fe-r-oz
Copy link
Contributor

@Fe-r-oz Fe-r-oz commented Oct 4, 2024

This PR aims too improve the performance of _allthreesumtozero. Benchmarks are attached below:
As you mentioned some time ago, to prevent from falling into trap of misleading results, I used evals = 1 and setups

Benchmarks:

julia> using BenchmarkTools
julia> function _allthreesumtozero(a, b, c)
           @inbounds @simd for i in 1:length(a)
               iseven(a[i] + b[i] + c[i]) || return false
           end
           true
       end

julia> N = 10^6;

julia> a = rand(1:100, N);
julia> b = rand(1:100, N);
julia> c = rand(1:100, N);

julia> @benchmark _allthreesumtozero($a, $b, $c) evals=1 setup=(a_copy=copy(a); b_copy=copy(b); c_copy=copy(c))
BenchmarkTools.Trial: 633 samples with 1 evaluation.
 Range (min … max):  330.000 ns …  19.116 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     943.000 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):     1.086 μs ± 933.113 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   █▃▆█▇▄▄▁ ▄▃▃▃▅▇▂ ▂▁▃  ▁                                       
  █████████▄███████████▇▇█▆▆█▆▇▇▇▆▇▅▆▄▄▅▃▃▁▃▅▃▃▂▃▂▂▂▂▂▁▃▂▃▂▁▁▁▃ ▅
  330 ns           Histogram: frequency by time         2.89 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> # Optimization
  function _allthreesumtozero_optimized(a, b, c)
       n = length(a)
       @inbounds @simd for i in 1:n
            odd = (a[i]+b[i]+c[i]) & 1
            if odd != 0
                return false
            end
       end
       true
  end

julia> @benchmark _allthreesumtozero_optimized($a, $b, $c) evals=1 setup=(a_copy=copy(a); b_copy=copy(b); c_copy=copy(c))
BenchmarkTools.Trial: 650 samples with 1 evaluation.
 Range (min … max):  234.000 ns … 17.180 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     590.500 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   785.857 ns ±  1.052 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄█▃▄▇ ▅   ▁   ▁                                               
  ███████▇███▆█▄█▆▄▆▄▃▄▄▄▄▄▃▂▄▂▃▃▁▃▂▁▁▂▂▁▁▁▁▂▂▂▁▁▁▁▁▂▂▁▁▁▁▁▁▁▃ ▃
  234 ns          Histogram: frequency by time         3.12 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.
  • The code is properly formatted and commented.
  • Substantial new functionality is documented within the docs.
  • All new functionality is tested.
  • All of the automated tests on github pass.

@Fe-r-oz
Copy link
Contributor Author

Fe-r-oz commented Oct 4, 2024

I think the PR is ready for review. Thanks for your tip about using evals=1 and setup from some time ago, that helped me from not producing misleading results... Thank you!

src/nonclifford.jl Outdated Show resolved Hide resolved
@Krastanov
Copy link
Member

looks great, thanks!

@Krastanov Krastanov merged commit 6863bd0 into QuantumSavory:nonclif Oct 25, 2024
8 of 12 checks passed
@Fe-r-oz Fe-r-oz deleted the todo branch October 25, 2024 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants