Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call optimization #554

Closed
wants to merge 3 commits into from
Closed

Call optimization #554

wants to merge 3 commits into from

Conversation

chfast
Copy link
Collaborator

@chfast chfast commented Sep 25, 2020

Results may be wrong as some benchmarks don't use calls. Confirmed for GCC10/LTO.

fizzy/execute/blake2b/512_bytes_rounds_1_mean                     -0.1128         -0.1128            87            78            87            78                                                                                 
fizzy/execute/blake2b/512_bytes_rounds_16_mean                    -0.1149         -0.1148          1329          1176          1329          1176                                                                                 
fizzy/execute/ecpairing/onepoint_mean                             -0.0918         -0.0918        411770        373976        411774        373980                                                                                 
fizzy/execute/keccak256/512_bytes_rounds_1_mean                   -0.0856         -0.0856           105            96           105            96                                                                                 
fizzy/execute/keccak256/512_bytes_rounds_16_mean                  -0.0999         -0.0999          1548          1393          1548          1393                                                                                 
fizzy/execute/memset/256_bytes_mean                               -0.1481         -0.1481             7             6             7             6                                                                                 
fizzy/execute/memset/60000_bytes_mean                             -0.1510         -0.1510          1623          1378          1623          1378                                                                                 
fizzy/execute/mul256_opt0/input0_mean                             -0.1334         -0.1334            29            25            29            25                                                                                 
fizzy/execute/mul256_opt0/input1_mean                             -0.1338         -0.1338            29            25            29            25                                                                                 
fizzy/execute/ramanujan_pi/33_runs_mean                           -0.1255         -0.1255           138           120           138           120                                                                                 
fizzy/execute/sha1/512_bytes_rounds_1_mean                        -0.1141         -0.1141            94            84            94            84                                                                                 
fizzy/execute/sha1/512_bytes_rounds_16_mean                       -0.1163         -0.1163          1318          1164          1318          1164                                                                                 
fizzy/execute/sha256/512_bytes_rounds_1_mean                      -0.1193         -0.1193            96            84            96            84                                                                                 
fizzy/execute/sha256/512_bytes_rounds_16_mean                     -0.1228         -0.1228          1326          1163          1326          1163                                                                                 
fizzy/execute/taylor_pi/pi_1000000_runs_mean                      -0.0392         -0.0392         41668         40036         41669         40036                                                                                 
fizzy/execute/micro/eli_interpreter/halt_mean                     -0.1528         -0.1528             0             0             0             0                                                                                 
fizzy/execute/micro/eli_interpreter/exec105_mean                  -0.1269         -0.1269             5             4             5             4                                                                                 
fizzy/execute/micro/factorial/10_mean                             -0.1033         -0.1032             0             0             0             0                                                                                 
fizzy/execute/micro/factorial/20_mean                             -0.1066         -0.1066             1             0             1             0                                                                                 
fizzy/execute/micro/fibonacci/24_mean                             -0.0947         -0.0947          5230          4735          5230          4735                                                                                 
fizzy/execute/micro/host_adler32/1_mean                           -0.0603         -0.0603             0             0             0             0
fizzy/execute/micro/host_adler32/100_mean                         -0.0753         -0.0753             3             3             3             3
fizzy/execute/micro/host_adler32/1000_mean                        -0.0549         -0.0549            31            29            31            29
fizzy/execute/micro/spinner/1_mean                                -0.1555         -0.1555             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                             -0.1487         -0.1487            10             9            10             9

@axic
Copy link
Member

axic commented Sep 30, 2020

Rebased this locally on #562, much more easier to read it grouped with execute.

assert(stack.size() >= num_args);
span<const Value> call_args{stack.rend() - num_args, num_args};

const auto ret = execute(instance, func_idx, call_args.begin(), depth + 1);
Copy link
Member

@axic axic Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, why do we have both .data() and .begin() when they point to the same thing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in std::span.

@codecov
Copy link

codecov bot commented Oct 5, 2020

Codecov Report

Merging #554 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #554   +/-   ##
=======================================
  Coverage   98.25%   98.25%           
=======================================
  Files          63       63           
  Lines        9224     9231    +7     
=======================================
+ Hits         9063     9070    +7     
  Misses        161      161           

@chfast
Copy link
Collaborator Author

chfast commented Oct 5, 2020

fizzy/execute/blake2b/512_bytes_rounds_1_mean                     +0.1395         +0.1395            77            88            77            88
fizzy/execute/blake2b/512_bytes_rounds_16_mean                    +0.1386         +0.1386          1164          1325          1164          1325
fizzy/execute/ecpairing/onepoint_mean                             +0.0746         +0.0746        384289        412951        384291        412955
fizzy/execute/keccak256/512_bytes_rounds_1_mean                   +0.1103         +0.1103            94           105            94           105
fizzy/execute/keccak256/512_bytes_rounds_16_mean                  +0.0938         +0.0938          1387          1517          1387          1517
fizzy/execute/memset/256_bytes_mean                               +0.1888         +0.1888             6             7             6             7
fizzy/execute/memset/60000_bytes_mean                             +0.1996         +0.1996          1374          1649          1374          1649
fizzy/execute/mul256_opt0/input1_mean                             +0.1312         +0.1312            25            29            25            29
fizzy/execute/ramanujan_pi/33_runs_mean                           +0.1020         +0.1020           118           131           118           131
fizzy/execute/sha1/512_bytes_rounds_1_mean                        +0.1375         +0.1375            84            95            84            95
fizzy/execute/sha1/512_bytes_rounds_16_mean                       +0.1394         +0.1394          1163          1325          1163          1325
fizzy/execute/sha256/512_bytes_rounds_1_mean                      +0.1316         +0.1316            84            96            84            96
fizzy/execute/sha256/512_bytes_rounds_16_mean                     +0.1378         +0.1378          1164          1325          1164          1325
fizzy/execute/taylor_pi/pi_1000000_runs_mean                      +0.0312         +0.0312         40031         41279         40032         41280
fizzy/execute/micro/eli_interpreter/exec105_mean                  +0.1739         +0.1739             4             5             4             5
fizzy/execute/micro/factorial/20_mean                             +0.0374         +0.0374             1             1             1             1
fizzy/execute/micro/fibonacci/24_mean                             +0.0843         +0.0843          4957          5375          4957          5375
fizzy/execute/micro/host_adler32/1_mean                           +0.1161         +0.1161             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                        +0.1519         +0.1519            29            34            29            34
fizzy/execute/micro/spinner/1_mean                                +0.0550         +0.0550             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                             +0.1180         +0.1180             9            10             9            10

I think we were just lucky with #552, because now whatever I change around invoke I get 10% regression. The same story is in #574.

@axic axic added the optimization Performance optimization label Oct 9, 2020
@chfast
Copy link
Collaborator Author

chfast commented Oct 20, 2020

Replaced by #602. The remaining code copy has no effect.

@chfast chfast closed this Oct 20, 2020
@axic axic deleted the call_optimization branch November 6, 2020 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Performance optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants