Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack optimization #247

Merged
merged 4 commits into from
May 26, 2020
Merged

Stack optimization #247

merged 4 commits into from
May 26, 2020

Conversation

chfast
Copy link
Collaborator

@chfast chfast commented Mar 30, 2020

Execution comparison

Comparing master-exec to opt-exec                                                                                                                                       
Benchmark                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New                       
-------------------------------------------------------------------------------------------------------------------------------------------------
fizzy/execute/blake2b/512_bytes_rounds_1_mean                     -0.1219         -0.1218            98            86            98            86
fizzy/execute/blake2b/512_bytes_rounds_16_mean                    -0.1459         -0.1459          1472          1257          1472          1257
fizzy/execute/ecpairing/onepoint_mean                             -0.2044         -0.2044        606879        482851        606880        482854
fizzy/execute/keccak256/512_bytes_rounds_1_mean                   -0.0619         -0.0619           104            97           104            97
fizzy/execute/keccak256/512_bytes_rounds_16_mean                  -0.0729         -0.0729          1495          1386          1495          1386
fizzy/execute/memset/256_bytes_mean                               -0.1190         -0.1189             9             8             9             8
fizzy/execute/memset/60000_bytes_mean                             -0.1218         -0.1218          1809          1589          1809          1589
fizzy/execute/mul256_opt0/input0_mean                             -0.0909         -0.0909            30            27            30            27
fizzy/execute/mul256_opt0/input1_mean                             -0.0909         -0.0909            30            27            30            27
fizzy/execute/sha1/512_bytes_rounds_1_mean                        -0.1538         -0.1538           105            89           105            89
fizzy/execute/sha1/512_bytes_rounds_16_mean                       -0.1598         -0.1598          1456          1224          1456          1224
fizzy/execute/sha256/512_bytes_rounds_1_mean                      -0.1101         -0.1101            96            85            96            85
fizzy/execute/sha256/512_bytes_rounds_16_mean                     -0.1178         -0.1178          1301          1148          1301          1148
fizzy/execute/micro/factorial/10_mean                             -0.0329         -0.0321             2             2             2             2
fizzy/execute/micro/factorial/20_mean                             -0.0284         -0.0276             3             3             3             3
fizzy/execute/micro/fibonacci/24_mean                             -0.0893         -0.0893         16020         14588         16020         14589
fizzy/execute/micro/host_adler32/1_mean                           -0.0360         -0.0359             1             1             1             1
fizzy/execute/micro/host_adler32/100_mean                         -0.0509         -0.0504             7             7             7             7
fizzy/execute/micro/host_adler32/1000_mean                        -0.0462         -0.0462            68            65            68            65
fizzy/execute/micro/spinner/1_mean                                -0.0280         -0.0296             1             0             1             0
fizzy/execute/micro/spinner/1000_mean                             -0.1301         -0.1302            13            11            13            11

Parsing comparison

Comparing master-parse to opt-parse                                                                                                                                     
Benchmark                                               Time             CPU      Time Old      Time New       CPU Old       CPU New                                    
------------------------------------------------------------------------------------------------------------------------------------
fizzy/parse/blake2b_mean                             +0.0439         +0.0439            12            13            12            13
fizzy/parse/ecpairing_mean                           +0.0320         +0.0320           678           699           678           699
fizzy/parse/keccak256_mean                           +0.0297         +0.0297            20            21            20            21
fizzy/parse/memset_mean                              +0.0112         +0.0112             3             3             3             3
fizzy/parse/mul256_opt0_mean                         +0.0276         +0.0276             4             4             4             4
fizzy/parse/sha1_mean                                +0.0240         +0.0240            19            20            19            20
fizzy/parse/sha256_mean                              +0.0090         +0.0090            33            34            33            34
fizzy/parse/micro/factorial_mean                     +0.0057         +0.0057             1             1             1             1
fizzy/parse/micro/fibonacci_mean                     +0.0065         +0.0065             1             1             1             1
fizzy/parse/micro/host_adler32_mean                  -0.0044         -0.0044             1             1             1             1
fizzy/parse/micro/spinner_mean                       -0.0209         -0.0209             1             1             1             1

@codecov-io
Copy link

codecov-io commented Mar 31, 2020

Codecov Report

❗ No coverage uploaded for pull request head (stack_optimization@998fab6). Click here to learn what that means.
The diff coverage is n/a.

@axic axic mentioned this pull request Apr 7, 2020
@chfast chfast force-pushed the stack_optimization branch 3 times, most recently from 0e2b53b to d9f57d0 Compare April 9, 2020 08:59
lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
@gumb0
Copy link
Collaborator

gumb0 commented Apr 9, 2020

unary_op, binary_op, comparison_op could also assign to stack.top() instead of pop then push

@chfast chfast marked this pull request as ready for review April 10, 2020 11:45
@chfast chfast requested review from gumb0 and axic April 10, 2020 11:45
@chfast
Copy link
Collaborator Author

chfast commented Apr 10, 2020

unary_op, binary_op, comparison_op could also assign to stack.top() instead of pop then push

Not easy with of the casts present there.

lib/fizzy/stack.hpp Outdated Show resolved Hide resolved
lib/fizzy/stack.hpp Outdated Show resolved Hide resolved
lib/fizzy/stack.hpp Outdated Show resolved Hide resolved
lib/fizzy/stack.hpp Outdated Show resolved Hide resolved
lib/fizzy/stack.hpp Outdated Show resolved Hide resolved
lib/fizzy/parser_expr.cpp Outdated Show resolved Hide resolved
lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
lib/fizzy/execute.cpp Outdated Show resolved Hide resolved
@chfast chfast force-pushed the stack_optimization branch 2 times, most recently from d9fb91a to d2ac5a9 Compare May 22, 2020 07:25

// Update code's max_stack_height using frame.stack_height of the previous instruction.
// The frame.stack_height may have been updated by call instruction and it is fine
// to omit value for end instruction.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what do you mean by "it is fine to omit value for end instruction."? Is it about final end instruction?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The each end ending a frame. I will update the comment to

The frame.stack_height may have been updated by call instruction. This also omits every end and else instructions as they pop/reset the frame object.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not sure I understand the second part. Do you mean that there's no need to update max_stack_height on end and else? But it's kinda not related to why we update max_stack_height according to previous instruction...

I would say we don't update max_stack_height on end and else, because frame.stack_heigh doesn't grow there anyway. Maybe here I'd just remove the second sentence.

But also wouldn't it be more straightforward to update here after frame.stack_height growth (for current instruction) + in one other place - update_caller_frame. No need for complicated comment here then. But you would need to pass code there...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to explain why the last update frame.stack_height += metrics.stack_height_change is ignored for code.max_stack_height.

I was considering changing update_caller_frame that way, but decided to try having the update in single place (and this is result of it).

So removing last sentence is good enough resolution? Or maybe move it to frame.stack_height += metrics.stack_height_change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep the update here, I'd change the comment to something like

// Update code's max_stack_height using frame.stack_height of the previous instruction.
// At this point `frame.stack_height` includes additional changes to the stack if previous instruction was a call.
// This way this update is skipped for the end/else instruction (becaue their frame is already popped/reset), but it doesn't matter because they don't grow stack anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment updated.

lib/fizzy/stack.hpp Outdated Show resolved Hide resolved
@chfast chfast force-pushed the stack_optimization branch 2 times, most recently from cd49419 to 2f3aa84 Compare May 25, 2020 18:28
@chfast chfast requested a review from gumb0 May 25, 2020 18:28
@axic axic force-pushed the stack_optimization branch 2 times, most recently from 330034c to c4d0ccb Compare May 25, 2020 21:07
lib/fizzy/parser_expr.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@gumb0 gumb0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just one minor typo

@chfast chfast merged commit 811ce55 into master May 26, 2020
@chfast chfast deleted the stack_optimization branch May 26, 2020 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants