Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowflake is slow! #1

Open
jeehoonkang opened this issue Nov 6, 2018 · 2 comments
Open

Snowflake is slow! #1

jeehoonkang opened this issue Nov 6, 2018 · 2 comments

Comments

@jeehoonkang
Copy link
Owner

jeehoonkang commented Nov 6, 2018

The snowflake branch is slower than master even when HP is not playing any role at all:

 name                     control ns/iter  variable ns/iter  diff ns/iter   diff %  speedup 
 multi_alloc_defer_free   2,060,746        2,293,428              232,682   11.29%   x 0.90 
 multi_defer              1,329,516        1,631,245              301,729   22.69%   x 0.82 
 multi_flush              12,558,094       32,184,340          19,626,246  156.28%   x 0.39 
 multi_pin                4,283,460        4,460,518              177,058    4.13%   x 0.96 
 single_alloc_defer_free  34               40                           6   17.65%   x 0.85 
 single_defer             17               28                          11   64.71%   x 0.61 
 single_flush             110              724                        614  558.18%   x 0.15 
 single_pin               7                8                            1   14.29%   x 0.88

I fully expect that flush() becomes slower, but not pin() or defer(). Should investigate the performance degradation using flamegraphs.

@jeehoonkang
Copy link
Owner Author

jeehoonkang commented Nov 7, 2018

After manual inspection of the generated x86 assemblies, I found out that the introduction of Garbage enum negatively affects the performance quite severely. In particular, I could recognize two reasons of performance degradation: (1) Garbage was bigger than Deferred: 40 bytes vs. 32 bytes in x86-64; and (2) case discrimination was quite expensive.

I mitigated (1) by changing the size of Deferred to 24 bytes so that Garbage is now 32 bytes. Anyway Guard::defer() will be used quite rarely. Though I'm not sure how to solve (2). After changing the size of Deferred, the benchmark result for defer() and pin() becomes slightly better:

 name                     control ns/iter  variable ns/iter  diff ns/iter   diff %  speedup
 multi_alloc_defer_free   2,060,746        2,177,730              116,984    5.68%   x 0.95                                                                                                                 
 multi_defer              1,329,516        1,451,704              122,188    9.19%   x 0.92                                                                                                                 
 multi_flush              12,558,094       28,366,437          15,808,343  125.88%   x 0.44                                                                                                                 
 multi_pin                4,283,460        4,174,085             -109,375   -2.55%   x 1.03                                                                                                                 
 single_alloc_defer_free  34               36                           2    5.88%   x 0.94                                                                                                                 
 single_defer             17               24                           7   41.18%   x 0.71
 single_flush             110              608                        498  452.73%   x 0.18
 single_pin               7                8                            1   14.29%   x 0.88

I'm relatively happy with the performance of the current implementation...

@jeehoonkang
Copy link
Owner Author

jeehoonkang commented Jun 2, 2019

The new benchmark result for a recent version of Crossbeam + Snowflake (9591f5e):

 name                     control ns/iter  variable ns/iter  diff ns/iter   diff %  speedup 
 multi_alloc_defer_free   4,142,880        4,459,418              316,538    7.64%   x 0.93 
 multi_defer              1,411,127        1,970,390              559,263   39.63%   x 0.72 
 multi_flush              12,473,802       14,810,773           2,336,971   18.74%   x 0.84 
 multi_pin                4,257,543        7,721,718            3,464,175   81.37%   x 0.55 
 single_alloc_defer_free  49               65                          16   32.65%   x 0.75 
 single_defer             17               36                          19  111.76%   x 0.47 
 single_flush             164              175                         11    6.71%   x 0.94 
 single_pin               7                16                           9  128.57%   x 0.44 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant