Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve price comparison performance #1094

Closed
abitmore opened this issue Jun 25, 2018 · 8 comments
Closed

Improve price comparison performance #1094

abitmore opened this issue Jun 25, 2018 · 8 comments
Assignees
Labels
2a Discussion Needed Prompt for team to discuss at next stand up. 2d Developing Status indicating currently designing and developing a solution 6 Performance Impacts flag identifying system/user efficiency, performance, etc. performance

Comments

@abitmore
Copy link
Member

abitmore commented Jun 25, 2018

According to replay profiling data, price::operator<() and price::operator>() are in top 10.

---------------------- first 27 M blocks ----------------------------
764718ms th_a db_management.cpp:124 reindex ] Done reindexing, elapsed time: 5063.73962899999969522 sec

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
  9.62    215.94   215.94 448458421     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_object, ... >::find(graphene::db::object_id_type) const
  8.47    406.11   190.17 20731743387     0.00     0.00  graphene::chain::operator>(graphene::chain::price const&, graphene::chain::price const&)
  7.18    567.18   161.07 390785433     0.00     0.00  graphene::chain::database::adjust_balance(graphene::db::object_id<(unsigned char)1, (unsigned char)2, graphene::chain::account_object>, graphene::chain::asset)
  4.68    672.30   105.12 307448392     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_statistics_object, ... >::find(graphene::db::object_id_type) const
  4.64    776.50   104.20 22967573183     0.00     0.00  graphene::chain::operator<(graphene::chain::price const&, graphene::chain::price const&)
  4.39    874.95    98.45                             sha256_block_data_order_avx
  3.92    962.82    87.87 3183637020     0.00     0.00  graphene::chain::generic_index<graphene::chain::asset_bitasset_data_object, ... >::find(graphene::db::object_id_type) const
  3.62   1044.04    81.22 386682199     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_balance_object, ... >::modify(graphene::db::object const&, std::function<void (graphene::db::object&)> const&)
  3.32   1118.50    74.46 805850110     0.00     0.00  graphene::chain::generic_index<graphene::chain::asset_object, ... >::find(graphene::db::object_id_type) const
  2.85   1182.43    63.93 245765371     0.00     0.00  graphene::chain::generic_index<graphene::chain::account_statistics_object, ... >::modify(graphene::db::object const&, std::function<void (graphene::db::object&)> const&)
  2.76   1244.32    61.89 27000000     0.00     0.00  graphene::chain::database::update_expired_feeds()
  2.16   1292.90    48.58 72953692     0.00     0.00  graphene::chain::generic_index<graphene::chain::limit_order_object, ... >::create(std::function<void (graphene::db::object&)> const&)

When comparing 2 price objects, we use two 128-bit multiplications. If the amounts are smaller, will the multiplications be faster?

Default boost::rational number is Irreducible Fraction. But price is not. What if we automatically change price to Irreducible Fraction (normalize the rational number) as well?

At least, if the 2 amounts are always coprime, we no longer need to use multiplication for == nor != operator.

Thoughts?

This is a sub-task of #982.

@abitmore abitmore added performance question 2a Discussion Needed Prompt for team to discuss at next stand up. 6 Performance Impacts flag identifying system/user efficiency, performance, etc. labels Jun 25, 2018
@jmjatlanta
Copy link
Contributor

jmjatlanta commented Jun 25, 2018

I haven't looked at the internal representation of 128 bit integers, but here is an idea using assembler:
http://masm32.com/board/index.php?topic=2213.0

If we're just checking for equality, could we just split them into 2 sets of 64 bit registers, OR them together, and see if the same value comes out that went in? You would have to do it once for each group, although you could skip the 2nd group if the first is not equal. That would be a fun exercise to benchmark.

Could also try 4 byte chunks, which may be faster for ORing.

Another note: While the whole post is good, see the last comment here: https://stackoverflow.com/questions/741301/how-can-i-add-and-subtract-128-bit-integers-in-c-or-c-if-my-compiler-does-not

@abitmore
Copy link
Member Author

@jmjatlanta < and > are used much more than ==, so we're not only talking about equality.

@btsabc
Copy link

btsabc commented Jun 28, 2018

Maybe we can use __int128 to improve performance over GCC and Clang.

@abitmore
Copy link
Member Author

Just a note here: @btsabc mentioned in #998 (comment) that MSVC doesn't support __int128.

@abitmore abitmore self-assigned this Jul 6, 2018
@abitmore abitmore added 2d Developing Status indicating currently designing and developing a solution and removed question labels Jul 6, 2018
@abitmore
Copy link
Member Author

abitmore commented Jul 6, 2018

After done some testing, it turns out that normalization will slightly reduce performance. That means the approach mentioned in OP is a no go.

There are some minor optimizations can be done about price comparison itself. I'll create a PR for them.

Since 128 bit multiplication is expensive, in order to improve overall performance, we need to try to do fewer price comparison calls in the code.

@abitmore
Copy link
Member Author

Merged #1124. Closing.

@abitmore
Copy link
Member Author

@abitmore
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2a Discussion Needed Prompt for team to discuss at next stand up. 2d Developing Status indicating currently designing and developing a solution 6 Performance Impacts flag identifying system/user efficiency, performance, etc. performance
Projects
None yet
Development

No branches or pull requests

3 participants