Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization proposal for Akka Receiver message processing #1188

Closed
eryeer opened this issue Oct 25, 2019 · 7 comments
Closed

Optimization proposal for Akka Receiver message processing #1188

eryeer opened this issue Oct 25, 2019 · 7 comments
Labels
Discussion Initial issue state - proposed but not yet accepted Enhancement Type - Changes that may affect performance, usability or add new features to existing modules. Ledger Module - The ledger is our 'database', this is used to tag changes about how we store information P2P Module - peer-to-peer message exchange and network optimisations, at TCP or UDP level (not HTTP).

Comments

@eryeer
Copy link
Contributor

eryeer commented Oct 25, 2019

Summary
When Akka's Receiver Actor receives a message, it calls the OnReceive method for processing. However, this processing method is sequential, that is, if the previous message doesn't finish processing, the next message will not start processing. This requires the OnReceive method to process fast and not consume too much time. Otherwise, when the TPS pressure increases and the messages of the Actor's OnReceive method need to be processed increase, message accumulation and blocking will occur, which directly affects system performance and becomes a performance bottleneck.

Specific illustration

The properties of the Actor's OnReceive method are described by the
official Akka documentation, below is the description screenshot.
微信图片_20191025151302

Meanwhile it have been tested by us to prove that if a message is processed in Receiver's OnReceive method for 1 s, then a total of 10 messages are sent from any Senders to Receiver. The total processing time in the OnReceive method is 10s.

This also proves that the Akka Actor's receiving message processing method is the same as the consumer message processing method of most producer-consumer model system frameworks. This means that the speed at which the OnReceive method handles logic execution determines the speed at which messages are consumed.

In our performance test, when we remove the block transaction size limit and send a transaction to the node to find the maximum transaction package size of each block (that is, when calculating the maximum TPS), it is found that although each block The maximum transaction package size is maintained at a stable upper limit and cannot be increased, but at this time the CPU, memory, and network communication are not reaching full load. The cause of this phenomenon is probably the sequential message processing of these Receiver in the code, because sequential execution is like locking the method, so that the CPU can't process these logics through multiple threads concurrently, only single thread. Slow processing, can not make full use of the computer's hardware resources.

When we are doing the review of the Actor part of the code, we find that the processing logic of some OnReceive methods is time consuming, such as accessing the database Storage, and some Receiver's OnReceive method will directly handle multiple types of messages instead of distributing these messages to different sub-actors for processing. If faced with high TPS pressure, such processing logic will become a performance bottleneck.

Do you have any solution you want to propose?
So I think we need to check and optimize the OnReceive processing logic of each Actor:

  1. Check all Actor's OnReceive method processing logic, add processing time monitoring for each message processing method, evaluate the time loss of these message processing logic, and find most frequently called message processing methods through pressure test. We also need to propose a processing time requirement for these methods.

    For example, if the system is to achieve 3000 TPS, it is expected that the
    OnGetDataMessageReceived() method of ProtocalHandler will be called at least 18,000 times per
    second in the case of seven nodes (because if NodeA broadcasts to other 6 nodes at a rate of 3000
    transactions per second, then There will be 6*3000 messages sent to NodeA every second to get
    the transaction Data), then the processing speed of the method should be less than 1s/18000=5.56e-5s=0.0556ms

  2. For sub-methods that do not require thread safety guarantees in the OnReceive method (that is, sub-methods that do not have concurrency issues), use asynchronous threads or thread pool for processing. This can make full use of hardware resources such as CPU to speed up message processing.

  3. For sub-methods that requires thread safety guarantees in the OnReceive method (that is, sub-methods that may have concurrency problems), if the processing time is too long, we can consider distributing it to the sub-actor with more detailed task responsibilities according to the message type or processing logic and do it asynchronously. Because the message passing between different Actors is asynchronous, this can solve the problem that the message set is stacked in a key Actor.

Where in the software does this update applies to?

  • Other: Akka Framework
@eryeer eryeer added the Discussion Initial issue state - proposed but not yet accepted label Oct 25, 2019
@shargon
Copy link
Member

shargon commented Oct 25, 2019

However, this processing method is serialized

Is not possible to prevent the serialization?

@eryeer
Copy link
Contributor Author

eryeer commented Oct 26, 2019

No, this is the processing mechanism for Akka to receive messages, cannot prevent the serialization.

@shargon
Copy link
Member

shargon commented Oct 26, 2019

This is a big problem, I think that shouldn't be necessary to serialize any internal message

@erikzhang
Copy link
Member

I don't think there is any serialization in Akka message processing.

@eryeer
Copy link
Contributor Author

eryeer commented Oct 26, 2019

According to Akka's official documentation, Akka uses sequential processing messages to ensure message order. MailBox is essentially a pipe, and this pipe is why messages can only be processed sequentially. At the same time, this sequential processing is also an alternative to locks, but in essence very similar to locks, the OnReceive method of an Actor instance can only process one message at a time, but not multiple messages simultaneously, and most of our Actors are single instances, which determines that these Actors are prone to performance bottlenecks.

And we have tested the Akka framework, and this test confirms this, I can provide this test code later.

PS: Do I use the word 'serialize' to cause a misunderstanding? Here 'serialize' does not mean that Akka will serialize the message and save it, but that the message must be processed sequentially, but not concurrently. I have changed this word in my statement in case of any misunderstanding.

@erikzhang
Copy link
Member

It's not "serialization".

The Actor model uses this method to prevent the use of locks. If you need multithreading, you have to create multiple actors. I don't see any bottleneck in this place.

@eryeer
Copy link
Contributor Author

eryeer commented Oct 30, 2019

Now we have completed the performance tracking and testing of all Actor's
OnReceive methods. Detailed test results statistics can be seen in the table below.

Akka Actor性能测试-en

After we performed the high TPS pressure test, we found that TaskManager's OnNewTransaction method and Blockchain's OnNewTransaction method are the main reasons for Neo TPS limitation.

The NewTasks method has the most serious bottleneck. The Hashset's ExceptWith method has the computation complexity of O(n). As the number of transactions increases, its computation time cost increases linearly. In one block (15s), the amount of messages processed by the method is about 1500, and the average time cost per message is about 0.01s. Since the OnReceive method processes messages sequentially, the total time cost to process these messages in one block is 15s, which is the entire time of the block. And because of the O(n) complexity, the message processing capability decreases as the number of transaction increases,which lead to continuous decrease of TPS.

In addition, Blockchain's OnNewTransaction is also a place to be optimized. Under high pressure, his total processing time is 0.0017*2466=4.19s, which means that if our TPS can be improved by 4 times, this method will become a performance bottleneck.

@Qiao-Jin has found these two performance consumption points and gave proposals in PR #1174 and PR #1183 . Thanks for his contribution to performance optimization.

After these two points optimized, we will continue to perform performance tests and keep tracking other performance optimization possibilities.

@lock9 lock9 added Ledger Module - The ledger is our 'database', this is used to tag changes about how we store information P2P Module - peer-to-peer message exchange and network optimisations, at TCP or UDP level (not HTTP). Enhancement Type - Changes that may affect performance, usability or add new features to existing modules. labels Nov 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion Initial issue state - proposed but not yet accepted Enhancement Type - Changes that may affect performance, usability or add new features to existing modules. Ledger Module - The ledger is our 'database', this is used to tag changes about how we store information P2P Module - peer-to-peer message exchange and network optimisations, at TCP or UDP level (not HTTP).
Projects
None yet
Development

No branches or pull requests

4 participants