-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move ShardsManager chunk management logic to a separate Actix thread #7662
Comments
Consider re-using existing sync jobs actor: nearcore/chain/client/src/client_actor.rs Lines 149 to 156 in dc404fe
That is, I think we can re-name this actor as "stuff we do in background actor". I don't have enough context to understand whether that 's actually a good idea or not, but seems worth consideration |
Hmm, I also don't have much context on what sync jobs actor does, but chunks management is also a critical task (as it affects validator latency) and not less important than ClientActor, so I don't think it should share with a generic background queue. |
This is a big refactoring, mostly due to tests. Some things that are worth mentioning: - The ShardsManagerActor is a new actor that takes the ShardsManagerRequest message and returns nothing. - This does mean that integration tests that penetrate the ShardsManager abstraction would now struggle to assert some detailed results. Luckily there aren't really many legitimate ones. - The head and header_head are now supplied via the ShardsManagerRequest::UpdateChainHeads message, and no longer supplied via the request_chunks_* calls or the resend_chunk_requests call. The latter is no longer originated by a timer in the Client which necessitates maintaining the heads in the ShardsManager, and then the former is no longer required as a result. This shouldn't be a problem, because the head is only used to validate signatures or decide forwarding recipients in the case where the prev_hash isn't available (i.e. it's best-effort to begin with), and the header_head is used to decide who exactly to fetch a chunk from (whether from archival, etc.), which is inherently also a best-effort determination. Besides, the updating of heads is serialized with all other client-initiated calls, which means that the only case when we might have a behavior change is when the head or header_head isn't exactly the same in the resend_chunk_requests call (which is initiated by a timer in ShardsManagerActor now). - Because the header and header_head are needed, I've included them as parameters in the ShardsManager constructor. (The alternative would be to keep them optional until we first hear from the client, but that adds complexity to handle the None case which is obscure and hard to test). This means that the store must be initialized with at least one block (from genesis) before ShardsManager is constructed, which in today's code means we must construct the Client before ShardsManager (since Client constructs Chain). In a future refactoring, the Chain should be constructed before the Client and everything else, so then initialization order wouldn't be a problem. In the meantime, we ensure the construction order, by moving Client::new call out of ClientActor constructor so it is synchronous, (and for testing ensure this on a framework-by-framework basis). - Much of the refactoring is also due to replacing (Addr<ClientActor>, Addr<ViewClientActor>) with ActorHandlesForNetworkMessageMocking, so that we can add a third, Arc<dyn ShardsManagerAdapter>. This is pretty simple. - For synchronous tests like TestEnv and setup_client, instead of constructing a ShardsManager and then propagating ShardsManagerRequest messages to it, I've made a SynchronousShardsManagerAdapter which just calls ShardsManager directly when it is called, as opposed to posting a message to a queue (like MockPeerManagerAdapter). This makes us mostly not have to change anything in these test cases, because the behavior is identical to before the refactoring. #7662
This is a big refactoring, mostly due to tests. Some things that are worth mentioning: - The ShardsManagerActor is a new actor that takes the ShardsManagerRequest message and returns nothing. - This does mean that integration tests that penetrate the ShardsManager abstraction would now struggle to assert some detailed results. Luckily there aren't really many legitimate ones. - The head and header_head are now supplied via the ShardsManagerRequest::UpdateChainHeads message, and no longer supplied via the request_chunks_* calls or the resend_chunk_requests call. The latter is no longer originated by a timer in the Client which necessitates maintaining the heads in the ShardsManager, and then the former is no longer required as a result. This shouldn't be a problem, because the head is only used to validate signatures or decide forwarding recipients in the case where the prev_hash isn't available (i.e. it's best-effort to begin with), and the header_head is used to decide who exactly to fetch a chunk from (whether from archival, etc.), which is inherently also a best-effort determination. Besides, the updating of heads is serialized with all other client-initiated calls, which means that the only case when we might have a behavior change is when the head or header_head isn't exactly the same in the resend_chunk_requests call (which is initiated by a timer in ShardsManagerActor now). - Because the header and header_head are needed, I've included them as parameters in the ShardsManager constructor. (The alternative would be to keep them optional until we first hear from the client, but that adds complexity to handle the None case which is obscure and hard to test). This means that the store must be initialized with at least one block (from genesis) before ShardsManager is constructed, which in today's code means we must construct the Client before ShardsManager (since Client constructs Chain). In a future refactoring, the Chain should be constructed before the Client and everything else, so then initialization order wouldn't be a problem. In the meantime, we ensure the construction order, by moving Client::new call out of ClientActor constructor so it is synchronous, (and for testing ensure this on a framework-by-framework basis). - Much of the refactoring is also due to replacing (Addr<ClientActor>, Addr<ViewClientActor>) with ActorHandlesForNetworkMessageMocking, so that we can add a third, Arc<dyn ShardsManagerAdapter>. This is pretty simple. - For synchronous tests like TestEnv and setup_client, instead of constructing a ShardsManager and then propagating ShardsManagerRequest messages to it, I've made a SynchronousShardsManagerAdapter which just calls ShardsManager directly when it is called, as opposed to posting a message to a queue (like MockPeerManagerAdapter). This makes us mostly not have to change anything in these test cases, because the behavior is identical to before the refactoring. near#7662
Right now, all chunk management logic is handled by the ClientActor thread, including the handling of partial encoded chunk messages, sending partial encoded chunk requests, responding to partial encoded chunk requests, handling partial encoded chunk responses, decoding chunks, etc. This unnecessarily clogs the ClientActor thread. We should move these to a separate actor.
The text was updated successfully, but these errors were encountered: