-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bound size of TransactionPool
in memory
#3284
Comments
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. |
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. |
Not sure what to do if a pool reaches its maximum size but transactions keep coming in. |
To get an idea of a reasonable tx pool size bound, will add a metric. |
We might need to stop accepting transactions at that point. cc @mm-near |
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. |
This fixes an issue of the transaction pool growing indefinitely in non-RPC nodes. Issue #3284
I'll pick this issue in the context of congestion work (https://github.com/near/nearcore/milestone/26, #8878). As a first step, I plan to investigate two naive approaches:
I highly suspect that both approaches will break some tests/assumptions that the clients make. My first goal will be to understand:
|
We do indeed have tests that rely on putting many transactions into the pool. Example failures are:
Concrete test
I'm adding explicit checks for the returned transaction status #8976 which would make it much easier to catch tests broken by introducing transaction pool size limits. Retrying the failure in all those tests would likely be too burdensome, so we'll have to use some high-enough limit that will cover the majority of tests and only introduce in the retries in tests that actually exercise the congestion scenario. |
I will proceed with implementing a hard capacity limit measured in bytes. The limit will be separate for each per-shard transaction pool within the We will also start returning a new error type to the clients when we reach the transaction pool size limit, so that they can retry this error intelligently with a back-off. |
The PR introduces a limit to the size of the transaction pool for each shard and a logic that rejects transactions that go over this limit. The reasoning for this work is described in #3284. To start, the limit will be disabled (effectively set to infinity) and this PR shouldn't have any effect. The node operators can override it with a config option. In the future, we will come up with a safe value to set by default (probably between 10 MiB - 1 GiB). We also start with a simple option where the RPC client will not know if their transaction runs into this limit. We will need to rework this part in the future, but it will have to touch the transaction forwarding code in a non-trivial way, and I would prefer to work on this when we have a congestion test ready #8920. Lastly, this PR adds some nuance to handling reintroducing transactions back to the pool after reorg or producing a chunk (`reintroduce_transactions` method), by acknowledging that not all transactions might have been included and logging the number of dropped transactions.
I've also explored the option of garbage collecting transactions based on their age and while doable, the existing clients (e.g. jsonrpc) will not work well with this approach and will raise the error to a higher level, likely all the way to the user (e.g. Wallet interface). This also introduces an attack vector where users can spam the system with many transactions, knowing that only a few of them will be included within the time limit and effectively block transaction pool space at a lower cost. The remaining steps to finish this work
|
This is one of the steps for near#3284
This is one of the steps for near#3284
This is one of the steps for near#3284
I've filed a bug for some follow-up work to simplify transaction pool code: #9060 |
This is one of the steps for near#3284
This will allow us to understand how big the transaction pools get in practice and what is the realistic limit to set for them. The logic within pool iterator is a bit complex due to the need to return transactions back to the pool and I'm working on a way to simplify it in a separate PR, but for now this accounting should do the job. This is one of the steps for #3284
So that it will be read from `config.json` that each node provides This is a part of #3284
So that it will be read from `config.json` that each node provides This is a part of #3284
So that it will be read from `config.json` that each node provides This is a part of near#3284
Right now the metric has noticeable blips due to rapid change when transactions are drawn from the pool: https://nearinc.grafana.net/goto/YZwyDllVR?orgId=1 To avoid this, we only decrease the metric after the pool iterator is dropped. This is a part of #3284
This PR enables the limit discussed in #3284. I've [considered](https://near.zulipchat.com/#narrow/stream/297873-pagoda.2Fnode/topic/Adding.20a.20new.20field.20to.20config.2Ejson/near/358955785) another approach to rolling this out by changing the value in `config.json` distributed through S3, but that would require more work both on our side and on validators without adding much benefit. Specifically, I've checked that we have a good safety margin here, as over the last month on the testnet, the max size of the transaction pool on the validators was < 40 KB: https://nearinc.grafana.net/goto/AhZFN__4R?orgId=1. @nikurt What would be a good place to document this field for validators? I saw https://near-nodes.io/, but couldn't find the appropriate section there.
So that it will be read from `config.json` that each node provides This is a part of #3284
Right now the metric has noticeable blips due to rapid change when transactions are drawn from the pool: https://nearinc.grafana.net/goto/YZwyDllVR?orgId=1 To avoid this, we only decrease the metric after the pool iterator is dropped. This is a part of #3284
This PR enables the limit discussed in #3284. I've [considered](https://near.zulipchat.com/#narrow/stream/297873-pagoda.2Fnode/topic/Adding.20a.20new.20field.20to.20config.2Ejson/near/358955785) another approach to rolling this out by changing the value in `config.json` distributed through S3, but that would require more work both on our side and on validators without adding much benefit. Specifically, I've checked that we have a good safety margin here, as over the last month on the testnet, the max size of the transaction pool on the validators was < 40 KB: https://nearinc.grafana.net/goto/AhZFN__4R?orgId=1. @nikurt What would be a good place to document this field for validators? I saw https://near-nodes.io/, but couldn't find the appropriate section there.
The limit of 100 MB on the per-shard transaction pool on each node will be active with the next release that @marcelo-gonzalez will be shepherding. |
Presently transactions are simply stored in a
BTreeMap
in memory, however, if we receive too many transactions too quickly, then this could grow over time, especially if the transactions are large.The text was updated successfully, but these errors were encountered: