-
Notifications
You must be signed in to change notification settings - Fork 23.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse c->argv after command execution to reduce memory allocation overhead #13521
Conversation
Benchmark
unstable:
this PR:
|
Update the benchmark
unstable:
this PR (+3.4%):
unstable:
this PR (+0.9% no regression):
|
Maybe it is too complex for the |
@sundb we can see that this PR is very beneficial on unix sockets, long lived client use-cases.
Load the data:
benchmark:
Results: unstable
the PR:
|
CE Performance Automation : step 1 of 2 (build) STARTING...This comment was automatically generated given a benchmark was triggered.
|
CE Performance Automation : step 2 of 2 (benchmark) RUNNING...This comment was automatically generated given a benchmark was triggered. Started benchmark suite at 2024-11-12 11:17:53.134137 and took 20052.87815 seconds up until now. In total will run 136 benchmarks. |
please note that I no longer automatically release argv in cron because it may still be used elsewhere, such as by blocked clients. |
src/networking.c
Outdated
c->argv = zmalloc(sizeof(robj*)*c->argv_len); | ||
/* Create new argv if space is insufficient or the new arguments are too large. */ | ||
if (unlikely(argc > c->argv_len || | ||
(c->argv_len > ARGV_CACHE_THRESHOLD && c->argv_len > argc * 2))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we're re-using the allocation only if it's very small, and not wasting more than half of it.
considering that it's very small, do really care how much of it is wasted?
or alternatively, maybe for big ones, if they're only 10% wasted we may want to re-use?
or maybe the benefit for saving that one allocation on large argv arrays is negligible (considering the elements we put in the array too)?
actually, now that i think of it, over a certain size we do gradual allocations
c->argv_len = min(c->multibulklen, 1024);
so it probably doesn't make sense to re-use in these cases (we're wasting time on reallocs anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i change the threshold to 1024, and free the arguments in cron.
This also means that we don't have to worry about constantly rebuilding when the command size fluctuates greatly, which is more in line with the actual usage scenario.
Because I remember some users using ping
for keep alive, it could result in never being reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when argc
almost is between 1 and 64, in pipeline mode, this PR brings 1%-5% improvements. But if argc
may changes frequently (more than 64) and we use unlikely
, I am not sure what will happen
benchmark with variable arguments size. start server:
prepare data:
32 (+1.7%)
64 (+1.7%)
128 (+0.1%)
256 (+2%)
512 (+3.5%)
|
Automated performance analysis summaryThis comment was automatically generated given there is performance data available. Using platform named: intel64-ubuntu22.04-redis-icx1 to do the comparison. In summary:
You can check a comparison in detail via the grafana link Comparison between unstable and sundb:derfer_free_argv.Time Period from 5 months ago. (environment used: oss-standalone) Unstable Table
Unstable test regexp names: memtier_benchmark-1Mkeys-generic-scan-pipeline-10|memtier_benchmark-1Mkeys-hash-hincrby|memtier_benchmark-1Mkeys-hash-transactions-multi-exec-pipeline-20|memtier_benchmark-1Mkeys-load-set-intset-with-100-elements-pipeline-10|memtier_benchmark-1Mkeys-load-zset-listpack-with-100-elements-double-score|memtier_benchmark-1Mkeys-load-zset-with-10-elements-double-score|memtier_benchmark-1Mkeys-string-setex-100B-pipeline-10|memtier_benchmark-1key-100M-bits-bitmap-bitcount|memtier_benchmark-1key-1Billion-bits-bitmap-bitcount|memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat|memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat-pipeline-10|memtier_benchmark-1key-hash-hscan-50-fields-10B-values|memtier_benchmark-1key-list-100-elements-lrange-all-elements-pipeline-10|memtier_benchmark-1key-list-10K-elements-linsert-lrem-integer|memtier_benchmark-1key-list-10K-elements-linsert-lrem-string|memtier_benchmark-1key-list-10K-elements-lpos-integer|memtier_benchmark-1key-list-10K-elements-lpos-string|memtier_benchmark-1key-list-1K-elements-lrange-all-elements|memtier_benchmark-1key-list-1K-elements-lrange-all-elements-pipeline-10|memtier_benchmark-1key-list-2K-elements-quicklist-lrange-all-elements-longs|memtier_benchmark-1key-pfadd-4KB-values-pipeline-10|memtier_benchmark-1key-set-100-elements-sscan|memtier_benchmark-1key-set-1K-elements-smembers|memtier_benchmark-1key-zset-10-elements-zrange-all-elements|memtier_benchmark-1key-zset-100-elements-zrange-all-elements|memtier_benchmark-1key-zset-100-elements-zrangebyscore-all-elements|memtier_benchmark-1key-zset-100-elements-zrangebyscore-all-elements-long-scores|memtier_benchmark-1key-zset-100-elements-zscan|memtier_benchmark-1key-zset-1K-elements-zrange-all-elements|memtier_benchmark-1key-zset-1M-elements-zscore-pipeline-10|memtier_benchmark-2keys-set-10-100-elements-sdiff|memtier_benchmark-2keys-set-10-100-elements-sunion|memtier_benchmark-2keys-stream-5-entries-xread-all-entries-pipeline-10|memtier_benchmark-2keys-zset-300-elements-skiplist-encoded-zunion|memtier_benchmark-2keys-zset-300-elements-skiplist-encoded-zunionstore|memtier_benchmark-3Mkeys-string-mixed-20-80-with-512B-values-pipeline-10-5200_conns Regressions Table
Regressions test regexp names: memtier_benchmark-1Mkeys-bitmap-getbit-pipeline-10|memtier_benchmark-1Mkeys-generic-exists-pipeline-10|memtier_benchmark-1Mkeys-generic-expireat-pipeline-10|memtier_benchmark-1Mkeys-generic-pexpire-pipeline-10|memtier_benchmark-1Mkeys-generic-touch-pipeline-10|memtier_benchmark-1Mkeys-hash-transactions-multi-exec-pipeline-20|memtier_benchmark-1Mkeys-load-set-intset-with-100-elements-pipeline-10|memtier_benchmark-1Mkeys-load-zset-with-10-elements-double-score|memtier_benchmark-1key-100M-bits-bitmap-bitcount|memtier_benchmark-1key-1Billion-bits-bitmap-bitcount|memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat-pipeline-10|memtier_benchmark-1key-list-10-elements-lrange-all-elements-pipeline-10|memtier_benchmark-1key-list-1K-elements-lrange-all-elements|memtier_benchmark-1key-list-1K-elements-lrange-all-elements-pipeline-10|memtier_benchmark-1key-list-2K-elements-quicklist-lrange-all-elements-longs|memtier_benchmark-1key-pfadd-4KB-values-pipeline-10|memtier_benchmark-1key-set-100-elements-smembers|memtier_benchmark-1key-set-1K-elements-smembers|memtier_benchmark-1key-zrank-1M-elements-pipeline-1|memtier_benchmark-1key-zrevrank-1M-elements-pipeline-1|memtier_benchmark-1key-zset-10-elements-zrange-all-elements|memtier_benchmark-1key-zset-10-elements-zrange-all-elements-long-scores|memtier_benchmark-1key-zset-100-elements-zrange-all-elements|memtier_benchmark-1key-zset-100-elements-zrangebyscore-all-elements-long-scores|memtier_benchmark-1key-zset-1K-elements-zrange-all-elements|memtier_benchmark-2keys-set-10-100-elements-sdiff|memtier_benchmark-2keys-stream-5-entries-xread-all-entries Full Results table:WARNING: There were 6 benchmarks with NO datapoints for both baseline and comparison. NO DATAPOINTS test regexp names: m|m|r|r|r|r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Introduced by #13521 If the client argv was released due to a timeout before sending the complete command, `argv_len` will be reset to 0. When argv is parsed again and resized, requesting a length of 0 may result in argv being NULL, then leading to a crash. And fix a bug that `argv_len` is not updated correctly in `replaceClientCommandVector()`. --------- Co-authored-by: ShooterIT <wangyuancode@163.com> Co-authored-by: meiravgri <109056284+meiravgri@users.noreply.github.com>
…rhead (#13521) inspred by #12730 Before this PR, we allocate new memory to store the user command arguments, however, if the size of the current `c->argv` is larger than the current command, we can reuse the previously allocated argv to avoid allocating new memory for the current command. And we will free `c->argv` in client cron when the client is idle for 2 seconds. --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Introduced by #13521 If the client argv was released due to a timeout before sending the complete command, `argv_len` will be reset to 0. When argv is parsed again and resized, requesting a length of 0 may result in argv being NULL, then leading to a crash. And fix a bug that `argv_len` is not updated correctly in `replaceClientCommandVector()`. --------- Co-authored-by: ShooterIT <wangyuancode@163.com> Co-authored-by: meiravgri <109056284+meiravgri@users.noreply.github.com>
inspred by #12730
Before this PR, we allocate new memory to store the user command arguments, however, if the size of the current
c->argv
is larger than the current command, we can reuse the previously allocated argv to avoid allocating new memory for the current command.And we will free
c->argv
in client cron when the client is idle for 2 seconds.