Skip to content

Commit

Permalink
Support metrics integration without quantiles
Browse files Browse the repository at this point in the history
Quantiles computations increases performance overhead by near 10% when
used in statistics. One may want to use statistics with metrics without
quantiles. The patch allows one to do it.

Patch also bumps minimal required metrics rock version.
`metrics >= 0.9.0` is required to use summary quantiles with
age buckets. `metrics >= 0.5.0, < 0.9.0` is unsupported
due to quantile overflow bug [1]. `metrics == 0.9.0` has bug that do
not permits to create summary collector without quantiles [2].
In fact, user may use `metrics >= 0.5.0`, `metrics != 0.9.0`
if he wants to use metrics without quantiles, and `metrics >= 0.9.0`
if he wants to use metrics with quantiles. But this is confusing,
so let's use a single restriction for both cases.

1. tarantool/metrics#235
2. tarantool/metrics#262

Follows up #224
  • Loading branch information
DifferentialOrange committed Dec 24, 2021
1 parent 3dd1c6f commit 36dc9bb
Show file tree
Hide file tree
Showing 10 changed files with 226 additions and 103 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test_on_push.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
- tarantool-version: "2.8"
metrics-version: "0.1.8"
- tarantool-version: "2.8"
metrics-version: "0.9.0"
metrics-version: "0.10.0"
- tarantool-version: "2.8"
coveralls: true
metrics-version: "0.12.0"
Expand Down
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -606,18 +606,21 @@ crud.disable_stats()
crud.reset_stats()
```

If [`metrics`](https://github.com/tarantool/metrics) `0.9.0` or greater
If [`metrics`](https://github.com/tarantool/metrics) `0.10.0` or greater
found, metrics collectors will be used by default to store statistics
instead of local collectors. You can manually choose driver if needed.
```lua
-- Use metrics collectors.
-- Use metrics collectors. (Default if metrics found).
crud.enable_stats({ driver = 'metrics' })

-- Use metrics collectors with 0.99 quantile.
crud.enable_stats({ driver = 'metrics', quantiles = true })

-- Use simple local collectors.
crud.enable_stats({ driver = 'local' })
```
Performance overhead is 3-5% in case of `local` driver and
10-20% in case of `metrics` driver.
Performance overhead is 3-7% in case of `local` driver and
5-10% in case of `metrics` driver, up to 20% for `metrics` with quantiles.

Format is as follows.
```
Expand Down Expand Up @@ -668,7 +671,8 @@ Each operation section contains of different collectors
for success calls and error (both error throw and `nil, err`)
returns. `count` is total requests count since instance start
or stats restart. `latency` is 0.99 quantile of request execution
time if `metrics` driver used, otherwise `latency` is total average.
time if `metrics` driver used and quantiles enabled,
otherwise `latency` is total average.
`time` is total time of requests execution.

In [`metrics`](https://www.tarantool.io/en/doc/latest/book/monitoring/)
Expand Down
43 changes: 28 additions & 15 deletions crud/stats/local_registry.lua
Original file line number Diff line number Diff line change
@@ -1,23 +1,36 @@
local errors = require('errors')

local dev_checks = require('crud.common.dev_checks')
local op_module = require('crud.stats.operation')
local registry_common = require('crud.stats.registry_common')
local stash = require('crud.stats.stash')

local registry = {}
local internal_registry = stash.get('local_registry')
local internal = stash.get('local_registry')
local StatsLocalError = errors.new_class('StatsLocalError', {capture_stack = false})

--- Initialize local metrics registry
--
-- Registries are not meant to used explicitly
-- by users, init is not guaranteed to be idempotent.
--
-- @function init
-- @tparam table opts
--
-- @tfield boolean quantiles
-- Quantiles is not supported for local, only `false` is valid.
--
-- @treturn boolean Returns true.
--
function registry.init()
internal_registry.spaces = {}
internal_registry.space_not_found = 0
function registry.init(opts)
dev_checks({ quantiles = 'boolean' })

StatsLocalError:assert(opts.quantiles == false,
"Quantiles are not supported for 'local' statistics registry")

internal.registry = {}
internal.registry.spaces = {}
internal.registry.space_not_found = 0

return true
end
Expand All @@ -32,7 +45,7 @@ end
-- @treturn boolean Returns true.
--
function registry.destroy()
internal_registry = stash.reset('local_registry')
internal.registry = nil

return true
end
Expand All @@ -58,10 +71,10 @@ function registry.get(space_name)
dev_checks('?string')

if space_name ~= nil then
return table.deepcopy(internal_registry.spaces[space_name]) or {}
return table.deepcopy(internal.registry.spaces[space_name]) or {}
end

return table.deepcopy(internal_registry)
return table.deepcopy(internal.registry)
end

--- Check if space statistics are present in registry
Expand All @@ -76,7 +89,7 @@ end
function registry.is_unknown_space(space_name)
dev_checks('string')

return internal_registry.spaces[space_name] == nil
return internal.registry.spaces[space_name] == nil
end

--- Increase requests count and update latency info
Expand All @@ -101,8 +114,8 @@ end
function registry.observe(latency, space_name, op, status)
dev_checks('number', 'string', 'string', 'string')

registry_common.init_collectors_if_required(internal_registry.spaces, space_name, op)
local collectors = internal_registry.spaces[space_name][op][status]
registry_common.init_collectors_if_required(internal.registry.spaces, space_name, op)
local collectors = internal.registry.spaces[space_name][op][status]

collectors.count = collectors.count + 1
collectors.time = collectors.time + latency
Expand All @@ -118,7 +131,7 @@ end
-- @treturn boolean Returns true.
--
function registry.observe_space_not_found()
internal_registry.space_not_found = internal_registry.space_not_found + 1
internal.registry.space_not_found = internal.registry.space_not_found + 1

return true
end
Expand All @@ -142,8 +155,8 @@ function registry.observe_fetch(tuples_fetched, tuples_lookup, space_name)
dev_checks('number', 'number', 'string')

local op = op_module.SELECT
registry_common.init_collectors_if_required(internal_registry.spaces, space_name, op)
local collectors = internal_registry.spaces[space_name][op].details
registry_common.init_collectors_if_required(internal.registry.spaces, space_name, op)
local collectors = internal.registry.spaces[space_name][op].details

collectors.tuples_fetched = collectors.tuples_fetched + tuples_fetched
collectors.tuples_lookup = collectors.tuples_lookup + tuples_lookup
Expand All @@ -167,8 +180,8 @@ function registry.observe_map_reduces(count, space_name)
dev_checks('number', 'string')

local op = op_module.SELECT
registry_common.init_collectors_if_required(internal_registry.spaces, space_name, op)
local collectors = internal_registry.spaces[space_name][op].details
registry_common.init_collectors_if_required(internal.registry.spaces, space_name, op)
local collectors = internal.registry.spaces[space_name][op].details

collectors.map_reduces = collectors.map_reduces + count

Expand Down
Loading

0 comments on commit 36dc9bb

Please sign in to comment.