forked from prometheus/client_ruby
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Daniel Magliola
committed
Sep 14, 2018
1 parent
8f036af
commit e0e87e7
Showing
3 changed files
with
304 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,218 @@ | ||
# Custom Data Stores | ||
|
||
Stores are basically an abstraction over a Hash, whose keys are in turn a Hash of labels | ||
plus a metric name. The intention behind having different data stores is solving | ||
different requirements for different production scenarios, or performance tradeoffs. | ||
|
||
The most common of these scenarios are pre-fork servers like Unicorn, which have multiple | ||
separate processes gathering metrics. If each of these had their own store, the metrics | ||
reported on each Prometheus scrape would be different, depending on which process handles | ||
the request. Solving this requires some sort of shared storage between these processes, | ||
and there are many ways to solve this problem, each with their own tradeoffs. | ||
|
||
This abstraction allows us to easily plug in the most adequate store for each scenario. | ||
|
||
## Interface | ||
|
||
Stores must expose `set`, `increment`, `get`, `label_sets` and `values` methods, which | ||
are explained in the code sample below. | ||
|
||
All values stored are `Float`s. | ||
|
||
Internally, they can store the data however they need to based on their | ||
requirements. For example, a store that needs to work in a multi-process | ||
environment needs to have a shared section of memory, via either Files, | ||
an MMap, or whatever the implementor wants for their particular use case. | ||
|
||
Stores MUST be thread safe. | ||
|
||
Ideally, multiple keys should be modifiable simultaneously, but this is not a | ||
hard requirement. | ||
|
||
This is what the interface looks like, in practice: | ||
|
||
```ruby | ||
module Prometheus | ||
module DataStores | ||
class CustomStore | ||
# Store a value for a metric and a set of labels | ||
# Internally, may add extra "labels" to disambiguate values between, | ||
# for example, different processes | ||
def set(metric_name:, labels:, val:) | ||
raise NotImplementedError | ||
end | ||
|
||
def increment(metric_name:, labels:, by: 1) | ||
raise NotImplementedError | ||
end | ||
|
||
# Return a value for a metric and a set of labels | ||
# Will return the same value stored by `set`, as opposed to `values`, which may | ||
# return multiple values that need to be aggregated | ||
# | ||
# For example, in a multi-process scenario, `set` may add an extra internal | ||
# label tagging the value with the process id. `get` will return the value for | ||
# "this" process ID. `values` will return the values for all process IDs, for | ||
# this label set. | ||
def get(metric_name:, labels:) | ||
raise NotImplementedError | ||
end | ||
|
||
# Return all the different sets of labels seen by `set` (not including internal | ||
# labels the Store itself may've added for its own tracking purposes) | ||
def label_sets(metric_name:) | ||
raise NotImplementedError | ||
end | ||
|
||
# Returns all the values seen by the Store for a given set of labels in a | ||
# metric. | ||
# Returns anything, may be a single value, a hash, an array, etc. | ||
# | ||
# The values returned by this method cannot be used directly. They | ||
# need to be processed by an `Aggregator` which will generally be Store | ||
# Dependent, and which knows what to do with those keys, and how to combine all | ||
# the values into one coherent one. | ||
# | ||
# For example, in a store that works for multiple processes, this may return a | ||
# hash, keyed by `__pid`. | ||
# | ||
# Because of this, Stores are tied to the Aggregators that know how to deal | ||
# with whatever `values` will return for their specific store. (This is why | ||
# Aggregators are nested inside a Store class) | ||
def values(metric_name:, labels:) | ||
raise NotImplementedError | ||
end | ||
|
||
# Aggegators take the potentially multiple values returned by `DataStore#values` | ||
# and aggregate them into one coherent value. | ||
# | ||
# This is necessary when storing multiple separate values for one particular | ||
# combination of `metric_name` and `labels`, such as in a multi-process | ||
# environment where different process keep separate values. | ||
# | ||
# A store-specific aggregator is necessary because the rest of the Prometheus | ||
# client shouldn't know how the Store deals with its data internally. Also, more | ||
# than one aggregator may be necessary for different metrics. For example, | ||
# Counters would probably have an aggregator that simply `sums` the values, but | ||
# a Gauge will need an aggregator that can either take the `max`, `min`, `sum`, | ||
# etc of the multiple values set for the "same" gauge, and that needs to be | ||
# configurable per metric. | ||
class BaseAggregator | ||
def self.aggregate(values) | ||
end | ||
end | ||
|
||
# When creating a new metric, users can define which Aggregator to use when | ||
# exporting that particular metric. This is useful for Gauges, where some will | ||
# want a `max`, some a` sum`, etc. | ||
# | ||
# For most metrics, however, users shouldn't need to care, and in the default | ||
# store where each set of labels only has one value, there is only one | ||
# aggregator that makes sense. As such, we need a default so users can create | ||
# metrics without having to explicitly define an aggregator. | ||
def self.default_aggregator | ||
BaseAggregator | ||
end | ||
end | ||
end | ||
end | ||
``` | ||
|
||
|
||
|
||
## Sample, imaginary multi-process Data Store | ||
|
||
This is just an example of how one could implement a data store, and a clarification on | ||
what `Aggregators` are. | ||
|
||
Important: This is **VAPORWARE**, intended simply to show how this could work / how to | ||
implement these interfaces. | ||
|
||
|
||
```ruby | ||
module Prometheus | ||
module DataStores | ||
class SampleMultiprocessStore | ||
def initialize | ||
@store = MagicHashSharedBetweenProcesses.new # PStore, for example | ||
end | ||
|
||
def set(metric_name:, labels:, val:) | ||
@store[store_key(metric_name, labels)] = val | ||
end | ||
|
||
def increment(metric_name:, labels:, by: 1) | ||
@store[store_key(metric_name, labels)] += by | ||
end | ||
|
||
def get(metric_name:, labels:) | ||
@store[store_key(metric_name, labels)] | ||
end | ||
|
||
def store_key(metric_name, labels) | ||
labels.merge( | ||
{ "__metric_name" => metric_name, | ||
"__pid" => Process.pid } | ||
) | ||
end | ||
|
||
def label_sets(metric_name:) | ||
all_store_values.keys.select do |labels| | ||
labels["__metric_name"] == metric_name | ||
end.map do |labels| | ||
labels.reject { |k,_| ["__metric_name", "__pid"].contains?(k) } | ||
end.uniq | ||
end | ||
|
||
# Returns a hash of { pid => value } | ||
def values(metric_name:, labels:) | ||
labels_plus_metric_name = labels.merge("__metric_name" => metric_name) | ||
labels_plus_metric_name_keys = labels_plus_metric_name.keys | ||
|
||
all_store_values.each_with_object({}) do |(k, v), acc| | ||
unless k.values_at(*labels_plus_metric_name_keys) == labels_plus_metric_name | ||
next | ||
end | ||
acc[k['__pid']] == v | ||
end | ||
end | ||
|
||
def all_store_values | ||
# This assumes there's a something common that all processes can write to, and | ||
# it's magically synchronized (which is not true of a PStore, for example, but | ||
# would of some sort of external data store like Redis, Memcached, SQLite) | ||
|
||
# This could also have some sort of: | ||
# file_list = Dir.glob(File.join(path, '*.db')).sort | ||
# which reads all the PStore files / MMapped files, etc, and returns a hash | ||
# with all of them together, which then `values` and `label_sets` can use | ||
end | ||
|
||
class SumAggregator | ||
# For the most part, we just need to sum the values from all processes | ||
def self.aggregate(values) | ||
values.values.sum | ||
end | ||
end | ||
|
||
class GaugeAggregator | ||
MODES = [SUM = :sum, MAX = :max, MIN = :min] | ||
def initialize(mode:) | ||
# TODO: Validate the `mode` | ||
@mode = mode | ||
end | ||
|
||
def self.aggregate(values) | ||
# this is a horrible way to do this, I'm just explaining the idea | ||
values.values.send(@mode) | ||
end | ||
end | ||
|
||
def self.default_aggregator | ||
SumAggregator | ||
end | ||
end | ||
end | ||
end | ||
|
||
``` |
73 changes: 73 additions & 0 deletions
73
lib/prometheus/client/data_stores/complex_hash_sample_crap.rb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
# module Prometheus | ||
# module Client | ||
# module DataStores | ||
# # Stores all the data in a simple, synchronized global Hash | ||
# # | ||
# # There are ways of making this faster (because of the naive Mutex usage). | ||
# class SimpleHash | ||
# def initialize | ||
# @store = Hash.new { |hash, key| hash[key] = 0.0 } | ||
# @mutex = Mutex.new | ||
# end | ||
# | ||
# def for_metric(metric_name:, metric_type:, metric_settings:) | ||
# validate_metric_settings(metric_type: metric_type, metric_settings: metric_settings) | ||
# | ||
# MetricStore.new(store: self, metric_name: metric_name, metric_type: metric_type, metric_settings: metric_settings) | ||
# end | ||
# | ||
# def synchronize | ||
# @mutex.synchronize { yield } | ||
# end | ||
# | ||
# protected | ||
# | ||
# def store | ||
# @store | ||
# end | ||
# | ||
# class MetricStore | ||
# def initialize(store:, metric_name:, metric_type:, metric_settings:) | ||
# end | ||
# | ||
# def set(labels:, val:) | ||
# @store.synchronize do | ||
# @store.store[store_key(labels)] = val | ||
# end | ||
# end | ||
# | ||
# def increment(metric_name:, labels:, by: 1) | ||
# @store.synchronize do | ||
# @store.store[store_key(labels)] += by | ||
# end | ||
# end | ||
# | ||
# def get(metric_name:, labels:) | ||
# @store.synchronize do | ||
# @store.store[store_key(labels)] | ||
# end | ||
# end | ||
# | ||
# def all_values | ||
# group by the actual labels, ignoring the process ones that we add in the store | ||
# and call aggregate with those | ||
# end | ||
# | ||
# def aggregate(values) | ||
# switch @metric_settings[:mode] do | ||
# | ||
# end | ||
# end | ||
# | ||
# private | ||
# | ||
# def store_key(labels) | ||
# labels.merge( | ||
# { "__metric_name" => @metric_name } | ||
# ) | ||
# end | ||
# end | ||
# end | ||
# end | ||
# end | ||
# end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
- update readme and "complex example" | ||
- add the "examples_for" as pointed at below | ||
- The way we specify the interface for stores is a README.md on the stores directory, | ||
and an "examples_for" spec testing the basics of the interface | ||
- commit | ||
|
||
- make sure the example still works | ||
|
||
- add `with_labels`, and improve the validation performance (there's a stash for that) | ||
- I also have a stash for the error handling strategies | ||
- when popping that, add the "if label_set" to `Metric#get`, which I missed | ||
- rename `LabelSetValidator.valid?` to `valid_symbols?` or something like that... | ||
and `validate` to `valid_set`, maybe? |