forked from prometheus/client_ruby
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores #1
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dmagliola
force-pushed
the
pluggable_data_stores
branch
4 times, most recently
from
September 14, 2018 16:45
c787dab
to
e0e87e7
Compare
dmagliola
force-pushed
the
pluggable_data_stores
branch
7 times, most recently
from
October 16, 2018 14:54
4c9a9be
to
e2c1154
Compare
dmagliola
changed the title
WIPWIPWIWPIPWIPWIWPIPW Pluggable data stores
Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores
Oct 16, 2018
dmagliola
requested review from
ernesto-jimenez,
Sinjo,
tscolari and
lawrencejones
October 16, 2018 15:21
tscolari
approved these changes
Oct 16, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It LGTM!
I think it's looking a lot better
dmagliola
force-pushed
the
pluggable_data_stores
branch
from
October 18, 2018 16:34
0c1c0d4
to
1943b2a
Compare
Ruby < 2.0 has been EOL'd over 3 years ago, and we want to use kwargs, so we're dropping support for it.
This should make it cleaner and more obvious to call into these methods.
Instead of allowing the first observation / setting of a metric to have any labels, and then validating that any further observations match those, we now require declaring what labels will be specified at the time of instantiating the metric. The Validator now only needs to verify that the provided labels match the expectation, without needing to keep track of the first observation. For now, it's still caching the validations, for performance, but that should disappear in a soon-to-come commit. `base_labels` now works slightly different. Instead of being a "base" value that is set when creating the metric, and then combined with the "other" labels at the time of exporting, it's now called `preset_values`, and it that get merged as part of the labels stored with every observation. This means the storage that holds the values will always have all the labels, including the common ones. This allows having a method like the `labels()` one recommended by the best practices, to have a "view" of a metric with some labels pre-applied. This will be added on a future commit, once we have a central data store. Finally, this needed a little adaptation of the Collector middleware that comes bundled in. This Collector automatically tracks metrics with some pre-defined labels, but allows the user to define which labels they want, on a per-request basis, by specifying a `proc` that will generate the labels given a Rack `env`. Given this design, we can't know what labels the custom `proc` will return at the time of initializing the metric, and we don't have an `env` to pass to them, so the interface of these procs has changed such that they now need to be able to handle an empty `env`, and still return a `hash` with all the right keys.
Expose only `sum` and `count` instead, with no quantiles / percentiles. The main reason to do this is that all this change is based on the idea of having a "Store" interface that can be shared between Ruby processes. The `quantile` gem doesn't play well with this, since we'd need to store instances of `Quantile::Estimator`, which is a complex data structure, and tricky to share between Ruby processes. Moreover, individual Summaries on different processes cannot be aggregated, so all processes would actually have to share one instance of this class, which makes it extremely tricky, particularly to do performantly. Even though this is a loss of functionality, this puts the Ruby client on par with other client libraries, like the Python one, which also only offers sum and count without quantiles. Also, this is actually more compliant with the Client Library best practices: - Summary is ENCOURAGED to offer quantiles as exports - MUST allow not having quantiles, since just count and sum is quite useful - This MUST be the default The original client didn't comply with the last 2 rules, where this one does, just like the Python client. And quantiles, while seemingly the point of summaries, are encouraged but not required. I'm not discarding the idea of adding quantiles back. I have ideas on how this could be done, but it'd probably be pretty expensive, as a single Estimator instance would have to be marshalled / unmarshalled into bytes for every call to `observe`, and it'd have a hard requirement of having some sort of low-cardinality Process ID as a label (possible with Unicorn) to avoid aggregation.
The reason we need this gem is to be able to provide a good concurrency story in our "Data Stores". We want the data stores to be thread-safe to make it easy for metrics to use them, and to shield away the complexity so that the store can do what's most efficient, and metrics don't need to make assumptions on how it works. However, we also need metrics to be able to update multiple values at once so in some cases (most notably, histograms), the store does need to provide a synchronization method. Since the normal `set` / `increment` methods call a Mutex already, having a Mutex block around them means we end up calling `Mutex.synchronize` twice, which *should* work, but Ruby Mutexes are not reentrant. `concurrent-ruby` provides a Reentrant lock. It's a Read-Write Lock, not a Mutex, but by always grabbing Write locks, it's functionally the same
Currently, each Metric object stores its data (labels and values) in a Hash object internal to itself. This change introduces the concept of a Data Store that is external to the metrics themselves, which holds all the data. The main reason to do this is that by having a standardized and simple interface that metrics use to access the store, we abstract away the details of storing the data from the specific needs of each metric. This allows us to then simply swap around the stores based on the needs of different circumstances, with no changes to the rest of the client. The main use case for this is solving the "pre-fork" server scenario, where multiple processes need to share the metric storage, to be able to report coherent numbers to Prometheus. This change provides one store, a Simple Hash, that pretty much has the same characteristics that the existing Storage had. Future commits may introduce new ones, and also each consumer of the Prometheus Client can make their own and simply swap them for the built-in ones, if they have specific needs. The interface and requirements of Stores are specified in detail in a README.md file in the `client/data_stores` directory. This is the documentation that must be used by anyone wishing to create their own store.
The existing spec on `Formats::Text` is based on mocking a registry with fake metrics and values inside, which means if anything changes in the interface for metrics, the test will not catch it. And there's no test validating that interface. This test is more realistic, and it actually catches the kind of bugs we introduced in the process of this refactor
`with_labels` returns a metric object, augmented with some pre-set labels. This is useful to be able to reduce repetition if a certain part of the code is passing the same label over and over again, but the label can have different values in different parts of the codebase (so passing in `preset_labels` when declaring the metric is not appropriate.
Every time a metric gets observed / incremented, we are composing the labels hash, merging the labels that were just passed in as part of the observation with the metric's `preset_labels`, and validating that the labelset is valid. If all labels are pre-set already, however (either as `preset_labels` when declaring the metric, or by use of `with_labels`), we can validate when the labels are pre-set, and then skip the validation on every observation, which can lead to some decent time savings if a metric is observed often.
LabelSetValidator has methods `valid?` and `validate`. It's not very clear what these do, what's the difference between them, and the name `valid?` would imply (by convention) that it returns a Boolean, rather than raising an exception on invalid input. Renamed these to `validate_symbols!` and `validate_labelset!`, to make it clearer what each of them do. The `!` also follows the usual Ruby convention of "do X, and if that doesn't work, Raise" This commit also starts drawing the distinction between an array of `labels`, and a hash of label keys and values (which we call a `labelset`). The term `labels` is used interchangeably for both concepts throughout the code. This commit doesn't fix all instances, but it's a step in that direction.
dmagliola
force-pushed
the
pluggable_data_stores
branch
from
October 18, 2018 16:37
1943b2a
to
8480c94
Compare
Look, we haz Travis now! |
Closed to avoid confusing with the actual PR we've created in the actual prometheus client repo |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR does a lot of things. It should really be 10 different PRs, one per commit. That was highly impractical, though.
However, please don't click "Files Changed" on the top-right, that's going to be impossible to digest. I really suggest going through these commits one-by-one
Main things this PR does:
All of this is groundwork to allow us to tackle the real issue we want to tackle: Pre-fork servers
We will be working on that, and on other improvements to the Client to bring it closer in line to the Prometheus Best Practices for Client Libraries on subsequent PRs