Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores #1

dmagliola · 2018-09-13T15:37:27Z

This PR does a lot of things. It should really be 10 different PRs, one per commit. That was highly impractical, though.

However, please don't click "Files Changed" on the top-right, that's going to be impossible to digest. I really suggest going through these commits one-by-one

Main things this PR does:

A few refactorings / improvements here and there
Add keyword arguments to methods, to be more idiomatic within modern Ruby standards
Create the concept of Data Stores, separate from the Metrics the user declares, to have a clear interface that allows us to swap different implementations.

All of this is groundwork to allow us to tackle the real issue we want to tackle: Pre-fork servers

We will be working on that, and on other improvements to the Client to bring it closer in line to the Prometheus Best Practices for Client Libraries on subsequent PRs

tscolari

It LGTM!
I think it's looking a lot better

Ruby < 2.0 has been EOL'd over 3 years ago, and we want to use kwargs, so we're dropping support for it.

This should make it cleaner and more obvious to call into these methods.

Instead of allowing the first observation / setting of a metric to have any labels, and then validating that any further observations match those, we now require declaring what labels will be specified at the time of instantiating the metric. The Validator now only needs to verify that the provided labels match the expectation, without needing to keep track of the first observation. For now, it's still caching the validations, for performance, but that should disappear in a soon-to-come commit. `base_labels` now works slightly different. Instead of being a "base" value that is set when creating the metric, and then combined with the "other" labels at the time of exporting, it's now called `preset_values`, and it that get merged as part of the labels stored with every observation. This means the storage that holds the values will always have all the labels, including the common ones. This allows having a method like the `labels()` one recommended by the best practices, to have a "view" of a metric with some labels pre-applied. This will be added on a future commit, once we have a central data store. Finally, this needed a little adaptation of the Collector middleware that comes bundled in. This Collector automatically tracks metrics with some pre-defined labels, but allows the user to define which labels they want, on a per-request basis, by specifying a `proc` that will generate the labels given a Rack `env`. Given this design, we can't know what labels the custom `proc` will return at the time of initializing the metric, and we don't have an `env` to pass to them, so the interface of these procs has changed such that they now need to be able to handle an empty `env`, and still return a `hash` with all the right keys.

Expose only `sum` and `count` instead, with no quantiles / percentiles. The main reason to do this is that all this change is based on the idea of having a "Store" interface that can be shared between Ruby processes. The `quantile` gem doesn't play well with this, since we'd need to store instances of `Quantile::Estimator`, which is a complex data structure, and tricky to share between Ruby processes. Moreover, individual Summaries on different processes cannot be aggregated, so all processes would actually have to share one instance of this class, which makes it extremely tricky, particularly to do performantly. Even though this is a loss of functionality, this puts the Ruby client on par with other client libraries, like the Python one, which also only offers sum and count without quantiles. Also, this is actually more compliant with the Client Library best practices: - Summary is ENCOURAGED to offer quantiles as exports - MUST allow not having quantiles, since just count and sum is quite useful - This MUST be the default The original client didn't comply with the last 2 rules, where this one does, just like the Python client. And quantiles, while seemingly the point of summaries, are encouraged but not required. I'm not discarding the idea of adding quantiles back. I have ideas on how this could be done, but it'd probably be pretty expensive, as a single Estimator instance would have to be marshalled / unmarshalled into bytes for every call to `observe`, and it'd have a hard requirement of having some sort of low-cardinality Process ID as a label (possible with Unicorn) to avoid aggregation.

The reason we need this gem is to be able to provide a good concurrency story in our "Data Stores". We want the data stores to be thread-safe to make it easy for metrics to use them, and to shield away the complexity so that the store can do what's most efficient, and metrics don't need to make assumptions on how it works. However, we also need metrics to be able to update multiple values at once so in some cases (most notably, histograms), the store does need to provide a synchronization method. Since the normal `set` / `increment` methods call a Mutex already, having a Mutex block around them means we end up calling `Mutex.synchronize` twice, which *should* work, but Ruby Mutexes are not reentrant. `concurrent-ruby` provides a Reentrant lock. It's a Read-Write Lock, not a Mutex, but by always grabbing Write locks, it's functionally the same

Currently, each Metric object stores its data (labels and values) in a Hash object internal to itself. This change introduces the concept of a Data Store that is external to the metrics themselves, which holds all the data. The main reason to do this is that by having a standardized and simple interface that metrics use to access the store, we abstract away the details of storing the data from the specific needs of each metric. This allows us to then simply swap around the stores based on the needs of different circumstances, with no changes to the rest of the client. The main use case for this is solving the "pre-fork" server scenario, where multiple processes need to share the metric storage, to be able to report coherent numbers to Prometheus. This change provides one store, a Simple Hash, that pretty much has the same characteristics that the existing Storage had. Future commits may introduce new ones, and also each consumer of the Prometheus Client can make their own and simply swap them for the built-in ones, if they have specific needs. The interface and requirements of Stores are specified in detail in a README.md file in the `client/data_stores` directory. This is the documentation that must be used by anyone wishing to create their own store.

The existing spec on `Formats::Text` is based on mocking a registry with fake metrics and values inside, which means if anything changes in the interface for metrics, the test will not catch it. And there's no test validating that interface. This test is more realistic, and it actually catches the kind of bugs we introduced in the process of this refactor

`with_labels` returns a metric object, augmented with some pre-set labels. This is useful to be able to reduce repetition if a certain part of the code is passing the same label over and over again, but the label can have different values in different parts of the codebase (so passing in `preset_labels` when declaring the metric is not appropriate.

Every time a metric gets observed / incremented, we are composing the labels hash, merging the labels that were just passed in as part of the observation with the metric's `preset_labels`, and validating that the labelset is valid. If all labels are pre-set already, however (either as `preset_labels` when declaring the metric, or by use of `with_labels`), we can validate when the labels are pre-set, and then skip the validation on every observation, which can lead to some decent time savings if a metric is observed often.

LabelSetValidator has methods `valid?` and `validate`. It's not very clear what these do, what's the difference between them, and the name `valid?` would imply (by convention) that it returns a Boolean, rather than raising an exception on invalid input. Renamed these to `validate_symbols!` and `validate_labelset!`, to make it clearer what each of them do. The `!` also follows the usual Ruby convention of "do X, and if that doesn't work, Raise" This commit also starts drawing the distinction between an array of `labels`, and a hash of label keys and values (which we call a `labelset`). The term `labels` is used interchangeably for both concepts throughout the code. This commit doesn't fix all instances, but it's a step in that direction.

dmagliola · 2018-10-18T16:39:09Z

Look, we haz Travis now!

dmagliola · 2018-10-23T09:27:12Z

Closed to avoid confusing with the actual PR we've created in the actual prometheus client repo

Remove deprecation notice from 2 years ago

4c03959

dmagliola force-pushed the pluggable_data_stores branch 4 times, most recently from c787dab to e0e87e7 Compare September 14, 2018 16:45

dmagliola force-pushed the pluggable_data_stores branch 7 times, most recently from 4c9a9be to e2c1154 Compare October 16, 2018 14:54

dmagliola changed the title ~~WIPWIPWIWPIPWIPWIWPIPW Pluggable data stores~~ Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores Oct 16, 2018

dmagliola requested review from ernesto-jimenez, Sinjo, tscolari and lawrencejones October 16, 2018 15:21

tscolari approved these changes Oct 16, 2018

View reviewed changes

dmagliola force-pushed the pluggable_data_stores branch from 0c1c0d4 to 1943b2a Compare October 18, 2018 16:34

Daniel Magliola added 10 commits October 18, 2018 17:37

Remove support for Ruby 1.9.3

461a1fc

Ruby < 2.0 has been EOL'd over 3 years ago, and we want to use kwargs, so we're dropping support for it.

Add keyword arguments to the API where it makes sense

c7e1784

This should make it cleaner and more obvious to call into these methods.

dmagliola force-pushed the pluggable_data_stores branch from 1943b2a to 8480c94 Compare October 18, 2018 16:37

dmagliola closed this Oct 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores #1

Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores #1

dmagliola commented Sep 13, 2018 •

edited

Loading

tscolari left a comment

dmagliola commented Oct 18, 2018

dmagliola commented Oct 23, 2018

Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores #1

Abstract Data Storage away from Metric objects, and introduce Swappable Data Stores #1

Conversation

dmagliola commented Sep 13, 2018 • edited Loading

tscolari left a comment

Choose a reason for hiding this comment

dmagliola commented Oct 18, 2018

dmagliola commented Oct 23, 2018

dmagliola commented Sep 13, 2018 •

edited

Loading