-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull metrics out of Clickhouse, expose 'em through Nexus' API #1131
Comments
Just to expand on a few things here when it comes to the API and related:
|
Thanks for putting this up @smklein. I'll try to add the writing I've done on this in the past as I find it, to at least start collecting some thoughts and ideas. A few things come to mind right now. @rmustacc As far as discover, there's an existing endpoint in Nexus to list the timeseries schema. These are not the individual timeseries, such as the number of bytes send out of this particular guest NIC, but the general schema that. Though it doesn't yet exist, an endpoint for listing the actual timeseries that exist, possibly restricted to a particular schema, could be both useful and relatively straightforward. It's when we get to running general queries against the actual timeseries data that I'm less confident of the interface. I should say that I've already written a tool that uses a prototype interface for selecting and filtering data. That's implemented here, but in general, it:
This spits out a couple of SQL queries that are run against the tables in ClickHouse. Out pops the data, which may correspond to zero or more timeseries. This all works "fine," in that you can get the correct data out of the database. For the API in Nexus itself, I'm not sure how to structure this. We could transliterate the existing query-builder tooling, which would mean a pretty generic endpoint like This would also probably work just fine, and is likely the easiest way to meet the criteria of getting raw data out that consumers can use. My concern is that it's not very useful for anything else. I don't know how we do aggregations in the database, how to correlate different timeseries (or even align them), or really anything beyond a simple As I mentioned, I'll try to collect more of my thoughts and writings, and either include them here or start an RFD. |
I would be fine with the limited, simple API and doing a bit of processing on the client for now. Do you think we'd able to specify a granularity or maximum number of data points for a given range or something? That would make things easier on us and limit the response size. |
As demo'd by @leftwo on 6/23's hypervisor sync, I think we will very soon have metrics from Crucible volumes and Propolis instances too. Both will be sitting in Clickhouse for now. |
I wanted to drop some thoughts that I've not yet had time to write up formally. I was initially leaning towards a "query-first" API, where clients can basically select ranges of raw data from timeseries and process it however they want. That's flexible, and makes sense when it's not clear how most folks will actually use the data. They get to decide that. On the other hand, the API is harder to build and requires work for the clients we do have, such as graphing data in the console. Talking with others and thinking more about it, a "resource-first" approach may be better. That is, we have endpoints for collecting metrics about a specific resource, such as a VM instance. That would just send back an object that has the latest sample value for some set of metrics. Those metrics may or may not be the same as the metrics stored in the database itself (e.g., this could include a median response latency for Nexus's HTTP server, rather than the histogram we store in ClickHouse). That means, each endpoint would basically boil down to one or more queries to ClickHouse to get the fields it needs and stuff them into the blob returned to the client. This has the drawback of a lot of endpoints, and thus a lot of types. But it's also nice that it decouples the database representation and the actual metrics we're exporting in the API. |
I think I'm fine with either approach, but I want us also to consider the internal usage of clickhouse data for things like failure detection and placement decisions. The internal API will clearly be a different endpoint, but I'm not sure if it should use a similar mechanism. There may be computation that needs to be done across some large chunks of data, but it's also possible we can write specific queries for this type of data and return rust types here also. A placement engine can only use certain data to make decisions, so having a single optimized query to get "placement input" could work. I haven't thought a whole lot about this yet, but I just wanted to make sure the use case was visible. |
I think that would be supported by the "resource-first" approach. That's maybe a poor term. All I meant was that individual endpoints export some data, which they derive from whatever database queries they want. That query could in theory be anything, the main point is that the client doesn't necessarily get raw data from the database, at least in a way that's obvious. They make a GET request to some endpoint, and that returns some chunk of data. The relationship between that response and the raw data in the database is hidden, so that we (1) aren't necessarily required to build a full, generic query language, and (2) aren't foisting all the work of generating useful information on the client. |
As a client dev I don't really know what I would do with "raw" data anyway. For example if I asked for a big date range I would want to be able to ensure I wasn't getting a billion data points. |
Crucible changes: Remove unused fields in IOop (#1149) New downstairs clone subcommand. (#1129) Simplify the do_work_task loop (#1150) Move `Guest` stuff into a module (#1125) Bump nix to 0.27.1 and use new safer Fd APIs (#1110) Move `FramedWrite` work to a separate task (#1145) Use fewer borrows in ExtentInner API (#1147) Update Rust crate reedline to 0.28.0 (#1141) Update Rust crate tokio to 1.36 (#1143) Update Rust crate slog-bunyan to 2.5.0 (#1139) Update Rust crate rayon to 1.8.1 (#1138) Update Rust crate itertools to 0.12.1 (#1137) Update Rust crate byte-unit to 5.1.4 (#1136) Update Rust crate base64 to 0.21.7 (#1135) Update Rust crate async-trait to 0.1.77 (#1134) Discard deferred msgs (#1131) Minor Downstairs cleanup (#1127) Update test_fail_live_repair to support pstop (#1128) Ignore client messages after stopping the IO task (#1126) Move client IO task into a struct (#1124) Bump Rust to 1.75 and fix new Clippy lints (#1123) Propolis changes: PHD: convert to async (#633) PHD: assume specialized Windows images (#636) propolis-standalone-config needn't be a crate standalone: Use tar for snapshot/restore phd: use latest "lab-2.0-opte" target, not a specific version (#637) PHD: add tests for migration of running processes (#623) PHD: fix `cargo xtask phd` tidy not doing anything (#630) PHD: add documentation for `cargo xtask phd` (#629) standalone: improve virtual device creation errors (#632) phd: add Windows Server 2019 guest adapter (#627) PHD: add `cargo xtask phd` to make using PHD nicer (#619)
Crucible changes: Remove unused fields in IOop (#1149) New downstairs clone subcommand. (#1129) Simplify the do_work_task loop (#1150) Move `Guest` stuff into a module (#1125) Bump nix to 0.27.1 and use new safer Fd APIs (#1110) Move `FramedWrite` work to a separate task (#1145) Use fewer borrows in ExtentInner API (#1147) Update Rust crate reedline to 0.28.0 (#1141) Update Rust crate tokio to 1.36 (#1143) Update Rust crate slog-bunyan to 2.5.0 (#1139) Update Rust crate rayon to 1.8.1 (#1138) Update Rust crate itertools to 0.12.1 (#1137) Update Rust crate byte-unit to 5.1.4 (#1136) Update Rust crate base64 to 0.21.7 (#1135) Update Rust crate async-trait to 0.1.77 (#1134) Discard deferred msgs (#1131) Minor Downstairs cleanup (#1127) Update test_fail_live_repair to support pstop (#1128) Ignore client messages after stopping the IO task (#1126) Move client IO task into a struct (#1124) Bump Rust to 1.75 and fix new Clippy lints (#1123) Propolis changes: PHD: convert to async (#633) PHD: assume specialized Windows images (#636) propolis-standalone-config needn't be a crate standalone: Use tar for snapshot/restore phd: use latest "lab-2.0-opte" target, not a specific version (#637) PHD: add tests for migration of running processes (#623) PHD: fix `cargo xtask phd` tidy not doing anything (#630) PHD: add documentation for `cargo xtask phd` (#629) standalone: improve virtual device creation errors (#632) phd: add Windows Server 2019 guest adapter (#627) PHD: add `cargo xtask phd` to make using PHD nicer (#619) Co-authored-by: Alan Hanson <alan@oxide.computer>
Here's the end-user flow we'd like:
What already exists:
The text was updated successfully, but these errors were encountered: