Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observability improvements #1896

Open
ThomWright opened this issue Jun 6, 2022 · 3 comments
Open

Observability improvements #1896

ThomWright opened this issue Jun 6, 2022 · 3 comments

Comments

@ThomWright
Copy link

ThomWright commented Jun 6, 2022

Context

Running an sqlx application in production, there are several outputs I would find really useful to make my application more observable.

These could be three separate issues, but I'm going to file them all together for now for convenience.

I haven't put any thought into implementation or APIs. I'd first like to find out whether this is something y'all would consider including in sqlx. If so, I'm happy to do some thinking about APIs and internals.

I also haven't looked at what similar libraries do. I've previously wrapped pg in Node.js to achieve these same results, but it might be worth taking a wider look at the ecosystem for inspiration.

RED metrics for queries and transactions

RED (Rate, Errors, Duration) metrics would tell us how many queries/transactions we're executing per second, how long they take, and how many of them error.

Being able to create histograms could be useful, but maybe unnecessarily complex. Instead, the following might suffice:

  • Rate – a count of queries/transactions executed
  • Errors – a count of errors returned from queries/transactions
  • Duration – a count of [milliseconds|microseconds] spent executing queries

It could be useful to break these down by query somehow, but I haven't put any thought into this.

Traces for queries and transactions

This has been mentioned in #942, specifically here.

For those using tracing, support for tracing both transactions and queries would be great.

An example illustrating what I have in mind:

Tracing example

EDIT: I've seen it suggested that e.g. connecting and acquiring connections from pools could be spans too.

USE metrics for connection pools

Database connection pools can be a common point of saturation under load, affecting latency and sometimes availability. This is especially true with PostgreSQL (as opposed to e.g. MySQL), where the number of connections is fairly constrained. This makes it fairly important to be able to get some observability around pool resources.

Specifically, it would be really useful to expose USE (Utilisation, Saturation, Errors) metrics for sqlx::Pool.

I'm thinking something along the lines of:

  • Utilisation – a count of [milliseconds|microseconds] connections have spent active and in use, but not necessarily executing queries
  • Saturation – a count of [milliseconds|microseconds] queries have spent waiting for an available connection
  • Errors – a count of connection errors returned (e.g. PoolTimedOut)

The utilisation metric could be sampled currently by using max_connections, Pool::size() and Pool::num_idle(). This would tell us utilisation for a given instant. But this is less powerful than being able to compute average utilisation over time.

As an example using PromQL, we could calculate average per-minute utilisation using something like:

rate(sqlx_connection_active_used_seconds_total[1m]) / sqlx_max_connections

Further reading

@ThomWright
Copy link
Author

I've just found #1860 and #1900, which is great!

For anyone interested, there's some good discussion in there about implementing a connection acquisition wait metric (for pool saturation).

@DenuxPlays
Copy link

DenuxPlays commented Mar 20, 2024

Are they any alternatives?
I search for a way to add some query metrics (similiar to the described RED metrics).
Is there a "workaround" until this is implemented in SQLx?
I also use Sea-ORM but I haven't found anything their either.

@alper
Copy link

alper commented Apr 19, 2024

I don't get why this doesn't work:

std::env::set_var("RUST_LOG", "tower_http=debug,sqlx::query=warn");

Shouldn't it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants