-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry Cardinality Errors and ResourceExhaustedException #2377
Comments
olavloite
added a commit
that referenced
this issue
Sep 27, 2024
The OpenTelemetry Attributes for metrics included a unique identifier for each connection. This can potentially create a very large number of time series, as each connection will be a time serie. Applications that continously create and drop connections will then produce a very large number of time series, which again can result in RESOURCE_EXHAUSTED error being returned from the monitoring backend. Fixes #2377
olavloite
added a commit
that referenced
this issue
Sep 30, 2024
The OpenTelemetry Attributes for metrics included a unique identifier for each connection. This can potentially create a very large number of time series, as each connection will be a time serie. Applications that continously create and drop connections will then produce a very large number of time series, which again can result in RESOURCE_EXHAUSTED error being returned from the monitoring backend. Fixes #2377
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Current Behavior
Receiving multiple rounds of error messages below:
RESOURCE_EXHAUSTED
exceptions start emitting after 1 minute of uptime and tend to repeat every minute thereafter on each podRESOURCE_EXHAUSTED
exceptions within a single PGAdapter instance,pgadapter_connection_id
tends to change over timeContext (Environment)
Other Information
I poked around Metrics Explorer to see if there was anything out of the ordinary. When looking at both
workload.googleapis.com/spanner/pgadapter/roundtrip_latencies
andworkload.googleapis.com/spanner/pgadapter/client_lib_latencies
over the last 3 hours, then changing the aggregation to counting time series, it produces a value of 162,745 which seems like a lot of time series.I inspected another distribution type metric,
spanner.googleapis.com/transaction_stat/total/transaction_latencies
, and it produced a value of 1.I'm not sure if the difference here is a problem, but thought it was interesting enough to mention.
The text was updated successfully, but these errors were encountered: