You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During startupServiceStatusCheck, servers may be under heavy load reloading segments or catching up on kafka ingestion. It's expected their CPU and memory usage will be high, but they are not responding to queries that this time.
The default thread account will actually measure, log, and fire metrics every 10-30ms in that time. This make it tough to monitor this feature because you can't tell if it's firing due to a query or a server being restarted.
I believe we should definitely move the initializeThreadAccountant call down. The best 2 options I see are:
right before preServeQueries
right after preServeQueries
We don't use preServeQueries yet, so I don't know what users intend. I imagine we'd want to do it before. The reason being
preServeQueries should likely be the same profile as real queries you'll see
so if you want this feature to work on real queries, it should work on preServeQueries, too
The text was updated successfully, but these errors were encountered:
The default thread account will actually measure, log, and fire metrics every 10-30ms in that time. This make it tough to monitor this feature because you can't tell if it's firing due to a query or a server being restarted.
As an alternative, couldn't we emit metrics with an extra tag to indicate the phase we are in?
As an alternative, couldn't we emit metrics with an extra tag to indicate the phase we are in?
We could, but I feel like there's some big downsides
Pinot's metrics framework is fairly messy for tags. It's not "pass in a list of tags". It's more like, "create a new function called emitServerMeterWtihPhaseTag". So I don't like adding to tags that don't exist
this feature logs and emits metrics every 10-30ms by default. At scale, this is actually quite expensive since you're paying for those logs and metrics either in storage costs or to some observability vendor
it wastes CPU time when the server is not actually serving queries/catching up. I don't see any need for "query killing" to be enabled before the server is even ready to serve queries
It seems the
initializeThreadAccountant
is made in https://github.com/apache/pinot/blob/master/pinot-server/src/main/java/org/apache/pinot/server/starter/helix/BaseServerStarter.java#L673-L674. Most importantly this is done beforestartupServiceStatusCheck
.During
startupServiceStatusCheck
, servers may be under heavy load reloading segments or catching up on kafka ingestion. It's expected their CPU and memory usage will be high, but they are not responding to queries that this time.The default thread account will actually measure, log, and fire metrics every 10-30ms in that time. This make it tough to monitor this feature because you can't tell if it's firing due to a query or a server being restarted.
I believe we should definitely move the
initializeThreadAccountant
call down. The best 2 options I see are:We don't use
preServeQueries
yet, so I don't know what users intend. I imagine we'd want to do it before. The reason beingpreServeQueries
should likely be the same profile as real queries you'll seepreServeQueries
, tooThe text was updated successfully, but these errors were encountered: