Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThreadAccountant for query killing is initialized too early #15100

Open
jadami10 opened this issue Feb 20, 2025 · 3 comments
Open

ThreadAccountant for query killing is initialized too early #15100

jadami10 opened this issue Feb 20, 2025 · 3 comments
Labels

Comments

@jadami10
Copy link
Contributor

It seems the initializeThreadAccountant is made in https://github.com/apache/pinot/blob/master/pinot-server/src/main/java/org/apache/pinot/server/starter/helix/BaseServerStarter.java#L673-L674. Most importantly this is done before startupServiceStatusCheck.

During startupServiceStatusCheck, servers may be under heavy load reloading segments or catching up on kafka ingestion. It's expected their CPU and memory usage will be high, but they are not responding to queries that this time.

The default thread account will actually measure, log, and fire metrics every 10-30ms in that time. This make it tough to monitor this feature because you can't tell if it's firing due to a query or a server being restarted.

I believe we should definitely move the initializeThreadAccountant call down. The best 2 options I see are:

  • right before preServeQueries
  • right after preServeQueries

We don't use preServeQueries yet, so I don't know what users intend. I imagine we'd want to do it before. The reason being

  • preServeQueries should likely be the same profile as real queries you'll see
  • so if you want this feature to work on real queries, it should work on preServeQueries, too
@Jackie-Jiang
Copy link
Contributor

@jasperjiaguo Can you help take a look?

@gortiz
Copy link
Contributor

gortiz commented Feb 21, 2025

The default thread account will actually measure, log, and fire metrics every 10-30ms in that time. This make it tough to monitor this feature because you can't tell if it's firing due to a query or a server being restarted.

As an alternative, couldn't we emit metrics with an extra tag to indicate the phase we are in?

@jadami10
Copy link
Contributor Author

As an alternative, couldn't we emit metrics with an extra tag to indicate the phase we are in?

We could, but I feel like there's some big downsides

  • Pinot's metrics framework is fairly messy for tags. It's not "pass in a list of tags". It's more like, "create a new function called emitServerMeterWtihPhaseTag". So I don't like adding to tags that don't exist
  • this feature logs and emits metrics every 10-30ms by default. At scale, this is actually quite expensive since you're paying for those logs and metrics either in storage costs or to some observability vendor
  • it wastes CPU time when the server is not actually serving queries/catching up. I don't see any need for "query killing" to be enabled before the server is even ready to serve queries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants