Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Multistage] Improve Observability for Diagnosing Performance of Multistage Engine Queries #15057

Open
satwik-pachigolla opened this issue Feb 13, 2025 · 1 comment
Assignees
Labels
multi-stage Related to the multi-stage query engine observability

Comments

@satwik-pachigolla
Copy link

Observability to diagnose performance of the multistage engine queries is lacking. On separate occasions, we observed latency spikes and timeouts for multistage engine queries for a brief period of time. There aren't enough useful logs or metrics to pinpoint which instance(s) is taking longer retrospectively.

Some specific gaps:

  • Query failures without metrics
  • No logs or stats in response metadata that can be used to identify slow instances in any stage of executing a query. Or no way of correlating broker request IDs to the logs or stats.
  • Timeouts are difficult to diagnose without taking an approach such as increasing the timeout, rerunning, and then profiling the query but once again that does not enable retrospective debugging. All that is available retrospectively are a high volume of logs across many instances such as:
Caught exception while processing query
[2025-02-01 10:33:42.386841] java.util.concurrent.TimeoutException: Timed out while offering data to mailbox

these logs do not provide further useful information.

While all the above observations seem to be unchanged in the latest version as of this posting, these observations were from Pinot version 17332de which is very slightly ahead of the 1.2 release version.

@Jackie-Jiang Jackie-Jiang added observability multi-stage Related to the multi-stage query engine labels Feb 13, 2025
@Jackie-Jiang
Copy link
Contributor

@gortiz Can you help take a look at this? Have you already made some improvements in 1.3 release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multi-stage Related to the multi-stage query engine observability
Projects
None yet
Development

No branches or pull requests

3 participants