Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add observability metrics for CommandPartitionedTopicMetadata requests #18243

Open
1 of 2 tasks
lhotari opened this issue Oct 28, 2022 · 8 comments
Open
1 of 2 tasks

Add observability metrics for CommandPartitionedTopicMetadata requests #18243

lhotari opened this issue Oct 28, 2022 · 8 comments

Comments

@lhotari
Copy link
Member

lhotari commented Oct 28, 2022

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently, there's no way to track CommandPartitionedTopicMetadata requests. There's no metrics or logs that indicate that a broker is handling CommandPartitionedTopicMetadata requests.

Misconfigured clients might flood brokers with CommandPartitionedTopicMetadata requests and cause high CPU consumption.

One example of this is misconfiguration of splunk-otel-collector's Pulsar exporter. The example config configures pulsar-client-go's PartitionsAutoDiscoveryInterval setting to 1 nanosecond. I have sent a PR to fix the example config with signalfx/splunk-otel-collector#2185 . This example shows that it's easy to mix the units and misconfigure a Pulsar client.

Solution

Add observability metrics for CommandPartitionedTopicMetadata requests, similar to what there is for lookup requests added by #8272.

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@tjiuming
Copy link
Contributor

currently, we have metadata store metrics, if it could meet your needs, I'd like to handle the issue.
@lhotari

@lhotari
Copy link
Member Author

lhotari commented Oct 28, 2022

currently, we have metadata store metrics, if it could meet your needs, I'd like to handle the issue.
@lhotari

How are metadata store metrics used currently? I think it could be a breaking change if CommandPartitionedTopicMetadata requests are tracked as part of some other metric. I think it should be a new metric that is unique for CommandPartitionedTopicMetadata requests. @codelipenghui do you have a suggestion?

@codelipenghui
Copy link
Contributor

How are metadata store metrics used currently? I think it could be a breaking change if CommandPartitionedTopicMetadata requests are tracked as part of some other metric. I think it should be a new metric that is unique for CommandPartitionedTopicMetadata requests. @codelipenghui do you have a suggestion?

The metadata store metrics are on the metadata store level which can provide the metastore operation latency. The REST API request metrics should be a separate part. The CommandPartitionedTopicMetadata requests metrics should not 100% equal to the metadata store operation. Maybe the jetty thread is blocked somewhere.

I think maybe jetty already provides the ability to expose the metrics with the request path label?

@tjiuming
Copy link
Contributor

@codelipenghui @lhotari There are 2 ways to get PartitionedTopicMetadata, one is ServerCnx#handlePartitionMetadataRequest(CommandPartitionedTopicMetadata partitionMetadata), another one is PersistentTopics#getPartitionedMetadata(Args ...)
if we need to add metrics for them, please assign the issue to me

@tjiuming
Copy link
Contributor

tjiuming commented Oct 31, 2022

How are metadata store metrics used currently? I think it could be a breaking change if CommandPartitionedTopicMetadata requests are tracked as part of some other metric. I think it should be a new metric that is unique for CommandPartitionedTopicMetadata requests. @codelipenghui do you have a suggestion?

The metadata store metrics are on the metadata store level which can provide the metastore operation latency. The REST API request metrics should be a separate part. The CommandPartitionedTopicMetadata requests metrics should not 100% equal to the metadata store operation. Maybe the jetty thread is blocked somewhere.

I think maybe jetty already provides the ability to expose the metrics with the request path label?

I've checked jetty, seems there is no such ability.
if we want the ability, it's not easy. because we need to converge the request path. such as: /api/v2/persistent/myTenant/myNamespace/partitioned -> /api/v2/persistent/{tenant}/{namespace}/partitioned. it may takes some time

@tjiuming
Copy link
Contributor

tjiuming commented Nov 1, 2022

@lhotari @codelipenghui PTAL #18281

@tjiuming
Copy link
Contributor

tjiuming commented Nov 3, 2022

@github-actions
Copy link

github-actions bot commented Dec 5, 2022

The issue had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants