Add observability metrics for CommandPartitionedTopicMetadata requests #18243

lhotari · 2022-10-28T09:43:44Z

Search before asking

I searched in the issues and found nothing similar.

Motivation

Currently, there's no way to track CommandPartitionedTopicMetadata requests. There's no metrics or logs that indicate that a broker is handling CommandPartitionedTopicMetadata requests.

Misconfigured clients might flood brokers with CommandPartitionedTopicMetadata requests and cause high CPU consumption.

One example of this is misconfiguration of splunk-otel-collector's Pulsar exporter. The example config configures pulsar-client-go's PartitionsAutoDiscoveryInterval setting to 1 nanosecond. I have sent a PR to fix the example config with signalfx/splunk-otel-collector#2185 . This example shows that it's easy to mix the units and misconfigure a Pulsar client.

Solution

Add observability metrics for CommandPartitionedTopicMetadata requests, similar to what there is for lookup requests added by #8272.

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

I'm willing to submit a PR!

tjiuming · 2022-10-28T13:29:50Z

currently, we have metadata store metrics, if it could meet your needs, I'd like to handle the issue.
@lhotari

lhotari · 2022-10-28T15:33:49Z

currently, we have metadata store metrics, if it could meet your needs, I'd like to handle the issue.
@lhotari

How are metadata store metrics used currently? I think it could be a breaking change if CommandPartitionedTopicMetadata requests are tracked as part of some other metric. I think it should be a new metric that is unique for CommandPartitionedTopicMetadata requests. @codelipenghui do you have a suggestion?

codelipenghui · 2022-10-31T02:58:03Z

How are metadata store metrics used currently? I think it could be a breaking change if CommandPartitionedTopicMetadata requests are tracked as part of some other metric. I think it should be a new metric that is unique for CommandPartitionedTopicMetadata requests. @codelipenghui do you have a suggestion?

The metadata store metrics are on the metadata store level which can provide the metastore operation latency. The REST API request metrics should be a separate part. The CommandPartitionedTopicMetadata requests metrics should not 100% equal to the metadata store operation. Maybe the jetty thread is blocked somewhere.

I think maybe jetty already provides the ability to expose the metrics with the request path label?

tjiuming · 2022-10-31T10:02:53Z

@codelipenghui @lhotari There are 2 ways to get PartitionedTopicMetadata, one is ServerCnx#handlePartitionMetadataRequest(CommandPartitionedTopicMetadata partitionMetadata), another one is PersistentTopics#getPartitionedMetadata(Args ...)
if we need to add metrics for them, please assign the issue to me

tjiuming · 2022-10-31T11:57:12Z

How are metadata store metrics used currently? I think it could be a breaking change if CommandPartitionedTopicMetadata requests are tracked as part of some other metric. I think it should be a new metric that is unique for CommandPartitionedTopicMetadata requests. @codelipenghui do you have a suggestion?

The metadata store metrics are on the metadata store level which can provide the metastore operation latency. The REST API request metrics should be a separate part. The CommandPartitionedTopicMetadata requests metrics should not 100% equal to the metadata store operation. Maybe the jetty thread is blocked somewhere.

I think maybe jetty already provides the ability to expose the metrics with the request path label?

I've checked jetty, seems there is no such ability.
if we want the ability, it's not easy. because we need to converge the request path. such as: /api/v2/persistent/myTenant/myNamespace/partitioned -> /api/v2/persistent/{tenant}/{namespace}/partitioned. it may takes some time

tjiuming · 2022-11-01T11:21:20Z

@lhotari @codelipenghui PTAL #18281

tjiuming · 2022-11-03T10:29:33Z

The PIP discuss thread: https://lists.apache.org/thread/sybl4nno4503w42hzt7b5lsyk6m2rbo6

github-actions · 2022-12-05T02:01:08Z

The issue had no activity for 30 days, mark with Stale label.

lhotari added area/broker area/metrics labels Oct 28, 2022

sijie mentioned this issue Oct 28, 2022

ISSUE-18243: Add observability metrics for CommandPartitionedTopicMetadata requests streamnative/pulsar-archived#5053

Open

2 tasks

codelipenghui assigned tjiuming Oct 31, 2022

tjiuming mentioned this issue Nov 1, 2022

PIP-222: [feat][monitor] Add PartitionMetadataRequest metrics #18281

Closed

4 tasks

tjiuming mentioned this issue Nov 3, 2022

PIP-222: Add CommandPartitionedTopicMetadata metrics #18319

Open

sijie mentioned this issue Nov 3, 2022

ISSUE-18319: PIP-222: Add CommandPartitionedTopicMetadata metrics streamnative/pulsar-archived#5082

Open

github-actions bot added the Stale label Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add observability metrics for CommandPartitionedTopicMetadata requests #18243

Add observability metrics for CommandPartitionedTopicMetadata requests #18243

lhotari commented Oct 28, 2022

tjiuming commented Oct 28, 2022

lhotari commented Oct 28, 2022

codelipenghui commented Oct 31, 2022

tjiuming commented Oct 31, 2022

tjiuming commented Oct 31, 2022 •

edited

Loading

tjiuming commented Nov 1, 2022

tjiuming commented Nov 3, 2022

github-actions bot commented Dec 5, 2022

Add observability metrics for CommandPartitionedTopicMetadata requests #18243

Add observability metrics for CommandPartitionedTopicMetadata requests #18243

Comments

lhotari commented Oct 28, 2022

Search before asking

Motivation

Solution

Alternatives

Anything else?

Are you willing to submit a PR?

tjiuming commented Oct 28, 2022

lhotari commented Oct 28, 2022

codelipenghui commented Oct 31, 2022

tjiuming commented Oct 31, 2022

tjiuming commented Oct 31, 2022 • edited Loading

tjiuming commented Nov 1, 2022

tjiuming commented Nov 3, 2022

github-actions bot commented Dec 5, 2022

tjiuming commented Oct 31, 2022 •

edited

Loading