Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of PrometheusRawMetricsProvider for the Pulsar-Proxy #14681

Merged
merged 3 commits into from
Mar 16, 2022

Conversation

cbornet
Copy link
Contributor

@cbornet cbornet commented Mar 14, 2022

Motivation

PR #9021 added support for PrometheusRawMetricsProvider in the broker so that plugins can add their metrics to the Prometheus /metrics endpoint.
This change does the same for the Pulsar Proxy. Proxy plugins (additional servlets, extensions, ...) can then add their Prometheus providers with ProxyService::addPrometheusRawMetricsProvider.
This helps having common metrics code for servlets and handlers/extensions when they can be deployed both in the broker and the proxy.

Modifications

  • PrometheusRawMetricsProvider moved from pulsar-broker to pulsar-broker-common under the same package name so the change is non-breaking.
  • Make the PrometheusMetricsServlet non-specific to the broker and have another PulsarPrometheusMetricsServlet that extends it for the broker.
  • Extract PrometheusMetricsGenerator methods used by both the broker and the proxy to a PrometheusMetricsGeneratorUtils class.
  • Add a ProxyService::addPrometheusRawMetricsProvider method for use by plugins.
  • Make the proxy use PrometheusMetricsServlet instead of the classical MetricsServlet and add registered PrometheusRawMetricsProvider to it on WebServer startup. PrometheusMetricsServlet is aware of the Prometheus CollectorRegistry so it's backward compatible for existing plugins that are registering standard Prometheus metrics.
  • Add a test for the Proxy Prometheus /metrics endpoint.

Verifying this change

  • Make sure that the change passes the CI checks.

This change is already covered by existing tests, such as org.apache.pulsar.broker.stats.PrometheusMetricsTest.
This change added tests and can be verified by running org.apache.pulsar.proxy.server.ProxyPrometheusMetricsTest

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no) no
  • The public API: (yes / no) no
  • The schema: (yes / no / don't know) no
  • The default values of configurations: (yes / no) no
  • The wire protocol: (yes / no) no
  • The rest endpoints: (yes / no) no
  • The admin cli options: (yes / no) no
  • Anything that affects deployment: (yes / no / don't know) no

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    This should be first documented for the broker.

  • doc

    (If this PR contains doc changes)

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work !

IIUC this change is totally backward compatible

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good work @cbornet

Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good.

I think we need to address metrics providers that are added late to ensure they are added to the metricsServlet.

Comment on lines 421 to 423
public void addPrometheusRawMetricsProvider(PrometheusRawMetricsProvider metricsProvider) {
metricsProviders.add(metricsProvider);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this method could be called from other threads in protocol handlers, I think we should consider using a concurrent data structure to back the metricsProviders list. Technically, it appears that the current implementation in the broker does not use a concurrent data structure, so I could be wrong about this class's thread safety requirements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codelipenghui as you contributed the code for the broker, can you say if we should use concurrent structures for PulsarService's pendingMetricsProviders ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, it was only considered when loading plugins when starting the broker, which will not be modified after the broker started. To avoid any potential thread-safety issues(maybe some users want to modify after the broker started), using concurrent structures looks good to me.

@michaeljmarshall michaeljmarshall added this to the 2.11.0 milestone Mar 14, 2022
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Mar 14, 2022
Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @michaeljmarshall 's question, otherwise LGTM to me

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just left a minor comment.

Comment on lines +41 to +42
private static final int HTTP_STATUS_OK_200 = 200;
private static final int HTTP_STATUS_INTERNAL_SERVER_ERROR_500 = 500;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for introducing new http status code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codelipenghui
because we are not importing org.eclipse.jetty.http.HttpStatus anymore
500 is used below

if (metricsServlet == null) {
if (pendingMetricsProviders == null) {
pendingMetricsProviders = new LinkedList<>();
synchronized (this) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about moving 'synchronized' to the signature of the method ?

if (pendingMetricsProviders != null) {
pendingMetricsProviders.forEach(provider -> metricsServlet.addRawMetricsProvider(provider));
this.pendingMetricsProviders = null;
synchronized (this) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about creating a method like this:

private synchronised void initMetricsServlet() {
      this.metricsServlet = new PrometheusMetricsServlet(-1L, proxyConfig.getClusterName());
            if (pendingMetricsProviders != null) {
                pendingMetricsProviders.forEach(provider -> metricsServlet.addRawMetricsProvider(provider));
                this.pendingMetricsProviders = null;
            }
}

@@ -347,7 +349,9 @@ public void close() throws IOException {
proxyAdditionalServlets = null;
}

metricsServlet = null;
synchronized (this) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about using a synchronised method here instead of using "synchronized (this)" ?
probably it is a matter of taste, but I think that "synchronized (this)" is quite hard to read

if (metricsServlet == null) {
if (pendingMetricsProviders == null) {
pendingMetricsProviders = new LinkedList<>();
synchronized (this) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about moving 'synchronized' to the signature of the method ?

pendingMetricsProviders.forEach(provider -> metricsServlet.addRawMetricsProvider(provider));
this.pendingMetricsProviders = null;

synchronized (this) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about moving this code to a dedicated method ?

private synchronised void initMetricsServlet() {
..
}

@@ -356,6 +374,10 @@ public void close() throws IOException {
}
}

private synchronized void resetMetricsServlet() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getMetricsServlet should be synchronized as well (spotbugs should not pass)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

this.pendingMetricsProviders = null;
}

createMetricsServlet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below we have a non synchronised access to metricsServlet.
I suggest to use a synchronised getter and use it while calling addWebServerHandlers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be only one thread here.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@michaeljmarshall please take another look

Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for addressing my comments @cbornet!

@codelipenghui codelipenghui merged commit ea95b28 into apache:master Mar 16, 2022
@cbornet cbornet deleted the proxy-metrics branch March 16, 2022 15:03
aparajita89 pushed a commit to aparajita89/pulsar that referenced this pull request Mar 21, 2022
Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants