Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to activate real-time metrics to OpenCensus #5624

Open
gertvdijk opened this issue Apr 23, 2019 · 15 comments
Open

Unable to activate real-time metrics to OpenCensus #5624

gertvdijk opened this issue Apr 23, 2019 · 15 comments
Assignees
Milestone

Comments

@gertvdijk
Copy link

gertvdijk commented Apr 23, 2019

#5099 implemented a way to record real-time metrics. Cool! Now I'd like to activate it, as it's disabled by default. I'm not successful in enabling this, however.

Tried with a NettyServerBuilder for example, but the setStatsRecordRealTimeMetrics method is protected.

I don't see any other way to set the private attribute recordRealTimeMetrics in the AbstractServerImplBuilder class from the sources.

The work-around I'm using now is ugly in bypassing the protected attribute (Kotlin, but you get my point).

val serverBuilder = NettyServerBuilder.forPort(1234)
val setStatsRecordRealTimeMetricsMethod = NettyServerBuilder::class.java
    .getDeclaredMethod("setStatsRecordRealTimeMetrics", Boolean::class.java)
setStatsRecordRealTimeMetricsMethod.setAccessible(true)
setStatsRecordRealTimeMetricsMethod.invoke(serverBuilder, true)

after which it starts to work, except from the method name not being exported (but that's #5593).

What version of gRPC are you using?

1.20.0

What did you expect to see?

A public function on the Builder to call setStatsRecordRealTimeMetrics(true).

@zhangkun83
Copy link
Contributor

We don't expose the public API to control Census because it'd be subject to change soon as we expect to move Census out of core (#5076). We will need to do that first.

@zhangkun83 zhangkun83 added this to the Next milestone Apr 25, 2019
@mattnworb
Copy link

with the issue linked above complete (there is now a grpc-census), is there any way to enable recordRealTimeMetrics in a ServerBuilder?

@gertvdijk
Copy link
Author

gertvdijk commented May 6, 2020

I had a quick look at the current master version 0057c4f of grpc-java, but all seem unchanged as of now and the point in my initial report is still valid.

I'd be happy for someone to tell me if there's a better/proper way. 😄

public final class NettyServerBuilder extends AbstractServerImplBuilder<NettyServerBuilder> {

@Override
protected void setStatsRecordRealTimeMetrics(boolean value) {
super.setStatsRecordRealTimeMetrics(value);
}

@zhangkun83
Copy link
Contributor

Since @voidzcy has moved census integration code into its own package, we should be able to expose such switch from there.

@voidzcy
Copy link
Contributor

voidzcy commented May 7, 2020

Yeah, this was the thing that we were trying to discuss when #6577 was done.

I am thinking we could expose these knobs in grpc-census through global configurations (e.g., CensusStats.recordServerRealTimeMetrics) and these would be public APIs in grpc-census. Then we remove existing hidden methods in channel/server builders.

Normal Census users (with grpc-census in runtime classpath) would not be impacted as we set default values to what we are currently using. Advanced Census users would include grpc-census in compile classpath and use APIs to configure Census knobs.

The drawback is we are not able to configure each channel/server individually as the configurations are global. Making existing methods in grpc-core public also looks OK, but the API usage depending on runtime dependencies is a concern.

/cc @ejona86

@ejona86
Copy link
Member

ejona86 commented May 7, 2020

I prefer the idea of having a global like CensusStats.recordServerRealTimeMetrics instead of exposing the current InternalNettyChannelBuilder methods as public. But I'm not wild about continuing down this global API configuration road, without a plan for moving away from it.

It also seems a bit weird to expose new APIs for OpenCensus when "OpenCensus is dead! Long live OpenTelemetry!"

Unfortunately it seems we'll need to keep the InternalNettyChannelBuilder method in-place, because their current usage is per-channel from a library, which is not compatible with "change a global." So we'd do an OR between the two options to decide whether to enable real time metrics.

Another option is to expose an Interceptor from grpc-census that can be configured to your hearts content, and when you use the interceptor it disables the built-in support. This has the problems 1) it is unclear if users will be able to easily add the interceptor to all the channels they need to and 2) how it will interacts with other features. For (2), retries and binary logging come to mind. Since retries+stats is undefined, it is a bit of a wildcard, but I think is may be okay independent of which solution we use. Tracing+binary logging is probably a bigger concern. Interestingly binary logging is in core in large part because Census was, so maybe we'll move it out to solve any problems.

For (1) there has been some discussion about "global interceptors." The idea here would probably look like a global where you set the ClientInterceptors, but can only be set once (presumably from within main()).

I don't mind doing both CensusStats.recordServerRealTimeMetrics and exposing an interceptor, but I am worried about moving forward without a plan given moving forward can make things even more entrenched and harder to develop a plan.

@mattnworb
Copy link

mattnworb commented May 8, 2020

This is just the perspective of an outsider (a user, not a contributor) but it seems like some of the complexity in the question of how to configure the CensusStatsModule has to do with the fact that it is added as an interceptor in AbstractServerImplBuilder and AbstractManagedChannelImplBuilder automatically if the class is found on the classpath, even though it sounds like there is a desire to move the configuration out of the core ServerBuilder and ChannelBuilder.

As a user of the code it would feel pretty natural to me to configure whether or not realtime stats were enabled via a method on CensusStatsModule or the relevant interceptor if I was adding that interceptor to the Server/Channel myself, as opposed to automatic registration. If there is a desire to move this more and more out of the core of grpc-java then it feels reasonable to me to have to add these interceptors explicitly, although that also seems tricky if a lot of people using these today expect it is done automatically.

If adding these interceptors continues to be automatic (and even somewhat hidden) in AbstractServerImplBuilder and AbstractManagedChannelImplBuilder then it would feel natural to configure the behavior through setters on those builders, since like pointed out above having a global configuration wouldn't work all that well if I have multiple Channels in the same process and want a different configuration for each.

@voidzcy
Copy link
Contributor

voidzcy commented May 8, 2020

We did considered letting users add interceptors by themselves (see #5510 (comment)). But eventually we chose to go with the dynamic loading approach to minimize breakage (it only requires adding another runtime dependency without code change). Requiring Census users to add interceptors explicitly is not that feasible internally in Google given how dependencies are managed.

Yes, as I mentioned in #5624 (comment), the drawback of using globals to configure Census is it can only be applied to all Channels/Servers. As in #5624 (comment), we have concerns for all approaches discussed here. We would need to flesh out a plan doing this. Please stay tuned and any suggestions are welcome.

@ejona86
Copy link
Member

ejona86 commented May 11, 2020

Requiring Census users to add interceptors explicitly is not that feasible internally in Google given how dependencies are managed.

We can probably manage fine in Google. At least for all the common cases I'm aware of, there are places we can add the interceptor. My concern is mostly for "all the other users."

@ejona86
Copy link
Member

ejona86 commented May 14, 2020

We discussed it further, and this is the approach we're entertaining most strongly:

  1. Add a CensusInterceptor in io.grpc.census. This can have whatever configuration is necessary when being created
  2. Add a disableImplicitCensus() somewhere in io.grpc.census. If you don't call this, then the two census interceptors would both be used. It must not be used from library code. We'd probably want a getter for code using CensusInterceptor directly to react
  3. (Maybe) Have a setting on the Channel and Server to disable the implicit census interceptor

@gertvdijk, @mattnworb, would that work for you?

@voidzcy
Copy link
Contributor

voidzcy commented May 14, 2020

Add a disableImplicitCensus() somewhere in io.grpc.census. If you don't call this, then the two census interceptors would both be used.

This might be problematic. I would rather throw an error in the CensusInterceptor if found the default one is not disabled.

@ejona86
Copy link
Member

ejona86 commented May 14, 2020

I could be fine with requiring the global one to be disabled when creating a CensusInterceptor. That would mean there would be no point in a Channel/Server option. The main concern I see is it means libraries wouldn't be able to use CensusInterceptor unless they shaded gRPC.

@voidzcy
Copy link
Contributor

voidzcy commented May 14, 2020

You are right. We are kind of in the dilemma between trying to keeping grpc-core away from census and having APIs to control the default census interceptors in grpc-core. After a more throughout consideration, io.grpc.census.Census.disableImplicitCensus() doesn't seem to work well. It controls the global behavior, which as you said may not be feasible for libraries. We probably may still want to go with the CallOption approach as that change seems more "internal" than having a setting on the Channel and Server.

@mattnworb
Copy link

We discussed it further, and this is the approach we're entertaining most strongly:

  1. Add a CensusInterceptor in io.grpc.census. This can have whatever configuration is necessary when being created
  2. Add a disableImplicitCensus() somewhere in io.grpc.census. If you don't call this, then the two census interceptors would both be used. It must not be used from library code. We'd probably want a getter for code using CensusInterceptor directly to react
  3. (Maybe) Have a setting on the Channel and Server to disable the implicit census interceptor

@gertvdijk, @mattnworb, would that work for you?

Yes it would work for me 👍

Could you elaborate though on "disableImplicitCensus()... It must not be used from library code"? Is this referring to library code (for example, something to set up a ChannelBuilder) should not disable the implicit interceptors since it will not know if the application adds or does not add the non-implicit ones?

@ejona86
Copy link
Member

ejona86 commented May 15, 2020

Is this referring to library code (for example, something to set up a ChannelBuilder) should not disable the implicit interceptors since it will not know if the application adds or does not add the non-implicit ones?

Essentially, yes.

Since it is a global, it is really only "safe" for an application (e.g., the thing implementing main()) to change, since only the application has a full view of the expected configuration. Let's say acme-super-fake-db is a library and used disableImplicitCensus(); a user that had a working Census setup would be quite peeved if adding a dependency on acme-super-fake-db disabled their Census stats.

@voidzcy voidzcy modified the milestones: 1.34, 1.35 Nov 18, 2020
@ericgribkoff ericgribkoff modified the milestones: 1.35, 1.36 Jan 7, 2021
@ejona86 ejona86 modified the milestones: 1.36, Next Feb 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants