Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NameResolverRegistry with multiple registrations no longer working as of gRPC 1.60+ #11055

Open
clairemcginty opened this issue Mar 28, 2024 · 10 comments
Milestone

Comments

@clairemcginty
Copy link

As of #10590, the behavior of ManagedChannelImplBuilder#nameResolverFactory has changed -- it now wraps the provided resolverFactory into an NameResolverFactoryToProviderFacade that returns the first registered NameResolverProvider's scheme as its defaultScheme(). As a result, lookups of any other registered schemes fail.

Here's a code sample of something that used to work pre-1.60.0 (apologies for the Scala code):

case class Resolver(scheme: String) extends NameResolverProvider {
  override def isAvailable: Boolean = true
  override def priority(): Int = 5
  override def newNameResolver(targetUri: URI, args: NameResolver.Args): NameResolver = new NameResolver {
    override def getServiceAuthority: String = "foo"
    override def shutdown(): Unit = {}
  }
  override def getDefaultScheme: String = scheme
}

val registry = new NameResolverRegistry()
registry.register(Resolver("s1"))
registry.register(Resolver("s2"))
registry.register(Resolver("s3"))

NettyChannelBuilder
  .forTarget("s2://test")
  .usePlaintext
  .nameResolverFactory(registry.asFactory())
  .build()

After upgrading to GRPC 1.60, .build() throws the following exception:

java.lang.IllegalArgumentException: Could not find a NameResolverProvider for s2://test
	at io.grpc.internal.ManagedChannelImpl.getNameResolver(ManagedChannelImpl.java:741)
	at io.grpc.internal.ManagedChannelImpl.getNameResolver(ManagedChannelImpl.java:771)
	at io.grpc.internal.ManagedChannelImpl.<init>(ManagedChannelImpl.java:635)
	at io.grpc.internal.ManagedChannelImplBuilder.build(ManagedChannelImplBuilder.java:662)
	at io.grpc.ForwardingChannelBuilder2.build(ForwardingChannelBuilder2.java:254)

Setting an IDE breakpoint in ManagedChannelImpl#getNameResolver shows:

Screenshot 2024-03-28 at 2 54 17 PM

So what's happening is that there's technically no longer a provider for s2, because the singular wrapper class NameResolverFactoryToProviderFacade has a registered scheme of s1.

Is there anything I can do to work around this? thanks!

@ejona86
Copy link
Member

ejona86 commented Mar 28, 2024

We were mainly trying to preserve compatibility with NameResolver.Factory while propagating NameResolverRegistry deeper through the code. registry.asFactory() existed for gRPC to convert, but admit I wasn't expecting others to use it like that. (That's very similar to what gRPC does, but because we were supporting the old way.)

Can you explain why you aren't registering in the global registry? Could you also explain why you need the multiple NameResolverProviders since this builder is for only a single target string? (I know of multiple reasons for these answers, but am interested in your situation.)

I don't think there's a workaround, other than registering in the global registry. To restore the behavior we'd either need to expose some goo to convert NameResolverRegistry's factory back to a registry, or we'd need to add a builder.nameResolverRegistry() method for you to use instead of providing a factory.

Ref: #7133

@clairemcginty
Copy link
Author

clairemcginty commented Mar 28, 2024

We were mainly trying to preserve compatibility with NameResolver.Factory while propagating NameResolverRegistry deeper through the code. registry.asFactory() existed for gRPC to convert, but admit I wasn't expecting others to use it like that. (That's very similar to what gRPC does, but because we were supporting the old way.)

Can you explain why you aren't registering in the global registry? Could you also explain why you need the multiple NameResolverProviders since this builder is for only a single target string? (I know of multiple reasons for these answers, but am interested in your situation.)

I don't think there's a workaround, other than registering in the global registry. To restore the behavior we'd either need to expose some goo to convert NameResolverRegistry's factory back to a registry, or we'd need to add a builder.nameResolverRegistry() method for you to use instead of providing a factory.

Ref: #7133

Thanks for the quick response @ejona86! Basically, the use case here is for a library function (thus the need to support multiple possible target URI schemes) exported for use in map-reduce jobs. The code here would be executed in relatively isolated container environments, which are difficult to inject startup/shutdown hooks into for the global registry route (unless I'm misunderstanding how that works - I couldn't find too many examples of custom NameResolvers.)

@ejona86
Copy link
Member

ejona86 commented Mar 29, 2024

Quick reply only partly responding:
If your resolver doesn't need injected configuration, then you just create a META-INF/services/ resource with the class name inside. For example, ours for DNS. gRPC looks for those when loading using ServiceLoader

@clairemcginty
Copy link
Author

Quick reply only partly responding: If your resolver doesn't need injected configuration, then you just create a META-INF/services/ resource with the class name inside. For example, ours for DNS. gRPC looks for those when loading using ServiceLoader

Thanks @ejona86 ! Unfortunately some of our resolvers do need extra config. As a workaround, the best thing I can think of for right now is a targetUri-aware NameResolverProvider, something like:

class TargetAwareNameResolverProvider extends NameResolverProvider {
   private final String targetScheme;
   private final NameResolver nameResolver;

   TargetAwareNameResolverProvider(URI targetUri, NameResolver.Args args) {
      this.targetScheme = targetUri.getScheme();
      switch (this.targetScheme) {
         case "s1":
            this.nameResolver = Resolver("s1")
            break;
         // etc
      }
   }

   @Override
   public NameResolver newNameResolver(URI targetUri, NameResolver.Args args) {
     // assert targetUri scheme matches expected...
     return this.nameResolver;
   }
 
   @Override
    public String getDefaultScheme() {
      return this.targetScheme;
    } 
}

Also - do you think it makes sense to add validation to .nameResolverFactory(...) to check that the supplied factory only provides a single resolver?

@ejona86
Copy link
Member

ejona86 commented Mar 29, 2024

How is the library function you are exposing used? Something like (?):

builder = ManagedChannelBuilder.forTarget("foo://");
yourLibrary.configure(builder);
builder.build();

How much of a problem would it be if we solved this by adding a new method builder.nameResolverRegistry()? Since that method wouldn't exist on older versions of gRPC.

Also - do you think it makes sense to add validation to .nameResolverFactory(...) to check that the supplied factory only provides a single resolver?

There's no way to check via the Factory API.

@clairemcginty
Copy link
Author

How is the library function you are exposing used?

It's pretty opinionated -- it accepts a targetUri and some other params (retry, serviceConfig etc) and constructs the Channel opaquely to the user.

How much of a problem would it be if we solved this by adding a new method builder.nameResolverRegistry()? Since that method wouldn't exist on older versions of gRPC.

So it would directly set nameResolverRegistry to the provided value rather than wrapping it in a NameResolverFactoryToProviderFacade? Yes, that should solve it! We're pretty strict on compatible GRPC versions, so it being an added method shouldn't be an issue.

@ejona86
Copy link
Member

ejona86 commented Mar 29, 2024

It's pretty opinionated -- it accepts a targetUri and some other params (retry, serviceConfig etc) and constructs the Channel opaquely to the user.

targetUri is passed in, so you could know which name resolver should be used. And you are in control of channel construction, which gives options.

Unfortunately some of our resolvers do need extra config.

Is that state stable for the life of the process? And is the config simply data; nothing that needs to be cleaned up/shut down? Where I'm going with this: Could you add the resolvers to the registry on first use of your library function?


I'm simultaneously trying to figure out the best workaround and long-term approach.

One thing I have been considering is passing NameResolverRegistry when creating the channel builder. The problem with the current nameResolverFactory is it changes the semantics of the target string in the middle of the building process. For things like OpenTelemetry, we think people may need configuration based on the target string but it is hard to implement if the target string can change in the middle of building.

@clairemcginty
Copy link
Author

It's pretty opinionated -- it accepts a targetUri and some other params (retry, serviceConfig etc) and constructs the Channel opaquely to the user.

targetUri is passed in, so you could know which name resolver should be used. And you are in control of channel construction, which gives options.

Yeah! so I could do something like the snippet above, which isn't too much trouble, it just adds an extra layer of URI/scheme parsing on top of what the channel lookup is already doing.

Is that state stable for the life of the process? And is the config simply data; nothing that needs to be cleaned up/shut down? Where I'm going with this: Could you add the resolvers to the registry on first use of your library function?

Yes, it's simply data! would you mind expanding a bit on how this would work? i.e. would it be done by the library user?

@ejona86
Copy link
Member

ejona86 commented Mar 29, 2024

would you mind expanding a bit on how this would work?

This only works if you are just injecting data (yay!) and if the configuration is stable for the life of the process (dunno). Basically, it would probably work if "you always use the same configuration in the registry" today.

class YourLibraryClient {
  private static final Object lock = new Object();
  private static boolean inited;

  public YourLibraryClient create(String target, String serviceConfig) {
    synchronized (lock) {
      if (!inited) {
        inited = true;
        NameResolverRegistry.getDefault().register(Resolver("s1"));
        NameResolverRegistry.getDefault().register(Resolver("s2"));
        NameResolverRegistry.getDefault().register(Resolver("s3"));
      }
    }
    // your existing code
  }

@temawi temawi added the Waiting on reporter there was a request for more information without a response or answer or advice has been provided label Apr 10, 2024
@sergiitk sergiitk added api-review and removed Waiting on reporter there was a request for more information without a response or answer or advice has been provided labels Apr 25, 2024
@larry-safran larry-safran added this to the Next milestone May 16, 2024
@ejona86
Copy link
Member

ejona86 commented May 16, 2024

API review meeting:

  • Introduce ability to pass NameResolverRegistry to the channel builder when being built (Grpc.newManagedChannel(String, ChannelCredentials, NameResolverRegistry))
  • That would also be a solution to many cases that are using channelBuilder.nameResolverFactory()
  • May want to delay this until after we split "default scheme" from provider selection. This hasn't really been discussed anywhere, but we should allow setting a default prefix (e.g., "dns://1.1.1.1/" or "xds://my.control.plane/"), and that seems it should be a separate decision than the priority order for a particular scheme. (The application author should be in control of the setting, not the name resolver library authors.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants