Skip to content

Conversation

@easwars
Copy link
Contributor

@easwars easwars commented Dec 4, 2025

How the API looked before this change?

  • We had a single interface, Filter, that was responsible for functionality like config parsing and optionally for building client and server interceptors (or filter instances).
  • We had a registry of Filters keyed by the supported type_urls.

How the API looks after this change?

  • We will have an interface, FilterProvider, that contains functionality for config parsing and other things like "what are the supported type_urls?", "Are filters produced by this provider supposed to be terminal?", "Are filters produced by this provider supposed to work on the client/server?" etc.
  • The FilterProvider will also contain functionality to create filter instances represented by a new interface, Filter.
  • The Filter interface contains functionality to build client and server interceptors. If this filter is not supported on either the client or the server, that functionality can be a no-op.
  • The Filter will also contain a new Close method allowing it to free up any resources allocated.
  • We will have a registry of FilterProviderss keyed by the supported type_urls.

Why is this change required?

  • As part of gRFC A83 and moving forward on a bunch of other gRFCs, we will start having HTTP filters that need to maintain a bunch of state, and this state needs to be retained across resource updates. For example, the filter might contain a gRPC channel to an external service, and we don't want to recreate this channel for resource updates that doesn't change any properties associated with the service that is being connected to.
    • This change is not part of this PR and will be made when the first filter that needs state retention support is implemented.

How will this be used?

  • The xDS name resolver and the xDS enabled grpc server (entities that create HTTP filters on the client and server side) will create new Filter instances only when the filter names in the xDS resources change. Otherwise, they will only create new interceptors using the updated filter configuration from the existing Filter. This will allow Filter instances to share state across interceptors and across state updates.

The changes here and inspired by similar changes made for Java and described here: https://github.com/grpc/proposal/blob/master/A83-xds-gcp-authn-filter.md#java

RELEASE NOTES: None

@easwars easwars requested a review from dfawley December 4, 2025 23:55
@easwars easwars added Type: Feature New features or improvements in behavior Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Dec 4, 2025
@easwars easwars added this to the 1.78 Release milestone Dec 4, 2025
@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

❌ Patch coverage is 22.72727% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.29%. Comparing base (647162c) to head (f879143).
⚠️ Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
internal/xds/httpfilter/router/router.go 25.00% 4 Missing and 2 partials ⚠️
internal/xds/httpfilter/fault/fault.go 16.66% 3 Missing and 2 partials ⚠️
internal/xds/httpfilter/rbac/rbac.go 16.66% 3 Missing and 2 partials ⚠️
internal/xds/resolver/serviceconfig.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8745      +/-   ##
==========================================
- Coverage   83.38%   83.29%   -0.09%     
==========================================
  Files         418      419       +1     
  Lines       32367    32595     +228     
==========================================
+ Hits        26988    27149     +161     
- Misses       4014     4051      +37     
- Partials     1365     1395      +30     
Files with missing lines Coverage Δ
internal/xds/httpfilter/httpfilter.go 100.00% <ø> (ø)
internal/xds/xdsclient/xdsresource/filter_chain.go 93.92% <100.00%> (-0.56%) ⬇️
internal/xds/resolver/serviceconfig.go 88.05% <0.00%> (+0.08%) ⬆️
internal/xds/httpfilter/fault/fault.go 70.45% <16.66%> (-1.52%) ⬇️
internal/xds/httpfilter/rbac/rbac.go 57.74% <16.66%> (-2.82%) ⬇️
internal/xds/httpfilter/router/router.go 32.35% <25.00%> (-5.89%) ⬇️

... and 44 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines 104 to 105
func (provider) IsClient() bool { return true }
func (provider) IsServer() bool { return false }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed it would be nice if we could find a way to use the type system to avoid potential for disagreement between these functions and the filter that is produced's behavior, but this is OK for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I got rid of the 2-layer thing, I was able to bring back the optional interfaces like we had before.

type Filter interface {
// TypeURLs are the proto message types supported by this filter. A filter
// will be registered by each of its supported message types.
// A FilterProvider is responsible for parsing a HTTP filter's configuration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: an HTTP filter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// capable of working on a server.
IsServer() bool
// Build creates a new Filter instance with the given name.
Build(name string) Filter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we didn't do this whole 2-layer thing and just added "name" to the Build*Interceptor methods instead?

We already said the FilterProvider would need to track instances directly inside of itself.

The builder functions could return a close func().

It seems like we could still do everything we need to do. Existing filters would just ignore the new name parameter and return func(){}. The new ones would do all their internal accounting in one centralized place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// this case, the RPC will not be intercepted by this filter.
BuildServerInterceptor(config, override FilterConfig) (iresolver.ServerInterceptor, error)

// Close closes the filter, allowing it to perform any required cleanup.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, please TODO since close is never called.

Copy link
Contributor Author

@easwars easwars Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the other changes, we no longer have this method.

}
i, err := ib.BuildClientInterceptor(filter.Config, override)
filterInstance := filter.FilterProvider.Build(filter.Name)
i, err := filterInstance.BuildClientInterceptor(filter.Config, override)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, add TODO.

Note that with a close function being returned, it would have been more obvious that it wasn't getting called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
si, err := sb.BuildServerInterceptor(filter.Config, override)
filterInstance := filter.FilterProvider.Build(filter.Name)
si, err := filterInstance.BuildServerInterceptor(filter.Config, override)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar: needs a TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

func init() {
httpfilter.Register(builder{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but would you mind also reverting this renaming to make the diffs easier to consume?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

type Filter interface {
// TypeURLs are the proto message types supported by this filter. A filter
// will be registered by each of its supported message types.
type FilterProvider interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we revert the old name, too? A "Filter" was already effectively a provider (of interceptors), and it retains that property with these changes.

Basically this whole change becomes:

  • Add a name and cancel func to the instance builders, to allow Filters to track instances.

? Does that actually satisfy all our goals? I really think it does, but I'm not 100% sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Does that actually satisfy all our goals? I really think it does, but I'm not 100% sure

I think it does. For example, in the GCP Auth Filter (A83), the Filter would contain a cache of call creds, and a map from name to the interceptor. Whenever Build is called, it would consult the map and return an entry from the map if one exists, otherwise it would create a new one and pass it a ref to the cache.

@easwars
Copy link
Contributor Author

easwars commented Dec 19, 2025

But now that we've gotten rid of the two-level thing, I think there is a problem when there is more than one grpc channel in the process and filter names are the same across channels (which seems very possible), but the configs are different.

Let's say we have channel-A with a "GCP Auth filter" with a config that has a cache size of 10, and channel-B with a "GCP Auth filter" with a config that has a cache size of 20.

With the two-level approach:

  • xDS client will include the same GCP Auth FilterProvider in the LDS resource for both channels
  • xDS resolver for channel-A will built a Filter instance from the provider for name "GCP Auth filter"
  • xDS resolver for channel-B will built a Filter instance from the provider for name "GCP Auth filter"
  • The provider will return two different Filter instances to the two channels because the provider is not going to maintain any shared state. In fact, we could get rid of the name parameter from the Build method at this point, or just use it for logging purposes.
  • Whenever either of the channels receives a new configuration for the filter with the same name, it's xDS resolver would build interceptors from their unique filter instance, which will contain the cache and will be able to retain the cache across state updates

With our flat approach though, this is not going to be possible:

  • Since the Filter (which is also the provider/builder) uses the name to distinguish between different instances of interceptors, it cannot support different configurations for channels with the same filter names.

I think we should go back to the two-level approach.

Let me know if I'm missing something.

@dfawley
Copy link
Member

dfawley commented Dec 19, 2025

The provider will return two different Filter instances to the two channels because the provider is not going to maintain any shared state. In fact, we could get rid of the name parameter from the Build method at this point, or just use it for logging purposes.

I see. I thought the FilterProvider was going to need to hold state so that if Build was called with the same name it would spit out the same instance again.

But then what is the lifecycle of a Filter instance? When is it created/destroyed?

Whenever either of the channels receives a new configuration for the filter with the same name, it's xDS resolver would build interceptors from their unique filter instance, which will contain the cache and will be able to retain the cache across state updates

When does the cached state get reused / not across interceptors? I thought we throw them away when the config changes? Or is it only certain parts of the config that determine that?

@easwars
Copy link
Contributor Author

easwars commented Dec 19, 2025

I see. I thought the FilterProvider was going to need to hold state so that if Build was called with the same name it would spit out the same instance again.

We could have the xDS resolver and the xDS server do that instead. With the two-level approach, they would maintain a map from filter name to Filter instance.

But then what is the lifecycle of a Filter instance? When is it created/destroyed?

And as long as the filter name does not change, the resolver and the server will continue using the same Filter instance, but will build new interceptors when needed (when building a new config selector on the client, and when building a new filter chain on the server).

When does the cached state get reused / not across interceptors? I thought we throw them away when the config changes? Or is it only certain parts of the config that determine that?

Yeah, it depends on the config and the filter. For example, in the GCP Auth filter, the only configuration is the size of the cache of call credentials. If the size changes, we don't want to throw away the old cache, but retain as many entries from the old cache as possible with the new size. And in the ext_authz filter, let's say some allowed_headers or header_mutation_rules change, but the address of the ext_authz server does not, then we would want to continue to use the existing grpc channel to the ext_authz server instead of creating a new one.

With the two-level approach, the Filter instance will be able to keep track of the most recent configuration based on what is passed to the interceptor build functions.

@dfawley
Copy link
Member

dfawley commented Dec 22, 2025

And as long as the filter name does not change, the resolver and the server will continue using the same Filter instance

OK so within a channel or server the lifecycle is (filter name + type)? So even if the config changes we keep using the same instance? And there will be like a map in the xds resolver and server with (name+type) as the key so it can do re-use?

So yes it seems we need the two-level approach, then, to handle that extra dimension. What if we:

  • change ClientInterceptorBuilder to ClientFilterBuilder and rename its method to BuildClientFilter
  • change ServerInterceptorBuilder to ServerFilterBuilder and rename its method to BuildServerFilter
  • Add ClientFilter and ServerFilter interfaces that contain a BuildInterceptor method that returns the interceptor.

Then we keep the property of the type system telling us what a filter supports, instead of having redundant bool methods. Also we keep the names a little simpler, too. A Filter is an HTTP filter in the generic sense. A ClientFilter is an instance of that Filter for clients to use, and a ServerFilter is an instance for servers.

WDYT?

@dfawley dfawley assigned easwars and unassigned dfawley Dec 22, 2025
@easwars
Copy link
Contributor Author

easwars commented Jan 7, 2026

OK so within a channel or server the lifecycle is (filter name + type)? So even if the config changes we keep using the same instance? And there will be like a map in the xds resolver and server with (name+type) as the key so it can do re-use?

Yes, that's correct.

@easwars
Copy link
Contributor Author

easwars commented Jan 7, 2026

With your suggestions for renaming, the API would look something like this:

// Instances of these are stored in the registry, keyed by the type_urls
type Filter interface {
  TypeURLs() []string
  ParseFilterConfig(proto.Message) (FilterConfig, error)
  ParseFilterConfigOverride(proto.Message) (FilterConfig, error)
  IsTerminal() bool
}

// A Filter can optionally implement this interface
type ClientFilterBuilder interface {
  BuildClientFilter() ClientFilter
}

// A Filter can optionally implement this interface
type ServerFilterBuilder interface {
  BuildServerFilter() ServerFilter
}

type ClientFilter interface {
  BuildClientInterceptor(configs) (iresolver.ClientInterceptor, func(), error) // The returned cancel func to be invoked when the interceptor is no longer needed
}

type ServerFilter interface {
  BuildServerInterceptor(configs) (iresolver.ServerInterceptor, func(), error) // The returned cancel func to be invoked when the interceptor is no longer needed
}

I gave the methods in the ClientFilter and ServerFilter interfaces different names so that a single concrete type can implement both the interfaces if required.

EDIT: Also, I changed the BuildXxxFilter methods to not return a cancel func as we will not have a case where a filter exists without interceptors being built from it.

@easwars
Copy link
Contributor Author

easwars commented Jan 7, 2026

With the above approach, the ClientFilter and ServerFilter instances would be the ones that would maintain state that is shared across updates and would make them available to the interceptors build from them.

We also discussed adding some kind of generic reference counted FilterState type that will be passed from the filter instance to the interceptor instance to allow for shared state. The interceptors themselves would hold one ref each to the FilterState (this would be unref-ed when an old config_selector is thrown away) and every RPC running through the interceptor would hold one ref (this would be unref-ed when the RPC completes). This work though can happen as part of the first filter that needs shared state (either GCP auth filter or ext_authz filter).

@easwars easwars assigned dfawley and unassigned easwars Jan 7, 2026
@easwars
Copy link
Contributor Author

easwars commented Jan 7, 2026

@dfawley : Assigning back to you to make sure my understanding matches yours, before I start making the changes. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants