CM: Add on-demand cluster discovery functionality #15857

krnowak · 2021-04-06T20:03:32Z

Commit Message:
OdCdsApi interface is introduced as an API used for the on-demand discovery. The implementation of it is provided using the CdsApiHelper class, so OdCdsApiImpl handles the discovered cluster in the same way as CdsApiImpl does. On the ClusterManagerImpl side, the discovery manager (ClusterDiscoveryManager) is added to help in deduplicating the requests for the same cluster from within the same worker thread. Further deduplication of the requests coming from different worker threads is done in ClusterManagerImpl in the main thread. Each unique request for a cluster also receives a timeout to catch a case when a discovery fails, thus allowing to let the worker threads to handle the failure.

This also adds an explicit wildcard mode to the XDS protocol, to allow handling both wildcard and on-demand requests for the same resource type on the same stream.

Additional Description:

Risk Level:
High. A new feature not wired up anywhere yet. Changes in wildcard mode.

Testing:
As a part of #15523.

Docs Changes:
API docs, updated the xds-protocol docs about explicit wildcard mode.

Release Notes:

xds: added the explicit wildcard mode, which allows subscribing to specific resource names on top of a wildcard subscription. This means that the resource name * is reserved. This may mean that in some cases the initial wildcard subscription request on the stream will not be empty, but will have a non-empty list of resources with a special name among them. See :ref:wildcard mode description <xds_protocol_resource_hints> and :ref:unsubscribing from resources <xds_protocol_unsubscribing_from_resources> for details.

On-demand cluster discovery would be release-notes-worthy, but since it's not wired anywhere yet, it can't be used.

Platform Specific Features:
N/A

[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]
[Optional API Considerations:]

Each thread using this functionality will need to instantiate their own OdCdsApi instance and pass it together with a callback to ClusterManager's requestOnDemandClusterDiscovery. Cluster manager will use passed OdCdsApi to perform the discovery. When the discovery is finished, ClusterManager invokes the callbacks in the thread context. OdCdsApiImpl starts the subscription on the first update request. The subsequent request uses the on-demand update functionality of the subscription. That way, subscriptions are only started when actually needed. Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

This is to address an issue raised during a review of the CdsApiHelper refactor. Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

mattklein123

Starting with some interface comments/questions. Thank you!

/wait

mattklein123 · 2021-04-06T23:38:22Z

include/envoy/upstream/cluster_manager.h

+   * @return ClusterDiscoveryCallbackHandlePtr the discovery process handle.
+   */
+  virtual ClusterDiscoveryCallbackHandlePtr
+  requestOnDemandClusterDiscovery(OdCdsApiSharedPtr odcds, const std::string& name,


Starting from a very high level: how does one allocate an OdCdsAPI, and why does one allocate it? It seems like this should all be handled by the cluster manager? Is the issue config? Should there be a CM API to allocate an on demand handle or something like that?

Starting from a very high level: how does one allocate an OdCdsAPI, and why does one allocate it?

An extension could have a config source in its configuration, so it could be used to create an OdCdsApi. Extension config creation happens in main thread, so this looks like a good moment to create an instance. The source config could be different from cds_config in bootstrap, which may be the reason why a separate OdCdsApi needs to be allocated.

It seems like this should all be handled by the cluster manager? Is the issue config? Should there be a CM API to allocate an on demand handle or something like that?

Could be I suppose. I can see adding something like:

class OdCdsApiHandle { public: virtual ~OdCdsApiHandle() = default; }; using OdCdsApiHandleSharedPtr = std::shared_ptr<OdCdsApiHandle>; … virtual OdCdsApiHandleSharedPtr ClusterManager::allocateOdCdsApi(…); virtual ClusterDiscoveryCallbackHandlePtr requestOnDemandClusterDiscovery(OdCdsApiHandleSharedPtr odcds, …);

So I'm not sure. It's replacing an interface with one function for another interface with no functions + one more function in CM.

Yeah ^ is what I would do. It's cleaner because now everything is contained within the CM and the handle is effectively opaque and can be de-duped based on config.

/wait

I'm halfway through the refactor, so I have couple of notes:

I have kept the OdCdsApiImpl class for now, it just does not subclass the already gone OdCdsApi interface. I think I'll rename the class to OdCdsApi later, since it's not an impl of anything anymore.

Having an opaque OdCdsApiHandle classes basically forced me to write a function like follows inside cluster_manager_impl.cc:

OdCdsApiImpl& getOdCdsApiFromHandle(OdCdsApiHandleSharedPtr const& handle) { auto impl = std::dynamic_pointer_cast<OdCdsApiHandleImpl> (handle); ASSERT(impl != nullptr); return impl->get(); }

which feels kind of ew, but I have seen std::dynamic_pointer_cast being used in Envoy, so I guess this is ok.

This also means that it probably became difficult to keep the unit tests I wrote (the ODCDTest class in cluster_manager_impl_test.cc), because I can't mock OdCds. Passing a mock of the handle will trigger an assertion. I'll comment them out for now.

You shouldn't need any casts. Just access the functionality from the handle (i.e. move the dynamic fetch functionality to the handle).

Oh. My understanding was that you wanted the handle class to be completely opaque, so no other virtual functions other than the destructor. Adding a dynamic fetch functionality to the handle means renaming the old OdCdsApi interface to OdCdsApiHandle.

Basically what I'm after is for the entire functionality to be exposed by the CM in isolation. So I think API handle -> fetch -> fetch handle makes sense. The API handle should be obtained from the CM itself based on config (which should be de-duped).

The only wrinkle I can see here is that you can receive an API handle from CM and then instead of doing cm.requestOnDemandClusterDiscovery(api_handle, …), you can also do api_handle.updateOnDemand(clusterName). So maybe expose the discovery functionality only through the handle and drop the requestOnDemandClusterDiscovery function from CM interface? Or it would more like - move requestOnDemandClusterDiscovery from CM to API handle.

Yes exactly, the only CM API is to get an API handle, then the request is done through the API handle. This makes sure the entire abstraction is hidden inside the CM, and then it's just an implementation detail of code sharing between CDS and ODCDS.

mattklein123 · 2021-04-06T23:38:57Z

include/envoy/upstream/cluster_manager.h

+  /**
+   * File an on-demand request for a cluster.
+   */
+  virtual void updateOnDemand(const std::string& cluster_name) PURE;


Porting my comment over from the other PR: I'm not sure why this is needed at this interface level. From my perspective everything should be encapsulated by the cluster manager. (See above comment)

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

krnowak · 2021-04-08T18:20:36Z

I refactored the code, so now we have the handle interface that exposes the discovery functionality, and CM received a function to create those handles. Currently the implementation of the function is simple - it creates a new handle every time it's called. I have kept the OdCdsApi interface so I could mock it for the cluster manager unit tests, but the interface is an implementation detail now.

The code is still building on my computer, so there may still be some lingering build issues to fix, so I could end up pushing some more commits. But I think it's more or less ready for another review round.

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

mattklein123

Thanks from an API perspective this looks right. Some high level comments around readability:

Please move the classes/function definitions into the appropriate visibility section.
The code needs significantly more comments given how complicated it is (across both class level and definition level, as well as within each function). It will make reviewing easier also. So please work on that next and I can take another pass.

Thank you!

/wait

mattklein123 · 2021-04-08T21:50:26Z

include/envoy/upstream/cluster_manager.h

+   */
+  virtual ClusterDiscoveryCallbackHandlePtr
+  requestOnDemandClusterDiscovery(const std::string& name,
+                                  ClusterDiscoveryCallbackWeakPtr callback) PURE;


I think I commented on this previously, but in general it doesn't make sense to pass a weak ptr here. The callback should exist at the time of the call. I would let the callee store it as a weak ptr.

Ok, I thought earlier it was only about the odcds weak pointer. Will change it.

mattklein123 · 2021-04-08T21:51:54Z

include/envoy/upstream/cluster_manager.h

+   */
+  virtual OdCdsApiHandleSharedPtr
+  allocateOdCdsApi(const envoy::config::core::v3::ConfigSource& odcds_config,
+                   const xds::core::v3::ResourceLocator* odcds_resources_locator,


nit: I would use OptRef here, and clarify above that this is use if provided over odcds_config.

It is a pointer, because I copied that from CdsApiImpl. Should CdsApiImpl be fixed too?

If it's easy to fix, yes, otherwise I wouldn't worry about it for now.

source/common/upstream/cluster_manager_impl.h

mattklein123 · 2021-04-08T21:54:38Z

source/common/upstream/cluster_manager_impl.h

+  private:
+    ClusterDiscoveryManager& parent_;
+    ClusterDiscoveryCallbackWeakPtr cb_;
+    std::string name_;


mattklein123 · 2021-04-08T21:56:00Z

source/common/upstream/cluster_manager_impl.h

@@ -285,6 +286,42 @@ class ClusterManagerImpl : public ClusterManager, Logger::Loggable<Logger::Id::u
  ClusterUpdateCallbacksHandlePtr
  addThreadLocalClusterUpdateCallbacks(ClusterUpdateCallbacks&) override;

+  class OdCdsApiHandleImpl : public std::enable_shared_from_this<OdCdsApiHandleImpl>,


Class definitions go at the top of the visibility section. I think this implementation detail can also be private.

Can this be at least protected, so unit test could use it?

You can make the test a friend if you need to test private details, though optimally the tests would just go through the main CM.

I've wanted to make it protected, so the test can construct its own handle instance that contains a mock odcds, instead of the real one. So when calling handle->requestOnDemandClusterDiscovery(…), a real CM code will be tested, but it will call mock's odcds.updateOnDemand. I'll push my updates soon, so you will see what I mean.

mattklein123 · 2021-04-08T21:59:13Z

source/common/upstream/cluster_manager_impl.h

+
+  private:
+    ClusterManagerImpl& parent_;
+    OdCdsApiPtr odcds_;


mattklein123 · 2021-04-08T21:59:46Z

source/common/upstream/cluster_manager_impl.h

+  ClusterDiscoveryCallbackHandlePtr
+  requestOnDemandClusterDiscovery(OdCdsApiHandleImplSharedPtr odcds_handle, const std::string& name,
+                                  ClusterDiscoveryCallbackWeakPtr callback);


This can be private.

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

krnowak · 2021-04-09T19:10:56Z

Updated. Other than addressing the review issues, I have changed the use of deque to list (I can't remember what was the reason to use deque). That made the code a bit simpler and allowed me to move the discovery handle impl class into the discovery manager class.

Other than that - I've got a clang-tidy failure. It complains about some leak in libstdc++ code (search for M_pi): https://dev.azure.com/cncf/4684fb3d-0389-4e0b-8251-221942316e06/_apis/build/builds/71558/logs/144

Abort, Retry, Ignore? :)

mattklein123

Thanks cool stuff. Flushing out some comments from this round.

/wait

mattklein123 · 2021-04-09T21:50:17Z

include/envoy/upstream/cluster_manager.h

+   * @param validation_visitor
+   * @return OdCdsApiHandleSharedPtr the ODCDS handle.
+   */
+  virtual OdCdsApiHandleSharedPtr


I think I see why you uses shared_ptr here (to deal with lifetime issues), but in a perfect world from an API perspective this should be a unique_ptr return and any needed shared/weak ptr gymnastics should be done under the hood. From a user perspective I would expect to allocate a handle in my filter, etc. main thread init, use that across different threads, and then destroy it when the filter is destroyed.

Yeah, this makes sense. Thanks for the insight. I have made it an unique_ptr that holds a shared_ptr to OdCdsApi.

mattklein123 · 2021-04-09T21:51:19Z

source/common/upstream/cluster_manager_impl.cc

+    // TODO(krnowak): This timeout period is arbitrary. Should it be a parameter or do we keep it,
+    // but rather with a name?
+    timer->enableTimer(std::chrono::milliseconds(5000));


Yes the timeout should be passed as an API paramater. Also I think ClusterDiscoveryStatus should have a Timeout status to differentiate from Missing.

Ok, will add the timeout parameter.

About the Missing and Timeout statuses - the thing is that the Missing status is only reported when the discovery times out, so I guess we could have Available and Timeout statuses. I'm not sure if there's a way for grpc subscription to tell me that the asked resource does not exist. At least I haven't seen it done in VHDS.

mattklein123 · 2021-04-09T21:56:21Z

source/common/upstream/cluster_manager_impl.cc

+          // If this gets called here, it means that we requested a discovery of a cluster without
+          // checking if that cluster is already known by cluster manager.


Is this actually true? Couldn't this be a race between threads?

Hm, could be a rather long-winded race, true. Worker thread W1 checks for a cluster that is absent, gets preempted for worker thread W2, which does the same and starts the discovery process. Then main thread sends the gRPC request, receives the reply, warms up the cluster and adds it to the thread local CMs. Then W1 gets a chance to run again, and since it decided that the cluster is missing, it starts the discovery and hits this case in main thread. Long-winded to a point of starvation, so I'd say this is a highly unlikely possibility. But who knows what may happen under the really high load. I'll update the comment.

mattklein123 · 2021-04-09T21:57:03Z

source/common/upstream/cluster_manager_impl.cc

+      return;
+    }
+
+    if (auto it = pending_cluster_creations_.find(name); it != pending_cluster_creations_.end()) {


Please add debug/trace level logging throughout this code/PR. It's going to be very difficult to debug this and understand what is going on. Please also check coverage to make sure all various branches are hit in the new code.

Added some, but I'm rather unsure about adding logging to the discovery manager - it's a thread local thing, so it might spam the log N times with the same message (especially when the discovery is finished and every worker thread is notified about the fact).

I think it's OK to do that at trace level, and probably still worth it.

Ok, will add them then.

mattklein123 · 2021-04-09T21:57:38Z

source/common/upstream/cluster_manager_impl.cc

+    // invoke the cluster lifecycle callbacks, that will in turn invoke our callback.
+    odcds.updateOnDemand(name);
+    // Setup the discovery timeout timer to avoid keeping callbacks indefinitely.
+    auto timer_cb = Event::TimerCb([this, name] { notifyExpiredDiscovery(name); });


nit: just inline this next line

mattklein123 · 2021-04-09T21:59:38Z

source/common/upstream/cluster_manager_impl.cc

+  // Defer destroying the timer, so it's not destroyed during its callback. TimerPtr is a
+  // unique_ptr, which is not copyable, but std::function is copyable, we turn a move-only
+  // unique_ptr into a copyable shared_ptr and pass that to the std::function.


None of this should be necessary. Timers are one shot and it should be safe to destroy them during callback invocation, so I think it's fine to just destroy the timer as this pending discovery expires. We shouldn't be passing timers between threads in any situation.

This code is expected to be executed in main thread, so the deferred destruction is also meant to happen in the main thread. But anyway, I'm happy to drop this, so I'll get rid of the clang-tidy complaint here too.

mattklein123 · 2021-04-09T22:00:14Z

source/common/upstream/cluster_manager_impl.cc

+    if (!cluster_manager.has_value()) {
+      return;
+    }


I don't think this is possible?

It's an opt-ref so I thought that the check would be prudent to do. I'll remove it.

source/common/upstream/cluster_manager_impl.cc

mattklein123 · 2021-04-09T22:05:30Z

source/common/upstream/cluster_manager_impl.cc

+  auto cb = ClusterAddedCb([this](ThreadLocalCluster& cluster) {
+    processClusterName(cluster.info()->name(), ClusterDiscoveryStatus::Available);
+  });


nit: just inline next line

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

krnowak · 2021-04-12T18:43:11Z

Updated. I'll have a look at the coverage when CI finishes it.

krnowak · 2021-04-13T06:28:49Z

tsan failure seems unrelated - two tests are flaky, //test/config_test:example_configs_test and //test/integration:integration_test, and some python flaky test reporting tool threw an exception.

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

krnowak · 2021-04-13T16:45:09Z

Updated, improved coverage a bit too, but there is one place I couldn't cover with mocking unit tests.

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

This is mostly for testing the invoker behavior for a handle that was destroyed when it was already prepared for processing. Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

…ogress This had led me to adding rather icky workarounds to simulate multiple threads with mocks. Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

krnowak · 2021-05-25T16:00:15Z

@mattklein123, @htuch, @dmitri-d, @markdroth: I think it's ready for review now.

I found it a bit hard to express the semantics of the explicit wildcard mode (as I called it) in the xds-protocol docs, because it feels that the general description of the protocol is done through the SotW point of view. I'll try to make a comparison between delta and SotW here:

initial implicit wildcard subscription request:
- SotW: empty request (no resource names).
- delta: empty request (no subscribe or unsubscribe resource names).
initial explicit wildcard subscription request with an extra resource foo (enabling explicit wildcard mode without adding an extra resource names makes little sense, but it's doable):
- SotW: request with names * and foo
- delta: request with subscribe names * and foo
subsequent request keeping the wildcard subscription and adding a resource bar:
- SotW: request with names *, foo, and bar.
- delta: request with a subscribe name bar.
subsequent request cancelling the wildcard subscription only:
- SotW: request with names foo and bar.
- delta: request with unsubscribe name *.
subsequent request enabling the wildcard subscription again:
- SotW: request with names *, foo and bar.
- delta: request with subscribe name *.
switching to explicit wildcard mode with an extra resource foo after the initial empty request was sent to enable implicit wildcard mode:
- SotW: request with names * and foo.
- delta: request with subscribe name foo (note that lack of *, we already are in the wildcard mode, no need to repeat it again).

krnowak · 2021-06-01T16:14:38Z

@mattklein123, @htuch, @dmitri-d, @markdroth: Friendly ping for a review. :) Thanks.

I'm not sure if the failure in windows clang_cl CI is related in any way to my changes. The failed test was //test/integration:quic_protocol_integration_test. The name of the failed tests seem to be UpstreamProtocols/DownstreamProtocolIntegrationTest.HittingDecoderFilterLimit/IPv6_HttpDownstream_Http3Upstream and UpstreamProtocols/DownstreamProtocolIntegrationTest.VeryLargeRequestHeadersRejected/IPv6_HttpDownstream_Http3Upstream. But the test output is a bit confusing.

mattklein123 · 2021-06-01T16:21:26Z

@krnowak, @htuch is out this week. I would suggest trying to have @dmitri-d review this week and @htuch can do the final pass next week. I think the things being worked on right now are out of scope of what I can help with.

dmitri-d · 2021-06-01T17:49:29Z

I'll take a look today.

source/common/config/delta_subscription_state.cc

dmitri-d · 2021-06-01T22:58:52Z

source/common/config/delta_subscription_state.cc

+    if (names_removed_.find("*") != names_removed_.end()) {
+      // we explicitly cancel the wildcard subscription
+      mode_ = WildcardMode::Disabled;
+    } else if (!names_added_.empty()) {


Hmm. I'm not sure about this: an explicit mode requires presence of a * in the names_added list; we only check if that list isn't empty. If another call to updateSubscriptionInterest is made before a request to the server is made, we may end up in a wrong state: for example if names_removed is empty in that call, we'll end up in WildcardMode::Disabled state.

Hmm. I'm not sure about this: an explicit mode requires presence of a * in the names_added list;

Only on the initial request (after the stream reset, for example), or on the first request where we decide to opt-in into the wildcard mode. When switching from implicit to explicit mode, no need to send * since the server should already know that we are in the wildcard mode. I would say that with this PR, the server should not ignore delta requests with a non-empty resource_names_subscribe list from a client that opted-in into wildcard mode.

I'll need to go through the documentation I wrote to make sure that it's clear.

we only check if that list isn't empty. If another call to updateSubscriptionInterest is made before a request to the server is made, we may end up in a wrong state: for example if names_removed is empty in that call, we'll end up in WildcardMode::Disabled state.

Sorry, I'm not following here. There isn't any logic that resets us to WildcardMode::Disabled on an empty names_removed_ list. Could you clarify?

dmitri-d · 2021-06-01T23:01:42Z

I think "explicit" wildcard changes make sense, but (and I hate to be this guy): could you make these changes a dedicated PR please?

markdroth · 2021-06-04T16:34:11Z

Sorry for the delay on this!

I found it a bit hard to express the semantics of the explicit wildcard mode (as I called it) in the xds-protocol docs, because it feels that the general description of the protocol is done through the SotW point of view.

I don't actually think it makes sense to consider this a separate mode for wildcard queries. It really isn't a new mode; it's just a new syntax for specifying the existing functionality.

I also don't think it's necessary to describe this totally differently for SotW and delta. The syntax differs between the two protocol variants, but the semantics are ultimately the same: the client determines what set of resources it is subscribed to at any one time, and the set of currently subscribed-to resources is a property of the stream that has to be tracked by both the client and the server. In SotW, the client must include the full set in every request, whereas in delta, the client can send just changes to the existing set, but in both cases, that set is a stable property of the stream in between requests. So I think we can describe the wildcard behavior in terms of that set of subscribed resources.

I think the right way to describe this in the protocol doc is to say that one of the subscribed resources can be *, which is a wildcard subscription. The old-style way of specifying a wildcard query is still supported, but it can be described as basically a syntax alias for the same thing -- in other words, if the set of subscribed resources is empty and has never previously been non-empty, that is a special case meaning that the set of subscribed resources contains only *.

Also, I agree with the comment from @dmitri-d about moving the xDS spec changes into a separate PR.

krnowak · 2021-06-07T18:01:02Z

Please see #16855 for the separate PR for the explicit wildcard mode.

github-actions · 2021-07-07T20:01:24Z

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions · 2021-07-15T00:01:32Z

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

krnowak · 2021-07-15T05:16:33Z

I'll update and reopen it when #16855 gets merged.

krnowak · 2021-07-16T19:36:24Z

I'll pick up this work in August, when I'm back from holidays.

OdCdsApi interface is introduced as an API used for the on-demand discovery. The implementation of it is provided using the CdsApiHelper class, so OdCdsApiImpl handles the discovered cluster in the same way as CdsApiImpl does. On the ClusterManagerImpl side, the discovery manager (ClusterDiscoveryManager) is added to help in deduplicating the requests for the same cluster from within the same worker thread. Further deduplication of the requests coming from different worker threads is done in ClusterManagerImpl in the main thread. Each unique request for a cluster also receives a timeout to catch a case when a discovery fails, thus allowing to let the worker threads to handle the failure. This is a continuation of #15857 - I could not reopen it, so I'm opening a new PR. I used the opportunity to rebase my changes on top of main. Risk Level: Low. A new feature not wired up anywhere yet. Signed-off-by: Krzesimir Nowak <knowak@microsoft.com>

OdCdsApi interface is introduced as an API used for the on-demand discovery. The implementation of it is provided using the CdsApiHelper class, so OdCdsApiImpl handles the discovered cluster in the same way as CdsApiImpl does. On the ClusterManagerImpl side, the discovery manager (ClusterDiscoveryManager) is added to help in deduplicating the requests for the same cluster from within the same worker thread. Further deduplication of the requests coming from different worker threads is done in ClusterManagerImpl in the main thread. Each unique request for a cluster also receives a timeout to catch a case when a discovery fails, thus allowing to let the worker threads to handle the failure. This is a continuation of envoyproxy#15857 - I could not reopen it, so I'm opening a new PR. I used the opportunity to rebase my changes on top of main. Risk Level: Low. A new feature not wired up anywhere yet. Signed-off-by: Krzesimir Nowak <knowak@microsoft.com> Signed-off-by: Josh Perry <josh.perry@mx.com>

krnowak added 4 commits April 6, 2021 20:18

CM: Fix a typo

050c490

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

CDS helper: Make name const

d5f74fb

This is to address an issue raised during a review of the CdsApiHelper refactor. Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

test: Add tests for ODCDS functionality in CM

805ef37

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

krnowak mentioned this pull request Apr 6, 2021

on-demand, upstream: on-demand cluster discovery #15523

Closed

mattklein123 self-assigned this Apr 6, 2021

mattklein123 requested changes Apr 6, 2021

View reviewed changes

repokitteh-read-only bot added the waiting label Apr 6, 2021

krnowak added 5 commits April 8, 2021 13:22

ODCDS: Allow passing a resource locator

8cac4ac

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

CM, test: Refactor to move discovery functionality to handle

3c4cfa5

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

CM, test: Fix formatting

75a0c46

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

Merge remote-tracking branch 'origin/main' into krnowak/odcds-cm-only

a53ea84

CM: Fix build issues

a718e3b

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

repokitteh-read-only bot removed the waiting label Apr 8, 2021

krnowak added 2 commits April 8, 2021 21:35

CM: Fix linking issues

cbe797e

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

test: Fix build issues

bc1d0df

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

mattklein123 requested changes Apr 8, 2021

View reviewed changes

repokitteh-read-only bot added the waiting label Apr 8, 2021

CM, test: Address review issues

b32112f

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

repokitteh-read-only bot removed the waiting label Apr 9, 2021

mattklein123 requested changes Apr 9, 2021

View reviewed changes

repokitteh-read-only bot added the waiting label Apr 9, 2021

CM: test: Address review issues

4e52486

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

repokitteh-read-only bot removed the waiting label Apr 12, 2021

CM, test: Improve coverage a bit

7ea51f0

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

krnowak added 7 commits May 22, 2021 16:14

docs: further updates in xds protocol

7b3f78b

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

tests: fix clang-tidy, use nicemock

bb3eb45

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

test: Add another case to discovery manager tests

124179a

This is mostly for testing the invoker behavior for a handle that was destroyed when it was already prepared for processing. Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

test: Improve coverage around notifications

788680a

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

CM, test: Try to improve coverage around detection of discovery in pr…

04f8f43

…ogress This had led me to adding rather icky workarounds to simulate multiple threads with mocks. Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

test: Improve coverage for explicit wildcard subscription mode

411e72d

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

config, test: Fixes for explicit wildcard mode and coverage improvements

f6ee71a

Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>

dmitri-d reviewed Jun 1, 2021

View reviewed changes

source/common/config/delta_subscription_state.cc Show resolved Hide resolved

dmitri-d reviewed Jun 1, 2021

View reviewed changes

source/common/config/delta_subscription_state.cc Show resolved Hide resolved

dmitri-d reviewed Jun 1, 2021

View reviewed changes

source/common/config/delta_subscription_state.cc Show resolved Hide resolved

dmitri-d reviewed Jun 1, 2021

View reviewed changes

alyssawilk added the waiting label Jun 3, 2021

markdroth mentioned this pull request Jun 7, 2021

Add support for explicit wildcard resource #16855

Merged

github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jul 7, 2021

github-actions bot closed this Jul 15, 2021

krnowak mentioned this pull request Oct 22, 2021

CM: Add on-demand cluster discovery functionality #18723

Merged

		// If this gets called here, it means that we requested a discovery of a cluster without
		// checking if that cluster is already known by cluster manager.

CM: Add on-demand cluster discovery functionality #15857

CM: Add on-demand cluster discovery functionality #15857

Conversation

krnowak commented Apr 6, 2021 • edited Loading

mattklein123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krnowak commented Apr 8, 2021

mattklein123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krnowak Apr 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krnowak commented Apr 9, 2021

mattklein123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krnowak Apr 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krnowak commented Apr 12, 2021 • edited Loading

krnowak commented Apr 13, 2021

krnowak commented Apr 13, 2021

krnowak commented May 25, 2021

krnowak commented Jun 1, 2021

mattklein123 commented Jun 1, 2021

dmitri-d commented Jun 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitri-d commented Jun 1, 2021

markdroth commented Jun 4, 2021

krnowak commented Jun 7, 2021

github-actions bot commented Jul 7, 2021

github-actions bot commented Jul 15, 2021

krnowak commented Jul 15, 2021

krnowak commented Jul 16, 2021

krnowak commented Apr 6, 2021 •

edited

Loading

krnowak Apr 9, 2021 •

edited

Loading

krnowak Apr 12, 2021 •

edited

Loading

krnowak commented Apr 12, 2021 •

edited

Loading