Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clustering) config export on demand #9362

Closed
wants to merge 4 commits into from

Conversation

StarlightIbuki
Copy link
Contributor

fix FT-3308

@bungle
Copy link
Member

bungle commented Sep 1, 2022

This reminds me of this:
#9082

@StarlightIbuki
Copy link
Contributor Author

This reminds me of this: #9082

yeah, we reduced 4 times of the process to 2, and now it should be 1

@bungle
Copy link
Member

bungle commented Sep 1, 2022

This also reminds me of this, @suika-kong please read carefully all the comments there:
#8840

@bungle
Copy link
Member

bungle commented Sep 1, 2022

While I think on demand is good, it does not solve the double problem. A lot of code will still be double and executed double when you have both types of clients connected. I think common code should be common and executed only once.

That is:

  1. there is no reason both wrpc and old do export
  2. there is no reason both calculate hashes
  3. there is no reason both json encode config
  4. there is no reason both compress the config

@StarlightIbuki
Copy link
Contributor Author

While I think on demand is good, it does not solve the double problem. A lot of code will still be double and executed double when you have both types of clients connected. I think common code should be common and executed only once.

That would be hard. We've discussed that earlier: the format of the 2 protocols are very different, and the most we could do here is to share the config table exported. I think that was the reason why we did not do it that time.

@bungle
Copy link
Member

bungle commented Sep 1, 2022

That would be hard. We've discussed that earlier: the format of the 2 protocols are very different, and the most we could do here is to share the config table exported. I think that was the reason why we did not do it that time.

Last time I checked it was not that hard. The wRPC is no different, it just wraps the above in its own protocol. What I said above is common, AFAIK. It means though a lot of code needs to be moved.

@bungle
Copy link
Member

bungle commented Sep 1, 2022

While I think on demand is good, it does not solve the double problem. A lot of code will still be double and executed double when you have both types of clients connected. I think common code should be common and executed only once.

That is:

  1. there is no reason both wrpc and old do export
  2. there is no reason both calculate hashes
  3. there is no reason both json encode config
  4. there is no reason both compress the config

I stand corrected, only the 1. and 2. are easily shareable. The 3. and 4. are not.

kong/clustering/control_plane.lua Outdated Show resolved Hide resolved
kong/clustering/wrpc_control_plane.lua Outdated Show resolved Hide resolved
@fffonion
Copy link
Contributor

fffonion commented Sep 1, 2022

@suika-kong if remove the duplicate code is too much work to do, let's not it do it in this PR (but sure to have a ticket to record it). As CP mostly takes to DP in either of the protocol, making it "on demand" should solve the CP 2x latency issue in most cases, let's focus on ship that and also solve the caching issue Aapo mentioned in 8840.


elseif err ~= "timeout" then
if err ~= "timeout" then
Copy link
Contributor

@flrgh flrgh Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ok is truth-y but self.clients is empty, this branch will be followed because err will be nil and generate a confusing error log entry (I think).

Maybe it would be more straightforward to put the self.clients check into push_config?

function _M:push_config()
  if not next(self.clients) then
    return true
  end

  local start = ngx_now()

  local payload, err = self:export_deflated_reconfigure_payload()
  if not payload then
  -- ...
end

...otherwise I think this needs to be changed to if err and err ~= "timeout" then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, maybe the self.clients check could be pushed to the event handler? This would be simple for both control_plane and wrpc_control_plane:

  kong.worker_events.register(function(_)
    if next(self.clients) and push_config_semaphore:count() <= 0 then
      -- the following line always executes immediately after the `if` check
      -- because `:count` will never yield, end result is that the semaphore
      -- count is guaranteed to not exceed 1
      push_config_semaphore:post()
    end
  end, "clustering", "push_config")

Copy link
Member

@dndx dndx Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flrgh The first approach has some issues, if we return true inside push_config, then the timer of push_config_loop is still reset which means that loop will wait for db_update_frequency before even pushing out another updated config.

The second issue might work (I did not spot immediate issue with it), but it seems to be a bit more dangerous as there are a lot of asynchronous going on between worker events and the semaphore. It's a bit harder to argue the correctness.

My thought:

If we change the push_config_loop function to do the following:

while not exiting() do
    local ok, err = push_config_semaphore:wait(1)
    if exiting() then
      return
    end

    if ok then
      if next(self.clients) do
        ok, err = pcall(self.push_config, self)
        ...

Does this also work? It seems to be simpler than the current approach and does not affects the coalescing timer. Basically, we consume the semaphore, but pretends nothing happened. The current code does the same thing, it just seems to be more complex on the error handling part.

@@ -305,7 +308,10 @@ local function push_config_loop(premature, self, push_config_semaphore, delay)
if exiting() then
return
end
if ok then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

dndx and others added 2 commits September 1, 2022 12:21
Co-authored-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
Co-authored-by: Michael Martin <flrgh@protonmail.com>
if ok then
-- we still need to wait for config for if no clients another run, just in case new client is in
if ok and next(self.clients) then

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Co-authored-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
@@ -540,7 +547,9 @@ local function push_config_loop(premature, self, push_config_semaphore, delay)
if exiting() then
return
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
end
end

@StarlightIbuki
Copy link
Contributor Author

Closing for @dndx 's PR. #9368

@bungle bungle deleted the feat/hybrid_cfg_export_on_demand branch October 14, 2022 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants