Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locket client keepalive time and timeout to jobs #675

Closed
wants to merge 2 commits into from
Closed

Add locket client keepalive time and timeout to jobs #675

wants to merge 2 commits into from

Conversation

klapkov
Copy link
Contributor

@klapkov klapkov commented Jan 5, 2023

As discussed in this issue #627, with this PR we add a keepalive timeout to the locket client, which will improve its performance. The "Time" property of the gRPC client is also added to the locket client config, so operators can configure both of them depending on their needs. Here is the PR in the locket repo: cloudfoundry/locket#12

Copy link

@PlamenDoychev PlamenDoychev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally looks good to me.
Approved

@PlamenDoychev
Copy link

Correlates to cloudfoundry/locket#12

Comment on lines +91 to +97
diego.auctioneer.locket.client_keepalive_time:
description: "Period in seconds after which the locket gRPC client sends keepalive ping requests to the locket server it is connected to."
default: 10
diego.auctioneer.locket.client_keepalive_timeout:
description: "Timeout in seconds to receive a response to the keepalive ping. If a response is not received within this time, the locket client will reconnect to another server."
default: 20

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means it sends a keepalive every 10 seconds right? And that after 20 seconds it times out?

If that is the case, can you change the timeout to be slightly more than 2 times larger than the keep alive? That way if the first keep alive is missed, the second one has a chance to get there in time. Maybe 22 seconds?

Can you also add a template check (and test) for making sure the timeout is greater than the keepalive time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this makes sense. Im not sure where are the job templates being tested. In the bosh docs: https://bosh.io/docs/job-templates/ it is described that we need a bosh-template ruby gem. Currently we do not have something like this, should I create one for every job that we need to test ?

@klapkov klapkov requested a review from ameowlia January 19, 2023 06:42
@geofffranks
Copy link
Contributor

FYI, CI seems unhappy because our github-pr resource is confused about the remote branch also being develop. Will see if there's an easy fix for this

@klapkov
Copy link
Contributor Author

klapkov commented Jan 24, 2023

Thanks @geofffranks, should I just move the changes to a new branch and make a new PR ? It wouldn't take much time.

@geofffranks
Copy link
Contributor

yeah, that might be easiest in the short term @klapkov.

@geofffranks
Copy link
Contributor

you might be able to edit this PR to pull from the new branch, so we can keep the existing discussion history.

@klapkov klapkov closed this by deleting the head repository Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants