Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate leader endpoint URL #651

Merged
merged 4 commits into from
Jun 26, 2024
Merged

Validate leader endpoint URL #651

merged 4 commits into from
Jun 26, 2024

Conversation

DavidePrincipi
Copy link
Member

Fail fast if the leader FQDN does not work for any reason.

The most common failure case is a missing DNS record: the URL of the leader is displayed to help to understand the issue.

image

In some secondary cases, the leader FQDN is good and validation succeedes, however the VPN hostname cannot be resolved, or does not work. For these cases:

  1. ignore ns-wireguard Firewalld service creation error. The service may exist because it was created by a previous join-node attempt
  2. check if the VPN hostname is resolved, and abort join-cluster if not. A new join attempt is then possible, once the node has been removed from the cluster.

Refs NethServer/dev#6958

The leader endpoint host address must be resolvable. Ensure it is so,
before calling join-node.
If the API URL is unreachable, fail fast in the validation phase:

- Send a probe HEAD request to a well-known URL to check the connection
- Reduce the retries of the probe task, to speed up the existing check
If the ns-wireguard service was already created by a previous join
attempt, ignore the creation failure.
Copy link
Contributor

@andre8244 andre8244 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI part LGTM, just a note about Weblate translations

core/ui/public/i18n/it/translation.json Show resolved Hide resolved
sys.exit(2)

try:
requests.head(endpoint_url, verify=request['tls_verify'], timeout=8.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 seconds are a very long time for an HTTP response, I'd lower it to 5 or less

@DavidePrincipi DavidePrincipi added the testing Start test suite label Jun 26, 2024
@DavidePrincipi DavidePrincipi merged commit 2fcdaf6 into main Jun 26, 2024
8 checks passed
@DavidePrincipi DavidePrincipi deleted the validate-leader-endpoint branch June 26, 2024 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Start test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants