Backport of Allocation API: fix "no path to region" errors for non-global regions into release/1.9.x #24682
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport
This PR is auto-generated from #24644 to be assessed for backporting due to the inclusion of the label backport/1.9.x.
🚨
The person who merged in the original PR is:
@tgross
This person should manually cherry-pick the original PR into a new backport PR,
and close this one when the manual backport PR is merged in.
The below text is copied from the body of the original PR.
In #16872 we added support for unix domain sockets, but this required mutating the
Config
when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients (as in the autoscaler) that mutation would happen multiple times and the address would be incorrectly parsed.When making
alloc log
,alloc fs
, oralloc exec
calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the privateurl
field and that causes the URL parsing to be skipped for the new client.This results in the region always being set to the string literal
"global"
(because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader.Arguably the "right" way of fixing this would be for
ClientConfig
not to change the API client's region to"global"
in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the privateurl
field so that the new address is re-parsed.Fixes: #24635
Fixes: #24609
Ref: #16872
Ref: #23785
Ref: 4d3b75d
Ref: https://hashicorp.atlassian.net/browse/NET-11858
Testing & Reproduction steps
To reproduce, stand up a cluster with
region = "example"
with at least one client, in an environment where you have line-of-sight to all nodes (ex. local development environment should work fine). Deploy a job to that client and runnomad alloc logs :alloc_id
.Overview of commits