Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acl: reduce complexity of token resolution process with alternative singleflighting #5480

Merged
merged 2 commits into from
Mar 14, 2019

Conversation

rboyer
Copy link
Member

@rboyer rboyer commented Mar 12, 2019

Switches acl resolution to use golang.org/x/sync/singleflight. For the identity/legacy lookups this is a drop-in replacement with the same overall approach to request coalescing.

For policies this is technically a change in behavior, but when considered holistically is approximately performance neutral (with the benefit of less code).

There are two goals with this blob of code (speaking specifically of policy resolution here):

  1. Minimize cross-DC requests.
  2. Minimize client-to-server LAN requests.

The previous iteration of this code was optimizing for the case of many possibly different tokens being resolved concurrently that have a significant overlap in linked policies such that deduplication would be worth the complexity. While this is laudable there are some things to consider that can help to adjust expectations:

  1. For v1.4+ policies are always replicated, and once a single policy shows up in a secondary DC the replicated data is considered authoritative for requests made in that DC. This means that our earlier concerns about minimizing cross-DC requests are irrelevant because there will be no cross-DC policy reads that occur.

  2. For Server nodes the in-memory ACL policy cache is capped at zero, meaning it has no caching. Only Client nodes run with a cache. This means that instead of having an entire DC's worth of tokens (what a Server might see) that can have policy resolutions coalesced these nodes will only ever be seeing node-local token resolutions. In a reasonable worst-case scenario where a scheduler like Kubernetes has "filled" a node with Connect services, even that will only schedule ~100 connect services per node. If every service has a unique token there will only be 100 tokens to coalesce and even then those requests have to occur concurrently AND be hitting an empty consul cache.

Instead of seeing a great coalescing opportunity for cutting down on redundant Policy resolutions, in practice it's far more likely given node densities that you'd see requests for the same token concurrently than you would for two tokens sharing a policy concurrently (to a degree that would warrant the overhead of the current variation of singleflighting.

Given that, this patch switches the Policy resolution process to only singleflight by requesting token (but keeps the cache as by-policy).

…ingleflighting

Switches acl resolution to use golang.org/x/sync/singleflight. For the
identity/legacy lookups this is a drop-in replacement with the same
overall approach to request coalescing.

For policies this is technically a change in behavior, but when
considered holistically is approximately performance neutral (with the
benefit of less code).

There are two goals with this blob of code (speaking specifically of
policy resolution here):

  1) Minimize cross-DC requests.
  2) Minimize client-to-server LAN requests.

The previous iteration of this code was optimizing for the case of many
possibly different tokens being resolved concurrently that have a
significant overlap in linked policies such that deduplication would be
worth the complexity. While this is laudable there are some things to
consider that can help to adjust expectations:

  1) For v1.4+ policies are always replicated, and once a single policy
  shows up in a secondary DC the replicated data is considered
  authoritative for requests made in that DC. This means that our
  earlier concerns about minimizing cross-DC requests are irrelevant
  because there will be no cross-DC policy reads that occur.

  2) For Server nodes the in-memory ACL policy cache is capped at zero,
  meaning it has no caching. Only Client nodes run with a cache. This
  means that instead of having an entire DC's worth of tokens (what a
  Server might see) that can have policy resolutions coalesced these
  nodes will only ever be seeing node-local token resolutions. In a
  reasonable worst-case scenario where a scheduler like Kubernetes has
  "filled" a node with Connect services, even that will only schedule
  ~100 connect services per node. If every service has a unique token
  there will only be 100 tokens to coalesce and even then those requests
  have to occur concurrently AND be hitting an empty consul cache.

Instead of seeing a great coalescing opportunity for cutting down on
redundant Policy resolutions, in practice it's far more likely given
node densities that you'd see requests for the same token concurrently
than you would for two tokens sharing a policy concurrently (to a degree
that would warrant the overhead of the current variation of
singleflighting.

Given that, this patch switches the Policy resolution process to only
singleflight by requesting token (but keeps the cache as by-policy).
@rboyer rboyer added the theme/acls ACL and token generation label Mar 12, 2019
@rboyer rboyer requested a review from a team March 12, 2019 19:50
agent/consul/acl_test.go Outdated Show resolved Hide resolved
Copy link
Member

@mkeeler mkeeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great and the implementation is much simpler than before.

GitHub PR stats is misleading. 225 additions and 231 deletions. We deleted much more custom code but then pulled in the singleflight module which was 112 of those additions.

There are a few questions inline. All very minor things.

agent/consul/acl.go Outdated Show resolved Hide resolved
agent/consul/acl.go Outdated Show resolved Hide resolved
agent/consul/acl.go Outdated Show resolved Hide resolved
agent/consul/acl.go Outdated Show resolved Hide resolved
agent/consul/acl.go Outdated Show resolved Hide resolved
agent/consul/acl_test.go Outdated Show resolved Hide resolved
@rboyer rboyer requested a review from mkeeler March 12, 2019 22:29
Copy link
Member

@mkeeler mkeeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@rboyer rboyer merged commit cd96af4 into master Mar 14, 2019
@rboyer rboyer deleted the f-acl-singleflight branch March 14, 2019 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/acls ACL and token generation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants