Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache ZedTokens for resources using NATS #209

Merged
merged 4 commits into from
Jan 16, 2024

Conversation

jnschaeffer
Copy link
Contributor

@jnschaeffer jnschaeffer commented Jan 14, 2024

Using full consistency when doing permissions checks is slow. In general, this is addressed by using ZedTokens to indicate minimum bounds on freshness when looking up cached data. Something has to keep track of those tokens, either on the client side or server-side.

This PR introduces worker caching of ZedTokens for resources on updates to relationships and updates the query engine to use those tokens when performing permissions checks. When a worker updates a relationship, it persists the ZedToken for all resources directly affected by that update to a NATS KV bucket. NATS KV writes are immediately consistent, so the new ZedToken for that resource is available to all consumers, including permissions-api API frontends. When the query engine performs a permissions check, it checks to see if a ZedToken is available for the resource.

If a ZedToken was found, that ZedToken is used along with the at_least_as_fresh SpiceDB API consistency strategy. If not, or if there was an error accessing NATS, the query engine falls back to the minimize_latency API consistency strategy. If the NATS KV bucket is configured with a TTL at least as high as the quantization interval for SpiceDB, this ensures that by the time the ZedToken is evicted from the cache, all SpiceDB frontends will be updated with data at least as fresh as the last relationship update for a resource. Clients that wish to force an update for a resource (e.g., making role changes immediately available to tenant users) can thus issue a relationship update to permissions-api and get the latest data for that resource.

This PR assumes that the KV bucket used already exists; permissions-api will not attempt to create it. This is because the intention is that the KV bucket has a TTL set to something close to the SpiceDB quantization interval, which permissions-api is not necessarily aware of.

Using full consistency when doing permissions checks is slow. In
general, this is addressed by using ZedTokens to indicate minimum
bounds on freshness when looking up cached data. Something has to keep
track of those tokens, either on the client side or server-side.

This commit introduces worker caching of ZedTokens for resources on
updates to relationships and updates the query engine to use those
tokens when performing permissions checks. When a worker updates a
relationship, it persists the ZedToken for all resources directly
affected by that update to a NATS KV bucket. NATS KV writes are
immediately consistent, so the new ZedToken for that resource is
available to all consumers, including permissions-api API
frontends. When the query engine performs a permissions check, it
checks to see if a ZedToken is available for the resource.

If a ZedToken was found, that ZedToken is used along with the
at_least_as_fresh SpiceDB API consistency strategy. If not, or if
there was an error accessing NATS, the query engine falls back to the
minimize_latency API consistency strategy. If the NATS KV bucket is
configured with a TTL at least as high as the quantization interval
for SpiceDB, this ensures that by the time the ZedToken is evicted
from the cache, all SpiceDB frontends will be updated with data at
least as fresh as the last relationship update for a resource. Clients
that wish to force an update for a resource (e.g., making role changes
immediately available to tenant users) can thus issue a relationship
update to permissions-api and get the latest data for that resource.

This commit assumes that the KV bucket used already exists;
permissions-api will not attempt to create it. This is because the
intention is that the KV bucket has a TTL set to something close to
the SpiceDB quantization interval, which permissions-api is not
necessarily aware of.

Signed-off-by: John Schaeffer <jschaeffer@equinix.com>
@jnschaeffer jnschaeffer requested review from a team as code owners January 14, 2024 21:11
This commit adds the necessary configs to the Helm chart to support
populating a ZedToken cache for permissions-api.

Signed-off-by: John Schaeffer <jschaeffer@equinix.com>
fishnix
fishnix previously approved these changes Jan 16, 2024
Copy link
Contributor

@fishnix fishnix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor stuff, this looks good 👍

}

// If an error happened when creating or updating the token, record it.
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ill be interested to see how often we hit this condition - i expect not a lot since its for an individual relationship

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not often, but I can think of a few cases:

  • NATS is unavailable
  • Something created a bucket entry before this function did
  • Something updated a bucket entry

// updateRelationshipZedTokens updates the NATS KV bucket for ZedTokens, setting the given ZedToken
// as the latest point in time snapshot for every resource in the given list of relationships.
func (e *engine) updateRelationshipZedTokens(ctx context.Context, rels []types.Relationship, zedToken string) {
resourceIDMap := map[string]struct{}{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this over just a slice here since you're just looping over the keys anyways?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relationships don't necessarily need to refer to unique resources (i.e., multiple relationships could be persisted that affect the same tenant). This just ensures that we aren't sending redundant ZedToken updates.

internal/query/zedtokens.go Show resolved Hide resolved
This commit adds NATS creds to the server deployment in the Helm
chart.

Signed-off-by: John Schaeffer <jschaeffer@equinix.com>
mikemrm
mikemrm previously approved these changes Jan 16, 2024
@jnschaeffer jnschaeffer dismissed stale reviews from mikemrm and fishnix via 534a7bf January 16, 2024 15:30
This commit adds tests for determineConsistency.

Signed-off-by: John Schaeffer <jschaeffer@equinix.com>
@jnschaeffer jnschaeffer merged commit 1d6d177 into infratographer:main Jan 16, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants