Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable SPIRE Agent LRU Cache #4224

Closed
rturner3 opened this issue Jun 5, 2023 · 7 comments · Fixed by #5383
Closed

Enable SPIRE Agent LRU Cache #4224

rturner3 opened this issue Jun 5, 2023 · 7 comments · Fixed by #5383
Assignees
Labels
priority/backlog Issue is approved and in the backlog
Milestone

Comments

@rturner3
Copy link
Collaborator

rturner3 commented Jun 5, 2023

Once the changes introduced by #3181 are validated, we should remove the feature flag surrounding the new cache design and make it the default behavior.

@rturner3 rturner3 added the priority/backlog Issue is approved and in the backlog label Jun 5, 2023
@prasadborole1
Copy link
Contributor

We're running LRU cache experimental feature for past few months in our production. We've also tried with different cache sizes and its working as expected with reduced cpu and memory usage.

@azdagron azdagron added this to the 1.9.0 milestone Oct 5, 2023
@rturner3 rturner3 self-assigned this Oct 5, 2023
@rturner3
Copy link
Collaborator Author

rturner3 commented Oct 5, 2023

Maintainers discussed how we can safely roll this out. Here is the rough action plan we discussed, adding in more details about exact release cycles where changes will land:

In v1.9.0:

  • Wire up the LRU cache to be the new default cache with a reasonable default value for the maximum size soft limit (TBD, probably <= 1000) such that most users see no direct impact from this change. Retain the existing experimental config field x509_svid_cache_max_size to give users a way to tune the LRU behavior if truly needed.
  • Introduce an experimental configuration field to switch back from the LRU cache to the current cache behavior in case there are any unexpected issues.

We will allow this behavior change to bake in the v1.9.0 minor release series to ensure agent stability during this transition.

In v1.10.0:

  • Deprecate the x509_svid_cache_max_size configurable and emit a WARN level log on startup when it is set
  • Deprecate the experimental configurable to switch back to the current cache behavior and emit a WARN level log on startup when it is set

In v1.11.0:

  • Remove the x509_svid_cache_max_size configurable and experimental configurable to disable the LRU cache behavior. Remove the experimental configurable to disable the LRU cache behavior and promote the x509_svid_cache_max_size configurable to be a stable setting.
  • Remove the code for the existing cache implementation in favor of the LRU cache implementation

@rturner3 rturner3 assigned prasadborole1 and unassigned rturner3 Jan 3, 2024
@rturner3
Copy link
Collaborator Author

rturner3 commented Jan 3, 2024

@prasadborole1 is planning to take on the changes targeted for v1.9.0.

@edwbuck
Copy link
Contributor

edwbuck commented Apr 19, 2024

@azdagron @rturner3 This seems like an old backlog entry that is partially completed, or perhaps even fully completed. Can one of you look into it in detail to assist in cleaning the backlog (or indicating what is needed in a PR to get this into 1.10.0).

@amoore877
Copy link
Member

As part of this, https://github.com/spiffe/spire/blob/main/doc/images/cache_mgr.png probably also needs updating? I'll make a separate issue just about SPIRE Arch diagramming in general. #5180

@amoore877
Copy link
Member

ref #5383 (comment)

reversal of course- maintainers would like to make the setting a stable aspect of spire-agent

@amartinezfayo
Copy link
Member

We have decided that in v1.11.0 instead of removing the x509_svid_cache_max_size configurable, we will move it out of the experimental section, promoting it to be a stable setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Issue is approved and in the backlog
Projects
None yet
7 participants