Feature request: TTL cache #1751

Solverj · 2019-09-12T05:57:34Z

Wanted feature

TTL cache for OPA(The only thing I see as missing to make OPA fully conform to ABAC(PIP)).

I was thinking in the line of:

Policy polices a request
Attributes needed on jwt content is checked in cache first, if cache has attribute retrieve it and goto Add failing test and source code #4, else goto Add failing test #3
http.send and cache response
Police request
End

A possible implementation could be to use DATA(and non-updatable by put calls) as the Cache with a mere Json structured map, where a thread in the background is handling TTL's and deletion.

And ofcourse the above is only enabled through, e.g., opa run -s --cache-enabled --cache-ttl=3600(in seconds) or something like that.

Use-case

A rest-service handles a lot of customers, each customer is divided into sub-users stationed in various locations. Each of these sub-divisioned client needs fine-grained access-control due to different payment plans for different data from the rest-service. All the scalable data(attributes) at a point is several GBs in size and can't be stored in memory, so OPA uses instead a TTL cache for attributes needed aligned with each specific JWT attribute needed. If the cache doesn't have the attributes needed, OPA issues a http.send to the external resource and stores the response in cache aligned with the JWT attribute used. Now OPA fully supports the ABAC paradigm.

tsandall · 2019-09-12T12:40:46Z

We've been talking about improving http.send to cache across queries. I just realized we did not have an issue filed to track that. This issues is related to it.

Ref #1753

krotscheck · 2020-03-17T17:28:26Z

A syntax I'm thinking of implementing in a plugin; open to comments.

default myval = 'foo'

myval = v {
    cache.has('key')
    v = cache.get('key')
}

myval = v {
    not cache.has('key')
    // do expensive things.
    cache.put('key', value, ttl)
    v = cache.get('key')
}

tsandall · 2020-03-17T19:16:20Z

hey @krotscheck, here are some thoughts.

Implementing this kind of caching as a set of built-in function (cache.has, cache.put, etc.) would be problematic because it means that statements in the rule body have side-effects and therefore order-of-execution is important.

If // do something expensive is a call-out to an external system via http.send or some other custom built-in function, the caching could be implemented inside the built-in function. For example:

myval = v {
  response := http.send({
     "method": "get",
     "https://example.com",
     "ttl": "5m"})
  v := response.body.some_value
}

If a more general-purpose caching mechanism is required (e.g., perhaps there's some compute-intensive operation happening on the result of http.send) then we could explore a more general-purpose caching mechanism...

cache myval = 5*60  # "cache the value of myval for no more than 5 minutes"

myval = v {
   # do expensive thing to get 'v'
}

Do you think the first approach would work for your use cases?

krotscheck · 2020-03-17T21:40:17Z

I think that'd work - as long as the cache can (also?/optionally?) be somehow keyed to the input. For instance, if I use a JWT as an input, it's unlikely that the policy response - no matter how expensive its calculation is - will change for the lifetime of the JWT. Any suggestions on that?

tsandall · 2020-03-18T15:13:52Z

I think that'd work - as long as the cache can (also?/optionally?) be somehow keyed to the input.

Yes, this would be up to the built-in function. We actually have some built-in functions whose outputs are cached for the duration of the top-level policy query (e.g., time.now_ns() and http.send work this way). We do this to ensure that calls deterministic (return the same output given the same input.) E.g., you would not want to have multiple time.now_ns() calls return different times if invoked multiple times inside the policy. For time.now_ns() it's trivial because there are no parameters. For http.send we just use ALL of the input parameters as the cache "key".

If the // expensive thing is implemented as some custom built-in function, it's up to that built-in function's implementation to do the right thing. Today we don't have a framework for caching across policy queries but it's something we're interested in (#1753).

tsandall · 2020-08-06T21:42:35Z

Now that http.send supports caching across queries (#1753) I think we can close this. Can revisit other caching options in the future if needed.

tsandall self-assigned this Sep 12, 2019

tsandall added enhancement feature-request labels Sep 12, 2019

tsandall closed this as completed Aug 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: TTL cache #1751

Feature request: TTL cache #1751

Solverj commented Sep 12, 2019 •

edited

Loading

tsandall commented Sep 12, 2019

krotscheck commented Mar 17, 2020

tsandall commented Mar 17, 2020

krotscheck commented Mar 17, 2020

tsandall commented Mar 18, 2020

tsandall commented Aug 6, 2020

Feature request: TTL cache #1751

Feature request: TTL cache #1751

Comments

Solverj commented Sep 12, 2019 • edited Loading

Wanted feature

tsandall commented Sep 12, 2019

krotscheck commented Mar 17, 2020

tsandall commented Mar 17, 2020

krotscheck commented Mar 17, 2020

tsandall commented Mar 18, 2020

tsandall commented Aug 6, 2020

Solverj commented Sep 12, 2019 •

edited

Loading