-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entity Slicing using Data Tries #74
base: main
Are you sure you want to change the base?
Conversation
3eaa6db
to
f53f2d2
Compare
Signed-off-by: oflatt <oflatt@gmail.com>
f53f2d2
to
0778bef
Compare
Code is now available: cedar-policy/cedar#1102 |
Signed-off-by: oflatt <oflatt@gmail.com>
5f579f4
to
d33585d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, and I think this is a good idea. Some comments, particularly about the subset of Cedar that this RFC works with.
56b7f6f
to
ee696f1
Compare
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Co-authored-by: Craig Disselkoen <craigdissel@gmail.com> Signed-off-by: oflatt <oflatt@gmail.com>
Signed-off-by: oflatt <oflatt@gmail.com>
Signed-off-by: oflatt <oflatt@gmail.com>
ee696f1
to
9ba753c
Compare
Signed-off-by: oflatt <oflatt@gmail.com>
What would it take to answer in Cedar schema?
in the following proposal:
Response should be able to handle sub element of type: |
Also remphasing that:
for :
is a cumbersome example. one would prefer using
with policy shaped as:
|
I want to see an example with a
or
|
I want to see an example with an |
As currently proposed, high-scale systems would struggle to use this approach. It has a couple of drawbacks. The first is the necessity for the cache of manifests, which is difficult to maintain at scale when policies change frequently and the volume of policies could potentially be huge. The second is that the manifests are course grained, such as requiring the loading all ancestors of an entity, which can be impractical when the ancestor set is unbounded. Third, it requires extra work on the client-side to piece together the instructions in the manifest with the actual principal/resource in the request in order to generate a concrete plan for data retrieval. For the situations that I'm personally most familiar with, I'd prefer a different approach. I'd rather give Cedar a policy slice, and the parameters of specific request (principal, action, resource), and Cedar emits a document that tells exactly which data to load for the request. In addition, this document would be in a well-defined interoperable representation like JSON that we can use as a standard interface to a Policy Information Point (PIP). We give this document to the PIP, and it returns JSON-formatted entity data. An example manifest may look like this:
If policies have multiple levels of references, like To tie it all together, I would definitely use an API that returned an enum of either (1) the concrete authorization result, or (2) the necessary data to arrive at a concrete result. I would loop on this API and gather data until a concrete result is achieved, or the max level is exceeded. |
That is a better proposition of what I tried to propose with #74 (comment) @D-McAdams . Going a Cedar formatted answer is the way. Now:
Therefore the slice answer should ask to resolve the whole equation. Having a first response only solving the Level 1 of condition, will make the next request for a slice more complex to evaluate. And in any cases, the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice write up! I think the overall idea makes sense, although I do wonder whether this should be in the Cedar library vs. in a separate tool. Will take a look at the implementation PR next.
|
||
As written, entity manifests in this RFC do not support loading only parts of a Cedar Set, or only some of the ancestors in the entity hierarchy. This is because sets are loaded on the leaves of the access trie, with no way to specify which elements are requested. | ||
|
||
To support this feature, we recommend that we take a constraint-based approach. Constraints would be attached to nodes in the access trie, specifying which elements of sets or the parent hierarchy are needed. Constraints form a small query language, so this would increase the complexity of the entity manifest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC 76 provides some additional ideas about how to support partial loading of ancestors. Do you still think the constraint-based approach is the right one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I'm not a big fan of the @ancestors
annotation proposed- I think a more automatic solution like constraint annotations would be better.
As for partial evaluation, it would still make entity manifests more precise.
Action::"CreateDocument" | ||
[ | ||
context.is_authenticated | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we expect entity manifests to be written and consumed by machines (right?), it might make more sense to use a structured JSON format. Our experience with writing parsers for human-readable formats that support comments has shown that they are easy to mess up 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented both a JSON format and a human-readable format. The human-readable format re-uses the parser for cedar expression so it's not super error prone.
Personally, I've found the human-readable format useful for understanding what's going on. For example, a human can diff a new entity manifest with an old one to understand how a new policy changes the manifest.
Couple reasons I think the simple approach is preferable and iteration is OK. One is that no matter what we design here, any real-world data gatherer will be looping and iterating. For example, given a policy that refers to It's going to be much easier for me as a customer of Cedar to say "Here's everything I know, tell me exactly what to load", and then repeat until there's nothing left to load. It's a dirt-simple code path I can run in a loop. Such an approach also makes it consistent on how to load entity information, regardless of whether the entities are the principal/resource, or some other entity referenced in a policy, or some other entity literal. As a data gatherer, I simply don't care. Cedar tells me which entities to load, and I just go load them. This makes it dirt-simple to make a generic data loader. |
I agree that iterating works for most if not all DBs. That being said, this can be done via one query in a relational DB with |
Also while I agree that getting data out of the PIP might need round trips, getting out of the policy store of the PDP like here does not. The information needed is stored in single dedicated objects (policies) which are extrapolated into what is missing. |
I think such document format should merit a RFC.
IIUC, entities in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally like this idea as it provides a neat way to do a just-in-time batched retrieval of entity data relevant to a particular type of request.
I'm curious why such a similar approach for finding the entities to load given a fully formed PARC wasn't included in the proposal.
pub struct RequestType { | ||
pub principal: EntityType, | ||
pub action: EntityUID, | ||
pub resource: EntityType, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Why not allow EntityUID
s to be passed for the principal
and resource
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RequestType
is used to index the entity manifest. When it is computed, the actual principal and resource are not known (but the types are). Action ids are known since there are a finite number.
Hi @D-McAdams, sorry for the late reply (I was out for a few weeks). You make some good points about the drawbacks of this proposal that I hope to address.
As for a JSON format, entity manifests map easily to JSON objects, and they look similar to what you have written. Let me know what you think- perhaps I've missed something. In my mind, the entity manifests as proposed in this RFC are the most general form of entity slicing, and other solutions can be implemented on top. |
@jeffsec-aws asks:
That's a good question. Cedar schemas and entity manifest look quite similar, but entity manifests represent data paths through multiple types. While I think it's possible to represent an entity manifest as a cedar schema as you showed, it requires encoding the data somehow. In other words, any encoding using Cedar schemas would abuse the format. |
Co-authored-by: Kesha Hietala <khieta@amazon.com> Signed-off-by: oflatt <oflatt@gmail.com>
Signed-off-by: oflatt <oflatt@gmail.com>
eef7b00
to
d7ff80f
Compare
Signed-off-by: oflatt <oflatt@gmail.com>
Signed-off-by: oflatt <oflatt@gmail.com>
Signed-off-by: oflatt <oflatt@gmail.com>
@patjakdev asks:
That's a good question. Having a fully formed PARC and having all the policies on hand does let you be a little more precise. You can do partial evaluation of the policies, get an entity manifest based on that, then do entity loading. |
@D-McAdams This PR implements an API that is similar to what you have outlined above: Users can compute an entity manifest from a policy slice, then implement an |
Rendered