Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed Fix to: Loss of Access with lower level ACL (Effective ACL Resource Algorithm) #102

Open
gibsonf1 opened this issue Nov 2, 2021 · 20 comments

Comments

@gibsonf1
Copy link

gibsonf1 commented Nov 2, 2021

The current Effective ACL Resource Algorithm enables the following which would not be the intent of the users involved:

teamx has default read/write/control to a container which has a 5 level container hierarchy under it. teamx wants to collaborate with person1 and creates a non default read/write acl on a container 3 levels down. Unless teamx copies the top level acl when creating the person1 acl, they will lose access.

Magnitude of the problem example: Alice has a pod with 1000 containers and gives Bob default read/write access to the root. Alice then collaborates with 5 other people in various containers carefully copying the intent of giving Bob access to all these other containers. At one point Alice would like to remove Bob's access, and now there are 100 acls copied in those 1000 containers. What does she do now?

A very simple solution to this avoiding all the copying and systems needed to manage where the copies are etc, is to make a very small revision to the algorithm such that the requesting agent is taken into consideration and only applicable acls are returned:

Effective ACL Resource Algorithm

    To determine the effective ACL resource of a resource, perform the following steps. Returns string (the URI of an ACL Resource).

        Let resource be the resource.
        Let agent be the requesting agent
        Let aclResource be the ACL resource of resource.
        If resource has an associated aclResource with a representation, and that aclResource applies to the agent, return aclResource.
        Otherwise, repeat the steps using the container resource of resource.

This will perform in the exact same way as the original algorithm intent, but simplify the use of acls dramatically and avoid all of the many issues that arise with access when copying and deleting acls around is not done correctly based on the intent of users.

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

Yes, @RubenVerborgh the agent-class does need to be treated differently, revised accordingly:

Effective ACL Resource Algorithm

    To determine the effective ACL resource of a resource, perform the following steps. Returns string (the URI of an ACL Resource).

        Let resource be the resource.
        Let agent be the requesting agent or agent-class      
        Let aclResource be the ACL resource of resource.
        If resource has an associated aclResource with a representation, and that aclResource applies to the agent, return aclResource.
        Otherwise, repeat the steps using the container resource of resource.

@bblfish
Copy link

bblfish commented Nov 2, 2021

@gibsonf1 wrote

A very simple solution to this avoiding all the copying and systems needed to manage where the copies are etc,

To avoiding copying data around I have proposed to add an :imports relation to the ontology - see issue 210 of authz panel. That would solve your problem I think. I had not thought at the time how it meshes with default reasoning, so let me try here.

Let us say the BBC has a set of default rules </default.acl> which give access to resources tagged over18 to people who can prove they are over 18, to over13 to those who can prove they are older than 13, etc...
Let us say that there is some other resource in </health/2021/08/09/covid/questionaire/users/> where the rule should be that those who fill in the form and The BBC are the only ones able to write the info. But they would be able to upload videos too there and make them public.

Then by creating an acl </health/2021/08/09/covid/questionaire/users/default.acl> like this

<> :imports </default.acl>.
// new rules to limit access to The BBC and those who created the subcontainer

One would not need to duplicate the rules further down the hierarchy.

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

@bblfish You can just do that with nested groups and the much simpler agent based algorithm

@gibsonf1 gibsonf1 changed the title Loss of Access with lower level ACL (Effective ACL Resource Algorithm) Proposed Fix to: Loss of Access with lower level ACL (Effective ACL Resource Algorithm) Nov 2, 2021
@bblfish
Copy link

bblfish commented Nov 2, 2021

If resource has an associated aclResource with a representation, and that aclResource applies to the agent, return aclResource. Otherwise, repeat the steps using the container resource of resource.

That requires the system to know who the agent is already. That kind of makes sense for the case where one wants to edit the ACL on one's own POD, but not for the much more widely needed cases where one wants to allow read or write access to a resource that one comes across by following links across the web. After all the client may not know what credential it needs to present to be able to gain read or write access. It may not be logged in. For example in The BBC example I gave above, you as a client, may not know what type of age credential you should present, and even if an age credential is the right thing to present.

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

@bblfish The system does know who the agent is given the webid in the token making the request.

For the age issue, that is a different but very interesting problem, how would the system know how old someone is for one thing? Assuming there were some way to capture that, you could use agent-classes that represent different age groups and give access to those classes.

@bblfish
Copy link

bblfish commented Nov 2, 2021

@bblfish The system does know who the agent is given the webid in the token making the request.

Doing that only allows you to implement a very limited set of use cases, which are similar to closed social networks.

On open social networks, A person can have a multiplicity of WebIds, one for work, for home, for the government, The BBC, etc.... A person may be known under any of those webIDs across the social web. Which one should the user choose? All of them? That would lead to a privacy problem for the user. Indeed we have a use case for that Minimal Credentials disclosure.

For the age issue, that is a different but very interesting problem, how would the system know how old someone is for one thing?

For age, an option would be Verifiable Credentials of some form.

Assuming there were some way to capture that, you could use agent-classes that represent different age groups and give access to those classes.

Yes, I gave an example of how one could express that in OWL nearly a year ago in this comment. An OWL like this could do

<#PersonOver21> owl:equivalentClass [  a owl:Restriction;
      owl:onProperty :hasAge ;
      owl:someValuesFrom   
          [ rdf:type   rdfs:Datatype ;
            owl:onDatatype       xsd:integer ;
            owl:withRestrictions (  [ xsd:minExclusive     21 ]   [ xsd:maxInclusive    150 ] )
          ]
       ] .   

There are more examples on the authorization panel meeting of 2021-04-28 where we discussed that a length, comparing it to ACP.

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

@bblfish I'm not sure I understand. I for one will have only one personal pod that I authenticate with where I want all my stuff to be, and of course others can do other things. But whatever webid you do authenticate with, that is the one the system will use for permissions. And there can only be one single webid in the auth token.

And that authenticated webid can be the member of countless groups or groups of groups allowing any kind of social access you like.

@bblfish
Copy link

bblfish commented Nov 2, 2021

@gibsonf1 you are describing the use case where you want to just store your own personal data on your store, and perhaps make some part of it public. That allows only two access modes really: public information and private information.

But the aim of Solid (Social Linked Data) is much wider: we want to create decentralised Social Networks where we can follow links across the web to arrive on other Pods around the World, so that we can read our friends or colleagues news feeds, comment on those, post pictures, make meetings etc... Each resource an be access controlled in this system. Also we can and will have different identifiers and credentials. So there is no way a client can know ahead of following a resource in such an open system under what credentials an resource is authorized. Hence we are back at my points above.

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

@bblfish I'm a big fan of the use case you describe which is fully compatible with this proposed acl fix etc. In our system, you can control permission down to a single triple, and we also now have non-personal agent pods such as business and project pods

But for managing social access, you simply use vcard:hasMember and create groups with webids. And additionally, you can create a group agent pod and grant access to all in that group with acl:agent-group group-webid. And the members of the group can also be groups, its a recursive algorithm

@bblfish
Copy link

bblfish commented Nov 2, 2021

which is fully compatible with this proposed acl fix etc

I don't think it is: you seem to require the POD to know the identity of the client before the client has had time to read the ACL. Have I misunderstood?

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

If I am a Solid server, and I get an authenticated request, I know the webid of the authenticated requestor as its embedded in the encrypted token. The client doesnt get to read the acl unless they are the controller of that acl and request it explcitly. The client only see the WAC-Allow header which shows both the authenticated-user access and public access

@bblfish
Copy link

bblfish commented Nov 2, 2021

If I am a Solid server, and I get an authenticated request, I know the webid of the authenticated requestor as its embedded in the encrypted token. The client doesn't get to read the acl unless they are the controller of that acl and request it explcitly.

Yes, that is fine for closed systems I described above here, but not for the open systems I described above.

As I explained if you require your client to authenticate on an open system before it knows the rules of engagement you are creating a privacy problem for the client: it has to try out every identifier randomly to access a resource. It is also slow and wasteful, but those problems pale in the light of the client privacy problem you have created. Indeed the privacy problem is so big that it forces the only use case to be the closed one I described here.

This is really a high level conceptual problem I am describing here. That your store gives access to single triples or not has no bearing on it.

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

@bblfish I'm clearly not understanding the issue. Can you define an open system? If that means public, then Solid fully supports that by giving access to foaf:Agent. Also, the group definitions, if intended for use by anyone, would need to be public readable. Also, maybe this discussion should be on gitter forums as its not part of this particular wac issue?

@bblfish
Copy link

bblfish commented Nov 2, 2021

Can you define an open system?

There are some new mathematical definitions of open systems that look very interesting and that show these to be the key difference between physics systems (closed) and biology (open). See the twitter thread for details.

But in our situation it means that new PODs can come to join and that there are links from one POD to another. The people and PODs in the network are not know in advance and can change over time, and there is no central authority to tell who is in and who is out.

In that system we want not just private/public protection but protection to group members. That can be done without loss of privacy. I described that here in the issue on ACLs on ACLs. (I am not so sure anymore about it being a good idea to have the inferencing on acl:accessTo (as dotted lines) in that diagram, so just imagine the dotted lines to be full).

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 2, 2021

It doesn't make sense to talk about this here, maybe you can post a concrete example of something Solid can't do that you are concerned about on the gitter specification channel? https://gitter.im/solid/specification

@bblfish
Copy link

bblfish commented Nov 2, 2021

maybe you can post a concrete example of something Solid can't do

I believe that Solid and WAC can fulfil all our use cases. But not if we restrict the POD to know the user in advance of him seeing the access control rules. ACP wants to drop that limitation (see timbl's intervention in TPAC last week) and it is very easy for WAC to do that too. Indeed the ACLs on ACLs proposal shows how.

Please take some time to read through the links I have already give you above. There is a lot of information there.

@timbl
Copy link

timbl commented Nov 8, 2021

Meanwhile, going back to the original proposal I don't see the logic of it, in that you have introduced inheritance but only in a very limited case, making the system MUCH more complicated, and to limited benefit. I feel that if you roll this out users will be in the end more confused, and there will be more cases in which they get nasty surprises than with WAC. You say and ACL file which "applies to" the agent is the one where the iteration stops. Where "applies to" can be a group membership. So the scan up the tree does involve testing group membership. (but I gather not class membership like AuthenticatedUser or Agent.)

Random example

  • /foo has public R access for everything below
  • /foo/bar has public A access for everything below, and R for Family
  • /foo/bar/baz has public W access for everything below, and R for Friends

If Alice is accessing /foo/bar/baz/grimpsh then if she is a member of friends, she will see that the public have write access but if she is not and is a member of Family she will see the public has Append access. If she is in neither group, she will think the public have read access.
This is because you are jumping out with a single ACL file from the algorithm, which is gives you unintended consequences.

Other things people have considered for WAC in the past have been a stop_here flag to optionally exit the iteration up the tree, or a inherit() (as Henry mention) function to explicitly pull in other ACLs. Those seem to be to be general in effect and more powerful, but at the same time simpler to specify, and to implement and to how to users.

Have you looked at ACP? Its targets cases which need more complexity for more power.

@gibsonf1
Copy link
Author

gibsonf1 commented Nov 8, 2021

@timbl I think the key issue for the user is being able to control permission the way they intend, and to very easily change that control when they intend as well. So with the current proposal, a user has to copy the same intent from the top level share to all lower level shares, such that managing intent becomes impossible as collaboration scales at all. From the user perspective, if they grant someone root level default read, they would likely not expect that they have to grant this again in as many more acls are created down hierarchy of containers. And that same user would likely expect that when they revoke that root level defaut access to a given agent, that that agent no longer has access, when in fact they would have to then go through the entire hierarchy and find every copied acl of that original intent and revise it. Having to copy intent around this way will make for a vastly complex permissions understanding by the user and extremely difficult to manage at any kind of scale. It's the way Windows Server 2000 worked, and it did not work well or scale, and given the complexity of copying permissions down hierarchy, it would inevitably have errors that would tend to destroy access.

This proposal basically makes the data side radcally less complex, 1 acl will be sufficient to capture user intent instead of a growing number of acls in a container hierarchy. From the implementation side, implementations already have to do the agent check - this is moving the agent check into acl discovery. And its very easy to cache container acls going up hierarchy for any system for performance. For the group discovery, this is the same issue with current scheme as if a group is involved, it would need to be resolved.

For the example

/foo has public R access for everything below
/foo/bar has public A access for everything below, and R for Family
/foo/bar/baz has public W access for everything below, and R for Friends

I think users need to be very careful about giving public default access to resources, and once done, act accordingly. We flag public access containers and resources with red to make it obvious to users when its public. For this example, I'm not sure why the user would be mixing friends and family permissions in a public resource area, but they could easily add a public acl that is default with no read write control or append modes to shut down public access in those cases they wanted something private in a public default area.

The main idea here being that the user only adds acl's at permission intent locations. So that if a user gives default access to a whole hierarchy of resources, and somewhere down hierarchy decides that in that location they want to change the permission, they can add an acl there as their intent has changed while leaving all else in place. And because the user distinctly controls permissions at these specific intent locations, they can just as easily change them at those same intent locations. Otherwise, it becomes impossible to track and remember intent when many acls are copied around.

@srosset81
Copy link

srosset81 commented Dec 15, 2024

With @niko-ng, we have implemented WAC permissions the same way as described by @gibsonf1 in SemApps and ActivityPods ... since 2021. If the acl:default permissions of containers only apply to resources with no specific authorizations, many of the things that we do with WAC at the moment would be really complicated (almost impossible) to do.

@niko-ng
Copy link

niko-ng commented Dec 15, 2024

Hello everyone. Thanks @gibsonf1 for point out this issue that is of very much interest to us too, at ActivityPods/SemApps.

I implemented WAC in 2020 for SemApps, which is the semantic toolbox/middleware on top of which ActivityPods is build. Sorry the whole tracking issue is in french.
The design work started at the end of 2019, and coding was done in 2020, with the final commit on the 2nd of March 2021.
While the first Draft for WAC has been published 4 weeks later on the 29th of March 2021.

How is that possible?
Our understanding of WACs and our implementation, are based on the early discussions that happened before the writing of the first draft. We collected some information from forum posts and comments and github issues here and there, and tried to make sense of the whole thing.
We realised a week ago, in a conversation in the matrix channel, that our implementation is not compliant with the current specs.
This is because after completing my work in March 2021, I moved on to other tasks and projects and did not come back to WAC.
Also, our implementation has been used by several projects, and specially, by ActivityPods, and the logic we support has been useful to everybody, so we didn't feel the need to change anything.

How we differ from the current specs

Our implementation does not take into account the concept of "effective" Authorization (because it appeared only in the draft and we had no prior knowledge of it).
But it does integrate the concept that Gibson explains in this issue, namely, that permissions should be checked for the Agent in question, and if nothing is found for that agent, we go up the tree, using the acl:default on parent containers.
We also went further on that concept, and also take into account the agent-class.
We do also check the agentGroup memberships.

Our algorithm is as follow:

  • we only search for the needed permission (R, W, A or C) in the aclResource of the target resource, for the specific webID of the current user, or for acl:AuthenticatedAgent, or for foaf:Agent, and if not found, also search if the user is member of the groups (if any acl:agentGroup are present)
  • if we didn't find anything (no matches, or the acl:mode was empty, or no aclResource for the resource), we move on to the parent container and do the exact same step as above, using the acl:default
  • we recursively go up the tree, until we reach the root container. if no permission is found there neither, we deny access.

This allows for flexible scenarios, and avoid the need to repeat permissions.
It also clears the concern/objection that @timbl mentioned above, about if Alice accessing /foo/bar/baz/grimpsh . With our algorithm, we will see that public has W access, no matter if she belongs to any of the groups or not. Meaning, if she needs W access, it will be given to her, as the rule on foaf:Agent (public) applies to her too.

In our algorithm, there is no way that a permission anywhere in the tree contradicts another, as we always look first for the most specific and constraining ACL.
There is no way neither that a permission will be "masked".
This is why I was talking about that in the matrix room, saying that we cannot "remove" permissions down the tree.
It is a conservative system where permissions can only be added down the tree, from generic ones to more specific ones, which is the use-case we find the most in the wild.

caveats of our implementation

  • it isn't compliant with current specs
  • it can be long for the permission system to check and enforce access control, as permissions have to be checked in many places, sometimes.
  • we cannot "mask" or "remove" perms down the tree of containers.

I would be glad to hear what you all say about this approach, and would like to see if the WAC specs could evolve too @csarven ?

We will soon reimplement the whole WAC system in SemApps/ActivityPods as it will now be based on the quadstore of NextGraph. We would probably like to keep our algorithm, as it has proven to be very useful for app developers and end-users alike. The implementation will be very different though, as I have learned from the previous work (4 years ago) and we will be using a matrix of permissions by user and resource, a kind of cache that flattens everything so it can be checked in O(1). This will be hidden from the APIs and will only be used internally to improve perfs. Write perfs will be slightly degraded, it is a tradeoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants