Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Resource Permissions and Sharing #4500

Open
5 tasks
DarshitChanpura opened this issue Jun 28, 2024 · 25 comments
Open
5 tasks

[Proposal] Resource Permissions and Sharing #4500

DarshitChanpura opened this issue Jun 28, 2024 · 25 comments
Assignees
Labels
enhancement New feature or request resource-permissions Label to track all items related to resource permissions Roadmap:Security Project-wide roadmap label triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable.

Comments

@DarshitChanpura
Copy link
Member

DarshitChanpura commented Jun 28, 2024

Resource Sharing

Status Quo

Currently, there is no mechanism provided by the security plugin for plugins to utilize to provide fine-grained access control to own resources created by each plugin. Some plugins have addressed this by implementing their own custom authorization mechanism, however it is not sustainable for each plugin to handle resource-level FGAC using their own custom implementation. For example, in anomaly detection, granting permissions to delete detectors requires the user to be granted cluster:admin/opendistro/ad/detector/delete, which allows the user to delete all detectors in the cluster. Anomaly Detection has implemented their own authz mechanism.To address this, we are introducing a new concept of Resource Sharing. This feature introduces a new way of authorization in addition to existing RBAC authorization model.

Abstract:

With Resource Sharing, the high-level idea is that a resource owner determines how a resource can be shared with others. A user should be able to control access to the resources created by them. A cluster admin will not be able to control access unless they are defined as REST or TLS admins.

This will enable a collaborative usage of OpenSearch like never before. Quip is one of the well known tools around for document sharing and collaboration. With quip you can define documents private to you, or you can grant access to a group of people, or you can mark the document as public. Resource Sharing is following a similar approach where different access control scopes and levels can be defined to permission resources.

Motivation:

Due to the absence of any mechanism for resource access control, many plugins have resorted to implement custom authorization schemes to curb access to resources defined by such plugins. Here is one such example of a known public documentation for ML-Commons plugin (link) to control access to model groups.

ML Commons implements a robust authorization model that categorizes access to model groups into three distinct modes: public, private, and restricted. These modes provide flexible control over who can view and interact with model groups:

  • Public: All users have access to the model group. However, this does not grant full control over the resource. Access is further governed by the permissions model. For instance, users with ml_read_access will only be able to list and view model groups, but not modify them. Only users with ml_full_access will have complete control over public models.
  • Private : Only the resource owner has access to the model group.
  • Restricted: The resource owner has shared access with specific trusted users or parties.

This approach is inspired by the ML Commons architecture and aims to provide a reusable mechanism for any plugin to leverage the Security plugin for resource sharing. By centralizing authorization within the Security plugin, plugin development is simplified, while ensuring that access control remains robust and scalable. This model also provides a clear pathway for migrating existing access control mechanisms from plugins to the Security framework.

Feature Implementation:

Components:

There are 3 major components to this design:

  • resource_sharing index: This index will be stored centrally and will contain information about how a particular resource is shared.
  • ResourcePlugin : The contract class defined in OpenSearch to be utilized by the plugins who choose to utilize the resource access control framework provided by Security plugin.
  • Resource Access Control Framework: This framework is at the heart of this design and involves defining concrete implementation of access control methods defined in the contract above. It also involves implementing index action listener to ensure the resource sharing information remains consistent across the nodes in the cluster.

Assumptions:

A Resource is a document stored within a plugin’s system index, containing metadata about that resource. Examples of resources include Model Groups in ML Commons, Anomaly Detectors in anomaly-detection, and Detectors in alerting plugins.
The .resource-sharing index will always maintain a compound key that references the location of the original resource. This compound key consists of two components:

  1. source_idx: The system index where the original resource is stored, as defined by the plugin.
  2. resource_id: The DocID corresponding to the resource document within the system index source_idx.

This structure ensures a clear and direct link between the shared resource and its original location.

Index design:

The resource_sharing index document has following entries:

  1. source_idx : The system index which resource metadata. This is usually the plugin’s system index.
  2. resource_id : The id of the resource document defined in the source_idx .
  3. created_by : Defined the user or a backend_role that created this resource.
  4. shared_with : Defines access scope for this resource. It contains users, roles and backend_roles this resource is shared with. Access scope is defined in 3 ways:
    1. public: A shared_with entry exists, and contains users entry mapped to * .
    2. restricted: A shared_with entry exists, and contains users entry not mapped to * .
    3. private: A shared_with entry is empty or doesn’t exist.
  5. In addition each shared_with will also contain the level of access that each user, role or backend_role may have. These are two scopes at present read_only and read_write.

Note: The relationship between entries in this index and the resource metadata stored in plugin’s index is many-to-one.

Here are 3 examples of entries in resource_sharing index, one for each scope.

  1. Private:
{
   "source_idx": ".plugins-ml-model-group",
   "resource_id": "<doc_id_of_model_group>",
   "created_by": {
      "user": "darshit",
      "backend_role": "",
   }
}
  1. Restricted:
{
   "source_idx": ".plugins-ml-model-group",
   "resource_id": "<doc_id_of_model_group>",
   "created_by": {
      "user": "darshit",
      "backend_role": "",
   },
   "share_with": {
      "read_only": {
         "users": [
            "derek"
         ],
         "roles": [],
         "backend_roles": []
      },
      "read_write": {
         "users": [
            "craig"
         ],
         "roles": [],
         "backend_roles": []
      }
   }
}
  1. Public:
{
   "source_idx": ".plugins-ml-model-group",
   "resource_id": "<doc_id_of_model_group>",
   "created_by": {
      "user": "darshit",
      "backend_role": "",
   },
   "share_with": {
      "read_only": {
         "users": ["*"],
         "roles": ["*"],
         "backend_roles": ["*"]
      },
      "read_write": {
         "users": [
            "*"
         ],
         "roles": ["*"],
         "backend_roles": ["*"]
      }
   }
}

NOTE: Each user, role and backend_role must be individually added as there is no pattern matching at present by design.

Code Implementation:

A new plugin type ResourcePlugin will be defined in OpenSearch. This is an extensible contract and will define core Java APIs to be used by plugins to check access:

Security will add a concrete implementation to these APIs to determine access to a particular resource, and plugins will call these APIs to verify a user’s access to requested resource.

Flow Diagram:
sequenceDiagram
    participant Plugins
    participant ResourcePlugin as ResourcePlugin<br>Defines Core APIs
    participant Security as Security<br>Concrete Implementation of APIs

    Plugins ->> ResourcePlugin: Call updateSharingInformation
    ResourcePlugin ->> Security: Forward to Security Plugin
    Security -->> ResourcePlugin: Return result
    ResourcePlugin -->> Plugins: Return result

    Plugins ->> ResourcePlugin: Call listAccessibleResourcesForUser
    ResourcePlugin ->> Security: Forward to Security Plugin
    Security -->> ResourcePlugin: Return accessible resources
    ResourcePlugin -->> Plugins: Return accessible resources

    Plugins ->> ResourcePlugin: Call hasPermission
    ResourcePlugin ->> Security: Forward to Security Plugin
    Security -->> ResourcePlugin: Return permission status
    ResourcePlugin -->> Plugins: Return permission status
Loading

Pros & Cons:

  • Pros
    • A highly scalable approach where the resource owner is empowered to determine how the resource is shared
    • Provides a clear migration path for plugins like ML Commons
  • Cons
    • No code re-use
    • Not all unknowns are identified
    • Introduces a new authorization model in addition to and different from Role Based Access Control

Test Plan

  1. Add local tests in the security plugin.
  2. Add mixed cluster tests. (To evaluate requests coming in during rolling-upgrade scenario)
  3. Add tests in anomaly detection or other plugins that could leverage this feature.

Will the feature require a security review?

This feature will require a security review to confirm that it correctly evaluates the scope of the requested resource.

Documentation

The documentation should be added for this feature detailing how users can utilize this feature.

--

Next Steps

  • Open a META issue that tracks this feature
    • Add a task to add ResourcePlugin definition in core
    • Open a task to add this feature in Security Plugin by extending the APIs offered (preferred if done via feature branch)
    • Open a follow-up issue to add tests for this feature
    • Open a PoC PR in anomaly detector plugin to demonstrate this feature

Expand to see old design:
## _High-Level Design_

### Status Quo

Currently, there is no mechanism provided by the security plugin for plugins to utilize to provide fine-grained access control to own resources created by each plugin. Some plugins have addressed this by implementing their own custom authorization mechanism, however it is not sustainable for each plugin to handle resource-level FGAC using their own custom implementation. For example, in [anomaly detection](https://opensearch.org/docs/latest/observing-your-data/ad/security/), granting permissions to delete detectors requires the user to be granted `cluster:admin/opendistro/ad/detector/delete`, which allows the user to delete all detectors in the cluster. Anomaly Detection has implemented their own authz mechanism. To avoid this the Security plugin should be the only place where authorization happens. To address this, we are introducing a new concept of Resource Permissions. This feature adds an additional step to the authorization evaluation, allowing plugins to define resource-specific permissions.


### **Feature Implementation**

Once plugins define the resource permissions associated with resources, cluster admins can assign the relevant resource permissions to roles, which will then be evaluated within the Security plugin. Resource permissions can be defined in two ways:

1. Starts with `resource:<plugin-identifier>/<permission>`  **(proposed)**
2. Starts with `<plugin-identifier>:<permission>`

The main difference is the presence of the standard prefix `resource:`, **which simplifies validation [here](https://github.com/opensearch-project/OpenSearch/blob/2069bd805804dd93bd69695a4c6521cc6f2b9bb6/server/src/main/java/org/opensearch/transport/TransportService.java#L1097)**. The structure would look like:

Allows users to use all Anomaly Detection functionality on selected resources, i.e., detectors

anomaly_full_access:
reserved: true
cluster_permissions:
- 'cluster_monitor'
- 'cluster:admin/opendistro/ad/'
index_permissions:
- index_patterns:
- '
'
allowed_actions:
- 'indices_monitor'
- 'indices:admin/aliases/get'
- 'indices:admin/mappings/get'
resource_permissions:
- identifier_pattern:
- 'detector1'
- 'detector2'
allowed_actions:
- 'resource:ad/detectors/read'



**This allows admins to add a layer of control, restricting access to selected detectors instead of all detectors. Plugins like [alerting](https://opensearch.org/docs/latest/observing-your-data/alerting/security/), [ml-commons](https://quip-amazon.com/CBKNACJHvnRY/ML-Commons-Security-model-access-control), and flow-framework can similarly benefit by defining permissions at the resource level.**



### **Evaluation Flow Diagram**

* ResourceAccessEvaluator.java
```mermaid
flowchart TD
    A[Resource Access Evaluator.evaluate]
    A --> B{Does request have resources defined?}
    B -->|No| C[Return response as incomplete]
    B -->|Yes| D{Do roles have resource_permissions defined?}
    D -->|No| E[Return response as denied and mark complete]
    D -->|Yes| F[Evaluate resource_permissions for all roles]
    F --> G{Do resource_permissions match with supplied resource names?}
    G -->|Yes| H[Return response as allowed and mark complete]
    G -->|No| E
    H --> J[Return presponse]
    E --> J
    C --> J
  • PrivilegesEvaluator.java’s evaluate() method with resource permissions check:
flowchart TD
    A[PrivilegesEvaluator.evaluate]
    A --> B[Resolve roles]
    B --> C{Is bulk request for default tenant?}
    C -->|Yes| D[Evaluate and return presponse]
    C -->|No| E[Resolve all indices in the request]
    E --> F[Evaluate snapshot restore request]
    F -->|Complete| G[Return response]
    F -->|Incomplete| H[Evaluate security index access request]
    H -->|Complete| I[Return response]
    H -->|Incomplete| J[Evaluate protected index access request]
    J -->|Complete| K[Return response]
    J -->|Incomplete| L[Evaluate Resource Access request]
    L -->|Complete| M[Return response]
    L -->|Incomplete| N[Evaluate PIT request]
    N -->|Complete| O[Return response]
    N -->|Incomplete| P{Is cluster permission request?}
    P -->|Yes| Q[Evaluate cluster-level permissions and return response]
    P -->|No| R[Evaluater terms aggregation request]
    R -->|Complete| S[Return response]
    R -->|Incomplete| T[Perform DNFOF evaluation if enabled]
    T --> U[Return result]
    style L fill:green,stroke:#333,stroke-width:4px
Loading

Low-Level Design

Design Questions

  • Are we allowing a mix of roles with and without resource permissions?
    • If yes, what is the default behavior when none of the roles resolved for a user contain a resource permission?
    • Should the user be allowed permission to that resource?
      • If yes, there won’t be backward compatibility issues, but it is not ideal as the feature can be circumvented by not adding any resource permissions.
      • (Proposed) If no, there might be backward compatibility issues, especially in mixed cluster requests, which need thorough testing.
  • Should we allow * (match_all) patterns for accessing resources? OR should they be individually listed under the role?
    • Note: detector_c* can still be allowed but the question here is, should * be prohibited as a resource_name pattern.
  • Should ressource_permissions be hidden behind feature flag?
    • No, the primary purpose of introducing this feature is to move the authorization logic, implemented by individual plugins with resources, to Security plugin.

File Changes

  1. ActionRequest.java
    1. This base class is extended by every plugin’s implementation of transport request calls. Hence, the proposal is to add a standard class property List<String> resources . This should then be implemented by plugins to populate this property when handling an actionRequest after which SecurityFilter.java will intercept this call to perform privilege evaluation. (It is important to note this change when creating a documentation to onboard plugin developers to this change)
      1. Note: this is done for standardization purposes, however this can be skipped and left on the plugin developers to add this property to each request class. (Not recommended)
  2. TransportService.java
    1. Add resource: as a valid transport action prefix here.
  3. RoleV7.java
    1. This class will now define a new type of class: Resource (Use the class Index as reference)
    2. Add this newly defined class as a property to the parent
  4. ResourceAccessEvaluator.java (new file)
    1. A new class that defines the evaluator method to verify the resource permissions.
    2. This method should assume the resource names are passed in the request (the string list defined above).
    3. If no resources are present in the list, it should assume that the plugin has not opted into this feature and so should bypass this authorization check to return a success response.
  5. PrivilegeEvaluator.java
    1. Modify evaluate() method to add a check that calls the evaluator defined above.
</details>






@github-actions github-actions bot added the untriaged Require the attention of the repository maintainers and may need to be prioritized label Jun 28, 2024
@shikharj05
Copy link
Contributor

shikharj05 commented Jul 1, 2024

Thanks for the proposal @DarshitChanpura ! This looks interesting! High-level comments-

From what I understand, this proposal is targeting current roles and adding new resource_permissions to the existing RBAC system. What are your thoughts about treating resources as entities and allowing admins/users to tie up ACLs with a resource? For example, treating detector as a resource and access control will be defined on this resource (i.e. close to the entity being evaluated).

This proposal talks in depth about plugins, can you add an example if such a resource is added by core. This could also pave the way for treating indices as resources and tying down an ACL with them.

As this is introducing a new field/permission model, should this be a new config_version ? Also, there's another RFC to retire V6 #4493 , I think we can use this opportunity to define how new permissions models/config_version changes should be added and older ones removed. Thoughts @DarshitChanpura @cwperks ?

@shikharj05
Copy link
Contributor

shikharj05 commented Jul 1, 2024

Adding few more comments on each design question-

Are we allowing a mix of roles with and without resource permissions?
If yes, what is the default behavior when none of the roles resolved for a user contain a resource permission?
Should the user be allowed permission to that resource?
If yes, there won’t be backward compatibility issues, but it is not ideal as the feature can be circumvented by not adding any resource permissions.
(Proposed) If no, there might be backward compatibility issues, especially in mixed cluster requests, which need thorough testing.

For backwards compatibility, I think this should be introduced with default behavior of treating * as a resource on a role that doesn't have any explicit definitions of resource_permissions. I agree that this might not be the best security behavior, however, it allows for smoother upgrades and easier migration path for customers. To negate circumventing of the feature, for any new roles being created, resource_permissions must be explicitly added (maybe, we can add similar requirement for updates to existing roles). Thoughts?

For example, an existing role without resource permissions specified would be treated as equal to the following-

# Allow users to read Anomaly Detection detectors and results
anomaly_read_access:
  reserved: true
  cluster_permissions:
    - 'cluster:admin/opendistro/ad/detector/info'
    - 'cluster:admin/opendistro/ad/detector/search'
    - 'cluster:admin/opendistro/ad/detector/validate'
    - 'cluster:admin/opendistro/ad/detectors/get'
    - 'cluster:admin/opendistro/ad/result/search'
    - 'cluster:admin/opendistro/ad/result/topAnomalies'
    - 'cluster:admin/opendistro/ad/tasks/search'
  resource_permissions:
    - '*'

This would allow users upgrading to newer versions of OpenSearch where these permissions are introduced while letting users/admins define more granular access control methods.

@shikharj05
Copy link
Contributor

Should we allow * (match_all) patterns for accessing resources? OR should they be individually listed under the role?
Note: detector_c* can still be allowed but the question here is, should * be prohibited as a resource_name pattern.

For the reasons mentioned in previous comment, I think * needs to be allowed.

@shikharj05
Copy link
Contributor

Should ressource_permissions be hidden behind feature flag?
No, the primary purpose of introducing this feature is to move the authorization logic, implemented by individual plugins with resources, to Security plugin.

I agree with the requirement, however, this could also break existing customers. If we can cover migrations/upgrades with enough tests, it should be okay to use a feature flag to enable it. However, I would prefer a flag to turn this functionality on/off smoothly for initial release as this could be a breaking change.

@shikharj05
Copy link
Contributor

File Changes

For file change, kindly consider the changes being brought in by #3870 and ensure changes are compatible with the RFC.

RoleV7.java

Since this introduces a new class in existing model, do you know if contents of this class are sent over transport channel with requests? If so, this could cause issues during ugprades in a mixed-version cluster.

@stephen-crawford
Copy link
Collaborator

[Triage] Hi @DarshitChanpura, thanks for filing this issue. This proposal/rfc is for the resource permissions you are working on. I will go ahead and mark it as triaged.

@stephen-crawford stephen-crawford added triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. and removed untriaged Require the attention of the repository maintainers and may need to be prioritized labels Jul 1, 2024
@DarshitChanpura
Copy link
Member Author

DarshitChanpura commented Jul 1, 2024

@shikharj05 Thank you for your comments.

What are your thoughts about treating resources as entities and allowing admins/users to tie up ACLs with a resource? For example, treating detector as a resource and access control will be defined on this resource (i.e. close to the entity being evaluated).

Not sure if i completely understand what an ACL is. However, this proposal introduces 1 more layer which is closer to the resource being evaluated.

This proposal talks in depth about plugins, can you add an example if such a resource is added by core. This could also pave the way for treating indices as resources and tying down an ACL with them.

Index permissions should cover this correct? Or are you talking about merging resource permissions and index permissions? I'm not aware of any "resource" introduced by core.

As this is introducing a new field/permission model, should this be a new config_version ? Also, there's another RFC to retire V6 #4493 , I think we can use this opportunity to define how new permissions models/config_version changes should be added and older ones removed. Thoughts @DarshitChanpura @cwperks ?

We can consider this as a new config version, however the scope of #4493 might not overlap with this as that RFC discusses retiring v6 and this RFC talks about introducing a new property to a role. Having said that this should be be tested in mixed cluster scenario.

For backwards compatibility, I think this should be introduced with default behavior of treating * as a resource on a role that doesn't have any explicit definitions of resource_permissions. I agree that this might not be the best security behavior, however, it allows for smoother upgrades and easier migration path for customers. To negate circumventing of the feature, for any new roles being created, resource_permissions must be explicitly added (maybe, we can add similar requirement for updates to existing roles). Thoughts?

Proposed path for roles with no resource_permissions is to skip Resource Permissions evaluation entirely which essentially mimics * behavior. "If no resources are present in the list, it should assume that the plugin has not opted into this feature and so should bypass this authorization check to return a success response." If plugin has opted in to resource permissions, we should not allow * access to all resources as this would essentially by-pass resource permissions evaluation.

I would prefer a flag to turn this functionality on/off smoothly for initial release as this could be a breaking change.

Agreed.

File Changes
For file change, kindly consider the changes being brought in by #3870 and ensure changes are compatible with the RFC.
RoleV7.java

Since that is still in draft state I have not considered those changes here.

Since this introduces a new class in existing model, do you know if contents of this class are sent over transport channel with requests? If so, this could cause issues during ugprades in a mixed-version cluster.

Agreed. This will have to be tested. Can you elaborate on what contents will be sent over transport channel?

@DarshitChanpura
Copy link
Member Author

@nibix would love your inputs on this.

@cwperks
Copy link
Member

cwperks commented Jul 2, 2024

Resource permissions can be defined in two ways:

  1. Starts with <plugin-identifier>:<permission>

@DarshitChanpura I think a good aim for action naming is to make them as succinct as possible to help cluster admins understand what they do based off of this short name. We could have the prefix of the action name be the resource type and allow plugins to specify what the type of resource is.

i.e. In Anomaly-Detection the resource type could be anomalydetector and for ML-Commons it could be modelgroup. Think of indices as a type of resource and resource permissions as a generalization on index permissions.

anomalydetector:admin/create is more succinct then resource:ad/admin/create, but I could see how the resource: prefix could also be helpful for consistency.

@DarshitChanpura
Copy link
Member Author

anomalydetector:admin/create is more succinct then resource:ad/admin/create, but I could see how the resource: prefix could also be helpful for consistency.

I agree with succinct naming, however the prefixes for transport actions are statically defined here. If we were to define plugin names as prefixes instead of resource: it becomes impossible to validate the prefix and without this validation the transport action flow fails here. Hence, I think we should have a common prefix: resource:.

@nibix
Copy link
Collaborator

nibix commented Jul 2, 2024

@DarshitChanpura

Interesting proposal! That moves OpenSearch quite a bit into the direction of a whole application platform.

My comments (completely IMHO) below:

Configuration structure

Resource permissions can be defined in two ways:

  1. Starts with resource:<plugin-identifier>/<permission> (proposed)
  2. Starts with <plugin-identifier>:<permission>

I can understand the motivation for both. Both have pros and cons. The pro about <plugin-identifier>:<permission> is that the plugin is easy to recognize for the user. The pro about resource:<plugin-identifier>/<permission> is that the validation in TransportService and also the logic routing in PrivilegesEvaluator is more easy. However, it might be confusing that the first component after the resource: prefix is actually not the resource, but a string representing the plugin? How about using the format plugin:<plugin-identifier>/<resource>/<permission> instead?

However, I would like to take one step back first. Do I see it correctly that we have these dimensions?

  • Plugin: Like Anomaly detection, or alerting, ...
  • Plugin-specific resource type: Plugins might be managing more than one resource types, especially alerting seems to have quite a few
  • Individual instances of one resource type.

While I am pretty sure that these dimensions can be modeled with the proposed resource_permissions structure, I am wondering whether it still will be easy to grasp when there are several plugins with several resource types in the game.

Generally, to verify the real-world suitability of a new config format, I like to recommend to actually create some complete real-world examples, which incorporate several features for several different cases.

Backwards compatibility considerations

I have read the discussions about backwards compat. Just some random thoughts about this:

  • Are there considerations in which version these permissions shall be introduced? In a major release one might be more liberal about breaking changes than in a minor release.
  • As some of the plugins might have implemented their own authz mechanisms now, these also need to be considered in the migration process. As soon as the security plugin gets responsible for authz, the plugins authz mechanisms should be no longer in effect.
  • One way around this would be to require the plugins to introduce new APIs which are then protected by the new scheme, while the old APIs are protected by the old scheme and can be removed eventually. But that might also introduce further and bigger issues 🤔
  • I would strongly recommend not to assume that roles without resource permissions indicate full permissions in order to stay backwards compat. That would be very counter-intuitive. If such a strong backwards compatibility is required, I'd go with a feature flag instead.

Code infrastructure: Extending ActionRequest

I can think about two different way of extending ActionRequest. I am wondering about what way you were thinking.

Approach a): Add an empty method that can be overridden:

public abstract class ActionRequest extends TransportRequest {

    public ActionRequest() {
        super();
    }

    public ActionRequest(StreamInput in) throws IOException {
        super(in);
    }
 
    /**
      * NEW NEW NEW .. to be overridden by whoever wants to
      */
    public List<String> getResources() {
       return Collections.emptyList();
    }

    public abstract ActionRequestValidationException validate();

    public boolean getShouldStoreResult() {
        return false;
    }

    @Override
    public void writeTo(StreamOutput out) throws IOException {
        super.writeTo(out);
    }
}

Approach b): Include the actual resource information here.

public abstract class ActionRequest extends TransportRequest {

    private final List<String> resources;

    public ActionRequest() {
        super();
        this.resources = Collections.emptyList();
    }

    public ActionRequest(List<String> resources) {
        super();
        this.resources = resources;
    }

    public ActionRequest(StreamInput in) throws IOException {
        super(in);
        if (in comes from resource aware node) {
           this.resources = in.readList();
        }
    }
 
    public List<String> getResources() {
       return this.resources;
    }

    @Override
    public void writeTo(StreamOutput out) throws IOException {
        super.writeTo(out);

        if (out is resource aware node) {
           out.writeList(this.resources);
        }
    }
}
  • One advantage of approach b) would be that it makes it very easy for plugins to gradually enhance their existing actions.

  • One disadvantage of approach b) would be that such requests likely already carry the resource information - yet there is no standardized way to access and retrieve it. In approach b) there would be the need to store it redundantly - which would create the risk of inconsistencies.

  • Yet, both approaches would be a slight deviation from the present quite strict segmentation of action and request types by interfaces and super-classes (compare IndicesRequest, AliasesRequest, BroadcastRequest, ...). If ActionRequest is extended, it gets very easy to also add resource priv requirements to IndicesRequests. The question would be whether that makes sense.

  • One thing that needs to be clarified is: Shall such requests support resource wildcards (*) and resource name patterns (my_resource_*)? For indices requests, it is standard that wildcards and patterns are supported, so it would be intuitive if resource requests would be supporting these as well. However, resource requests can only carry the patterns. They are most likely lacking the dependencies to resolve these patterns to concrete resource names. This probably can be only done by a central component in the plugin, possibly the TransportAction instance which is responsible for processing the requests. Possibly, one needs to have a further interface which TransportActions need to implement in order to extract concrete resource names from requests.

Code infrastructure: Privilege meta data

I am deviating from the proposal a bit to make a related observation: In core OpenSearch, there is also no standardized interface to deduce required privileges from action requests. Instead, the security plugin has a quite involved logic to deduce required privileges from requests. See for example:

private Set<String> evaluateAdditionalIndexPermissions(final ActionRequest request, final String originalAction) {
// --- check inner bulk requests
final Set<String> additionalPermissionsRequired = new HashSet<>();
if (!isClusterPerm(originalAction)) {
additionalPermissionsRequired.add(originalAction);
}
if (request instanceof ClusterSearchShardsRequest) {
additionalPermissionsRequired.add(SearchAction.NAME);
}
if (request instanceof BulkShardRequest) {
BulkShardRequest bsr = (BulkShardRequest) request;
for (BulkItemRequest bir : bsr.items()) {
switch (bir.request().opType()) {
case CREATE:
additionalPermissionsRequired.add(IndexAction.NAME);
break;
case INDEX:
additionalPermissionsRequired.add(IndexAction.NAME);
break;
case DELETE:
additionalPermissionsRequired.add(DeleteAction.NAME);
break;
case UPDATE:
additionalPermissionsRequired.add(UpdateAction.NAME);
break;
}
}
}

private boolean getOrReplaceAllIndices(final Object request, final IndicesProvider provider, boolean allowEmptyIndices) {
final boolean isDebugEnabled = log.isDebugEnabled();
final boolean isTraceEnabled = log.isTraceEnabled();
if (isTraceEnabled) {
log.trace("getOrReplaceAllIndices() for " + request.getClass());
}
boolean result = true;
if (request instanceof BulkRequest) {
for (DocWriteRequest ar : ((BulkRequest) request).requests()) {
result = getOrReplaceAllIndices(ar, provider, false) && result;
}
} else if (request instanceof MultiGetRequest) {
for (ListIterator<Item> it = ((MultiGetRequest) request).getItems().listIterator(); it.hasNext();) {
Item item = it.next();
result = getOrReplaceAllIndices(item, provider, false) && result;
/*if(item.index() == null || item.indices() == null || item.indices().length == 0) {
it.remove();
}*/
}
} else if (request instanceof MultiSearchRequest) {
for (ListIterator<SearchRequest> it = ((MultiSearchRequest) request).requests().listIterator(); it.hasNext();) {
SearchRequest ar = it.next();
result = getOrReplaceAllIndices(ar, provider, false) && result;
/*if(ar.indices() == null || ar.indices().length == 0) {
it.remove();
}*/
}
} else if (request instanceof MultiTermVectorsRequest) {
for (ActionRequest ar : (Iterable<TermVectorsRequest>) () -> ((MultiTermVectorsRequest) request).iterator()) {
result = getOrReplaceAllIndices(ar, provider, false) && result;
}
} else if (request instanceof PutMappingRequest) {
PutMappingRequest pmr = (PutMappingRequest) request;
Index concreteIndex = pmr.getConcreteIndex();
if (concreteIndex != null && (pmr.indices() == null || pmr.indices().length == 0)) {
String[] newIndices = provider.provide(new String[] { concreteIndex.getName() }, request, true);
if (checkIndices(request, newIndices, true, allowEmptyIndices) == false) {
return false;
}
((PutMappingRequest) request).indices(newIndices);
((PutMappingRequest) request).setConcreteIndex(null);
} else {
String[] newIndices = provider.provide(((PutMappingRequest) request).indices(), request, true);
if (checkIndices(request, newIndices, false, allowEmptyIndices) == false) {
return false;
}
((PutMappingRequest) request).indices(newIndices);
}
} else if (request instanceof RestoreSnapshotRequest) {
if (clusterInfoHolder.isLocalNodeElectedClusterManager() == Boolean.FALSE) {
return true;
}
final RestoreSnapshotRequest restoreRequest = (RestoreSnapshotRequest) request;
final SnapshotInfo snapshotInfo = SnapshotRestoreHelper.getSnapshotInfo(restoreRequest);
if (snapshotInfo == null) {
log.warn(
"snapshot repository '" + restoreRequest.repository() + "', snapshot '" + restoreRequest.snapshot() + "' not found"
);
provider.provide(new String[] { "*" }, request, false);
} else {
final List<String> requestedResolvedIndices = IndexUtils.filterIndices(
snapshotInfo.indices(),
restoreRequest.indices(),
restoreRequest.indicesOptions()
);
final List<String> renamedTargetIndices = renamedIndices(restoreRequest, requestedResolvedIndices);
// final Set<String> indices = new HashSet<>(requestedResolvedIndices);
// indices.addAll(renamedTargetIndices);
if (isDebugEnabled) {
log.debug("snapshot: {} contains this indices: {}", snapshotInfo.snapshotId().getName(), renamedTargetIndices);
}
provider.provide(renamedTargetIndices.toArray(new String[0]), request, false);
}
} else if (request instanceof IndicesAliasesRequest) {
for (AliasActions ar : ((IndicesAliasesRequest) request).getAliasActions()) {
result = getOrReplaceAllIndices(ar, provider, false) && result;
}
} else if (request instanceof DeleteRequest) {
String[] newIndices = provider.provide(((DeleteRequest) request).indices(), request, true);
if (checkIndices(request, newIndices, true, allowEmptyIndices) == false) {
return false;
}
((DeleteRequest) request).index(newIndices.length != 1 ? null : newIndices[0]);
} else if (request instanceof UpdateRequest) {
String[] newIndices = provider.provide(((UpdateRequest) request).indices(), request, true);
if (checkIndices(request, newIndices, true, allowEmptyIndices) == false) {
return false;
}
((UpdateRequest) request).index(newIndices.length != 1 ? null : newIndices[0]);
} else if (request instanceof SingleShardRequest) {
final SingleShardRequest<?> singleShardRequest = (SingleShardRequest<?>) request;
final String index = singleShardRequest.index();
String[] indices = provider.provide(index == null ? null : new String[] { index }, request, true);
if (!checkIndices(request, indices, true, allowEmptyIndices)) {
return false;
}
singleShardRequest.index(indices.length != 1 ? null : indices[0]);
} else if (request instanceof FieldCapabilitiesIndexRequest) {
// FieldCapabilitiesIndexRequest does not support replacing the indexes.
// However, the indexes are always determined by FieldCapabilitiesRequest which will be reduced below
// (implements Replaceable). So IF an index arrives here, we can be sure that we have
// at least privileges for indices:data/read/field_caps
FieldCapabilitiesIndexRequest fieldCapabilitiesRequest = (FieldCapabilitiesIndexRequest) request;
String index = fieldCapabilitiesRequest.index();
String[] newIndices = provider.provide(new String[] { index }, request, true);
if (!checkIndices(request, newIndices, true, allowEmptyIndices)) {
return false;
}
} else if (request instanceof IndexRequest) {
String[] newIndices = provider.provide(((IndexRequest) request).indices(), request, true);
if (checkIndices(request, newIndices, true, allowEmptyIndices) == false) {
return false;
}
((IndexRequest) request).index(newIndices.length != 1 ? null : newIndices[0]);
} else if (request instanceof Replaceable) {
String[] newIndices = provider.provide(((Replaceable) request).indices(), request, true);
if (checkIndices(request, newIndices, false, allowEmptyIndices) == false) {
return false;
}
((Replaceable) request).indices(newIndices);
} else if (request instanceof BulkShardRequest) {
provider.provide(((ReplicationRequest) request).indices(), request, false);
// replace not supported?
} else if (request instanceof ReplicationRequest) {
String[] newIndices = provider.provide(((ReplicationRequest) request).indices(), request, true);
if (checkIndices(request, newIndices, true, allowEmptyIndices) == false) {
return false;
}
((ReplicationRequest) request).index(newIndices.length != 1 ? null : newIndices[0]);
} else if (request instanceof MultiGetRequest.Item) {
String[] newIndices = provider.provide(((MultiGetRequest.Item) request).indices(), request, true);
if (checkIndices(request, newIndices, true, allowEmptyIndices) == false) {
return false;
}
((MultiGetRequest.Item) request).index(newIndices.length != 1 ? null : newIndices[0]);
} else if (request instanceof CreateIndexRequest) {
String[] newIndices = provider.provide(((CreateIndexRequest) request).indices(), request, true);
if (checkIndices(request, newIndices, true, allowEmptyIndices) == false) {
return false;
}
((CreateIndexRequest) request).index(newIndices.length != 1 ? null : newIndices[0]);
} else if (request instanceof ResizeRequest) {
// clone or shrink operations
provider.provide(((ResizeRequest) request).indices(), request, true);
provider.provide(((ResizeRequest) request).getTargetIndexRequest().indices(), request, true);
} else if (request instanceof CreateDataStreamAction.Request) {
provider.provide(((CreateDataStreamAction.Request) request).indices(), request, false);
} else if (request instanceof ReindexRequest) {
result = getOrReplaceAllIndices(((ReindexRequest) request).getDestination(), provider, false) && result;
result = getOrReplaceAllIndices(((ReindexRequest) request).getSearchRequest(), provider, false) && result;
} else if (request instanceof BaseNodesRequest) {
// do nothing
} else if (request instanceof MainRequest) {
// do nothing
} else if (request instanceof ClearScrollRequest) {
// do nothing
} else if (request instanceof SearchScrollRequest) {
// do nothing
} else if (request instanceof PutComponentTemplateAction.Request) {
// do nothing
} else {
if (isDebugEnabled) {
log.debug(request.getClass() + " not supported (It is likely not a indices related request)");
}
result = false;
}
return result;
}
(especially this one is pure horror)

Could it make sense to extend the proposal to enhance ActionRequests and TransportActions to provide their required privileges - independently whether these are plugin resources or core indices, core aliases or core data streams?

New config version

So far the proposal for the new configuration is a pure super-set of the previous configuration. Thus, the new properties can be considered as a "progressive enhancement", which does not make a new config version necessary.

I guess that introducing new config versions introduces significant pains (migration processes, tools, etc), which should be avoided if possible.

@cwperks
Copy link
Member

cwperks commented Jul 2, 2024

New config version

So far the proposal for the new configuration is a pure super-set of the previous configuration. Thus, the new properties can be considered as a "progressive enhancement", which does not make a new config version necessary.

I guess that introducing new config versions introduces significant pains (migration processes, tools, etc), which should be avoided if possible.

This would be ideal. I also don't believe a new config version is necessary.

I agree with succinct naming, however the prefixes for transport actions are statically defined here. If we were to define plugin names as prefixes instead of resource: it becomes impossible to validate the prefix and without this validation the transport action flow fails here. Hence, I think we should have a common prefix: resource:.

That data structure can be treated as a registry and an extension point could be introduced for plugins to register resource types. I see "views" is in that list and that could be considered a resource type. Since the new section of the role would be called resource_permissions: there would be some redundancy with all action names prefixed with resource: but we can iterate to determine the ideal path forward. As @nibix mentioned, it would be useful in the action name to both has the plugin identifier (or commonly-used shortname) + resource type to allow plugins to create more than one resource type.

@shikharj05
Copy link
Contributor

One way around this would be to require the plugins to introduce new APIs which are then protected by the new scheme, while the old APIs are protected by the old scheme and can be removed eventually. But that might also introduce further and bigger issues 🤔

+1, while this offers easier migration path, however this will make authorization decisions complicated with custom authZ. @DarshitChanpura - can you help list plugins performing custom authZ within the opensearch-project? I think this proposal can cover migrating these plugins away from custom authZ.

I would strongly recommend not to assume that roles without resource permissions indicate full permissions in order to stay backwards compat. That would be very counter-intuitive. If such a strong backwards compatibility is required, I'd go with a feature flag instead.

I think it should be okay to introduce this feature with a flag. i.e. fail-open with * as resource_permissions or fail-close in case of missing resource_permissions. We can also offer something like a migration period/dual-mode flag like plugins.security_config.ssl_dual_mode_enabled

@shikharj05
Copy link
Contributor

Not sure if i completely understand what an ACL is. However, this proposal introduces 1 more layer which is closer to the resource being evaluated.

For example, consider Linux OS. Permissions on a file can be defined for file owner, group owner & others. i.e. resource permissions are dictated by ACL defined on the file.

As an example, for OpenSearch, this could possibly translate to a Super Admin user being able to create a detector in AD plugin and adding an ACL to it that looks like - CRUD for super admin users & RU for users with anomaly_full_access role and R for others.

For an incoming request, security plugin would first evaluate permissions based on the identity (caller of the request); next, request would be evaluated against resource(in this case a detector) ACL.
Thoughts?

@DarshitChanpura
Copy link
Member Author

@nibix

Are there considerations in which version these permissions shall be introduced? In a major release one might be more liberal about breaking changes than in a minor release.
As some of the plugins might have implemented their own authz mechanisms now, these also need to be considered in the migration process. As soon as the security plugin gets responsible for authz, the plugins authz mechanisms should be no longer in effect.
One way around this would be to require the plugins to introduce new APIs which are then protected by the new scheme, while the old APIs are protected by the old scheme and can be removed eventually. But that might also introduce further and bigger issues 🤔
I would strongly recommend not to assume that roles without resource permissions indicate full permissions in order to stay backwards compat. That would be very counter-intuitive. If such a strong backwards compatibility is required, I'd go with a feature flag instead.

These are great thoughts/suggestions. This feature would require end to end testing to understand the migration path for plugins with custom authz schemes. However, the idea is to enforce those plugins to use security plugin as the standard for any authorization which would be considered breaking change for plugins but the end user experience remains the same (except now they have to have resource permissions, of course).
I agree with not assuming no permission as full permissions and hence suggested an alternative to use feature flag. However, the issue with feature flag is that when it is enabled all plugins must have been onboarded to this new authz scheme, else we'll have to maintain a list of plugins which have onboarded, as a custom setting, and then only evaluate requests pertaining to those plugins, which is a whole new set of problems. I'm strongly in favor of nudging plugins to introduce new APIs that can be hidden behind feature flag until we finally deprecate the current APIs.

Code infrastructure: Extending ActionRequest

I'm leaning towards approach a which makes the implementation simpler and easy to extend, but with a twist. If we are nudging plugins to introduce new APIs, we can extend ActionRequest to a new class ResourceActionRequest and have the plugins using custom authz extend this class instead.

One thing that needs to be clarified is: Shall such requests support resource wildcards () and resource name patterns (my_resource_)? For indices requests, it is standard that wildcards and patterns are supported, so it would be intuitive if resource requests would be supporting these as well. However, resource requests can only carry the patterns. They are most likely lacking the dependencies to resolve these patterns to concrete resource names. This probably can be only done by a central component in the plugin, possibly the TransportAction instance which is responsible for processing the requests. Possibly, one needs to have a further interface which TransportActions need to implement in order to extract concrete resource names from requests.

My understanding is that resource names would be extracted from the API request as is before forwarding those to transport action calls. I'm not entirely sure if we need to expand and resolve resource names similar to index-pattern. If we do then, we will have to develop an API to fetch all resources and resolve them based on the pattern supplied, similar to IndexResolverReplacer. This might need a deeper look.

@cwperks

That data structure can be treated as a registry and an extension point could be introduced for plugins to register resource types. I see "views" is in that list and that could be considered a resource type. Since the new section of the role would be called resource_permissions: there would be some redundancy with all action names prefixed with resource: but we can iterate to determine the ideal path forward. As @nibix mentioned, it would be useful in the action name to both has the plugin identifier (or commonly-used shortname) + resource type to allow plugins to create more than one resource type.

This sounds like a good approach. However it also relies to plugins registering the resource types. If we are to move with this path then we would have to highlight that all plugins that will be utilizing this new authz scheme should register the resource types. AFAIK, keeping a consistent prefix avoids a lot of problems around validation and we already have redundancy in cluster_permissions and index_permissions.

@shikharj05

@DarshitChanpura - can you help list plugins performing custom authZ within the opensearch-project? I think this proposal can cover migrating these plugins away from custom authZ

From my knowledge and talking with @cwperks, these plugins currently perform custom authZ: alerting, anomaly-detection, ml-commons and flow-framework.

For an incoming request, security plugin would first evaluate permissions based on the identity (caller of the request); next, request would be evaluated against resource(in this case a detector) ACL.
Thoughts?

I'm not sure if we have a concept of ACL today, but wouldn't it be similar to creating roles like admin, developer, regular-user and then mapping users to these roles as required to grant them access?

@cwperks
Copy link
Member

cwperks commented Jul 17, 2024

@DarshitChanpura I had some more recent thoughts on Resource Permissions and wanted to run them by you. I think Resource Permissions are fundamentally about sharing a resource created by some user.

I was thinking about what it would take to create a Searchable Photo Album plugin (Hackathon idea) where users of the plugin can share their photo albums with others to add photos, add captions, etc.

Each resource should have a resource user, the user that creates the resource. The resource user, should have the ability to allow others to interact with their resources (either explicitly adding someone by username or perhaps someone that shares a role/backend role)

I'm starting to think about the Job-Scheduler SPI model where the security plugin can listen to index operation on resource indexes and then store the resource user centrally along with the docId of the resource, the resource index, the resource user and what level of access (allowed_actions) in a .resource_sharing index.

Also I'm now thinking that resources should not have unique names. i.e. You and I can both have photo albums called Tulum 2024, but I should be able to add you to my album to add photos.

Let's say user1 is trying to upload a photo to user2's photo album, how would the authz work?

  1. PUT /_plugins/photoalbum/{albumId}/upload <- Assume the resource user with {albumId} belongs to user2

  2. When authorizing the request, the security plugin can use the permissions stored in the .resource-sharing index to determine if the user trying to interact with the resource has access.

For a user's own resource, they can continue to be authorized using the cluster_permissions section of a role definition.

I'm still thinking through this idea some and I will reply back with another comment once its more complete.

@shikharj05
Copy link
Contributor

shikharj05 commented Jul 17, 2024

@cwperks I think your idea also aligns with Resource ACLs. ACLs would allow a user to create a resource and define an ACL on it. i.e. permissions will be defined on the resource and not defined on user requesting access.

@DarshitChanpura
Copy link
Member Author

DarshitChanpura commented Jul 17, 2024

+1 to @shikharj05's comment. I like this idea @cwperks, however we are moving towards a more ACL approach where the resource "owner/creator" has the right to define ACL's to the resource.

At present, the "owner" of these resources is admin or someone who has access to manage roles, and these admins can then define access levels to individual resources or resource patterns. Allowing the ability to share resources is an administerial task. For what you are suggesting, the idea of a cluster admin might change to allow a regular user more control over their own resources. I personally am in favor of adding such a feature however this might be a v2 of this or a separate feature overall.

@DarshitChanpura
Copy link
Member Author

Outstanding Questions:

  1. How are we planning to roll-out this feature?
    • To allow for smoother upgrades, and maintain backwards compatibility, this feature will initially be launched with three modes (settings) : disabled, fail-open and enabled.
      • plugins.security.resource_permissions.enabled: false - resource permission evaluation is completely disabled.
      • plugins.security.resource_permissions.fail_open: true - resource permissions will be evaluated and all failures and successes will be logged, and eventually authorization will be deemed successful. Requires enabled to be set to true.
      • plugins.security.resource_permissions.enabled: true - resource permissions evaluation will be enforced and any failures will cause the request to fail.
    • To accommodate these flags, plugins will initially be launched with a code flag that switches between in-house authorization and centralized security plugin authorization based on plugins.security.resource_permissions.enabled is set to false or true.
      • This is prevent us from taking a step backwards where we have on-boarded all plugins onto the new authz scheme, but then have the feature flag disabled, thus allowing all requests to pass through.

Please let me know your thoughts and any questions.

@DarshitChanpura
Copy link
Member Author

@shikharj05 @nibix @cwperks I have updated the design to follow a new approach that is more scalable. Please review it and add comments. If you would prefer to review via google doc, lmk.
(Thanks @cwperks for helping me draft this).

@DarshitChanpura
Copy link
Member Author

DarshitChanpura commented Aug 20, 2024

This feature will be released under @experimental flag until a PoC is complete.

@DarshitChanpura DarshitChanpura changed the title [Proposal] Resource Permissions [Proposal] Resource Permissions and Sharing Aug 20, 2024
@stephen-crawford
Copy link
Collaborator

Nice job @DarshitChanpura specifically taking all the feedback in stride and reworking the design to fit the comments and concerns :)

@kkhatua kkhatua added Roadmap:Security Project-wide roadmap label enhancement New feature or request labels Aug 26, 2024
@DarshitChanpura
Copy link
Member Author

@shikharj05 @nibix @willyborankin @cwperks @derek-ho I have update the design slightly to include scopes for individual or group access to a resource. Please review whenever you get a chance and let me know if there are any comments/concerns.

@cwperks
Copy link
Member

cwperks commented Sep 17, 2024

In addition each shared_with will also contain the level of access that each user, role or backend_role may have. These are two scopes at present read_only and read_write.

@DarshitChanpura From an implementation perspective how will security differentiate between read operation and write operation? Would it be done in the hasPermission(Resource resource) call? i.e. A plugin utilizing these methods has to pass what type of operation it is?

@DarshitChanpura
Copy link
Member Author

hasPermission will be tweaked to hasPermission(resource, scope) where scope will default to read_only.

Plugin owners are responsible for calling with correct scope. Let me know if you have a different approach in mind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request resource-permissions Label to track all items related to resource permissions Roadmap:Security Project-wide roadmap label triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable.
Projects
Status: New
Development

No branches or pull requests

6 participants