Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Aligning Access and Visibility in OpenSearch #4069

Open
peternied opened this issue Feb 22, 2024 · 5 comments
Open

[RFC] Aligning Access and Visibility in OpenSearch #4069

peternied opened this issue Feb 22, 2024 · 5 comments
Labels
help wanted Community contributions are especially encouraged for these issues. triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable.

Comments

@peternied
Copy link
Member

peternied commented Feb 22, 2024

Problem Statement

When a OpenSearch cluster admin Alice creates and then shares a Quarterly Sales dashboard with user Bill; Alice does not know if Bill will see the same data.

How can this be?

In the Security Plugin there are several permissions rules that are computed based on the user sending the query. These alter the query results and are invisible to the user Bill.

  1. Permissions for indexes are additive, to read an index a user need to have a role that grants them permission to that index. Sales data could be in indices named sales-na-YYYY-MM & sales-eu-YYYY-MM, Bill might only have access to the EU region.
  2. Document level security features (incl. row, columns, field-masking) are additive, if a user has a role with these features enabled it adds a restriction on the data. Sensitive sale data is filtered out by a DLS rule, customer-name == Frodo Baggins
  3. Security supports operating in a mode where queries are edited in flight to remove indexes the user does not have permissions to access. Bill does not have access to the NA region sales data and those results are silently filtered out

Bill's view of the dashboard would be missing data and he would not be unaware. Even if he brought this to Alice's attention she does not have a straight forward way to know what Bill data is missing nor if he needs more permissions to sales-na-* or less permissions so the sensitive sales data filter is not applied.

flowchart TB
    A["Quarterly Sales Dashboard Query: sales-*"] -->B1["Alice: Expand Query Indices<br/>sales-na-jan, sales-na-feb, sales-eu-jan, sales-eu-feb"]
    A -->B2["Bill: Expand Query Indices<br/>sales-na-jan, sales-na-feb, sales-eu-jan, sales-eu-feb"]
    
    subgraph Alice ["Alice's Flow"]
    B1 --> C1{"Resolve User Permissions"}
    C1 -->|No Filters Applied| E1["Filtered Indices:<br/>sales-na-jan, sales-na-feb, sales-eu-jan, sales-eu-feb"]
    E1 --> F1{Apply DLS Rules}
    F1 -->|No DLS Rules Applied| G1["Final Query:<br/>sales-na-jan, sales-na-feb, sales-eu-jan, sales-eu-feb"]
    G1 --> H1["Run Query"]
    end
    
    subgraph Bill ["Bill's Flow"]
    B2 --> C2{"Resolve User Permissions"}
    C2 -->|Filters Applied| E2["Filtered Indices:<br/>sales-eu-jan, sales-eu-feb"]
    E2 --> F2{Apply DLS Rules}
    F2 -->|Exclude 'Frodo Baggins'| G2["Final Query:<br/>sales-eu-jan, sales-eu-feb Exclude 'Frodo Baggins'"]
    G2 --> H2["Run Query"]
    end

    H1 --> I["All Indices Data"]
    H2 --> I
    
    style A fill:#f9f,stroke:#333,stroke-width:2px
    style I fill:#bff,stroke:#333,stroke-width:2px
Loading

Proposal

There should be a resource that combines these different access control rules together so that if access is grant to that resource Alice has confidence that Bill sees exactly the same view of the data. If Bill does not have access, there is a clear path to resolve the access question - "Alice could you grant me access to 'Company wide sales data' resource so I can see the Quarterly Sales dashboard?"

This resource would be fundamentally different from existing index/index pattern/alias/datastream that require explicit permissions. Administration of the setting of resources would require permissions of the 'targeted' indexes.

This resource would need a new way to be granted / reviewed since the existing permissions model only allows for cluster wide, index and tenants permissions.

I suggest this is called a View, as in "Alice could you grant me access to the 'Company wide sales view'".

Additional Context

Related issues

@peternied peternied added the help wanted Community contributions are especially encouraged for these issues. label Feb 22, 2024
@github-actions github-actions bot added the untriaged Require the attention of the repository maintainers and may need to be prioritized label Feb 22, 2024
@varun-lodaya
Copy link
Contributor

Are you trying to just abstract all the permissions into a common logical permission and then apply or will this work differently than existing permissions? Also, this sounds like coarse grained access control right? How will you customize and let user configure fine grained controls here?

@peternied
Copy link
Member Author

@varun-lodaya Thanks for taking a look - the utility of fine vs course is dependent on what resource is being permissioned. The problem raised by this issue is that it can be unclear how different levels of fine grain control operate. I don't think there is a requirement for a single permissions model; the existing model creates these problems and the question is if we should address it.

We've had many issues reported to back up this problem statement and I'll collect more and update the description to include them.

Do you agree with the premise of the problem, it sounds like you might have an alternative proposal in mind?

@derek-ho
Copy link
Contributor

[Triage] It sounds like this issue is trying to find a path forward for an issue that comes up often regarding how we handle access for certain data sets.

@derek-ho derek-ho added triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. and removed untriaged Require the attention of the repository maintainers and may need to be prioritized labels Feb 26, 2024
@msfroh
Copy link

msfroh commented Apr 6, 2024

I've been trying to think about how to use the "views" concept to support multi-tenant clusters.

I wrote up a straw API call example to create a view, specify who can access it, and provide a series of processors that limit the requests/responses on the views: https://docs.google.com/document/d/1AUEPXw5-P7UxW_UhdLdXw_LZTkCVyQdYEnh676WGlC4/edit?usp=sharing

The analogy I gave on another issue is that your cluster is an amusement park and a view is an entrance. The entrance you use determines which credentials are required and what wrist-band you're issued (which determines what rides/amenities you can access).

@peternied -- I don't know if this is in line with what you had in mind, but it makes sense to me.

@peternied
Copy link
Member Author

@msfroh Thanks for writing up that doc. I like many of the ideas you proposed. As 'views' has an implication in the DB space, I wouldn't mind an alternative, along the lines of data_park_entrance_gate_name 🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Community contributions are especially encouraged for these issues. triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable.
Projects
None yet
Development

No branches or pull requests

4 participants