RFC: Object level security, Phase 1 #93115

legrego · 2021-03-01T20:49:53Z

Summary

This RFC describes the first phase of Object Level Security (#82725)

elasticmachine · 2021-03-05T14:49:05Z

Pinging @elastic/kibana-security (Team:Security)

mattkime · 2021-03-05T22:53:16Z

There's one more use case that I think should be accounted for - telemetry. I think the needs regarding collecting telemetry on saved objects would be filled by some sort of administrator role but its worth specifically calling out just to be safe.

legrego · 2021-03-06T17:25:01Z

There's one more use case that I think should be accounted for - telemetry. I think the needs regarding collecting telemetry on saved objects would be filled by some sort of administrator role but its worth specifically calling out just to be safe.

@mattkime to clarify, are you interested in making sure that private object telemetry is reportable? If so, I don't think any of the changes in this RFC will impact telemetry collection one way or another. As far as I'm aware, telemetry uses a bare saved objects client, which does not include our security wrapper. This wrapper is what would be responsible for securing access to private objects.

If that addresses your concern, then I'll update the RFC to make note of this. Otherwise I'm happy to discuss further

mattkime · 2021-03-08T15:06:41Z

@legrego That sounds good to me. My concern comes from working with spaces where there's not really a way to work with saved objects across spaces.

jportner

Overall I like it and it sounds straightforward. Nits/feedback below.

We haven't really said this anywhere but this proposal is basically to implement Discretionary Access Control (DAC) for Saved Objects. In future phases we would allow an object's owner (subject) to grant read/write to other users or roles (subjects).

Another thing to consider: do we ever want to make a private object owned by a role instead of a user? Maybe that would be a totally separate classification -- not 'public' or 'private' but something else ('privateGroup').

rfcs/text/0015_ols_phase_1.md

jportner · 2021-03-09T22:55:36Z

rfcs/text/0015_ols_phase_1.md

+#### Performance considerations
+When overwriting existing objects, the security wrapper must first retrieve all of the existing `private` objects to ensure that the user is authorized. This requires another round-trip to `get`/`bulk-get` all `private` objects so we can authorize the operation.


In the SOR, we already do pre-flight requests for multi-namespace object types when creating, updating, or deleting objects, because we need to ensure that the object exists in the context of the current space.

Rather than adding an additional pre-flight request to the Security wrapper, perhaps we should consider exposing the existing pre-flight requests from the SOR to its wrappers? Maybe allow the wrapper to pass in a handler/callback for those requests. Maybe that's an optimization we can make in a later phase, but I wanted to put it forward for consideration.

Thanks for calling this out, I meant to explain my thinking here. I agree that an additional pre-flight request from the security wrapper is less than ideal, but I'm hesitant to over-engineer an optimal solution in this first phase, since I don't expect that we'll make heavy use of private multi-space objects in the near future.

Refresh my memory: why do the current set of pre-flight namespace checks live in the repository? I want to make sure I'm not overlooking a scenario where the repository should also be validating the accessControl.

When you take an action on an object, the repository has to retrieve it and make sure the object exists in the current space context. This was not needed when we had single-namespace objects (whose raw document IDs contained the namespace), but now that we have multi-namespace objects we need to do that preflight check.

At the moment the preflight check only occurs on multi-namespace object types, which is rare, but in the future most object types will be multi-namespace...

Edit: agree that we don't need to necessarily over-engineer something for Phase 1, but I also think we should plan ahead for the future when the performance hit will be more egregious.

Sorry I should have been clearer: why does the repository have to perform the check, as opposed to say, the spaces wrapper?

Because if the Spaces plugin is disabled, the repository should still behave this way (e.g., you should only be able to interact with objects that were in the default space before the Spaces plugin was disabled)

It seems like this auditing limitation might come up in a couple of other places. Do we have the same problem when:

updating a namespaceType: 'multiple' saved objects that exists, but not in the current namespace?

getting or updating a saved object that doesn't exist?

performing an operation which doesn't succeed because of a network error?

Yes, these situations are all possible today. We "mitigate" this by explaining that the audit log records an outcome of unknown for these scenarios. If you want a definitive answer, then you have to correlate Kibana audit logs with the corresponding Elasticsearch audit logs to be sure that the request actually completed. We can definitively tell you when the request has failed due to authorization checks in the wrapper, but beyond that the audit log will effectively tell the auditor that the user was authorized to perform this action, but we don't know if it completed successfully or not.

I'm quite nervous about the complexity of saved objects as a whole. My impression is that there's only a handful of people that somewhat grasp how it all fits together (and I find myself constantly learning some new behaviour like audit logging limitations). This implementation complexity also comes with a very complex set of tests where the amount of permutations keep steadily growing. So anything that could lower complexity feels worthwhile to explore.

No disagreement here on any of this!

We can always decide to adopt some technical debt / complexity in the short term, but I think a large part of the value of an RFC is starting to shape a more concrete long term technical vision so that if we adopt additional complexity we have some fleshed out ideas for how we can pay it down.

I can spend some time on this if that'd be helpful. Long-term, it seems like the structure of the saved objects service may change somewhat drastically (i.e. potentially deprecating the SOC and generic APIs). I know we've had some long-form discussions, but do we have anything more concise that explains what we're targeting? We probably do, I just can't find it 🙂.
If I know what we're targeting, then I can try to tailor the approach to match that vision.
If not, I can enumerate these other ideas with the assumption that the current architecture will remain largely in place, and then we can figure out what that looks like in the evolved architecture.

do we have anything more concise that explains what we're targeting

We need another RFC 🙈

I'm working on writing down a longer term strategy in 7.13. I think the most impactful change for auditing might be deprecating the saved objects client and generic API's in favour of plugins building custom HTTP API's. Previously auditing had to happen on the saved objects level, but if we're adding another layer into our architecture is that still the best place?

Maybe as a first step we can start by thinking through the ideal end state of audit logging and work back from that. This might be better suited to a sync discussion, but some questions we could explore:

If all data access happened through HTTP API's which either responded with 401, error or success could the URL and status code be an adequate replacement for generic saved_object_create/get/find/delete events?

there seems to be two classes of audit logs: generic saved_object_* events and events that are specific to the alerting domain, do we anticipate wanting to add more domain specific audit events?

from an auditing perspective what do we consider as interesting security events that needs to be logged. If a plugin has a hidden saved object exposed through an HTTP API and rejects a request because of some domain specific business rule, should this create an event? E.g. "only users with super-powers are allowed to break through steel doors"

Maybe as a first step we can start by thinking through the ideal end state of audit logging and work back from that. This might be better suited to a sync discussion, but some questions we could explore:

I forked this off into a shared doc to continue the discussion in a more sane medium than GH comment threads

@rudolf I started describing some alternate approaches in 682f6c8 (#93115)

Of the alternatives, I'm currently favoring 5.2.2 - Pluggable authorization, but not necessarily coupled to this first phase of OLS. I think this model could work within the context of our "Persistence layer"/"Saved objects are data" discussions.

I'm not opposed to reevaluating if our timelines end up converging, though

rfcs/text/0015_ols_phase_1.md

Co-authored-by: Joe Portner <5295965+jportner@users.noreply.github.com>

…ols-phase-1

azasypkin

ACK: started to review, but haven't managed to finish it yet, will continue tomorrow morning.

rfcs/text/0015_ols_phase_1.md

azasypkin · 2021-03-10T16:19:11Z

rfcs/text/0015_ols_phase_1.md

+#### Attaching Access Controls
+
+This wrapper will be responsible for attaching an access control specification to all private objects before they are created.
+It will also allow users to provide their own access control specification in order to support the import/create use cases.


question: how exactly is this supposed to support import use case? You mean it will provide a way to specify what owner should be set for all private objects during import?

Each private object will have its access specification attached during export, so that it may be preserved on import

rfcs/text/0015_ols_phase_1.md

rudolf · 2021-03-15T12:10:41Z

rfcs/text/0015_ols_phase_1.md

+#### Performance considerations
+When overwriting existing objects, the security wrapper must first retrieve all of the existing `private` objects to ensure that the user is authorized. This requires another round-trip to `get`/`bulk-get` all `private` objects so we can authorize the operation.


Agree with @kobelb. We decided to build the namespace infrastructure into the repository and the same might make sense for private objects. Especially now that licensing is not the primary driver for how we architect these layers.

rudolf · 2021-03-15T12:17:53Z

rfcs/text/0015_ols_phase_1.md

+This wrapper will be responsible for attaching an access control specification to all private objects before they are created.
+It will also allow users to provide their own access control specification in order to support the import/create use cases.
+
+Similar to the way we treat `namespaces`, it will not be possible to change an access control specification via the `update`/`bulk_update` functions in this first phase. We may consider adding a dedicated function to update the access control specification, similar to what we've done for sharing to spaces.


Why should changing the ACL / owner be a separate API? Could we just prevent changing owners from the client but allow changing the acl from the repository?

This is very much forward-looking, but in a world where we've deprecated the saved objects client and plugins only consume the repository from behind their API controllers, it feels like it could simplify things if a plugin is able to just update a saved object and it's ACL in a single api method.

I don't expect to need either of these mechanisms for this first phase of work, so this is something we can discuss in more detail when the time comes. My primary motivation for making this a separate API is audit logging. Auditors might want to know when an object has changed owners, or when a user has chosen to share their object with another user.

The normal update API doesn't record this level of granularity in the audit logs. We could add conditional logic to audit ACL updates differently from other updates, but my preference would be to treat this as a distinct operation.

Secondly, it's plausible that a user would be authorized to update a saved object, but unauthorized to adjust the ACL. Allowing the ACL to be updated via update would further complicate the authorization logic here.

rudolf · 2021-03-15T12:22:53Z

rfcs/text/0015_ols_phase_1.md

+ * * public (default): instances of this saved object type will be accessible to all users within the given namespace, who are authorized to act on objects of this type.
+ * * private: instances of this saved object type will belong to the user who created them, and will not be accessible by other users, except for administrators.
+ */
+export type SavedObjectsAccessClassification = 'public' | 'private';


When a saved object is public it means we can quickly skip the ACL checking performance cost. But how does this fit into the longer term vision of OLS? I understand that transitioning from private to public is out of scope, but if this is something that's applied to the saved object type instead of a single object it wouldn't be possible to later share this saved object.

It feels like restricting this to a type moves us away from true OLS where the permissions are solely determined by the object itself. Is private types just technical debt until we can solve the performance problems or do we expect to always have this distinction?

Good question. When I discussed this first phase with @arisonl, he cautioned that the future of OLS is uncertain at this point. While the meta issue describes a potential end state, it's not clear if/when sharing "public" objects (such as dashboards) will be possible.

Having said that, I could envision a third type of classification for objects that are sharable to other users. At that point, we could design the transition for moving public objects to become sharable. My current thinking is that private objects will never be sharable. They contain information that is specific to the current user, so it doesn't make sense to allow them to be shared.

rudolf · 2021-03-16T11:58:02Z

rfcs/text/0015_ols_phase_1.md

+
+# 6. Adoption strategy
+
+Adoption for net-new features is hopefully straightforward. Like most saved object features, the saved objects service will transparently handle all authorization and auditing of these objects, so long as they are properly registered.


Thinking through our security model, we would only be able to enforce security if Elasticsearch is behind a firewall preventing users from connecting directly and users don't have access to the developer console. Since the same is true of spaces is this something users are familiar with, I couldn't find anything in our documentation?

We expect that a vast majority of Kibana users will not have direct access to the .kibana index. We have a note in the breaking changes doc which implies this, but we don't do a great job of documenting this today 🙈.

rudolf · 2021-03-16T12:31:12Z

rfcs/text/0015_ols_phase_1.md

+#### Performance considerations
+When overwriting existing objects, the security wrapper must first retrieve all of the existing `private` objects to ensure that the user is authorized. This requires another round-trip to `get`/`bulk-get` all `private` objects so we can authorize the operation.


do we have anything more concise that explains what we're targeting

We need another RFC 🙈

I'm working on writing down a longer term strategy in 7.13. I think the most impactful change for auditing might be deprecating the saved objects client and generic API's in favour of plugins building custom HTTP API's. Previously auditing had to happen on the saved objects level, but if we're adding another layer into our architecture is that still the best place?

Maybe as a first step we can start by thinking through the ideal end state of audit logging and work back from that. This might be better suited to a sync discussion, but some questions we could explore:

If all data access happened through HTTP API's which either responded with 401, error or success could the URL and status code be an adequate replacement for generic saved_object_create/get/find/delete events?

there seems to be two classes of audit logs: generic saved_object_* events and events that are specific to the alerting domain, do we anticipate wanting to add more domain specific audit events?

from an auditing perspective what do we consider as interesting security events that needs to be logged. If a plugin has a hidden saved object exposed through an HTTP API and rejects a request because of some domain specific business rule, should this create an event? E.g. "only users with super-powers are allowed to break through steel doors"

rudolf · 2021-03-16T12:38:52Z

rfcs/text/0015_ols_phase_1.md

+OLS allows saved objects to be owned by individual users. This allows Kibana to store information that is specific
+to each user, which enables further customization and collaboration throughout our solutions.


I see the value in being able to designate a saved object as belonging to a specific user identity. But it's probably worth asking, do we also need security? Just like a user is able to completely mess up or delete another user's dashboard would it matter if there wasn't a hard control to prevent accessing a saved object belonging to another user, but this was simply enforced in the UI layer?

It's a good question. I think there are some cases where recording identity would be enough, but there are other cases where we'd want to enforce security.

Take user preferences as an example: There isn't a ton of value in preventing Alice from reading Bob's preferences, but it would be really troubling if Alice could modify or delete Bob's preferences. Since all users would be authorized to update/delete these SOs, even a traditionally "readonly" user might be authorized to bulk delete the preferences of all other users. That's not something that an administrator could easily fix, unless they happened to have a snapshot of the Kibana index that they were willing to restore.

rudolf

Thanks for fleshing this out, I'm 👍 for the design.

If we're deprecating saved objects in the long term then having some additional technical debt that's isolated to the wrapper might be ideal. That way if we move it to Kibana Persistence service at a later stage it's easy to remove all private objects technical debt. It also doesn't feel like it makes sense to try to improve the architecture in Saved Objects directly and we should rather use these ideas in how we build the persistence service.

jportner

LGTM!

A few notes:

3.2 Saved Objects API: Looking at create / bulk_create again, in the implementation I think we'll want to ensure that the object's owner is maintained even in the event of an overwrite (e.g., user "larry" created a profile, then user "admin" overwrites that profile -- the profile should still be owned by "larry"). I'm guessing you already intended this behavior, and I don't think it necessarily has to be in the RFC, but I wanted to point it out!
5.2 Re-using the repository's pre-flight checks: I thought of another option as an alternative to a callback as described in 5.2.3. Instead of that, any client wrapper could fetch the object/s on its own and pass that down to the repository in an options field (preflightObject/s?) so the repository can reuse that result if it's defined, instead of initiating an entire additional preflight check. That resolves our problem without much additional complexity.
Of course we don't want consumers (mis)using this field, we can either mark it as @internal or we could explore creating a separate "internal SOC" interface that is only meant to be used by the SOC wrappers.
8.3 Behavior when security is disabled: I think this is resolved, private objects should be accessible if the Security plugin is disabled

legrego · 2021-04-08T18:32:10Z

3.2 Saved Objects API: Looking at create / bulk_create again, in the implementation I think we'll want to ensure that the object's owner is maintained even in the event of an overwrite (e.g., user "larry" created a profile, then user "admin" overwrites that profile -- the profile should still be owned by "larry"). I'm guessing you already intended this behavior, and I don't think it necessarily has to be in the RFC, but I wanted to point it out!

I've always considered an overwrite to be the same as a delete + create. In that scenario, I wouldn't expect the object's owner to be maintained.

5.2 Re-using the repository's pre-flight checks: I thought of another option as an alternative to a callback as described in 5.2.3. Instead of that, any client wrapper could fetch the object/s on its own and pass that down to the repository in an options field (preflightObject/s?) so the repository can reuse that result if it's defined, instead of initiating an entire additional preflight check. That resolves our problem without much additional complexity.
Of course we don't want consumers (mis)using this field, we can either mark it as @internal or we could explore creating a separate "internal SOC" interface that is only meant to be used by the SOC wrappers.

I'll update the alternatives to add this suggestion - thanks!

8.3 Behavior when security is disabled: I think this is resolved, private objects should be accessible if the Security plugin is disabled

++ I'll update the RFC to reflect this as well

azasypkin

Looks great, thanks!

azasypkin · 2021-04-12T16:07:09Z

rfcs/text/0016_ols_phase_1.md

+
+The security wrapper will ensure that the user is authorized to share/unshare all existing `private` objects.
+#### Performance considerations
+Similar to the "create / override" scenario above, the security wrapper must first retrieve all of the existing `private` objects to ensure that the user is authorized. This requires another round-trip to `get`/`bulk-get` all `private` objects so we can authorize the operation.


question:

This requires another round-trip to get/bulk-get all private objects

If I read it correctly addToNamespaces / deleteFromNamespaces work with single objects and we don't need to do a bulk-get here? Feel free to ignore this comment if we plan to extend these APIs to work with multiple objects in the future.

This will be changing in the future to support bulk operations via a new updateObjectsSpaces function (https://github.com/jportner/kibana/blob/4d0b48e9828e27bc645a2a0b4883acc3cc4a3cc7/src/core/server/saved_objects/service/lib/repository.ts#L1271-L1271).

Otherwise, you're right that the current implementation does not support bulk operations

Good to know, thanks!

legrego · 2021-04-12T16:27:21Z

Thanks all for your feedback and thoughtful comments!

Co-authored-by: Joe Portner <5295965+jportner@users.noreply.github.com>

kibanamachine · 2021-04-12T16:30:24Z

💚 Backport successful

Status	Branch	Result
✅	7.x

This backport PR will be merged automatically after passing CI.

Co-authored-by: Joe Portner <5295965+jportner@users.noreply.github.com> Co-authored-by: Larry Gregory <larry.gregory@elastic.co> Co-authored-by: Joe Portner <5295965+jportner@users.noreply.github.com>

legrego force-pushed the rfc/ols-phase-1 branch from 2cd0a19 to fc15305 Compare March 2, 2021 13:13

legrego mentioned this pull request Mar 2, 2021

Object level security, Phase 1 #86013

Closed

3 tasks

legrego added 2 commits March 5, 2021 09:45

RFC: Object level security, Phase 1

e3b4ff9

Add section for various plugin configurations

e43bcc4

legrego force-pushed the rfc/ols-phase-1 branch from fc15305 to e43bcc4 Compare March 5, 2021 14:45

legrego mentioned this pull request Mar 5, 2021

Object Level Security, Phase 1 #82725

Closed

legrego added the RFC label Mar 5, 2021

legrego marked this pull request as ready for review March 5, 2021 14:48

legrego added the Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more! label Mar 5, 2021

legrego added release_note:skip Skip the PR/issue when compiling release notes v8.0.0 labels Mar 5, 2021

legrego requested review from jportner, azasypkin and a team March 5, 2021 14:51

jportner reviewed Mar 9, 2021

View reviewed changes

legrego and others added 5 commits March 10, 2021 08:51

Update rfcs/text/0015_ols_phase_1.md

7bbaa22

Co-authored-by: Joe Portner <5295965+jportner@users.noreply.github.com>

rename acl to accessControl

bc89e1d

rename classification to accessClassification

36f8465

Add section describing add/delete from namespaces

0761ddf

Merge branch 'rfc/ols-phase-1' of github.com:legrego/kibana into rfc/…

a059f97

…ols-phase-1

azasypkin reviewed Mar 10, 2021

View reviewed changes

kobelb reviewed Mar 11, 2021

View reviewed changes

rfcs/text/0015_ols_phase_1.md Outdated Show resolved Hide resolved

rudolf reviewed Mar 15, 2021

View reviewed changes

rudolf reviewed Mar 16, 2021

View reviewed changes

legrego added 2 commits March 16, 2021 14:04

Merge branch 'master' of github.com:elastic/kibana into rfc/ols-phase-1

ffad5c8

Re-number RFC from 0015 -> 0016

b1f9b3e

legrego added 6 commits March 16, 2021 14:08

Document sub-feature privilege for saved objects management

d0af3f0

Explain why OLS doesn't work OOTB today

40588b3

link to unresolved question 1

8da400a

Adjust wording of when access controls are attached

67b5203

Add section about telemetry impacts

84db8fa

Add alternative approaches for pre-flight checks

682f6c8

legrego requested a review from rudolf April 2, 2021 12:28

rudolf approved these changes Apr 7, 2021

View reviewed changes

legrego requested review from jportner and azasypkin April 7, 2021 13:00

jportner approved these changes Apr 8, 2021

View reviewed changes

legrego added 3 commits April 8, 2021 14:34

Mark unresolved question 3 as resolved

267e940

Add additional alternative for preflight checks

0d77880

Merge branch 'master' of github.com:elastic/kibana into rfc/ols-phase-1

9076ec1

azasypkin approved these changes Apr 12, 2021

View reviewed changes

legrego added v7.13.0 auto-backport Deprecated - use backport:version if exact versions are needed labels Apr 12, 2021

legrego merged commit 7e2ffc0 into elastic:master Apr 12, 2021

legrego deleted the rfc/ols-phase-1 branch April 12, 2021 16:27

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Apr 12, 2021

RFC: Object level security, Phase 1 (elastic#93115)

eef631c

Co-authored-by: Joe Portner <5295965+jportner@users.noreply.github.com>

kibanamachine mentioned this pull request Apr 12, 2021

[7.x] RFC: Object level security, Phase 1 (#93115) #96828

Merged

		#### Performance considerations
		When overwriting existing objects, the security wrapper must first retrieve all of the existing `private` objects to ensure that the user is authorized. This requires another round-trip to `get`/`bulk-get` all `private` objects so we can authorize the operation.


		# 6. Adoption strategy

		Adoption for net-new features is hopefully straightforward. Like most saved object features, the saved objects service will transparently handle all authorization and auditing of these objects, so long as they are properly registered.

		OLS allows saved objects to be owned by individual users. This allows Kibana to store information that is specific
		to each user, which enables further customization and collaboration throughout our solutions.

RFC: Object level security, Phase 1 #93115

RFC: Object level security, Phase 1 #93115

Conversation

legrego commented Mar 1, 2021 • edited Loading

Summary

elasticmachine commented Mar 5, 2021

mattkime commented Mar 5, 2021

legrego commented Mar 6, 2021

mattkime commented Mar 8, 2021

jportner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jportner Mar 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azasypkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rudolf left a comment

Choose a reason for hiding this comment

jportner left a comment

Choose a reason for hiding this comment

legrego commented Apr 8, 2021

azasypkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

legrego commented Apr 12, 2021

kibanamachine commented Apr 12, 2021

💚 Backport successful

legrego commented Mar 1, 2021 •

edited

Loading

jportner Mar 10, 2021 •

edited

Loading