Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Change Proposal] Add support for specifying Elasticsearch privileges #203

Closed
3 tasks done
axw opened this issue Aug 1, 2021 · 16 comments
Closed
3 tasks done

[Change Proposal] Add support for specifying Elasticsearch privileges #203

axw opened this issue Aug 1, 2021 · 16 comments
Labels
discuss Issue needs discussion Team:Fleet Label for the Fleet team Team:Integrations Label for the Integrations team

Comments

@axw
Copy link
Member

axw commented Aug 1, 2021

In elastic/kibana#94591 and elastic/kibana#97366, Kibana added support for informing fleet-server of the privileges that should be assigned to an agent API Key. There was mention of a pending proposal in one of the PRs for packages to specify privileges ("permissions"):

To do so we look into the policy's PacakgePolicy and the enabled inputs. We link that with the package DataStream to determine which indices it uses and which permissions it needs. The permissions will be specified in the package's data_streams/*/manifest.yml (pending proposal).

We would like to do just this in APM Server, to enable servers to periodically refresh and read a single non-sensitive data stream: elastic/apm-server#5490. Specifically, we require the maintenance and read index privileges for the traces-apm.sampled-* index pattern. I don't think the proposal was ever made, so I'm proposing it now.

The way the Kibana code is written, a permissions property would be expected in data_stream/<data_stream>/manifest.yml, e.g.

title: APM tail-sampled traces
type: traces
dataset: apm.sampled
ilm_policy: traces-apm.sampled-default_policy
permissions:
  cluster: ['monitor']
  # Additional index privileges are required for tail-based sampling:
  #   maintenance: for refreshing sampling decisions
  #   read: for reading sampling decisions
  indices: ['auto_configure', 'create_doc', 'maintenance', 'read']
elasticsearch:
  index_template:
    settings:
      # Create a single shard per index, so we can use
      # global checkpoints as a way of limiting search
      # results.
      number_of_shards: 1

Given that these permissions are Elasticsearch privileges, I think it might make more sense to define them as such explicitly. e.g.

title: APM tail-sampled traces
type: traces
dataset: apm.sampled
ilm_policy: traces-apm.sampled-default_policy
elasticsearch:
  privileges:
    cluster: ['monitor']
    indices: ['auto_configure', 'create_doc', 'maintenance', 'read']
  index_template:
    settings:
      # Create a single shard per index, so we can use
      # global checkpoints as a way of limiting search
      # results.
      number_of_shards: 1

Implementation issues:

  1. Add support for specifying Elasticsearch privileges in data stream manifest #210
  2. util: add support for elasticsearch.privileges.indices to DataStream type package-registry#712
  3. [Fleet] Add support for Elasticsearch privileges in data stream manifest kibana#109047
@axw axw added the discuss Issue needs discussion label Aug 1, 2021
@ruflin ruflin added Team:Fleet Label for the Fleet team Team:Integrations Label for the Integrations team labels Aug 2, 2021
@ruflin
Copy link
Member

ruflin commented Aug 2, 2021

Thanks for putting together directly a proposal. I think there are at least 2 ways to approach this. First as proposed here that the manifest hands over a predefined list of permissions to Fleet and Fleet just adds it to the policy without knowing what is inside. This gives a lot of freedom to the package dev to basically do whatever is needed.

The second part is around more "opinionated" config options that do not directly represent the Elasticsearch security permissions. For example the user can only set the following:

elasticsearch.permissions:
  read: true

With this, we would ensure that data streams are always only have a small subset of the permissions and Fleet would "understand" what permissions were given and could even have special visualisations around it to show what kind of permissions are there / required.

I think the first one is easy to implement and gives APM exactly what is needed but I'm not sure if we would benefit long term more from option 2 to keep more control. Opinions?

@axw
Copy link
Member Author

axw commented Aug 2, 2021

I don't see a compelling reason to have an alternative set of privilege names.

Fleet could restrict the allowed privileges without coming up with its own mapping -- it would just need to restrict the values that can be specified in elasticsearch.privileges.{cluster,indices} based on some set of known Elasticsearch privileges.

Ultimately I think we would be better served by not having such hard-coded restrictions, but instead:

  • have a minimal set of privileges that any integration can use, i.e. the defaults we have today
  • any package-specified privileges beyond those should prompt the user to explicitly approve the privileges at integration install time, much like you do when you install an app on an app store. Because APM would/should be installed automatically by Kibana, it would be a special case and would not require explicit opt-in.

could even have special visualisations around it to show what kind of permissions are there / required

You would know better than me, but I would expect integrations with additional privileges to be uncommon. Since it would be uncommon, this could just be a basic list of the exact Elasticsearch privilege names and a link to https://www.elastic.co/guide/en/elasticsearch/reference/current/security-privileges.html for an explanation of what they mean.

@exekias
Copy link
Contributor

exekias commented Aug 2, 2021

I don't expect many integrations needing this, hence I consider it advanced usage.

Adding a translation layer also means we need to document & keep it up to date, later ES breaking changes may create some headaches for us too.

Since more integration developers will be coming it's really important that we prompt the user when installing one with extra privileges for them to inspect. I'm wondering if indices is mostly safe as it only affects the integration data streams (am I right?). cluster, of course, has a wider impact.

@mostlyjason
Copy link

@axw have you considered how this change affects the deployment model of our users? Many operate Elastic Agent on Endpoints that do not have a direct connection to Elasticsearch for security reasons. They send data only through proxies like Logstash or Kafka. We also considering a change to introduce a secure proxy to avoid exposing ES credentials on endpoints.

For tail based sampling, this would require APM server to run in centralized trusted environment. Have you considered a local version of tail based sampling that can operate on endpoints without ES access?

@axw
Copy link
Member Author

axw commented Aug 3, 2021

@mostlyjason yes, we have considered this. What we'll be delivering initially will have some restrictions; we may introduce options for other deployment modes later. The alternative is to have centralised APM Servers with trace ID routing, which introduces other complications and costs.

For tail based sampling, this would require APM server to run in centralized trusted environment. Have you considered a local version of tail based sampling that can operate on endpoints without ES access?

Not necessarily centralised, but yes they do need to have direct access to Elasticsearch for now. Our tail-based sampling is coherent, meaning that all events in the trace are sampled according to the same decision. This either requires all events to go through the same APM Server, or for APM Servers to coordinate. For various reasons that I won't go into here, we have gone with the latter.

There are two ways in which we expect APM to be deployed: on trusted hosts for backend services, and as a central service (e.g. ESS) for RUM/untrusted applications. We can later introduce another proxy-type service for this feature, which would take care of reading from Elasticsearch and making that data available to APM Servers. Another option would be to chain servers, with a central or regional server with direct access to Elasticsearch performing the tail-based sampling.

@axw
Copy link
Member Author

axw commented Aug 3, 2021

@ruflin @exekias it sounds like we're aligned in our thinking.

Since more integration developers will be coming it's really important that we prompt the user when installing one with extra privileges for them to inspect. I'm wondering if indices is mostly safe as it only affects the integration data streams (am I right?). cluster, of course, has a wider impact.

I agree, but I imagine this will take a while to design and implement - and I would like to not have to wait for that to enable tail-based sampling. We shouldn't YOLO security either though.

As a compromise, how about the following proposal?

  • Add elasticsearch.privileges.indices to package-spec/package-registry. No support for specifying cluster privileges initially.
  • Kibana (Fleet) would initially restrict the allowed index privileges to: auto_configure, create_doc, maintenance, monitor, read, read_cross_cluster

Does this sound reasonable? @jen-huang sound okay from a Fleet perspective?

@exekias
Copy link
Contributor

exekias commented Aug 3, 2021

@mtojek @jsoriano thoughts?

@mtojek
Copy link
Contributor

mtojek commented Aug 4, 2021

Sorry, I miss the context/justification behind this feature (not Elasticsearch expert), so not blocking this initiative.

@jsoriano
Copy link
Member

jsoriano commented Aug 4, 2021

Andrew's proposal sounds good to me, I think it makes sense to add this kind of settings to packages.

Beyond the change in the spec, something that is not completely clear to me is how elasticsearch.privileges.indices would be interpreted. Will this be checked when enrolling an agent, to confirm that it is using an API key with the required privileges? (If that is the case maybe we can call the setting elasticsearch.required_privileges.indices.

@ruflin
Copy link
Member

ruflin commented Aug 4, 2021

@jsoriano The way this works in Fleet is that Fleet creates a permission block in the policy. Fleet-server uses this permission block to create the API with these permissions for the Elastic Agent. So the Elastic Agent will always receive only an API key with these permissions.

@exekias
Copy link
Contributor

exekias commented Aug 12, 2021

As a compromise, how about the following proposal?

  • Add elasticsearch.privileges.indices to package-spec/package-registry. No support for specifying cluster privileges initially.
  • Kibana (Fleet) would initially restrict the allowed index privileges to: auto_configure, create_doc, maintenance, monitor, read, read_cross_cluster

Does this sound reasonable? @jen-huang sound okay from a Fleet perspective?

This looks reasonable to me, WDYT @jen-huang? Once this go in the spec we would need code handling permissions on the kibana side

@jen-huang
Copy link
Contributor

jen-huang commented Aug 16, 2021

Add elasticsearch.privileges.indices to package-spec/package-registry. No support for specifying cluster privileges initially.

++

Kibana (Fleet) would initially restrict the allowed index privileges to: auto_configure, create_doc, maintenance, monitor, read, read_cross_cluster

Currently Fleet defaults to auto_configure and create_doc. Are we sure we need to expand these default privileges?

Once this proposal is closed, please open a corresponding Kibana issue. Thanks!

Edit: I accidentally closed this :D

@jen-huang jen-huang reopened this Aug 16, 2021
@axw
Copy link
Member Author

axw commented Aug 18, 2021

Currently Fleet defaults to auto_configure and create_doc. Are we sure we need to expand these default privileges?

They should not be default across the board. There's one data stream for which APM needs additional privilegesw. Fleet should consider these to be the maximum set of allowable privileges, but should continue to default to the privileges given today.

@axw
Copy link
Member Author

axw commented Aug 18, 2021

Calling this accepted. I've created implementation issues, will close this once they've been implemented.

@axw
Copy link
Member Author

axw commented Sep 21, 2021

All implementation tasks are complete. I'm going to put together an end-to-end test, and once I've confirmed everything is working well together I'll close this.

@axw
Copy link
Member Author

axw commented Sep 22, 2021

I've manually confirmed the changes are working. The end-to-end test will take a little while longer to put together for unrelated reasons, so I'll close this now. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs discussion Team:Fleet Label for the Fleet team Team:Integrations Label for the Integrations team
Projects
None yet
Development

No branches or pull requests

7 participants