Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service tokens ignore privileges additionally defined by packages #1048

Closed
simitt opened this issue Dec 30, 2021 · 16 comments · Fixed by elastic/elasticsearch#82600
Closed
Assignees
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.1.0

Comments

@simitt
Copy link

simitt commented Dec 30, 2021

Removing the username/password and enforcing usage of service tokens (#1006) revealed a bug related to the privileges that the resulting API Keys have. It seems that any additionally defined privileges of the packages are just ignored.

Description of the Problem

  1. The apmpackage specifies additional privileges for the traces-apm.sampled-default data stream:
elasticsearch:
  privileges:
    indices: [auto_configure, create_doc, maintenance, monitor, read]
  1. Setting up an agent policy that contains the apm integration and enrolling an elastic agent, the state/data/state.yml lists the privileges for the data streams, as configured in the apmpackage:
bash-4.2$ cat state/data/state.yml

action:
    ...
    inputs:
    - apm-server:
       ...
      data_stream:
        namespace: default
      id: c0f3d4d1-acc9-447c-a443-f9049dab8ee5
      meta:
        package:
          name: apm
          version: 8.1.0
      name: apm_systemtest_2-apm
      revision: 1
      type: apm
      use_output: default
    output_permissions:
      default:
        _elastic_agent_checks:
          cluster:
          - monitor
        _elastic_agent_monitoring:
         ...
        apm_systemtest_2-apm:
          cluster:
          - cluster:monitor/main
          indices:
          - names:
            - logs-apm.app-default
            privileges:
            - auto_configure
            - create_doc
          - names:
            - metrics-apm.app.*-default
            privileges:
            - auto_configure
            - create_doc
          - names:
            - logs-apm.error-default
            privileges:
            - auto_configure
            - create_doc
          - names:
            - metrics-apm.internal-default
            privileges:
            - auto_configure
            - create_doc
          - names:
            - metrics-apm.profiling-default
            privileges:
            - auto_configure
            - create_doc
          - names:
            - traces-apm.rum-default
            privileges:
            - auto_configure
            - create_doc
          - names:
            - traces-apm.sampled-default
            privileges:
            - auto_configure
            - create_doc
            - maintenance
            - monitor
            - read
          - names:
            - traces-apm-default
            privileges:
            - auto_configure
            - create_doc
    outputs:
      default:
        api_key: vOL9Cn4BqaHYuqz-hbSb:Iboop-tIQjKH7tLzzFH0Yg
        hosts:
        - http://elasticsearch:9200
        type: elasticsearch
    revision: 2
  1. Using this base64 encoded API Key and querying for the actual privileges shows that privileges are missing:
simitt@simmac ~ % curl -H "Authorization: ApiKey dk9MOUNuNEJxYUhZdXF6LWhiU2I6SWJvb3AtdElRaktIN3RMenpGSDBZZw==" -X GET "localhost:9200/_security/user/_has_privileges?pretty" -H 'Content-Type: application/json' -d'
{
  "cluster": [ "cluster:monitor/main" ],
  "index" : [
    {
      "names": [ "logs-apm.app-default", "metrics-apm.app.*-default", "logs-apm.error-default", "metrics-apm.internal-default", "metrics-apm.profiling-default", "traces-apm.rum-default", "traces-apm-default" ],
      "privileges": [ "auto_configure","create_doc" ]
    },
    {
      "names": [ "traces-apm.sampled-default" ],
      "privileges": [ "auto_configure","create_doc","maintenance","monitor","read" ]
    }
  ]
}
'



{
  "username" : "elastic/fleet-server",
  "has_all_requested" : false,
  "cluster" : {
    "cluster:monitor/main" : true
  },
  "index" : {
    "logs-apm.app-default" : {
      "create_doc" : true,
      "auto_configure" : true
    },
    "logs-apm.error-default" : {
      "create_doc" : true,
      "auto_configure" : true
    },
    "metrics-apm.app.*-default" : {
      "create_doc" : true,
      "auto_configure" : true
    },
    "metrics-apm.internal-default" : {
      "create_doc" : true,
      "auto_configure" : true
    },
    "metrics-apm.profiling-default" : {
      "create_doc" : true,
      "auto_configure" : true
    },
    "traces-apm-default" : {
      "create_doc" : true,
      "auto_configure" : true
    },
    "traces-apm.rum-default" : {
      "create_doc" : true,
      "auto_configure" : true
    },
    "traces-apm.sampled-default" : {
      "read" : false,
      "create_doc" : true,
      "auto_configure" : true,
      "monitor" : false,
      "maintenance" : false
    }
  },
  "application" : { }
}
  1. Verifying read privileges with a concrete request confirms that they are indeed missing:
simitt@simmac ~ % curl -i -H "Authorization: ApiKey dk9MOUNuNEJxYUhZdXF6LWhiU2I6SWJvb3AtdElRaktIN3RMenpGSDBZZw==" "http://localhost:9200/traces-apm-default/_search"
HTTP/1.1 403 Forbidden
X-elastic-product: Elasticsearch
content-type: application/json;charset=utf-8
content-length: 621

{"error":{"root_cause":[{"type":"security_exception","reason":"action [indices:data/read/search] is unauthorized for API key id [vOL9Cn4BqaHYuqz-hbSb] of user [elastic/fleet-server] on indices [traces-apm-default,.ds-traces-apm-default-2021.12.30-000001], this action is granted by the index privileges [read,all]"}],"type":"security_exception","reason":"action [indices:data/read/search] is unauthorized for API key id [vOL9Cn4BqaHYuqz-hbSb] of user [elastic/fleet-server] on indices [traces-apm-default,.ds-traces-apm-default-2021.12.30-000001], this action is granted by the index privileges [read,all]"},"status":403}%
@simitt simitt added bug Something isn't working v8.0.0 labels Dec 30, 2021
@michel-laterman
Copy link
Contributor

I'm not sure if this is related, but I've noticed that the service_token that Kibana issues is unable to list policies:

elastic-agent|retrieve-service-token-container⚡ ⇒ curl -XPOST localhost:5601/api/fleet/service-tokens -u elastic:changeme -H "kbn-xsrf: value"
{"name":"token-1640912238005","value":"AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE2NDA5MTIyMzgwMDU6ak53LTZHdkFTZnFrN0dTal9SM3BiUQ"}%
elastic-agent|retrieve-service-token-container⚡ ⇒ curl localhost:5601/api/fleet/agent_policies -H "Authorization: Bearer AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE2NDA5MTIyMzgwMDU6ak53LTZHdkFTZnFrN0dTal9SM3BiUQ"
{"items":[],"total":0,"page":1,"perPage":20}%
elastic-agent|retrieve-service-token-container⚡ ⇒ curl localhost:5601/api/fleet/agent_policies -u elastic:changeme
{"items":[{"id":"499b5aa7-d214-5b5d-838b-3cd76469844e","namespace":"default","monitoring_enabled":["logs","metrics"],"name":"Default Fleet Server policy","description":"Default Fleet Server agent policy created by Kibana","is_default":false,"is_default_fleet_server":true,"is_preconfigured":true,"status":"active","is_managed":false,"revision":1,"updated_at":"2021-12-31T00:54:04.727Z","updated_by":"system","package_policies":["default-fleet-server-agent-policy"],"agents":1},{"id":"2016d7cc-135e-5583-9758-3ba01f5a06e5","namespace":"default","monitoring_enabled":["logs","metrics"],"name":"Default policy","description":"Default agent policy created by Kibana","is_default":true,"is_preconfigured":true,"status":"active","is_managed":false,"revision":1,"updated_at":"2021-12-31T00:54:02.703Z","updated_by":"system","package_policies":["default-system-policy"],"agents":0}],"total":2,"page":1,"perPage":20}%

@jlind23
Copy link
Contributor

jlind23 commented Jan 5, 2022

@simitt do you have any logs on fleet-server side while trying to create the token?
cc @blakerouse @michel-laterman

@blakerouse
Copy link
Contributor

@simitt The permissions show up when using username/password, but not with the service token? If that is the case the Fleet Server calls the same API with the same contents no matter basic auth or service token. This makes me think its an issue on the Elasticsearch side.

As @jlind23 ask do you have any Elasticsearch logs for this, it might show errors about not being able to assign those permissions.

@axw
Copy link
Member

axw commented Jan 6, 2022

There is a single log record from Elasticsearch in our test setup:

elasticsearch_1 | {"@timestamp":"2022-01-06T04:25:38.550Z", "log.level": "WARN", "message":"path: /traces-*/_search, params: {size=200, index=traces-*}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[aba19b0b0e12][search][T#7]","log.logger":"rest.suppressed","elasticsearch.cluster.uuid":"6MdsLK9FS22dUqR23UdxOw","elasticsearch.node.id":"fjTJPyHZR7W4kZyC31hPgg","elasticsearch.node.name":"aba19b0b0e12","elasticsearch.cluster.name":"docker-cluster","error.type":"org.elasticsearch.action.search.SearchPhaseExecutionException","error.message":"all shards failed","error.stack_trace":"Failed to execute phase [query], all shards failed; shardFailures {[fjTJPyHZR7W4kZyC31hPgg][.ds-traces-apm.sampled-default-2022.01.06-000001][0]: [.ds-traces-apm.sampled-default-2022.01.06-000001/UFpv0gZsQ4SMVnD9d5Q_2A][[.ds-traces-apm.sampled-default-2022.01.06-000001][0]] org.elasticsearch.action.NoShardAvailableActionException: [aba19b0b0e12][127.0.0.1:9300][indices:data/read/search[phase/query]]\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:544)\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:491)\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:343)\n\tat org.elasticsearch.action.ActionListener$Delegating.onFailure(ActionListener.java:66)\n\tat org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:48)\n\tat org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:651)\n\tat org.elasticsearch.transport.TransportService$4.handleException(TransportService.java:724)\n\tat org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1349)\n\tat org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1458)\n\tat org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1432)\n\tat org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:50)\n\tat org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:47)\n\tat org.elasticsearch.action.ActionRunnable.onFailure(ActionRunnable.java:77)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:28)\n\tat org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:775)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\n\tat java.base/java.lang.Thread.run(Thread.java:833)\n}\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:725)\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:412)\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:757)\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:509)\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:343)\n\tat org.elasticsearch.action.ActionListener$Delegating.onFailure(ActionListener.java:66)\n\tat org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:48)\n\tat org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:651)\n\tat org.elasticsearch.transport.TransportService$4.handleException(TransportService.java:724)\n\tat org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1349)\n\tat org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1458)\n\tat org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1432)\n\tat org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:50)\n\tat org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:47)\n\tat org.elasticsearch.action.ActionRunnable.onFailure(ActionRunnable.java:77)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:28)\n\tat org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:775)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\n\tat java.base/java.lang.Thread.run(Thread.java:833)\nCaused by: [.ds-traces-apm.sampled-default-2022.01.06-000001/UFpv0gZsQ4SMVnD9d5Q_2A][[.ds-traces-apm.sampled-default-2022.01.06-000001][0]] org.elasticsearch.action.NoShardAvailableActionException: [aba19b0b0e12][127.0.0.1:9300][indices:data/read/search[phase/query]]\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:544)\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:491)\n\t... 18 more\n"}

The error message doesn't explicitly say so, but I'm pretty sure this is due to lack of read privilege on traces-apm.sampled-default.

API Keys have at most the privileges of the authenticated user. When fleet-server is run with the root user (as we were doing with user/pass auth in our tests previously), that's everything, so fleet-server would produce API Keys with the additional privileges we request.

The elastic/fleet-server service account has limited index privileges:

            new RoleDescriptor.IndicesPrivileges[] {
                RoleDescriptor.IndicesPrivileges.builder()
                    .indices(
                        "logs-*",
                        "metrics-*",
                        "traces-*",
                        "synthetics-*",
                        ".logs-endpoint.diagnostic.collection-*",
                        ".logs-endpoint.action.responses-*"
                    )
                    .privileges("write", "create_index", "auto_configure")
                    .build(),

Thus, the maximum set of index privileges API Keys issues by fleet-server is write, create_index, and auto_configure.

@ruflin this undoes elastic/package-spec#203. How would you suggest we proceed?

@jlind23
Copy link
Contributor

jlind23 commented Jan 6, 2022

@axw It may be a dumb question but couldn't we add the read right to the elastic/fleet-server service account?

@axw
Copy link
Member

axw commented Jan 6, 2022

@jlind23 we could, I'm just not sure if this was intended. If not, then yes we could expand the privileges; if it is intentionally/necessarily limited, we need to go back to the drawing board on elastic/package-spec#203.

@jlind23
Copy link
Contributor

jlind23 commented Jan 6, 2022

So you can specify privileges in elastic package only if those privileges are already part of fleet-server ones right?
If yes, then I am pretty sure that it should be extended. Who owns the decision there? @ruflin ?

@axw
Copy link
Member

axw commented Jan 6, 2022

So you can specify privileges in elastic package only if those privileges are already part of fleet-server ones right?

Right. The ones in the elastic/fleet-server service account define the maximum set of privileges an agent's API Key can be assigned.

If yes, then I am pretty sure that it should be extended. Who owns the decision there? @ruflin ?

Yes, I think so. If not he can redirect :)

marclop added a commit to marclop/apm-server that referenced this issue Jan 10, 2022
Skips the `TestTailSampling` systemtest until elastic/fleet-server#1048
is resolved.

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
marclop added a commit to elastic/apm-server that referenced this issue Jan 10, 2022
Skips the `TestTailSampling` systemtest until elastic/fleet-server#1048
is resolved.

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
mergify bot pushed a commit to elastic/apm-server that referenced this issue Jan 10, 2022
Skips the `TestTailSampling` systemtest until elastic/fleet-server#1048
is resolved.

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
(cherry picked from commit 85b95c3)
marclop pushed a commit to elastic/apm-server that referenced this issue Jan 10, 2022
Skips the `TestTailSampling` systemtest until elastic/fleet-server#1048
is resolved.

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
(cherry picked from commit 85b95c3)
@ruflin
Copy link
Member

ruflin commented Jan 10, 2022

As mentioned above, the current assumption is that fleet-server service account has all the permissions needed to hand out to Elastic Agents for ingesting data as it is creating the API keys. Unfortunately in the context of elastic/package-spec#203 we missed this very important detail :-( I wonder why this only showed up now with the removal of username / password as I was hoping we already used service tokens before in our test suites.

We likely need a short term fix which is adding the very specific permissions to the fleet-server service account for apm-server to work properly. In parallel we should go back to the drawing board. How broad should the permissions be that the fleet-server service account has? How generic can the permission definition be in a package? Is there another way (through Fleet?) that additional permissions can be passed down?

@axw Short term, what are the exact additional permissions we need?

@simitt
Copy link
Author

simitt commented Jan 10, 2022

Short term, what are the exact additional permissions we need?

apmpackage specifies following additional privileges in the package:

  • 'cluster:monitor/main' on cluster
  • [auto_configure, create_doc, maintenance, monitor, read] on the traces-apm.sampled-default index

Test revealed that the monitor and read privileges for the traces-apm.sampled-default index are missing. As shown in the initial description, the API Key seemed to have all other privileges.

@ruflin
Copy link
Member

ruflin commented Jan 11, 2022

I assume the read permissions on this data stream are needed for multiple apm-servers to sync up on sampled traces? What is the monitor permission used for?

My currently proposed path forward:

  • Evaluate for apm-server if there is a way to get around these permissions as read is something we don't allow in general
  • If no workaround found, add these 2 permissions to the fleet-server
  • Figure out a long term solution for permissions which extend beyond fleet-server
  • In package-spec, validate that only permissions are used / supported which are given to fleet-server

@axw
Copy link
Member

axw commented Jan 11, 2022

What is the monitor permission used for?

monitor is needed for index stats requests, which we perform in order to fetch the global checkpoint which is used for tailing the data stream. We also need maintenance for refreshing the data stream.

My currently proposed path forward:

  • Evaluate for apm-server if there is a way to get around these permissions as read is something we don't allow in general

We have already evaluated this. The answer is no, not without:

  • introducing external components to our architecture (e.g. Redis) or
  • changing users' deployment architectures (adding smart proxies to implement trace ID routing through to a single apm-server), and preventing certain deployment architectures (e.g. multi-cloud, with separate APM Servers / ES clusters for security reasons)

If no workaround found, add these 2 permissions to the fleet-server

Can we add it for just the one data stream that APM Server needs it? Seems a little bit dirty to have that defined in Elasticsearch, but I think it would be ideal to limit the impact.

Figure out a long term solution for permissions which extend beyond fleet-server

👍 I wonder if apm-server should get its own service account. But let's take this somewhere else.

In package-spec, validate that only permissions are used / supported which are given to fleet-server

👍 I forget if we're doing this at the moment, but I'm pretty sure we are validating them in Kibana at least.

@ruflin
Copy link
Member

ruflin commented Jan 12, 2022

Can we add it for just the one data stream that APM Server needs it? Seems a little bit dirty to have that defined in Elasticsearch, but I think it would be ideal to limit the impact.

I think yes. Could this index contain any confidential info or is purely tracking?

@axw
Copy link
Member

axw commented Jan 13, 2022

I think yes. Could this index contain any confidential info or is purely tracking?

No. It contains:

  • @timestamp
  • event.ingested
  • data_stream.*
  • observer.id (apm-server UUID)
  • trace.id

@axw
Copy link
Member

axw commented Jan 14, 2022

Elasticsearch PR opened to add privileges for that one data stream pattern: elastic/elasticsearch#82600

@axw axw added the v8.1.0 label Jan 24, 2022
@axw axw removed the v8.0.0 label Jan 24, 2022
@axw
Copy link
Member

axw commented Jan 24, 2022

@simitt I've retargeted the issues and PR to only 8.1.0, to minimise risk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants