Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1839621: Add an ability to proxy requests to Che Workspace #5332

Merged
merged 14 commits into from
May 27, 2020

Conversation

sleshchenko
Copy link
Contributor

@sleshchenko sleshchenko commented May 7, 2020

What does this PR do?

It solves https://bugzilla.redhat.com/show_bug.cgi?id=1839621

So, it adds backend endpoints

  • /api/terminal/proxy/[namespace]/[workspace-name]/[path]: proxy requests on [path] to workspace in given namespace, attaching user's bearer token. This endpoint is used to construct a kubeconfig in the user's workspace, allowing them to access commandline tools.
  • /api/terminal/available/: Returns 204 if proxy is available, 503 otherwise. Used to check if the endpoint above will process requests (see security considerations below)

This PR is needed for the OpenShift Console to open a terminal. See screencast how it works:

Screencast

console-terminal

Setup

Note: To ease testing, this PR includes some commits that include an early implementation of the frontend UI that will be used. These commits will be dropped from this PR before merging. The nature of the changes means it's much harder to test otherwise.

  1. Deploy changes from this PR

Built image could be used:

oc patch consoles.operator.openshift.io cluster --patch '{ "spec": { "managementState": "Unmanaged" } }' --type=merge
oc set image deploy console console=sleshchenko/console:apiTerminal-27-05-2020 -n openshift-console

^ image has date in the tag for you to make sure it's up to date, and it will be updated after any changes in the PR.

  1. Set up che-workspace-controller
    git clone https://github.com/che-incubator/che-workspace-operator.git
    cd che-workspace-operator
    export WEBHOOK_ENABLED=true
    export DEFAULT_ROUTING=basic
    make deploy
    
    Changing WEBHOOK_ENABLED above to false should disable all endpoints

Testing

Note: it's necessary to grant additional permissions to the console SA, as in openshift/console-operator#432

Testing the /api/terminal/available/ endpoint

use curl from terminal

curl -k -H "Authorization: Bearer $(oc whoami -t)" \
  ${cluster_URL}/api/terminal/available/ --verbose

Should return 204 if WEBHOOK_ENABLED, 503 otherwise.

Testing the main proxy endpoint:

Cases:

  • Log in as a cluster-admin user, click terminal button and wait for workspace to be running. Should see a request sent to .../exec/init with 403 response, and message "Terminal is disabled for cluster-admin users."
    Screenshot from 2020-05-21 14-02-01

  • Log in as regular user, click terminal button. Should see a request to .../exec/init with 200 status, and terminal should start. Once terminal is initialized, oc whoami should list current user as logged in. Note: A few front-end workarounds are required to make this work:

    • Front end currently depends on listing workspaces at cluster scope. To enable this easily, bind the view-workspaces clusterrole to the user:
      oc adm policy add-cluster-role-to-user view-workspaces ${testuser}
      
    • On first run, the terminal seems to get stuck loading. It's necessary to close the panel and reopen it for the terminal to initialize.

    You can also exec directly into the dev container in the workspace pod, and test oc there.

In both cases, if WEBHOOK_ENABLED=false when deploying the operator, the requests to .../exec/init should return 403 with message "Terminal endpoint is disabled: workspace operator is not deployed."

Workflow implementation details

The current implementation uses the following flow:

  1. Frontend creates a namespace (if necessary) and initializes a Che workspace
  2. The che-workspace-operator creates a deployment for the requested workspace
  3. The frontend POSTs the user's token to the backend proxy, which forwards it to the workspace's /exec/init endpoint
  4. The workspace component creates a kubeconfig file for the passed token
  5. The front end does a regular exec into the workspace container

Related issues/PRs

It depends on openshift/console-operator#432

Security concerns

We're adding a proxy to the backend that is submitting a user's auth token to an URL defined in a status field. As a result it's necessary to carefully consider cases to avoid sending the auth token to an unintended recipient. The current security model on the workspace operator side is:

  • Upon creation, workspaces get a creator annotation (org.eclipse.che.workspace/creator) that is set to the user's UID. This is protected by a webhook, and the annotation is read on the backend to ensure the token is being sent to a workspace that was created by the current user.
  • Exec access to any containers in a workspace is restricted to the creator, via webhook

To support these backend changes, requests to the new backend endpoints perform some checks before proxying the user's token:

  • Check if user is cluster-admin or otherwise has strong privileges, and if so deny the request.
    • This is done by checking if the user token can create pods in the openshift-operators namespace
  • Check that webhooks deployed by the che-workspace-operator are present on the cluster, and if not, deny the request.

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. component/backend Related to backend labels May 7, 2020
@openshift-ci-robot
Copy link
Contributor

Hi @sleshchenko. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 7, 2020
@spadgett
Copy link
Member

spadgett commented May 8, 2020

/uncc @rhamilto
/cc @benjaminapetersen

@openshift-ci-robot openshift-ci-robot requested review from benjaminapetersen and removed request for rhamilto May 8, 2020 17:02
@benjaminapetersen
Copy link
Contributor

Do you have a doc for setup / testing this?

@sleshchenko sleshchenko force-pushed the apiTerminal branch 2 times, most recently from 1956703 to af9aa8e Compare May 12, 2020 08:14
@amisevsk
Copy link
Contributor

@spadgett @benjaminapetersen Regarding our security concerns around the backend endpoint, @sleshchenko and I came up with a potential solution: currently, the webhooks used by the operator to secure workspaces are owned by the controller, so its removal removes the webhooks as well. If we removed that ownerref and kept webhooks in cluster when the controller is removed, this would provide security assuming webhooks aren't manually removed:

  • If no operator exists and a user creates a malicious workspace object, they can't direct users to share their tokens to it because the backend depends on the workspace status field, which is generally not modifiable.
  • If the operator is running, the current webhooks prevent anyone hijacking a workspace or exec'ing into a running workspace to obtain credentials
  • If the operator deployment has been removed, but the webhooks still exist, their failure policy means that existing workspaces cannot be updated to steal data. At this point, new workspaces cannot be created until webhooks are removed.

This leaves an opening for potential privilege escalation in users who can modify webhooks or workspace/status, but those are pretty nonstandard privileges.

WDYT?

@sleshchenko

This comment has been minimized.

pkg/terminal/proxy.go Outdated Show resolved Hide resolved
pkg/terminal/proxy.go Outdated Show resolved Hide resolved
@amisevsk
Copy link
Contributor

amisevsk commented May 21, 2020

Since I can't edit the PR description, the current flow I'm using to test the changes:

Included in PR description.

Setup

  1. Deploy changes from this PR
  2. Set up che-workspace-controller
    git clone https://github.com/che-incubator/che-workspace-operator.git
    cd che-workspace-operator
    export WEBHOOK_ENABLED=true
    export DEFAULT_ROUTING=basic
    make deploy
    
    Changing WEBHOOK_ENABLED above to false should disable all endpoints

Testing

Note: it's necessary to grant additional permissions to the console SA, as in openshift/console-operator#432

Testing the /api/terminal/available/ endpoint

use curl from terminal

curl -k -H "Authorization: Bearer $(oc whoami -t)" \
  ${cluster_URL}/api/terminal/available/ --verbose

Should return 204 if WEBHOOK_ENABLED, 503 otherwise.

Testing the main proxy endpoint:

Cases:

  • Log in as a cluster-admin user, click terminal button and wait for workspace to be running. Should see a request sent to .../exec/init with 403 response, and message "Terminal is disabled for cluster-admin users."
    Screenshot from 2020-05-21 14-02-01

  • Log in as regular user, click terminal button. Should see a request to .../exec/init with 200 status, and terminal should start. Once terminal is initialized, oc whoami should list current user as logged in. Note: A few front-end workarounds are required to make this work:

    • Front end currently depends on listing workspaces at cluster scope. To enable this easily, bind the view-workspaces clusterrole to the user:
      oc adm policy add-cluster-role-to-user view-workspaces ${testuser}
      
    • On first run, the terminal seems to get stuck loading. It's necessary to close the panel and reopen it for the terminal to initialize.

    You can also exec directly into the dev container in the workspace pod, and test oc there.

In both cases, if WEBHOOK_ENABLED=false when deploying the operator, the requests to .../exec/init should return 403 with message "Terminal endpoint is disabled: workspace operator is not deployed."

@sleshchenko sleshchenko marked this pull request as ready for review May 22, 2020 13:19
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 22, 2020
@christianvogt
Copy link
Contributor

/ok-to-test

@openshift-ci-robot openshift-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label May 22, 2020
@amisevsk
Copy link
Contributor

Added a few minor fixups:

  • Move endpoint api/terminal/[namespace]/[workspacename]/[path] -> api/terminal/proxy/[namespace]/[workspacename]/[path]
  • Fixup some missed returns in endpoint handling.

Wanted to also share a process for testing changes locally on crc:

Testing backend proxy while running bridge locally

  1. Get console SA token and ca.crt, store in /var/run/secrets/kubernetes.io/serviceaccount:

    cd /tmp/
    SA_SECRET=$(oc get sa console -n openshift-console -o yaml | yq -r '.secrets[].name' | grep "console-token-.*")
    oc get secret ${SA_SECRET} -n openshift-console -o json | jq -r '.data["ca.crt"]' | base64 -d > ca.crt
    oc get secret ${SA_SECRET} -n openshift-console -o json | jq -r '.data["token"]'  | base64 -d > token
    
    sudo mv token  /var/run/secrets/kubernetes.io/serviceaccount/token && \
         mv ca.crt /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  2. Set expected env vars and run bridge

    export KUBERNETES_SERVICE_PORT=6443
    export KUBERNETES_SERVICE_HOST=api.crc.testing
    ./examples/run-bridge.sh

@amisevsk
Copy link
Contributor

Any guidance on these CI failures? I don't know how to resolve

msg="Error: Error creating service account: googleapi: Error 429: Maximum number of service accounts on project reached., rateLimitExceeded"

Signed-off-by: Angel Misevski <amisevsk@redhat.com>
@christianvogt
Copy link
Contributor

/retest

pkg/terminal/client.go Outdated Show resolved Hide resolved
pkg/terminal/proxy.go Outdated Show resolved Hide resolved
pkg/terminal/proxy.go Outdated Show resolved Hide resolved
pkg/terminal/proxy.go Outdated Show resolved Hide resolved
pkg/terminal/proxy.go Outdated Show resolved Hide resolved
r2.URL.Path = path

// TODO a new proxy per request is created. Must be revised and probably changed
terminalProxy := proxy.NewProxy(&proxy.Config{
Copy link
Member

@spadgett spadgett May 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unclear why we need to use a proxy. If this is just a single, well-known endpoint for initializing the terminal, why not just call that endpoint directly from the backend? This would also tighten the security because you couldn't proxy with the token to other workspace paths.

Are there other requests we intend to proxy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requests currently proxied to the workspace are

We can probably just make those requests directly. I think the proxy may currently be design cruft from our initial goal of serving the proxied console frontend.

Removing the proxy may interfere with future plans to reuse some of the features provided by the front-end (e.g. saving history when reopening). @sleshchenko would have to confirm on that front.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request.Clone instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its relevant at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into writing out such a solution and feel like it introduces some unnecessary complexity and room for error (we'd basically be writing a simplified implementation of reverseProxy.ServeHTTP at a certain point). If there are no significant objections, I'd prefer to keep it as-is for now and potentially look into it in the future.

Regarding the security concerns:

  • It's likely we'll have additional API paths on the workspace side for 4.6, as we intend to look into reusing some of the cloudshell UI
  • Maliciously proxying the user's token to different paths is not a significant concern since anything malicious you could do with an alternate path you could do with the default path (e.g. log token on /activity/tick). If we want to restrict API paths, it'd be easier to check against a list of "approved" paths.

There may be some headers/other data that is processed by reverseProxy that's problematic that I'm unaware of. If it's important I can work on this today and have changes by tomorrow. @benjaminapetersen @spadgett WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the token for these other requests? I'd expect only to need the token once on init.

We should only pass the token when we need to.

There may be some headers/other data that is processed by reverseProxy that's problematic that I'm unaware of.

It's very likely. We should not be passing cookies for instance. Probably some other headers could cause problems.

If there are only a handful of requests we'll need to make, it would be better to support only those explicitly. The proxy seems unnecessary to me here. I'd think removing the proxy would reduce complexity as opposed to adding to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spadgett makes sense to me, I'll work on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented, PTAL.

Signed-off-by: Angel Misevski <amisevsk@redhat.com>
Reduce duplication by extracting k8s cluster config struct to separate
method.

Signed-off-by: Angel Misevski <amisevsk@redhat.com>
@amisevsk
Copy link
Contributor

/retest

@christianvogt
Copy link
Contributor

/retitle Bug 1839621: Add an ability to proxy requests to Che Workspace

@openshift-ci-robot openshift-ci-robot changed the title Add an ability to proxy requests to Che Workspace Bug 1839621: Add an ability to proxy requests to Che Workspace May 25, 2020
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label May 25, 2020
@openshift-ci-robot
Copy link
Contributor

@sleshchenko: This pull request references Bugzilla bug 1839621, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.5.0) matches configured target release for branch (4.5.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1839621: Add an ability to proxy requests to Che Workspace

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label May 25, 2020
Limit /api/terminal/proxy endpoint from being a full reverseProxy to a
workspace to simplify code.

Signed-off-by: Angel Misevski <amisevsk@redhat.com>
@openshift-ci-robot
Copy link
Contributor

@sleshchenko: This pull request references Bugzilla bug 1839621, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.5.0) matches configured target release for branch (4.5.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1839621: Add an ability to proxy requests to Che Workspace

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@christianvogt
Copy link
Contributor

/retest

Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sleshchenko, spadgett

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 27, 2020
@openshift-merge-robot openshift-merge-robot merged commit c4ccb7d into openshift:master May 27, 2020
@openshift-ci-robot
Copy link
Contributor

@sleshchenko: All pull requests linked via external trackers have merged: openshift/console#5332, openshift/console-operator#432. Bugzilla bug 1839621 has been moved to the MODIFIED state.

In response to this:

Bug 1839621: Add an ability to proxy requests to Che Workspace

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sleshchenko sleshchenko deleted the apiTerminal branch May 29, 2020 09:11
@spadgett spadgett added this to the v4.5 milestone Jun 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. component/backend Related to backend component/core Related to console core functionality lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants