Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail DevWorkspaces when DevWorkspaceRouting controller sees error #1199

Merged
merged 2 commits into from
Oct 27, 2023

Conversation

amisevsk
Copy link
Collaborator

What does this PR do?

Mark DevWorkspaceRoutings as failed if any of the sync operations returns an UnrecoverableSyncError rather than ignoring it and treating it as a transient error (i.e. retrying).

This allows workspaces to fail with a meaningful message more quickly, rather than waiting for timeout and failing then.

What issues does this PR fix or reference?

Closes #1198

Is it tested? How?

To test, create a resource quota, then create workspaces that would violate that quota and ensure they fail.

For example, for services,

  1. Create a resource quota specifying e.g. a maximum number of services in a namespace:
    apiVersion: v1
    kind: ResourceQuota
    metadata:
      name: resource-quota-services
    spec:
      hard:
        count/services: "1"
  2. Create two workspaces that have at least one endpoint (resulting in DWO attempting to create two services)
  3. Workspace should fail with message
    Failed to set up networking for workspace: services "workspace82736d3c797e4150-service" is forbidden: exceeded quota: resource-quota-services, requested: count/services=1, used: count/services=1, limited: count/services=1
    

We should verify that such quotas are respected for routes and ingresses as well, as applicable.

PR Checklist

  • E2E tests pass (when PR is ready, comment /test v8-devworkspace-operator-e2e, v8-che-happy-path to trigger)
    • v8-devworkspace-operator-e2e: DevWorkspace e2e test
    • v8-che-happy-path: Happy path for verification integration with Che

Mark DevWorkspaceRoutings as failed if any of the sync operations
returns an UnrecoverableSyncError rather than ignoring it and treating
it as a transient error (i.e. retrying).

This allows workspaces to fail with a meaningful message more quickly,
rather than waiting for timeout and failing then.

Signed-off-by: Angel Misevski <amisevsk@redhat.com>
Copy link
Collaborator

@AObuchow AObuchow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and works as expected for the provided test-case on service creation.
I also tested this for routes (on OpenShift) and ingresses on minikube, and the workspaces failed as expected.

For all tests, I used created 2 workspaces based on the following devworkspace:

kind: DevWorkspace
apiVersion: workspace.devfile.io/v1alpha2
metadata:
  name: endpoint-1
spec:
  started: true
  routingClass: 'basic'
  template:
    components:
      - name: web-terminal
        container:
          image: quay.io/wto/web-terminal-tooling:next
          memoryRequest: 256Mi
          memoryLimit: 512Mi
          mountSources: true
          command:
           - "tail"
           - "-f"
           - "/dev/null"
          endpoints:
            - name: 'http'
              protocol: http
              targetPort: 8080
              attributes:
                discoverable: true
              exposure: public

Routes

I used the following resource quota:

kind: ResourceQuota
apiVersion: v1
metadata:
  name: routes-quota
  namespace: default
spec:
  hard:
    count/routes.route.openshift.io: '1'
status:
  hard:
    count/routes.route.openshift.io: '1'

The second devworkspace failed as expected:

$ kubectl get dw -w  
NAME         DEVWORKSPACE ID             PHASE     INFO  
endpoint-1   workspace7144cffb567f4a35   Running   Workspace is running  
endpoint-2   workspace5604dcd5bb8a403f   Failed    Failed to set up networking for workspace: routes.route.openshift.io "workspace5604dcd5bb8a403f-http" is forbidden: exce  
eded quota: routes-quota, requested: count/routes.route.openshift.io=1, used: count/routes.route.openshift.io=1, limited: count/routes.route.openshift.io=1

Ingresses

I used the following resource quota:

kind: ResourceQuota
apiVersion: v1
metadata:
  name: ingresses-quota
spec:
  hard:
    count/ingresses: '1'

The second devworkspace failed accordingly:

$ kubectl get dw    
NAME         DEVWORKSPACE ID             PHASE     INFO  
endpoint-1   workspace590265b135274cb0   Running   Workspace is running  
endpoint-2   workspacef99194ee40f24e11   Failed    Failed to set up networking for workspace: ingresses.networking.k8s.io "workspacef99194ee40f24e11-http" is forbidden: ex  
ceeded quota: ingresses-quota, requested: count/ingresses.networking.k8s.io=1, used: count/ingresses.networking.k8s.io=1, limited: count/ingresses.networking.k8s.io=1

@AObuchow
Copy link
Collaborator

The Check sources GitHub action is failing due to a DWR test-case I wrote that was abusing the bug that this PR addresses.

IMO we should just remove the Gets Preparing status test case & the associated createsPreparingDWR() function.

@amisevsk let me know if you'd like me to add an extra commit to the PR for this.

Drop devworkspacerouting reconciler test for "gets preparing status", as
this test relied on a bug (403 forbidden responses to API calls were
ignored) in the controller.

Since there's no good way to create a 'preparing' DevWorkspaceRouting,
we can't test this specific function.

Signed-off-by: Angel Misevski <amisevsk@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm label Oct 27, 2023
@openshift-ci openshift-ci bot added the lgtm label Oct 27, 2023
@openshift-ci
Copy link

openshift-ci bot commented Oct 27, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amisevsk, AObuchow

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov
Copy link

codecov bot commented Oct 27, 2023

Codecov Report

Attention: 15 lines in your changes are missing coverage. Please review.

Comparison is base (800f386) 53.01% compared to head (c73d089) 52.89%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1199      +/-   ##
==========================================
- Coverage   53.01%   52.89%   -0.13%     
==========================================
  Files          84       84              
  Lines        7583     7595      +12     
==========================================
- Hits         4020     4017       -3     
- Misses       3275     3290      +15     
  Partials      288      288              
Files Coverage Δ
...s/controller/devworkspacerouting/sync_ingresses.go 64.06% <0.00%> (ø)
...lers/controller/devworkspacerouting/sync_routes.go 64.78% <0.00%> (ø)
...rs/controller/devworkspacerouting/sync_services.go 64.06% <0.00%> (-3.13%) ⬇️
...workspacerouting/devworkspacerouting_controller.go 48.90% <0.00%> (-3.17%) ⬇️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@amisevsk amisevsk merged commit b512c88 into devfile:main Oct 27, 2023
4 of 6 checks passed
@amisevsk amisevsk deleted the dwr-reconciler-errors branch October 27, 2023 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unrecoverable sync errors are ignored in the DevWorkspaceRouting reconciler
2 participants