bug? flake? TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled #2603

ncdc · 2023-01-11T21:25:58Z

From https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/kcp-dev_kcp/2602/pull-ci-kcp-dev-kcp-main-e2e-shared/1613279921029779456

After all the shards are deleted, we create a workspace, which should be unschedulable, but somehow it gets scheduled and then initialized?

=== RUN   TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
=== PAUSE TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:170: Saving test artifacts for test "TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled" under "/logs/artifacts/TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled/1636451888".
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:170: Starting kcp servers...
    kcp.go:430: running: /go/src/github.com/kcp-dev/kcp/bin/kcp start --root-directory /tmp/TestWorkspaceControlleradd_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled3772238024/001/kcp/main --secure-port=37495 --embedded-etcd-client-port=43281 --embedded-etcd-peer-port=39257 --embedded-etcd-wal-size-bytes=5000 --kubeconfig-path=/tmp/TestWorkspaceControlleradd_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled3772238024/001/kcp/main/admin.kubeconfig --feature-gates=CustomResourceValidationExpressions=true --audit-log-path /logs/artifacts/TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled/1636451888/artifacts/kcp/main/kcp.audit
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    kcp.go:728: success contacting https://10.128.134.92:37495/readyz
    kcp.go:728: success contacting https://10.128.134.92:37495/livez
    controller_test.go:170: Started kcp servers after 19.607612204s
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:175: Created root:organization workspace root:e2e-workspace-n7kxg as /clusters/n6xcdh5ewikvarda
I0111 21:20:40.250327   30383 shared_informer.go:255] Waiting for caches to sync for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
I0111 21:20:40.250487   30383 reflector.go:219] Starting reflector *v1beta1.Workspace (0s) from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.250515   30383 reflector.go:255] Listing and watching *v1beta1.Workspace from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.361509   30383 shared_informer.go:285] caches populated
I0111 21:20:40.361562   30383 shared_informer.go:262] Caches are synced for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
I0111 21:20:40.361752   30383 shared_informer.go:255] Waiting for caches to sync for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
I0111 21:20:40.361920   30383 reflector.go:219] Starting reflector *v1alpha1.Shard (0s) from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.361951   30383 reflector.go:255] Listing and watching *v1alpha1.Shard from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.462862   30383 shared_informer.go:285] caches populated
I0111 21:20:40.462930   30383 shared_informer.go:262] Caches are synced for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:190: Get a list of current shards so that we can schedule onto a valid shard later
    controller_test.go:190: Delete all pre-configured shards, we have to control the creation of the workspace shards in this test
    controller_test.go:190: Create a workspace without shards
    controller_test.go:190: Expect workspace to be unschedulable
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:190: 
        	Error Trace:	controller_test.go:116
        	            				controller_test.go:190
        	Error:      	Received unexpected error:
        	            	expected state not found: context deadline exceeded, 11 errors encountered while processing 11 events: [Workspace.tenancy.kcp.io "steve" not found, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions(nil), expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceInitialized", Status:"False", Severity:"Info", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"InitializerExists", Message:"Initializers still exist: [root:universal system:apibindings]"}, v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceInitialized", Status:"False", Severity:"Info", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"InitializerExists", Message:"Initializers still exist: [system:apibindings]"}, v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceInitialized", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 41, 0, time.Local), Reason:"", Message:""}, v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}]
        	Test:       	TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
        	Messages:   	did not see workspace marked unschedulable
    kcp.go:417: cleanup: canceling context
    kcp.go:421: cleanup: waiting for shutdownComplete
I0111 21:21:10.600893   30383 reflector.go:536] k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167: Watch close - *v1beta1.Workspace total 10 items received
I0111 21:21:10.600938   30383 reflector.go:536] k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167: Watch close - *v1alpha1.Shard total 1 items received
    kcp.go:425: cleanup: received shutdownComplete
    --- FAIL: TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled (50.95s)

The text was updated successfully, but these errors were encountered:

nrb · 2023-01-24T20:36:26Z

This appears to be a timing issue where the shard is deleted from etcd, but is still in the informer cache. The workspace controller sees the shard as valid, and marks the workspace as scheduled.

@sttts Does that sound like a valid explanation of why this edge case might happen?

kcp-ci-bot · 2024-04-28T20:31:17Z

Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kcp-ci-bot · 2024-05-28T20:32:35Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kcp-ci-bot · 2024-06-27T20:33:02Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

kcp-ci-bot · 2024-06-27T20:33:05Z

@kcp-ci-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ncdc added kind/bug Categorizes issue or PR as related to a bug. area/workspaces kind/flake Categorizes issue or PR as related to a flaky test. labels Jan 11, 2023

ncdc added this to kcp Jan 11, 2023

github-project-automation bot moved this to New in kcp Jan 11, 2023

ncdc mentioned this issue Jan 11, 2023

🐛 committer: fix equality check in statusless committer #2602

Merged

nrb moved this from New to Backlog in kcp Jan 17, 2023

ncdc self-assigned this Jan 26, 2023

ncdc moved this from Backlog to In Progress in kcp Jan 26, 2023

This was referenced Jan 26, 2023

bug: Possible to schedule a workspace onto a just-deleted Shard #2686

Closed

🐛 Deflake TestWorkspaceController #2687

Closed

Flakey e2e: TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled #812

Closed

ncdc mentioned this issue Feb 22, 2023

kcp: add release-0.11 jobs openshift/release#36657

Merged

ncdc removed their assignment Jan 29, 2024

kcp-ci-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 28, 2024

kcp-ci-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 28, 2024

kcp-ci-bot closed this as completed Jun 27, 2024

github-project-automation bot moved this from In Progress to Done in kcp Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug? flake? TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled #2603

bug? flake? TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled #2603

ncdc commented Jan 11, 2023

nrb commented Jan 24, 2023

kcp-ci-bot commented Apr 28, 2024

kcp-ci-bot commented May 28, 2024

kcp-ci-bot commented Jun 27, 2024

kcp-ci-bot commented Jun 27, 2024

bug? flake? TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled #2603

bug? flake? TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled #2603

Comments

ncdc commented Jan 11, 2023

nrb commented Jan 24, 2023

kcp-ci-bot commented Apr 28, 2024

kcp-ci-bot commented May 28, 2024

kcp-ci-bot commented Jun 27, 2024

kcp-ci-bot commented Jun 27, 2024