Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug? flake? TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled #2603

Closed
ncdc opened this issue Jan 11, 2023 · 5 comments
Labels
area/workspaces kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@ncdc
Copy link
Member

ncdc commented Jan 11, 2023

From https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/kcp-dev_kcp/2602/pull-ci-kcp-dev-kcp-main-e2e-shared/1613279921029779456

After all the shards are deleted, we create a workspace, which should be unschedulable, but somehow it gets scheduled and then initialized?

=== RUN   TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
=== PAUSE TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:170: Saving test artifacts for test "TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled" under "/logs/artifacts/TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled/1636451888".
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:170: Starting kcp servers...
    kcp.go:430: running: /go/src/github.com/kcp-dev/kcp/bin/kcp start --root-directory /tmp/TestWorkspaceControlleradd_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled3772238024/001/kcp/main --secure-port=37495 --embedded-etcd-client-port=43281 --embedded-etcd-peer-port=39257 --embedded-etcd-wal-size-bytes=5000 --kubeconfig-path=/tmp/TestWorkspaceControlleradd_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled3772238024/001/kcp/main/admin.kubeconfig --feature-gates=CustomResourceValidationExpressions=true --audit-log-path /logs/artifacts/TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled/1636451888/artifacts/kcp/main/kcp.audit
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    kcp.go:728: success contacting https://10.128.134.92:37495/readyz
    kcp.go:728: success contacting https://10.128.134.92:37495/livez
    controller_test.go:170: Started kcp servers after 19.607612204s
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:175: Created root:organization workspace root:e2e-workspace-n7kxg as /clusters/n6xcdh5ewikvarda
I0111 21:20:40.250327   30383 shared_informer.go:255] Waiting for caches to sync for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
I0111 21:20:40.250487   30383 reflector.go:219] Starting reflector *v1beta1.Workspace (0s) from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.250515   30383 reflector.go:255] Listing and watching *v1beta1.Workspace from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.361509   30383 shared_informer.go:285] caches populated
I0111 21:20:40.361562   30383 shared_informer.go:262] Caches are synced for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
I0111 21:20:40.361752   30383 shared_informer.go:255] Waiting for caches to sync for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
I0111 21:20:40.361920   30383 reflector.go:219] Starting reflector *v1alpha1.Shard (0s) from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.361951   30383 reflector.go:255] Listing and watching *v1alpha1.Shard from k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167
I0111 21:20:40.462862   30383 shared_informer.go:285] caches populated
I0111 21:20:40.462930   30383 shared_informer.go:262] Caches are synced for TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:190: Get a list of current shards so that we can schedule onto a valid shard later
    controller_test.go:190: Delete all pre-configured shards, we have to control the creation of the workspace shards in this test
    controller_test.go:190: Create a workspace without shards
    controller_test.go:190: Expect workspace to be unschedulable
=== CONT  TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
    controller_test.go:190: 
        	Error Trace:	controller_test.go:116
        	            				controller_test.go:190
        	Error:      	Received unexpected error:
        	            	expected state not found: context deadline exceeded, 11 errors encountered while processing 11 events: [Workspace.tenancy.kcp.io "steve" not found, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions(nil), expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceInitialized", Status:"False", Severity:"Info", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"InitializerExists", Message:"Initializers still exist: [root:universal system:apibindings]"}, v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceInitialized", Status:"False", Severity:"Info", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"InitializerExists", Message:"Initializers still exist: [system:apibindings]"}, v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}, expected an unschedulable workspace, got status.conditions: v1alpha1.Conditions{v1alpha1.Condition{Type:"WorkspaceInitialized", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 41, 0, time.Local), Reason:"", Message:""}, v1alpha1.Condition{Type:"WorkspaceScheduled", Status:"True", Severity:"", LastTransitionTime:time.Date(2023, time.January, 11, 21, 20, 40, 0, time.Local), Reason:"", Message:""}}]
        	Test:       	TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled
        	Messages:   	did not see workspace marked unschedulable
    kcp.go:417: cleanup: canceling context
    kcp.go:421: cleanup: waiting for shutdownComplete
I0111 21:21:10.600893   30383 reflector.go:536] k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167: Watch close - *v1beta1.Workspace total 10 items received
I0111 21:21:10.600938   30383 reflector.go:536] k8s.io/client-go@v0.0.0-20230109113100-c493866a854f/tools/cache/reflector.go:167: Watch close - *v1alpha1.Shard total 1 items received
    kcp.go:425: cleanup: received shutdownComplete
    --- FAIL: TestWorkspaceController/add_a_shard_after_a_workspace_is_unschedulable,_expect_it_to_be_scheduled (50.95s)
@ncdc ncdc added kind/bug Categorizes issue or PR as related to a bug. area/workspaces kind/flake Categorizes issue or PR as related to a flaky test. labels Jan 11, 2023
@ncdc ncdc added this to kcp Jan 11, 2023
@github-project-automation github-project-automation bot moved this to New in kcp Jan 11, 2023
@nrb nrb moved this from New to Backlog in kcp Jan 17, 2023
@nrb
Copy link
Contributor

nrb commented Jan 24, 2023

This appears to be a timing issue where the shard is deleted from etcd, but is still in the informer cache. The workspace controller sees the shard as valid, and marks the workspace as scheduled.

@sttts Does that sound like a valid explanation of why this edge case might happen?

@kcp-ci-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kcp-ci-bot kcp-ci-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 28, 2024
@kcp-ci-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

@kcp-ci-bot kcp-ci-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 28, 2024
@kcp-ci-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

@kcp-ci-bot
Copy link
Contributor

@kcp-ci-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-project-automation github-project-automation bot moved this from In Progress to Done in kcp Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workspaces kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants