Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug/flake: TestAPIExportAuthorizers - APIExport view missing queueing for related APIExports (via claims) #2713

Closed
ncdc opened this issue Jan 31, 2023 · 11 comments · Fixed by #2720
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.

Comments

@ncdc
Copy link
Member

ncdc commented Jan 31, 2023

From https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/kcp-dev_kcp/2707/pull-ci-kcp-dev-kcp-main-e2e-sharded/1620426348143054848 for PR #2707

=== RUN   TestAPIExportAuthorizers
=== PAUSE TestAPIExportAuthorizers
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:69: shared kcp server will target configuration "/go/src/github.com/kcp-dev/kcp/.kcp/admin.kubeconfig"
=== CONT  TestAPIExportAuthorizers
    kcp.go:980: waiting for readiness for server at https://10.128.38.8:6444
=== CONT  TestAPIExportAuthorizers
    kcp.go:1023: success contacting https://10.128.38.8:6444/livez
=== CONT  TestAPIExportAuthorizers
    kcp.go:1023: success contacting https://10.128.38.8:6444/readyz
    kcp.go:1001: server at https://10.128.38.8:6444 is ready
=== CONT  TestAPIExportAuthorizers
    assertions.go:1691: Waiting for condition, but got: workspace phase is Initializing, not Ready
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:74: Created root:organization workspace root:e2e-workspace-ccdmr as /clusters/2j1clh7hrfj2r5l4 on shard "shard-1"
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:77: Created root:universal workspace root:e2e-workspace-ccdmr:service-provider-1 as /clusters/1huhzk2qxbm3y4si on shard "shard-1"
    authorizer_test.go:78: Created root:universal workspace root:e2e-workspace-ccdmr:service-provider-2 as /clusters/2hf0bx9k2m0adhb5 on shard "shard-1"
    authorizer_test.go:79: Created root:universal workspace root:e2e-workspace-ccdmr:tenant as /clusters/1nyczkrpz8rtgduv on shard "shard-1"
    authorizer_test.go:80: Created root:universal workspace root:e2e-workspace-ccdmr:tenant-shadowed-crd as /clusters/4x24bik73ykm94us on shard "shard-1"
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:93: Giving users [service-provider-1-admin service-provider-2-admin tenant-user] member access to workspace "root:e2e-workspace-ccdmr"
    authorizer_test.go:94: Giving users [service-provider-1-admin] member access to workspace "root:e2e-workspace-ccdmr:service-provider-1"
    authorizer_test.go:95: Giving users [service-provider-2-admin] member access to workspace "root:e2e-workspace-ccdmr:service-provider-2"
    authorizer_test.go:96: Giving users [tenant-user] member access to workspace "root:e2e-workspace-ccdmr:tenant"
    authorizer_test.go:97: Giving users [tenant-user] member access to workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd"
    authorizer_test.go:99: install sherriffs API resource schema, API export, permissions for tenant-user to be able to bind to the export in service provider workspace "root:e2e-workspace-ccdmr:service-provider-1"
    authorizer_test.go:100: applying "apis.kcp.io/v1alpha1, Kind=APIResourceSchema" workspace "root:e2e-workspace-ccdmr:service-provider-1" name "today.sheriffs.wild.wild.west"
    authorizer_test.go:100: applying "apis.kcp.io/v1alpha1, Kind=APIExport" workspace "root:e2e-workspace-ccdmr:service-provider-1" name "wild.wild.west"
    authorizer_test.go:100: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRole" workspace "root:e2e-workspace-ccdmr:service-provider-1" name "tenant-user-bind-apiexport"
    authorizer_test.go:100: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding" workspace "root:e2e-workspace-ccdmr:service-provider-1" name "tenant-user-bind-apiexport"
    authorizer_test.go:133: get the sheriffs apiexport's generated identity hash
    authorizer_test.go:148: Found identity hash: 09cc3909865b33171cc220d31d5d3e8c2681848f2aabe2b76e047b3e32704be7
    authorizer_test.go:150: install cowboys API resource schema, API export, and permissions for tenant-user to be able to bind to the export in second service provider workspace "root:e2e-workspace-ccdmr:service-provider-2"
    authorizer_test.go:151: applying "apis.kcp.io/v1alpha1, Kind=APIResourceSchema" workspace "root:e2e-workspace-ccdmr:service-provider-2" name "today.cowboys.wildwest.dev"
    authorizer_test.go:151: applying "apis.kcp.io/v1alpha1, Kind=APIExport" workspace "root:e2e-workspace-ccdmr:service-provider-2" name "today-cowboys"
    authorizer_test.go:151: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRole" workspace "root:e2e-workspace-ccdmr:service-provider-2" name "tenant-user-bind"
    authorizer_test.go:151: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding" workspace "root:e2e-workspace-ccdmr:service-provider-2" name "tenant-user-bind"
    authorizer_test.go:151: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRole" workspace "root:e2e-workspace-ccdmr:service-provider-2" name "tenant-user-maximum-permission-policy"
    authorizer_test.go:151: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding" workspace "root:e2e-workspace-ccdmr:service-provider-2" name "tenant-user-maximum-permission-policy"
    authorizer_test.go:207: bind cowboys and claimed sherriffs in the tenant workspace "root:e2e-workspace-ccdmr:tenant"
    authorizer_test.go:209: applying "apis.kcp.io/v1alpha1, Kind=APIBinding" workspace "root:e2e-workspace-ccdmr:tenant" name "wild.wild.west"
    authorizer_test.go:209: applying "apis.kcp.io/v1alpha1, Kind=APIBinding" workspace "root:e2e-workspace-ccdmr:tenant" name "cowboys"
    authorizer_test.go:257: Make sure ["wildwest.dev", "wild.wild.west"] API groups shows up in consumer workspace "root:e2e-workspace-ccdmr:tenant" group discovery
    authorizer_test.go:268: Install cowboys CRD and also bind the conflicting cowboys API export in tenant workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd"
    authorizer_test.go:269: applying "apiextensions.k8s.io/v1, Kind=CustomResourceDefinition" workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd" name "cowboys.wildwest.dev"
    authorizer_test.go:289: Waiting for cowboys CRD to be ready in tenant workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd"
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:303: Create a cowboys APIBinding in consumer workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd" that points to the today-cowboys export from "root:e2e-workspace-ccdmr:service-provider-2" but shadows a local cowboys CRD at the same time
    authorizer_test.go:304: applying "apis.kcp.io/v1alpha1, Kind=APIBinding" workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd" name "cowboys"
    authorizer_test.go:337: Waiting for cowboys APIBinding in consumer workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd" to have the condition "BindingUpToDate" mentioning the conflict with the shadowing local cowboys CRD
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:355: applying "wildwest.dev/v1alpha1, Kind=Cowboy" workspace "root:e2e-workspace-ccdmr:tenant" namespace "default" name "cowboy-via-api-binding"
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:363: applying "wildwest.dev/v1alpha1, Kind=Cowboy" workspace "root:e2e-workspace-ccdmr:tenant-shadowed-crd" namespace "default" name "cowboy-via-crd"
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:371: Create virtual workspace client for "today-cowboys" APIExport in workspace "root:e2e-workspace-ccdmr:service-provider-2"
    authorizer_test.go:388: verify that service-provider-2-admin cannot list sherrifs resources via virtual apiexport apiserver because we have no local maximal permissions yet granted
    authorizer_test.go:398: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRole" workspace "root:e2e-workspace-ccdmr:service-provider-1" name "service-provider-2-admin-maximum-permission-policy"
    authorizer_test.go:398: applying "rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding" workspace "root:e2e-workspace-ccdmr:service-provider-1" name "service-provider-2-admin-maximum-permission-policy"
    authorizer_test.go:412: verify that service-provider-2-admin can lists all claimed resources using a wildcard request
=== CONT  TestAPIExportAuthorizers
    assertions.go:1691: Waiting for condition, but got: error while waiting to list "wild.wild.west/v1alpha1, Resource=sheriffs": the server could not find the requested resource
=== CONT  TestAPIExportAuthorizers
    authorizer_test.go:417: 
        	Error Trace:	util.go:327
        	            				authorizer_test.go:417
        	Error:      	Condition never satisfied
        	Test:       	TestAPIExportAuthorizers
        	Messages:   	listing claimed resources failed
--- FAIL: TestAPIExportAuthorizers (48.53s)
@ncdc ncdc added the kind/flake Categorizes issue or PR as related to a flaky test. label Jan 31, 2023
@ncdc ncdc added this to kcp Jan 31, 2023
@github-project-automation github-project-automation bot moved this to New in kcp Jan 31, 2023
@ncdc ncdc moved this from New to Next in kcp Jan 31, 2023
@ncdc ncdc moved this from Next to In Progress in kcp Jan 31, 2023
@ncdc ncdc self-assigned this Jan 31, 2023
@ncdc
Copy link
Member Author

ncdc commented Jan 31, 2023

I can reproduce this locally. I've added extra logging to pkg/virtual/apiexport/controllers/apireconciler/apiexport_apireconciler_reconcile.go to see what's going on. What I'm seeing is that

exports, err := c.apiExportIndexer.ByIndex(indexers.APIExportByIdentity, pc.IdentityHash)

sometimes returns 0 exports, meaning kcp can't find the APIExport for sheriffs in this case, so it never adds it to discovery. Not sure why yet.

@davidfestal
Copy link
Member

By any chance, is this flake the same as what @jmprusi seemed to have tried to fix in PR https://github.com/kcp-dev/kcp/pull/2417/files#diff-8d6f5573691c6741b1615098ed0f51a30afa2e1e9a906f19d5cbfceaef754930 ?

@ncdc
Copy link
Member Author

ncdc commented Jan 31, 2023

@davidfestal yes, although that's not really a fix 😄

@ncdc
Copy link
Member Author

ncdc commented Jan 31, 2023

More data from a local test failure (sharded, 1 shard):

  • 14:53:34.279516: httplog records patching of APIExport to set identity hash
  • 14:53:34.279628: kcp sees identity hash during indexing (IndexAPIExportByIdentity)
  • 14:53:34.281443: vw server IndexAPIExportByIdentity does NOT see hash (i.e. it's empty)
  • 14:53:35.029849: vw server apiexport_apireconciler_controller.go first queues today-cowboys APIExport
  • 14:53:35.029995: vw server apiexport_apireconciler_reconcile.go lists exports for identity; gets 0
  • 14:53:35.116657: e2e first tries to list sheriffs
  • 14:53:35.226938: vw server IndexAPIExportByIdentity sees hash
  • 14:54:05.042799: e2e last tries to list sheriffs; test fails
  • 14:54:05.383739: vw server apiexport_apireconciler_controller.go queues today-cowboys APIExport because APIResourceSchema was deleted
  • 14:54:05.383953: vw server apiexport_apireconciler_reconcile.go lists exports for identity; gets 1
  • 14:54:05.384139: vw server creates API definition for sheriffs

@ncdc
Copy link
Member Author

ncdc commented Jan 31, 2023

Note: the APIExport informer here is from the cache server

@ncdc
Copy link
Member Author

ncdc commented Jan 31, 2023

Oh I think when the APIExport that has sheriffs gets updated, we need to queue all other APIExports that have claims against its (sheriffs) identity. Testing this out.

@ncdc ncdc changed the title flake: TestAPIExportAuthorizers bug: TestAPIExportAuthorizers Jan 31, 2023
@ncdc ncdc added the kind/bug Categorizes issue or PR as related to a bug. label Jan 31, 2023
@ncdc ncdc changed the title bug: TestAPIExportAuthorizers bug/flake: TestAPIExportAuthorizers Jan 31, 2023
@ncdc
Copy link
Member Author

ncdc commented Jan 31, 2023

Root cause:

  • given 2 APIExports A and B
  • B has a permission claim for A's identity hash
  • A is updated
  • We never enqueued B as a result

@lionelvillard
Copy link
Contributor

@lionelvillard lionelvillard reopened this Feb 1, 2023
@github-project-automation github-project-automation bot moved this from Done to In Progress in kcp Feb 1, 2023
@ncdc
Copy link
Member Author

ncdc commented Feb 1, 2023

Different root cause. Let's open a new one.
/close

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 1, 2023

@ncdc: Closing this issue.

In response to this:

Different root cause. Let's open a new one.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot closed this as completed Feb 1, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in kcp Feb 1, 2023
@ncdc ncdc changed the title bug/flake: TestAPIExportAuthorizers bug/flake: TestAPIExportAuthorizers - APIExport view missing queueing for related APIExports (via claims) Feb 1, 2023
@ncdc
Copy link
Member Author

ncdc commented Feb 1, 2023

#2732 for the new one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants