Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coscheduling: integration-test for PodGroup status #549

Merged
merged 1 commit into from
Apr 6, 2023

Conversation

Gekko0114
Copy link
Member

What type of PR is this?

/kind bug
/kind feature

What this PR does / why we need it:

Added integration test for PodGroup Status.
Fixed some logic to ensure smooth running of the Reconcile function.

Which issue(s) this PR fixes:

Fixes #516

Special notes for your reviewer:

Thank you for reviewing this PR.
Please let me know if I have made any mistakes or if there are any improvements that could be made.
Your comments and feedback are appreciated.

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 19, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @Gekko0114. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 19, 2023
test/util/utils.go Outdated Show resolved Hide resolved
pkg/controllers/podgroup_controller.go Outdated Show resolved Hide resolved
@Gekko0114 Gekko0114 force-pushed the integration-test branch 2 times, most recently from 2f59587 to ae9b4ed Compare March 20, 2023 12:45
@zwpaper
Copy link
Member

zwpaper commented Mar 20, 2023

I am a little bit busy past few days, will make sure the testing logic works locally in the following days

@Gekko0114
Copy link
Member Author

Thank you so much.
Do you mean you will work on fixing the Reconcile logic or the PostBind logic? I will appreciate it.

@Gekko0114
Copy link
Member Author

Or if you explain your idea, perhaps I can work on it instead.

@zwpaper
Copy link
Member

zwpaper commented Mar 24, 2023

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 24, 2023
@zwpaper
Copy link
Member

zwpaper commented Mar 24, 2023

Or if you explain your idea, perhaps I can work on it instead.

sorry for the late reply, I mean I will check the testing logic locally, and I was started working on it, will respond in not long later

@Gekko0114
Copy link
Member Author

Hi @zwpaper,
Thank you so much for your confirmation. I will double-check it.

@Gekko0114 Gekko0114 force-pushed the integration-test branch 2 times, most recently from 78f5a6b to 695f01b Compare March 26, 2023 13:46
@Gekko0114
Copy link
Member Author

Hi @zwpaper,

I checked it again but it seems to be working fine.
When I ran this code again, all test cases passed successfully. When I changed want status to something else, the test failed.

I have included the logs for same test cases, and also the logs when I changed the want status to something else.

I tried this test code on macOS and latest repository version.
Could you please cherry-pick my PR branch and try running the test again? If it still doesn't work, could you provide more details about your logs?

(BTW, I've updated the test code to use gocmp.Diff, but this is not related to this issue.)

Running successfully for same test cases.

=== RUN TestPodGroupController/The_status_of_no_podGroup_label_pod_changes_from_pending_to_running
I0326 22:52:17.182295 15173 podgroup_controller_test.go:281] "Creating pod " podName="t1-p1-1"
I0326 22:52:17.184989 15173 podgroup_controller_test.go:281] "Creating pod " podName="t1-p1-2"
I0326 22:52:17.187493 15173 podgroup_controller_test.go:281] "Creating pod " podName="t1-p1-3"
podgroup_controller_test.go:357: Case The status of no podGroup label pod changes from pending to running finished
--- PASS: TestPodGroupController (6.66s)
--- PASS: TestPodGroupController/Statuses_of_all_pods_change_from_pending_to_running (2.92s)
--- PASS: TestPodGroupController/Statuses_of_all_pods_change_from_running_to_succeeded (0.89s)
--- PASS: TestPodGroupController/The_status_of_one_pod_changes_from_running_to_succeeded (0.85s)
--- PASS: TestPodGroupController/The_status_of_the_pod_changes_from_running_to_failed (0.88s)
--- PASS: TestPodGroupController/The_status_of_no_podGroup_label_pod_changes_from_pending_to_running (0.87s)
PASS
E0326 22:52:18.052442 15173 scheduling_queue.go:1072] "Error while retrieving next pod from scheduling queue" err="scheduling queue is closed"
ok sigs.k8s.io/scheduler-plugins/test/integration 31.796s

Failed when changing want status.
(Changing line266 to this util.UpdatePGStatus(util.MakePG("pg1-1", ns, 3, nil, nil), "Pending", "", 0, 0, 0, 1))

=== RUN TestPodGroupController/The_status_of_no_podGroup_label_pod_changes_from_pending_to_running
I0326 22:54:35.610528 15474 podgroup_controller_test.go:281] "Creating pod " podName="t1-p1-1"
I0326 22:54:35.613804 15474 podgroup_controller_test.go:281] "Creating pod " podName="t1-p1-2"
I0326 22:54:35.616372 15474 podgroup_controller_test.go:281] "Creating pod " podName="t1-p1-3"
podgroup_controller_test.go:349:   v1alpha1.PodGroupStatus{
   ... // 3 identical fields
   Running: 0,
   Succeeded: 0,
-  Failed: 0,
+  Failed: 1,
   ... // 1 ignored field
  }
podgroup_controller_test.go:355: The status of no podGroup label pod changes from pending to running Waiting next PodGroup error: timed out waiting for the condition
--- FAIL: TestPodGroupController (16.53s)
--- PASS: TestPodGroupController/Statuses_of_all_pods_change_from_pending_to_running (2.89s)
--- PASS: TestPodGroupController/Statuses_of_all_pods_change_from_running_to_succeeded (0.89s)
--- PASS: TestPodGroupController/The_status_of_one_pod_changes_from_running_to_succeeded (0.86s)
--- PASS: TestPodGroupController/The_status_of_the_pod_changes_from_running_to_failed (0.86s)
--- FAIL: TestPodGroupController/The_status_of_no_podGroup_label_pod_changes_from_pending_to_running (10.72s)
FAIL
E0326 22:54:46.331361 15474 scheduling_queue.go:1072] "Error while retrieving next pod from scheduling queue" err="scheduling queue is closed"
FAIL sigs.k8s.io/scheduler-plugins/test/integration 41.908s
FAIL

@zwpaper
Copy link
Member

zwpaper commented Mar 26, 2023

Oh my bad, I mistakenly change the args, it work after a second look.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 26, 2023
@zwpaper
Copy link
Member

zwpaper commented Mar 26, 2023

@PiotrProkop I have no more questions, is this one ok for you?

@Gekko0114 Gekko0114 force-pushed the integration-test branch 2 times, most recently from 206d5d4 to 4db6b00 Compare March 28, 2023 04:38
@Gekko0114
Copy link
Member Author

Gekko0114 commented Mar 28, 2023

Hi @denkensk,

Thanks for your comments!
I fixed it, so could you have a look again ?
Thanks,

@Gekko0114
Copy link
Member Author

Hi @denkensk,

Sorry for ping you multiple times.
Could you approve it?
I have fixed comments. Additionally, this PR will fix a small bug in the PodGroup, so I would like to merge it ASAP

@denkensk
Copy link
Member

denkensk commented Apr 3, 2023

/lgtm
/approve

Thanks for your contribution. @Gekko0114
And sorry for the late reply.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 3, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: denkensk, Gekko0114

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 3, 2023
@Gekko0114
Copy link
Member Author

@denkensk,
I apologize for reminding you, and thank you for your prompt confirmation!

@Gekko0114
Copy link
Member Author

/test

@k8s-ci-robot
Copy link
Contributor

@Gekko0114: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

  • /test pull-scheduler-plugins-integration-test-master
  • /test pull-scheduler-plugins-unit-test-master
  • /test pull-scheduler-plugins-verify
  • /test pull-scheduler-plugins-verify-build-master

Use /test all to run all jobs.

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Gekko0114
Copy link
Member Author

/test pull-scheduler-plugins-integration-test-master

@Gekko0114
Copy link
Member Author

Gekko0114 commented Apr 3, 2023

When I run make integration-test "ARGS=-run TestPodGroupController, test can be passed.
However, make integration-test can't be passed... I will investigate on it.

panic: close of closed channel
goroutine 9857 [running]:
sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler()
/home/prow/go/src/sigs.k8s.io/scheduler-plugins/vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal.go:31 +0x25
sigs.k8s.io/scheduler-plugins/test/integration.TestPodGroupController.func1()
/home/prow/go/src/sigs.k8s.io/scheduler-plugins/test/integration/podgroup_controller_test.go:78 +0x32
created by sigs.k8s.io/scheduler-plugins/test/integration.TestPodGroupController
/home/prow/go/src/sigs.k8s.io/scheduler-plugins/test/integration/podgroup_controller_test.go:77 +0x3dd
FAIL sigs.k8s.io/scheduler-plugins/test/integration 356.228s

@denkensk
Copy link
Member

denkensk commented Apr 4, 2023

I apologize for reminding you, and thank you for your prompt confirmation!

@Gekko0114 It's okay, it's my responsibility. Thanks for your contribution again.

@denkensk
Copy link
Member

denkensk commented Apr 4, 2023

When I run make integration-test "ARGS=-run TestPodGroupController, test can be passed.
However, make integration-test can't be passed... I will investigate on it.

/hold for waiting

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 4, 2023
@Gekko0114
Copy link
Member Author

It seems that this error is caused by controller-runtime's SetupSignalHandler.
I've created an issue.
Please let me know if you can fix this issue.

In this #549, we have created integration test using mgr.Start(ctrl.SetupSignalHandler()). However, as another test also uses mgr.Start(ctrl.SetupSignalHandler()), we are encountering a panic error.

panic: close of closed channel
goroutine 9857 [running]:
sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler()
/home/prow/go/src/sigs.k8s.io/scheduler-plugins/vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal.go:31 +0x25
sigs.k8s.io/scheduler-plugins/test/integration.TestPodGroupController.func1()
/home/prow/go/src/sigs.k8s.io/scheduler-plugins/test/integration/podgroup_controller_test.go:78 +0x32
created by sigs.k8s.io/scheduler-plugins/test/integration.TestPodGroupController
/home/prow/go/src/sigs.k8s.io/scheduler-plugins/test/integration/podgroup_controller_test.go:77 +0x3dd
FAIL sigs.k8s.io/scheduler-plugins/test/integration 356.228s

I think this is because ctrl.SetupSignalHandler is being executed twice in a single go test.
Could you suggest how we can implement integration test in such cases?

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2023
@Gekko0114
Copy link
Member Author

Hi @denkensk ,

Could you review and approve it again?

I fixed the issue by creating this param and shared this between integration-tests.
var signalHandler = ctrl.SetupSignalHandler()

@denkensk
Copy link
Member

denkensk commented Apr 6, 2023

/lgtm
/unhold

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Apr 6, 2023
@k8s-ci-robot k8s-ci-robot merged commit fc0b16b into kubernetes-sigs:master Apr 6, 2023
@Gekko0114 Gekko0114 deleted the integration-test branch April 15, 2023 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Coscheduling] test: need a integration test for PodGroup Status
5 participants