Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add unit test for jobset controller #485

Closed
wants to merge 1 commit into from

Conversation

googs1025
Copy link
Member

issue from #476
It's a bit embarrassing. I accidentally deleted the PR during synchronization and resubmitted a new one. Sorry.
When there is rebase, I don’t know how to do it, and I often close the submission.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: googs1025
Once this PR has been reviewed and has the lgtm label, please assign danielvegamyhre for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 28, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @googs1025. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 28, 2024
Copy link

netlify bot commented Mar 28, 2024

Deploy Preview for kubernetes-sigs-jobset canceled.

Name Link
🔨 Latest commit 96fdbd2
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-jobset/deploys/662baff29528d60009af0571

@danielvegamyhre
Copy link
Contributor

When there is rebase, I don’t know how to do it, and I often close the submission.

No worries! I will take a look at this tomorrow.

@danielvegamyhre
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 28, 2024
@googs1025 googs1025 force-pushed the main branch 4 times, most recently from 0abec76 to 780374e Compare March 28, 2024 06:47
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 29, 2024
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 29, 2024
@danielvegamyhre
Copy link
Contributor

danielvegamyhre commented Apr 4, 2024

@googs1025 I am refactoring createJobs() to make it more testable in #496 since right now this function does a lot things unrelated to creating jobs (enforcing startup policy rules, suspend/resume, etc.) which makes it harder to unit test thoroughly.

This will make it easier for you to write the tests, and easier for me to review them as well.

I added some preliminary notes on some of your test in the meantime.

Comment on lines 1235 to 1247
name: "create jobs with DNSHostname",
js: testutils.MakeJobSet(jobSetName, ns).EnableDNSHostnames(true).
ReplicatedJob(testutils.MakeReplicatedJob("replicated-job-1").
Job(testutils.MakeJobTemplate("test-job", ns).Obj()).
Replicas(1).
Obj()).Obj(),
jobs: childJobs{
active: []*batchv1.Job{
makeJob(&makeJobArgs{
jobSetName: jobSetName,
replicatedJobName: "replicated-job-1",
jobName: "test-jobset-replicated-job-1-test-job-0",
ns: ns,
replicas: 1,
jobIdx: 0,
}).Parallelism(1).
Completions(2).
Ready(1).
Active(1).Obj(),
},
},
expectedJobsCreated: []*batchv1.Job{
makeJob(&makeJobArgs{
jobSetName: jobSetName,
replicatedJobName: "replicated-job-1",
jobName: "test-jobset-replicated-job-1-test-job-0",
ns: ns,
replicas: 1,
jobIdx: 0,
}).Parallelism(1).
Completions(2).
Ready(1).
Active(1).Obj(),
},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing in the createJobs() function anymore that depends on enableDNShostnames, we can remove this test case

},
},
{
name: "create jobs with StartupPolicyOrder is InOrder",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this test case, we should have 2 replicated jobs, and ensure we only see Job creation calls for the jobs in the 1st replicated Job.

},
},
errorFlag: true,
expectedErr: errors.Join(errors.New("job \"test-jobset-replicated-job-1-0\" creation failed with error: create pod error")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to do errors.Join() on a single error, do you find you needed to in order for the test to pass?

return nil
}
if tc.errorFlag || job == nil {
return errors.New("create pod error")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return errors.New("create pod error")
return errors.New("example error")

Let's keep the error message generic and obvious test error, so it's clear when reading the test cases we aren't actually expecting some pod related error.

@googs1025
Copy link
Member Author

@danielvegamyhre Thank you for your response. I will update the content of this PR after you merge your pr.

if tc.errorFlag || job == nil {
return errors.New("delete pod error")
}
job = tc.jobs[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you overriding the job that was deleted with the test case job at index 0 every time?

In the jobsDeleted slice, we should just track the actual job objects that were in the Delete() calls.

expected bool
}{
{
name: "test jobSetFinished with not finished",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: "test jobSetFinished with not finished",
name: "jobset not finished",

Obj()).Obj(),
},
{
name: "test jobSetFinished with finished",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: "test jobSetFinished with finished",
name: "jobset finished",

expected: true,
},
}
for _, tc := range tests {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test case for a failed JobSet? It should be considered finished as well.

tests := []struct {
name string
jobs []*batchv1.Job
expectedJobsUpdated []*batchv1.Job
Copy link
Contributor

@danielvegamyhre danielvegamyhre Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expectedJobsUpdated []*batchv1.Job
wantResumed []*batchv1.Job

name string
js *jobset.JobSet
jobs childJobs
expectedErr error
Copy link
Contributor

@danielvegamyhre danielvegamyhre Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename expectedErr to wantErr everywhere in this file? It's more concise and generally adheres to the "want/got" pattern usually seen in Go testing

js *jobset.JobSet
jobs childJobs
expectedErr error
errorFlag bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename errorFlag to forceError everywhere in this file? I think this name makes the parameter's purpose more clear.

if tc.errorFlag || job == nil {
return errors.New("create pod error")
}
job = tc.jobs.active[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remove this line, should be tracking the job passed in the Create() call, not the defined in the test case.

record record.EventRecorder
}

func makeClient(interceptor interceptor.Funcs, initObjs ...client.Object) *fakeClient {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func makeClient(interceptor interceptor.Funcs, initObjs ...client.Object) *fakeClient {
func makeFakeClient(interceptor interceptor.Funcs, initObjs ...client.Object) *fakeClient {

Comment on lines 1207 to 1213
makeJob(&makeJobArgs{
jobSetName: jobSetName,
replicatedJobName: "replicated-job-1",
jobName: "test-jobset-replicated-job-1-test-job-0",
ns: ns,
replicas: 1,
jobIdx: 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these makeJob() calls in the test cases are identical, let's just store 1-2 test jobs as variable above the test cases, then re-use those variables as much as possible in the test cases. This applies to all the tests in this file.

This should drastically reduce the number of lines in this file and improve the readability of the test cases.

@googs1025
Copy link
Member Author

Maybe this PR can wait until @danielvegamyhre these PRs #496 #494 are merged before I make new changes.

@googs1025
Copy link
Member Author

We can prioritize handling this PR #490 first.

@danielvegamyhre
Copy link
Contributor

We can prioritize handling this PR #490 first.

Sounds good I will take a look at #490 again

@danielvegamyhre
Copy link
Contributor

/hold

until comments addressed

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 23, 2024
@danielvegamyhre
Copy link
Contributor

@googs1025 I am refactoring createJobs() to make it more testable in #496 since right now this function does a lot things unrelated to creating jobs (enforcing startup policy rules, suspend/resume, etc.) which makes it harder to unit test thoroughly.

This will make it easier for you to write the tests, and easier for me to review them as well.

I added some preliminary notes on some of your test in the meantime.

The refactor is complete, feel free to resume this PR

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2024
@googs1025
Copy link
Member Author

@danielvegamyhre Okay, I plan to spend some time next week working on this PR

@danielvegamyhre
Copy link
Contributor

@googs1025 Ok please submit PRs with tests for only 1 function at a time, so that it is easier to review.

Add comment explaining why we don't unconditionally compute firstFailedJob
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 26, 2024
@k8s-ci-robot
Copy link
Contributor

@googs1025: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-jobset-verify-main 96fdbd2 link true /test pull-jobset-verify-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@danielvegamyhre
Copy link
Contributor

@googs1025 are you still working on this?

@googs1025
Copy link
Member Author

@googs1025 are you still working on this?

Yes, I have been quite busy with work recently. I will start this PR next week.
thanks for reminding

@danielvegamyhre
Copy link
Contributor

Just following up on this @googs1025, do you still plan to finish this?

@googs1025
Copy link
Member Author

Just following up on this @googs1025, do you still plan to finish this?
@danielvegamyhre
Sorry, maybe I can let this issue go, I have other things with higher priority to do

@googs1025 googs1025 closed this May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants