Assessment runner refactors #2123

stevenhorsman · 2024-10-21T16:19:04Z

Various refactors to the assessment helper and assessment runner to try and abstract out complexity of the large runner Assess section, to make it easier to read and reduce duplication

Drive by fix to drop the rust version post the merge of test/e2e: kbs: Use the cached kbs-client #2121

wainersm · 2024-10-28T20:59:28Z

src/cloud-api-adaptor/test/e2e/assessment_helpers.go

+			return podErr
+		}
+
+		t.Logf("podEvents: %+v\n", podEventsDescriptions)


Hi @stevenhorsman !

What does the "+" in "+v" do? I vaguely remember I had to remove it to print the string correctly, I just don't remember where.

The %+v includes the field names of the struct. In this case it's a string IIRC, so I should just switch to %s instead. I think the %+v was from an earlier version. Sorry about that.

wainersm · 2024-10-28T21:00:49Z

src/cloud-api-adaptor/test/e2e/assessment_helpers.go

+
+		t.Logf("podEvents: %+v\n", podEventsDescriptions)
+		if !strings.Contains(podEventsDescriptions, expectedPodEventDescribe) {
+			err = fmt.Errorf("error: Pod Descriptions don't contain Expected String %s", expectedPodEventDescribe)


Nit: s/Pod Descriptions/Pod Events/

Thanks - I usually think of the pod events in the context of kubectl describe, so my inconsistencies leak in.

mkulke · 2024-10-29T07:56:50Z

src/cloud-api-adaptor/test/e2e/assessment_helpers.go

+	pods, err := GetPodNamesByLabel(ctx, client, t, caaPod.GetObjectMeta().GetNamespace(), "app", "cloud-api-adaptor")
+	if err != nil {
+		return nil, fmt.Errorf("getCaaPod: GetPodNamesByLabel failed: %v", err)
+	}
+	caaPod.Name = pods.Items[0].Name


should we assert that it's exactly 1 pod? (sometimes there's e.g. a crashed pod in "Terminating" State and another one in "Creating")

Yes definitely. I will add that in. Thanks for the suggestion.

wainersm · 2024-10-29T12:56:35Z

src/cloud-api-adaptor/test/e2e/assessment_helpers.go

+	return descriptionsBuilder.String(), nil
+}
+
+func ComparePodEventWarningDescriptions(ctx context.Context, t *testing.T, client klient.Client, pod v1.Pod, expectedPodEvent string) error {


Suggestion: document the function so users understand the logic easily. You did it very well on the commit message, btw.

wainersm · 2024-10-29T14:26:10Z

src/cloud-api-adaptor/test/e2e/assessment_helpers.go

-			if err != nil {
-				return false, err
-			}
+	caaPod.Name = pods.Items[0].Name


The nodeName parameter is not longer used (can be removed) and first caa pod is picked up. Is there a bad consequence if running the test on multiple-workers type of cluster?

That's an interesting point. I think we need to pass the nodeName into the CAA pod so we can check we are getting the correct one. Let me try that

Ok, I've added 69e05f0 which I hope addresses your concern?

I think it does address. thanks @stevenhorsman !

We have some repeated patterns of looking through list of pods to find the one matching our expected name and then extracting some information from it. Rather than repeat this in multiple places we can refactor this pattern out in assessment_helpers and re-use it. Signed-off-by: stevenhorsman <steven@uk.ibm.com>

I used the `PodEventExtractor` method to compare against an expected error message, but didn't realise at the time that it just watched for the first warning message and returned that. We then use this output in `assessment_runner` to see if our expected error description is there. Note: As we use it in the `Assess` phase of testing and the pod should be in the expected state after the `WithSetup`, we should be able to replace the need for the watch and just get a list of the events - Rename to clarify that only warning (and error) events are considered and create a `ComparePodEventDescriptions` method in `assessment_helper` to help simplify the logic in `assessment_runner` and process all warning events looking for a description match, not just the first match. If no error was expected then mark the test as failing. - In some of the tests that expect failure events, as we are no longer watching for the first error, we should wait for pod to have gone into failed state before processing the events, in others we add a retry to check for the message. Based on the above logic we can simplify the instance type testing Signed-off-by: stevenhorsman <steven@uk.ibm.com>

Refactor out the instance type check to the assessment_helper, to make the assessment_runner flow easier to read Signed-off-by: stevenhorsman <steven@uk.ibm.com>

Refactor out the logic that checks the alternativeImage is used to make the assessment_runner flow cleaner Signed-off-by: stevenhorsman <steven@uk.ibm.com>

- Refactor out the logic that checks if nydus snapshotter was used for image pull, to make the assessment_runner flow cleaner. - Refactor IsPulledWithNydusSnapshotter to use remove the duplicated code and re-use other helper functions Signed-off-by: stevenhorsman <steven@uk.ibm.com>

Create new methods for VerifyCaaPodLogContains and getCaaPod to avoid duplication and improve readability. Signed-off-by: stevenhorsman <steven@uk.ibm.com>

No tests are using the `WithFailReason` option, so remove the code to reduce clutter Signed-off-by: stevenhorsman <steven@uk.ibm.com>

The [Go Wiki]|(https://go.dev/wiki/TestComments) states > On a practical level, prefer calling t.Error over t.Fatal. > t.Fatal is usually only appropriate when some piece > of test setup fails, without which you cannot run the > test at all. - Update t.Fatal's in assess methods to t.Error's - Pass pointers to pods rather than the Pod, like the Kubernetes e2e pod base does Signed-off-by: stevenhorsman <steven@uk.ibm.com>

Refactor out the logic that checks if the image pull timing, to make the assessment_runner flow cleaner. Signed-off-by: stevenhorsman <steven@uk.ibm.com>

After confidential-containers#2121 the rust version is no longer needed Signed-off-by: stevenhorsman <steven@uk.ibm.com>

As Wainer pointed out in the review, if we ran the e2e tests on a cluster with multiple worker nodes, then our logic for only checking for a single CAA pod might fail, so add an additional nodeName filter, so get the CAA that is running on the same node as the pod. Signed-off-by: stevenhorsman <steven@uk.ibm.com>

wainersm

Hey @stevenhorsman ! Great refactoring, the code is much nicer to read indeed!

stevenhorsman added the test_e2e_libvirt Run Libvirt e2e tests label Oct 21, 2024

stevenhorsman force-pushed the assessment-runner-refactors branch 9 times, most recently from 41d51c4 to 024978b Compare October 25, 2024 12:01

stevenhorsman marked this pull request as ready for review October 25, 2024 13:21

stevenhorsman requested a review from a team as a code owner October 25, 2024 13:21

stevenhorsman requested a review from wainersm October 25, 2024 13:41

wainersm reviewed Oct 28, 2024

View reviewed changes

mkulke approved these changes Oct 29, 2024

View reviewed changes

stevenhorsman force-pushed the assessment-runner-refactors branch from 024978b to 30a7d81 Compare October 29, 2024 08:58

wainersm reviewed Oct 29, 2024

View reviewed changes

stevenhorsman added 10 commits October 29, 2024 14:55

test/e2e: InstanceType refactor

003edef

Refactor out the instance type check to the assessment_helper, to make the assessment_runner flow easier to read Signed-off-by: stevenhorsman <steven@uk.ibm.com>

test/e2e: Refactor out VerifyAlternateImage

c6d937a

Refactor out the logic that checks the alternativeImage is used to make the assessment_runner flow cleaner Signed-off-by: stevenhorsman <steven@uk.ibm.com>

test/e2e: Create more helper functions

54993fa

Create new methods for VerifyCaaPodLogContains and getCaaPod to avoid duplication and improve readability. Signed-off-by: stevenhorsman <steven@uk.ibm.com>

test/e2e: Remove fail reason test case option

98855c2

No tests are using the `WithFailReason` option, so remove the code to reduce clutter Signed-off-by: stevenhorsman <steven@uk.ibm.com>

test/e2e: Refactor VerifyImagePullTimer

3790a9a

Refactor out the logic that checks if the image pull timing, to make the assessment_runner flow cleaner. Signed-off-by: stevenhorsman <steven@uk.ibm.com>

versions: Remove rust version

e1767d2

After confidential-containers#2121 the rust version is no longer needed Signed-off-by: stevenhorsman <steven@uk.ibm.com>

stevenhorsman force-pushed the assessment-runner-refactors branch from 30a7d81 to e1767d2 Compare October 29, 2024 15:00

stevenhorsman force-pushed the assessment-runner-refactors branch from d701c6e to 69e05f0 Compare October 29, 2024 15:40

wainersm approved these changes Oct 29, 2024

View reviewed changes

wainersm merged commit 5672559 into confidential-containers:main Oct 29, 2024
28 checks passed

stevenhorsman deleted the assessment-runner-refactors branch October 30, 2024 08:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assessment runner refactors #2123

Assessment runner refactors #2123

stevenhorsman commented Oct 21, 2024 •

edited

Loading

wainersm Oct 28, 2024

stevenhorsman Oct 29, 2024

wainersm Oct 28, 2024

stevenhorsman Oct 29, 2024

mkulke Oct 29, 2024

stevenhorsman Oct 29, 2024

wainersm Oct 29, 2024

wainersm Oct 29, 2024

stevenhorsman Oct 29, 2024

stevenhorsman Oct 29, 2024 •

edited

Loading

wainersm Oct 29, 2024

wainersm left a comment

Assessment runner refactors #2123

Assessment runner refactors #2123

Conversation

stevenhorsman commented Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevenhorsman Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wainersm left a comment

Choose a reason for hiding this comment

stevenhorsman commented Oct 21, 2024 •

edited

Loading

stevenhorsman Oct 29, 2024 •

edited

Loading