Create Agent policies per each test execution #1866

mrodm · 2024-05-24T11:44:52Z

Relates #787

This PR adds a new Agent Policy that will be created and configured per each system test execution.

This new Agent Policy is going to help us:

Each test will ingest documents in its own Data Stream.
It should not be needed to manage deletion of docs, since all executions will create a new Data Stream for it
- Previously, using the same Agent Policy, the data streams could contain documents form previous runs and sometimes it is not possible to check if all documents have been removed or not, as it happens in this PR when running per stages system tests:
  - Update installation/uninstallation package process in system tests #1845
  - https://buildkite.com/elastic/elastic-package/builds/3162

After each test run defined that new Agent Policy is deleted as well as the Data Stream created for it.

mrodm · 2024-05-24T19:14:29Z

/test

elasticmachine · 2024-05-24T19:49:56Z

💚 Build Succeeded

Buildkite Build
Commit: 7fd7350

History

💔 Build #3221 failed 7fd7350
💚 Build #3213 succeeded 3c98df1

cc @mrodm

mrodm · 2024-05-28T19:12:25Z

internal/testrunner/runners/system/runner.go

+	if !r.options.RunIndependentElasticAgent {
+		// keep same behaviour as previously when Elastic Agent of the stack is used.
+		return false
+	}


I thought to keep the same behaviour when using Elastic Agent of the stack. WDYT ?

internal/testrunner/runners/system/runner.go

mrodm · 2024-05-29T09:12:27Z

internal/testrunner/runners/system/runner.go

+		tdErr := r.tearDownTest(ctx)
+		if tdErr != nil {
+			logger.Errorf("failed to tear down runner: %s", tdErr.Error())
+		}


Required to delete the policy created just for testing and also the related Data Streams.

mrodm · 2024-05-29T11:18:14Z

test integrations

mrodm · 2024-06-05T16:00:26Z

test integrations

elasticmachine · 2024-06-05T16:06:16Z

Created or updated PR in integrations repository to test this version. Check elastic/integrations#10005

jsoriano

I like the concept, but I wonder if we need as many policies as we have. We are creating three kinds of policies now.

jsoriano · 2024-06-05T16:05:51Z

internal/testrunner/runners/system/runner.go

+	policyToAssignDatastreamTests := policyToTest
+	if r.shouldCreateNewAgentPolicyForTest() {


Do we need the policyToTest? Couldn't we always use the policy created here?

If I just update the policyToTest to be like this one, that means that the namespace is random, it would work just while the system tests are not run per stages (--setup, --no-provision, --tear-down).

When these flags are used it would be used the same data stream created in the first run, retrieved from the service state:

elastic-package/internal/testrunner/runners/system/runner.go

Line 859 in 858808b

policyToTest = &serviceStateData.CurrentPolicy

That would mean that the method to delete old documents from the data stream could not be delete since that data stream would contain documents for sure.

Not tested, maybe it could be created always the policyToTest and just used the one from the state file to assign back the policy to the agent in the scenarios where the tests are performed in stages.

And for those test runs without these flags (common usage), there would be data streams created for each run:

It could be added the logic to remove those data streams in one of the handlers.

In this PR these 3 policies are used as following:

one for enrolling without any package policy

there were having issues when the stack is running for more than X minutes that agents were not able to enroll using the default policy.

one to assign the package policy to the agent

this one would be used to assign back when running per stages after each --no-provision run

one for the actual testing

this policy is the one that would be using the Elastic Agent to run the system tests.

I'll give it a try to remove one agent policy. I'll need to test with system test running per stages too.

internal/testrunner/runners/system/runner.go

jsoriano

LGTM. I think we can go further in the direction of using an isolated policy and data stream for testing, so we don't need to clean documents at any moment. But we can incrementally go in this direction.

jsoriano · 2024-06-06T10:33:39Z

internal/testrunner/runners/system/runner.go

@@ -920,7 +919,7 @@ func (r *runner) prepareScenario(ctx context.Context, config *testConfig, svcInf
 		policyEnroll := kibana.Policy{
 			Name:        fmt.Sprintf("ep-test-system-enroll-%s-%s-%s", r.options.TestFolder.Package, r.options.TestFolder.DataStream, testTime),
 			Description: fmt.Sprintf("test policy created by elastic-package to enroll agent for data stream %s/%s", r.options.TestFolder.Package, r.options.TestFolder.DataStream),
-			Namespace:   "ep",
+			Namespace:   "enrollep",


This could be also randomized, right?

Sure, I'll update it.
This policy has no package policy assigned so it will not create any data stream in any case, but better to have it randomized to avoid future issues.

jsoriano · 2024-06-06T10:35:42Z

internal/testrunner/runners/system/runner.go

 	r.resetAgentLogLevelHandler = func(ctx context.Context) error {
+		if r.options.RunTestsOnly {
+			return nil
+		}


Nit. In these cases, could we directly avoid adding the handler?

Suggested change

r.resetAgentLogLevelHandler = func(ctx context.Context) error {

if r.options.RunTestsOnly {

return nil

}

if r.options.RunTestsOnly {

r.resetAgentLogLevelHandler = func(ctx context.Context) error {

It could be done, but there are handlers that have more conditions to not run. For instance

r.removeAgentHandler = func(ctx context.Context) error { if r.runTestsOnly { return nil } // When not using independent agents, service deployers like kubernetes or custom agents create new Elastic Agent if !r.runIndependentElasticAgent && !svcInfo.Agent.Independent { return nil }

I think I'll keep all the handlers as they are to keep the same pattern in all of them.

…handler)

mrodm · 2024-06-06T12:11:49Z

test integrations

mrodm · 2024-06-06T12:18:22Z

internal/testrunner/runners/system/runner.go

+		logger.Debug("creating test policy...")
+		policyToAssignDatastreamTests := kibana.Policy{
 			Name:        fmt.Sprintf("ep-test-system-%s-%s-%s", r.testFolder.Package, r.testFolder.DataStream, testTime),
 			Description: fmt.Sprintf("test policy created by elastic-package test system for data stream %s/%s", r.testFolder.Package, r.testFolder.DataStream),
-			Namespace:   "ep",
+			Namespace:   common.CreateTestRunID(),


Now, every execution will create a new Agent Policy for testing, except when running with --tear-down. This tear down stage does not run any tests, so it is not needed.

elasticmachine · 2024-06-06T12:19:01Z

Created or updated PR in integrations repository to test this version. Check elastic/integrations#10005

mrodm · 2024-06-06T13:17:49Z

test integrations

elasticmachine · 2024-06-06T13:23:08Z

Created or updated PR in integrations repository to test this version. Check elastic/integrations#10092

jsoriano

Great change 👍

jsoriano · 2024-06-06T13:45:54Z

internal/testrunner/runners/system/runner.go

 	if r.resetAgentPolicyHandler != nil {
 		if err := r.resetAgentPolicyHandler(cleanupCtx); err != nil {
 			return err
 		}
 		r.resetAgentPolicyHandler = nil
 	}

+	// Shutting down the service should be run one of the first actions
+	// to ensure that resources created by terraform are deleted even if other
+	// errors fail.


Maybe we should run all handlers, and report all returned errors if any. (For a future change).

That's a good point.

There could be handlers that depend on others, but if so there will be another error. Or it could be detected the error and no execute the handler.

As you mentioned, for a future change.

jsoriano · 2024-06-06T13:52:22Z

internal/testrunner/runners/system/runner.go

+		// There are some issues when the stack is running for some time,
+		// agents cannot enroll with the default policy
+		// This enroll policy must be created even if independent Elastic Agents are not used. Agents created
+		// in Kubernetes or Custom Agents require this enroll policy too (service deployer).


Good to comment on this, I didn't remember the issues after running for some time 👍

jsoriano · 2024-06-06T13:56:27Z

internal/testrunner/runners/system/runner.go

+		// This allows us to ensure that the Agent Policy used for testing is
+		// assigned to the agent with all the required changes (e.g. Package DataStream)
+		logger.Debug("creating test policy...")
+		policyToAssignDatastreamTests := kibana.Policy{


Nit. I find this name confusing. Why do we have to mention that this policy is going to have data streams assigned? As this is only used in this reduced scope, could we just call it policy?

Suggested change

policyToAssignDatastreamTests := kibana.Policy{

policy := kibana.Policy{

I just wrote that name while developing the feature and I didn't re-check it 😅
I'll rename it with just policy 👍

jsoriano · 2024-06-06T14:02:33Z

internal/testrunner/runners/system/runner.go

-		return err
-	}
-	return nil
-}


👏 🔥

I think we are going to avoid some flakiness cases by removing this. And probably tests will be slightly faster.

elasticmachine · 2024-06-06T15:13:36Z

💚 Build Succeeded

Buildkite Build
Commit: 0837447

History

💚 Build #3359 succeeded f6ee2bd
💔 Build #3358 failed 7dd578f
💚 Build #3357 succeeded 7cfa6db
💔 Build #3356 failed f39c935
💚 Build #3349 succeeded b3b7a1d
💚 Build #3333 succeeded 8488344

cc @mrodm

mrodm self-assigned this May 24, 2024

mrodm added 8 commits May 28, 2024 09:00

Create policy for no-provision - missing to re-assign policy

da98733

Ensure test policy is re-assigned

dd9bf60

Delete agent policies created for each test

df6268a

delete datastream created for testing

ed1411e

Apply changes to all scenarios

1c55b1c

Skip tests with terraform

2bca8d8

Fix env. variable name

60c34da

Change order tear down handlers

402b631

mrodm force-pushed the create-policy-per-test branch from 7fd7350 to 402b631 Compare May 28, 2024 07:02

mrodm added 9 commits May 28, 2024 09:07

Exit loop if getDocs returns zero documents

e7ea9ff

Reorder handlers and change condition

8e95b4d

Add new tear down handler

a421e83

Update comment

8c9c5d8

Add one more test package

c11361f

Remove exceptions in wait loop to delete docs

7d71562

Update some logger calls to use the right formats

e4e8751

Restore packages skipped

f43dc77

Remove empty line

df1835c

mrodm commented May 28, 2024

View reviewed changes

internal/testrunner/runners/system/runner.go Show resolved Hide resolved

mrodm added 3 commits May 28, 2024 21:16

Remove mustBeZero parameter from delete docs function

d9e8000

Add handler to clean test scenario - test only stage

7769e3c

Skip handlers that should not be executed with --no-provision

91676d7

mrodm commented May 29, 2024

View reviewed changes

mrodm marked this pull request as ready for review May 29, 2024 09:13

mrodm requested a review from a team May 29, 2024 09:13

mrodm added 5 commits June 5, 2024 14:08

Change condition in handler

46b3ff8

Merge remote-tracking branch 'upstream/main' into create-policy-per-test

2570fee

Use helper from common

2379f81

Add comment

8372d73

Reassign policy back to agent for both independent and stack agents

8488344

jsoriano approved these changes Jun 5, 2024

View reviewed changes

Update namespace in policyEnroll

b3b7a1d

jsoriano approved these changes Jun 6, 2024

View reviewed changes

mrodm added 4 commits June 6, 2024 12:54

Avoid creating a new policy if no stages are used

f39c935

Merge upstream/main onto create-policy-per-test

7cfa6db

Remove code to wait until docs are removed from datastream (and wipe …

3e94058

…handler)

Randomized namespace in enroll Agent Policy

7dd578f

mrodm commented Jun 6, 2024

View reviewed changes

mrodm requested a review from jsoriano June 6, 2024 12:18

Create enroll policy if not stages are used

f6ee2bd

elasticmachine mentioned this pull request Jun 6, 2024

Test elastic-package#1866 - DO NOT MERGE elastic/integrations#10092

Closed

jsoriano approved these changes Jun 6, 2024

View reviewed changes

Rename variable

0837447

jsoriano approved these changes Jun 6, 2024

View reviewed changes

mrodm enabled auto-merge (squash) June 6, 2024 14:59

mrodm merged commit 3cc9d99 into elastic:main Jun 6, 2024
3 checks passed

mrodm deleted the create-policy-per-test branch June 6, 2024 15:13

mrodm mentioned this pull request Jun 6, 2024

Move installation package in policy and system tests #1892

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create Agent policies per each test execution #1866

Create Agent policies per each test execution #1866

mrodm commented May 24, 2024 •

edited

Loading

mrodm commented May 24, 2024

elasticmachine commented May 24, 2024

mrodm May 28, 2024

mrodm May 29, 2024

mrodm commented May 29, 2024

mrodm commented Jun 5, 2024

elasticmachine commented Jun 5, 2024

jsoriano left a comment

jsoriano Jun 5, 2024

mrodm Jun 6, 2024

mrodm Jun 6, 2024

jsoriano left a comment

jsoriano Jun 6, 2024

mrodm Jun 6, 2024

jsoriano Jun 6, 2024

mrodm Jun 6, 2024

mrodm commented Jun 6, 2024

mrodm Jun 6, 2024

elasticmachine commented Jun 6, 2024

mrodm commented Jun 6, 2024

elasticmachine commented Jun 6, 2024

jsoriano left a comment

jsoriano Jun 6, 2024

mrodm Jun 6, 2024

jsoriano Jun 6, 2024

jsoriano Jun 6, 2024

mrodm Jun 6, 2024

jsoriano Jun 6, 2024

elasticmachine commented Jun 6, 2024

		policyToAssignDatastreamTests := policyToTest
		if r.shouldCreateNewAgentPolicyForTest() {

	policyToAssignDatastreamTests := kibana.Policy{
	policy := kibana.Policy{

Create Agent policies per each test execution #1866

Create Agent policies per each test execution #1866

Conversation

mrodm commented May 24, 2024 • edited Loading

mrodm commented May 24, 2024

elasticmachine commented May 24, 2024

💚 Build Succeeded

History

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrodm commented May 29, 2024

mrodm commented Jun 5, 2024

elasticmachine commented Jun 5, 2024

jsoriano left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsoriano left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrodm commented Jun 6, 2024

Choose a reason for hiding this comment

elasticmachine commented Jun 6, 2024

mrodm commented Jun 6, 2024

elasticmachine commented Jun 6, 2024

jsoriano left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticmachine commented Jun 6, 2024

💚 Build Succeeded

History

mrodm commented May 24, 2024 •

edited

Loading