-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Agent policies per each test execution #1866
Conversation
/test |
💚 Build Succeeded
History
cc @mrodm |
7fd7350
to
402b631
Compare
if !r.options.RunIndependentElasticAgent { | ||
// keep same behaviour as previously when Elastic Agent of the stack is used. | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought to keep the same behaviour when using Elastic Agent of the stack. WDYT ?
tdErr := r.tearDownTest(ctx) | ||
if tdErr != nil { | ||
logger.Errorf("failed to tear down runner: %s", tdErr.Error()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required to delete the policy created just for testing and also the related Data Streams.
test integrations |
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#10005 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the concept, but I wonder if we need as many policies as we have. We are creating three kinds of policies now.
policyToAssignDatastreamTests := policyToTest | ||
if r.shouldCreateNewAgentPolicyForTest() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the policyToTest
? Couldn't we always use the policy created here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I just update the policyToTest
to be like this one, that means that the namespace is random, it would work just while the system tests are not run per stages (--setup, --no-provision, --tear-down).
When these flags are used it would be used the same data stream created in the first run, retrieved from the service state:
policyToTest = &serviceStateData.CurrentPolicy |
That would mean that the method to delete old documents from the data stream could not be delete since that data stream would contain documents for sure.
Not tested, maybe it could be created always the policyToTest
and just used the one from the state file to assign back the policy to the agent in the scenarios where the tests are performed in stages.
And for those test runs without these flags (common usage), there would be data streams created for each run:
It could be added the logic to remove those data streams in one of the handlers.
In this PR these 3 policies are used as following:
- one for enrolling without any package policy
- there were having issues when the stack is running for more than X minutes that agents were not able to enroll using the default policy.
- one to assign the package policy to the agent
- this one would be used to assign back when running per stages after each --no-provision run
- one for the actual testing
- this policy is the one that would be using the Elastic Agent to run the system tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give it a try to remove one agent policy. I'll need to test with system test running per stages too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think we can go further in the direction of using an isolated policy and data stream for testing, so we don't need to clean documents at any moment. But we can incrementally go in this direction.
@@ -920,7 +919,7 @@ func (r *runner) prepareScenario(ctx context.Context, config *testConfig, svcInf | |||
policyEnroll := kibana.Policy{ | |||
Name: fmt.Sprintf("ep-test-system-enroll-%s-%s-%s", r.options.TestFolder.Package, r.options.TestFolder.DataStream, testTime), | |||
Description: fmt.Sprintf("test policy created by elastic-package to enroll agent for data stream %s/%s", r.options.TestFolder.Package, r.options.TestFolder.DataStream), | |||
Namespace: "ep", | |||
Namespace: "enrollep", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be also randomized, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll update it.
This policy has no package policy assigned so it will not create any data stream in any case, but better to have it randomized to avoid future issues.
r.resetAgentLogLevelHandler = func(ctx context.Context) error { | ||
if r.options.RunTestsOnly { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. In these cases, could we directly avoid adding the handler?
r.resetAgentLogLevelHandler = func(ctx context.Context) error { | |
if r.options.RunTestsOnly { | |
return nil | |
} | |
if r.options.RunTestsOnly { | |
r.resetAgentLogLevelHandler = func(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be done, but there are handlers that have more conditions to not run. For instance
r.removeAgentHandler = func(ctx context.Context) error {
if r.runTestsOnly {
return nil
}
// When not using independent agents, service deployers like kubernetes or custom agents create new Elastic Agent
if !r.runIndependentElasticAgent && !svcInfo.Agent.Independent {
return nil
}
I think I'll keep all the handlers as they are to keep the same pattern in all of them.
test integrations |
logger.Debug("creating test policy...") | ||
policyToAssignDatastreamTests := kibana.Policy{ | ||
Name: fmt.Sprintf("ep-test-system-%s-%s-%s", r.testFolder.Package, r.testFolder.DataStream, testTime), | ||
Description: fmt.Sprintf("test policy created by elastic-package test system for data stream %s/%s", r.testFolder.Package, r.testFolder.DataStream), | ||
Namespace: "ep", | ||
Namespace: common.CreateTestRunID(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, every execution will create a new Agent Policy for testing, except when running with --tear-down
. This tear down stage does not run any tests, so it is not needed.
Created or updated PR in integrations repository to test this version. Check elastic/integrations#10005 |
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#10092 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great change 👍
if r.resetAgentPolicyHandler != nil { | ||
if err := r.resetAgentPolicyHandler(cleanupCtx); err != nil { | ||
return err | ||
} | ||
r.resetAgentPolicyHandler = nil | ||
} | ||
|
||
// Shutting down the service should be run one of the first actions | ||
// to ensure that resources created by terraform are deleted even if other | ||
// errors fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should run all handlers, and report all returned errors if any. (For a future change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point.
There could be handlers that depend on others, but if so there will be another error. Or it could be detected the error and no execute the handler.
As you mentioned, for a future change.
// There are some issues when the stack is running for some time, | ||
// agents cannot enroll with the default policy | ||
// This enroll policy must be created even if independent Elastic Agents are not used. Agents created | ||
// in Kubernetes or Custom Agents require this enroll policy too (service deployer). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to comment on this, I didn't remember the issues after running for some time 👍
// This allows us to ensure that the Agent Policy used for testing is | ||
// assigned to the agent with all the required changes (e.g. Package DataStream) | ||
logger.Debug("creating test policy...") | ||
policyToAssignDatastreamTests := kibana.Policy{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. I find this name confusing. Why do we have to mention that this policy is going to have data streams assigned? As this is only used in this reduced scope, could we just call it policy
?
policyToAssignDatastreamTests := kibana.Policy{ | |
policy := kibana.Policy{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wrote that name while developing the feature and I didn't re-check it 😅
I'll rename it with just policy
👍
return err | ||
} | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 🔥
I think we are going to avoid some flakiness cases by removing this. And probably tests will be slightly faster.
💚 Build Succeeded
History
cc @mrodm |
Relates #787
This PR adds a new Agent Policy that will be created and configured per each system test execution.
This new Agent Policy is going to help us:
After each test run defined that new Agent Policy is deleted as well as the Data Stream created for it.