Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent status scenario #2875

Merged
merged 54 commits into from
Aug 21, 2023
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
da72c37
Update version to dummy 1.0.0.0'
maddieford Nov 8, 2022
59dbd22
Revert version change
maddieford Nov 8, 2022
633a826
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Nov 21, 2022
14a743f
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Dec 8, 2022
54ea0f3
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Jan 10, 2023
e79c4c5
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Feb 8, 2023
498b612
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Feb 14, 2023
1e269f4
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Mar 13, 2023
7b49e76
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Mar 24, 2023
0a426cc
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 6, 2023
17fbf6a
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 7, 2023
995cbb9
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 13, 2023
eaadc83
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 24, 2023
fb03e07
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 27, 2023
6a8e0d6
Merge remote-tracking branch 'upstream/develop' into develop
maddieford May 19, 2023
b4951c8
Merge branch 'develop' of github.com:Azure/WALinuxAgent into develop
maddieford Jun 6, 2023
c6d9300
Merge branch 'develop' of github.com:maddieford/WALinuxAgent into dev…
maddieford Jun 23, 2023
f650fe4
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Jul 10, 2023
a10bdfa
Merge branch 'develop' of github.com:maddieford/WALinuxAgent into dev…
maddieford Jul 10, 2023
d6624d0
Create files for agent status scenario
maddieford Jul 11, 2023
af0ec80
Add agent status test logic
maddieford Jul 11, 2023
f1684ad
fix pylint error
maddieford Jul 11, 2023
684d705
Add comment for retry
maddieford Jul 11, 2023
4985805
Merge branch 'develop' into agent_status_scenario
maddieford Jul 11, 2023
2505f69
Merge branch 'develop' into agent_status_scenario
maddieford Jul 11, 2023
84f207e
Mark failures as exceptions
maddieford Jul 20, 2023
08778ee
Merge branch 'develop' into agent_status_scenario
maddieford Jul 20, 2023
75b37a8
Improve messages in logs
maddieford Jul 25, 2023
78e9026
Improve comments
maddieford Jul 25, 2023
85c2878
Update comments
maddieford Jul 25, 2023
324fc3c
Check that agent status updates without processing additional goal st…
maddieford Aug 6, 2023
37f548f
Remove unused agent status exception
maddieford Aug 6, 2023
808f3ae
Update comment
maddieford Aug 6, 2023
64b2bea
Clean up comments, logs, and imports
maddieford Aug 6, 2023
5adcb61
Exception should inherit from baseexception
maddieford Aug 6, 2023
fa977e9
Import datetime
maddieford Aug 6, 2023
44d819a
Import datetime
maddieford Aug 6, 2023
a6255f8
Import timedelta
maddieford Aug 6, 2023
fac1779
instance view time is already formatted
maddieford Aug 6, 2023
919b9c7
Increse status update time
maddieford Aug 6, 2023
35fd0c3
Increse status update time
maddieford Aug 6, 2023
72ac70a
Increse status update time
maddieford Aug 6, 2023
06ca0ae
Increase timeout
maddieford Aug 6, 2023
a806360
Update comments and timeoutS
maddieford Aug 6, 2023
c96edba
Allow retry if agent status timestamp isn't updated after 30s
maddieford Aug 6, 2023
baa9798
Remove unused import
maddieford Aug 7, 2023
63c73cc
Merge branch 'develop' into agent_status_scenario
maddieford Aug 11, 2023
ae708e5
Update time value in comment
maddieford Aug 11, 2023
7da72e1
address PR comments
maddieford Aug 11, 2023
9c8a733
Check if properties are None
maddieford Aug 11, 2023
fc673e7
Make types & errors more readable
maddieford Aug 12, 2023
e2b3b3c
Re-use vm_agent variable
maddieford Aug 12, 2023
2e1f887
Add comment for dot operator
maddieford Aug 14, 2023
fc394af
Merge branch 'develop' into agent_status_scenario
maddieford Aug 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion tests_e2e/orchestrator/runbook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ variable:
#
# The test suites to execute
- name: test_suites
value: "agent_bvt, no_outbound_connections, extensions_disabled, agent_not_provisioned, fips, agent_ext_workflow"
value: "agent_bvt, no_outbound_connections, extensions_disabled, agent_not_provisioned, fips, agent_ext_workflow, agent_status"
- name: cloud
value: "AzureCloud"
is_case_visible: true
Expand Down
9 changes: 9 additions & 0 deletions tests_e2e/test_suites/agent_status.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#
# This scenario validates the agent status is updated without any other goal state changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"any", rather than "any other"

#
name: "AgentStatus"
tests:
- "agent_status/agent_status.py"
images:
- "endorsed"
- "endorsed-arm64"
143 changes: 143 additions & 0 deletions tests_e2e/tests/agent_status/agent_status.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
#!/usr/bin/env python3
import datetime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move those imports below, with the rest of them

from time import sleep

from assertpy import assert_that
# Microsoft Azure Linux Agent
#
# Copyright 2018 Microsoft Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#
# Validates the agent status is updated without any other goal state changes
#

from azure.mgmt.compute.models import VirtualMachineInstanceView

from tests_e2e.tests.lib.agent_test import AgentTest
from tests_e2e.tests.lib.agent_test_context import AgentTestContext
from tests_e2e.tests.lib.logging import log
from tests_e2e.tests.lib.ssh_client import SshClient
from tests_e2e.tests.lib.virtual_machine_client import VirtualMachineClient


def validate_instance_view_vmagent_status(instance_view: VirtualMachineInstanceView):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use global functions? shouldn't these be methods of the test class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pycharm suggested making them functions since they were static

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think they should be methods of the test class though, I'll move them in the test class

is_valid = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all those checks should probably raise exceptions on failure, that would simplify the code since there is no need for the valid flag and would also allow you include the actual error in the main assert of the test (see comment below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

status = instance_view.vm_agent.statuses[0]

# Validate message field
message = status.message
if message is None:
is_valid = False
log.info("Instance view is missing an agent status message, waiting to retry...")
elif 'unresponsive' in message:
is_valid = False
log.info("Instance view shows unresponsive agent, waiting to retry...")

# Validate display status field
display_status = status.display_status
if display_status is None:
is_valid = False
log.info("Instance view is missing an agent display status, waiting to retry...")
elif 'Not Ready' in display_status:
is_valid = False
log.info("Instance view shows agent status is not ready, waiting to retry...")

return is_valid


def validate_instance_view_vmagent(instance_view: VirtualMachineInstanceView):
is_valid = True

# Validate vm_agent_version field
vm_agent_version = instance_view.vm_agent.vm_agent_version
if vm_agent_version is None:
is_valid = False
log.info("Instance view is missing agent version, waiting to retry...")
elif 'Unknown' in vm_agent_version:
is_valid = False
log.info("Instance view shows agent version is unknown, waiting to retry...")

# Validate statuses field
statuses = instance_view.vm_agent.statuses
if statuses is None:
is_valid = False
log.info("Instance view is missing agent statuses, waiting to retry...")
elif len(statuses) < 1:
is_valid = False
log.info("Instance view is missing an agent status entry, waiting to retry...")
else:
is_valid = validate_instance_view_vmagent_status(instance_view=instance_view)

return is_valid


def validate_instance_view(instance_view: VirtualMachineInstanceView):
instance_view_is_valid = True

if instance_view.vm_agent is None:
instance_view_is_valid = False
log.info("Instance view is missing vm agent, waiting to retry...")
else:
instance_view_is_valid = validate_instance_view_vmagent(instance_view=instance_view)

if instance_view.statuses is None:
instance_view_is_valid = False
log.info("Instance view is missing statuses, waiting to retry...")

if instance_view_is_valid:
log.info("Instance view is valid, agent version: {0}, status: {1}"
.format(instance_view.vm_agent.vm_agent_version,
instance_view.vm_agent.statuses[0].display_status))

return instance_view_is_valid


class AgentStatus(AgentTest):
def __init__(self, context: AgentTestContext):
super().__init__(context)
self._ssh_client = SshClient(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have method in AgentTestContext, reusing that self._context.create_ssh_client() simplifies things.

ip_address=self._context.vm_ip_address,
username=self._context.username,
private_key_file=self._context.private_key_file)

def run(self):
log.info("")
log.info("*******Verifying the agent status*******")

vm = VirtualMachineClient(self._context.vm)

timeout = datetime.datetime.now() + datetime.timedelta(minutes=5)
instance_view_is_valid = False

# Retry validating instance view with timeout of 5 minutes
while datetime.datetime.now() < timeout and not instance_view_is_valid:
instance_view = vm.get_instance_view()
log.info("")
log.info("Validating VM Instance View...")
log.info("Instance view of VM is:\n%s", instance_view.serialize())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pretty-print the serialization?


instance_view_is_valid = validate_instance_view(instance_view)
if not instance_view_is_valid:
log.info("")
log.info("Instance view is not valid, waiting 10s before retry...")
sleep(10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sleep 10 secs with timeout 5 mins is too aggressive retries. I don't think too frequent reties needed here as we decide to wait for 5 mins.


log.info("")
assert_that(instance_view_is_valid).described_as("Timeout has expired, instance view is not valid.").is_true()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error message should include why the view is not valid



if __name__ == "__main__":
AgentStatus.run_from_command_line()