Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvising pre_checks for test_reload config #7953

Merged
merged 6 commits into from
Jun 26, 2023

Conversation

ansrajpu-git
Copy link
Contributor

@ansrajpu-git ansrajpu-git commented Apr 4, 2023

Description of PR

This PR is an add-on to the PR #7289.To resolve intermittent failures in test_reload_configuration_checks,

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

test_config_reload is failing intermittently at:

  • Line 97. Even though wait parameter to reboot is being passed as 5, but if plt_reboot_dict is defined in the inventory, this wait time would be overwritten to value in plt_reboot_dict - Line 220

  • Line 108. After successful config reload, we are not checking for database service has started before executing config reload again. If database service has not started, we see the following exception:

"Traceback (most recent call last):File \"/usr/local/bin/config\", line 8, in <module>
    sys.exit(config())\n  File \"/usr/local/lib/python3.9/dist-packages/click/core.py\", line 764, in __call__
   return self.main(*args, **kwargs)\n  File \"/usr/local/lib/python3.9/dist-packages/click/core.py\", line 717, in main\n    rv = self.invoke(ctx)  File \"/usr/local/lib/python3.9/dist-packages/click/core.py\", line 1134, in invoke   Command.invoke(self, ctx) File \"/usr/local/lib/python3.9/dist-packages/click/core.py\", line 956, in invoke    return ctx.invoke(self.callback, **ctx.params)  File \"/usr/local/lib/python3.9/dist-packages/click/core.py\", line 555, in invoke    return callback(*args, **kwargs)  File \"/usr/local/lib/python3.9/dist-packages/click/decorators.py\", line 17, in new_func    return f(get_current_context(), *args, **kwargs)  File \"/usr/local/lib/python3.9/dist-packages/config/main.py\", line 1104, in config    ctx.obj = Db()  File \"/usr/local/lib/python3.9/dist-packages/utilities_common/db.py\", line 12, in __init__    self.cfgdb.connect()  File \"/usr/lib/python3/dist-packages/swsscommon/swsscommon.py\", line 1862, in connect    return _swsscommon.ConfigDBConnector_Native_connect(self, wait_for_init, retry_on)\nRuntimeError: Unable to connect to redis: Cannot assign requested address"

How did you do it?

-the wait time should only be changed to value defined in plt_reboot_dict only when its value is 0. Else should pick the value sent in the function call.
-Check if interfaces-config.service is exited successfully & database containers has started before doing config reload for the second time

How did you verify/test it?

Ran the test against T2 chassis multiple times

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The detected issues may be old or new. For new issues, please try to fix them.

For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame
author of this pull request. But if you can take extra effort to fix the old issues as well, that would be great!

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/common/reboot.py:381:27: E127 continuation line over-indented for visual indent
tests/common/reboot.py:390:30: E127 continuation line over-indented for visual indent
tests/common/reboot.py:393:26: E127 continuation line over-indented for visual indent
tests/platform_tests/test_reload_config.py:12:1: F401 'tests.common.fixtures.conn_graph_facts.conn_graph_facts' imported but unused
tests/platform_tests/test_reload_config.py:25:1: E302 expected 2 blank lines, found 1
tests/platform_tests/test_reload_config.py:25:64: F811 redefinition of unused 'conn_graph_facts' from line 12
tests/platform_tests/test_reload_config.py:50:33: E231 missing whitespace after ':'
tests/platform_tests/test_reload_config.py:78:1: E302 expected 2 blank lines, found 1
tests/platform_tests/test_reload_config.py:78:82: F811 redefinition of unused 'conn_graph_facts' from line 12

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@ansrajpu-git ansrajpu-git changed the title Asrajput Improvising system pre_checks for test_reload config Apr 4, 2023
@ansrajpu-git ansrajpu-git changed the title Improvising system pre_checks for test_reload config Improvising pre_checks for test_reload config Apr 4, 2023
@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/platform_tests/test_reload_config.py:118:1: W191 indentation contains tabs
tests/platform_tests/test_reload_config.py:118:1: E101 indentation contains mixed spaces and tabs
tests/platform_tests/test_reload_config.py:118:5: E128 continuation line under-indented for visual indent

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@rlhui rlhui requested a review from wenyiz2021 June 23, 2023 08:34
if plt_reboot_ctrl:
wait = plt_reboot_ctrl['wait']
timeout = plt_reboot_ctrl['timeout']
if plt_reboot_ctrl:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change indention here?

Copy link
Contributor

@wenyiz2021 wenyiz2021 Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a feasible fix, it'll only overwrite timeout if wait == 0, however, the intention for having plt_reboot_ctrl is to give privilege for some TBs(chassis) to predefine a new timeout/wait that has more tolerance.

I just checked the wait is only used

time.sleep(wait)
, which is time for dut to be stabilize, and 5 sec is not enough for chassis. why do we want to keep it as 5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

**the wait has kept for small duration intentionally to verifythe test -for reload configuration checks. The test is simulated to perform config reload before the dut is stablize and expect a check to the proper message.
***The plt_reboot_ctrl should only apply, if the (wait) time is not explicitly provided for a known objective

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @ansrajpu-git, please check my latest change, I have added a flag to control this, instead of overwriting both timeout and wait only if wait is 0.
please see if it works for you too?

@wenyiz2021
Copy link
Contributor

wenyiz2021 commented Jun 23, 2023

hi @ansrajpu-git, thanks for your fix.
I have added some additional change:

  1. added a flag plt_reboot_ctrl_overwrite in reboot function to decide whether we want to overwrite wait and timeout. by default is to overwrite, in this test test_reload_configuration_checks we don't want to wait for that long, we want to catch transient error, so the flag is False.
  2. changed to run on every sku with enum_rand_one_per_hwsku_hostname

I have confirmed this worked on Arista chassis, please also confirm if it works for your case

@mssonicbld
Copy link
Collaborator

@ansrajpu-git PR conflicts with 202205 branch

@rlhui
Copy link

rlhui commented Jul 7, 2023

@ansrajpu-git please raise an explicit PR for 202205, thanks

yejianquan pushed a commit that referenced this pull request Jul 7, 2023
…#8843)

Description of PR
This is cherry-pick of #7953

* Update multi_asic.py

* Update sonic.py

* added flag plt_reboot_ctrl_overwrite in reboot function to decide whether we want to overwrite wait and timeout.

added a flag plt_reboot_ctrl_overwrite in reboot function to decide whether we want to overwrite wait and timeout. by default is to overwrite, in this test test_reload_configuration_checks we don't want to wait for that long, we want to catch transient error, so the flag is False.

* changed to run on every sku with enum_rand_one_per_hwsku_hostname

co-authorized by: jianquanye@microsoft.com
wangxin pushed a commit that referenced this pull request Jul 19, 2023
This PR has a few fixes missed during cherry-pick of #7953 into 202205 earlier via PR 8843

Co-authored-by: mlok <marty.lok@nokia.com>
mrkcmo pushed a commit to Azarack/sonic-mgmt that referenced this pull request Oct 3, 2023
* Fix for test_config_reload.py_Retry later error within a short interval after reboot, by adding a flag, by default we overwrite with plt_reboot_ctrl in inventory, this test does not overwrite, keep small interval
* Adding checks for system up before config_reload is initiated
* change to run on every sku

---------

Co-authored-by: wenyiz2021 <91497961+wenyiz2021@users.noreply.github.com>
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this pull request Jan 25, 2024
* Fix for test_config_reload.py_Retry later error within a short interval after reboot, by adding a flag, by default we overwrite with plt_reboot_ctrl in inventory, this test does not overwrite, keep small interval
* Adding checks for system up before config_reload is initiated
* change to run on every sku

---------

Co-authored-by: wenyiz2021 <91497961+wenyiz2021@users.noreply.github.com>
@JibinBao JibinBao mentioned this pull request Feb 8, 2024
8 tasks
JibinBao added a commit to JibinBao/sonic-mgmt that referenced this pull request Feb 8, 2024
1. Revert PR: sonic-net#7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: sonic-net#7953 due to the above same reason

Change-Id: Ifde0a83020bb06fb191a646e5ca91738b864ad7f
JibinBao added a commit to JibinBao/sonic-mgmt that referenced this pull request Feb 8, 2024
1. Revert PR: sonic-net#7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: sonic-net#7953 due to the above same reason

Change-Id: I022a948cb2e36ba7d5538258159ea047d13d80a7
@JibinBao JibinBao mentioned this pull request Feb 8, 2024
8 tasks
wangxin pushed a commit that referenced this pull request Mar 4, 2024
1. Revert PR: #7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: #7953 due to the above same reason
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Mar 4, 2024
1. Revert PR: sonic-net#7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: sonic-net#7953 due to the above same reason
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Mar 4, 2024
1. Revert PR: sonic-net#7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: sonic-net#7953 due to the above same reason
wangxin pushed a commit that referenced this pull request Mar 4, 2024
Cherry-pick #11674 to 202205 branch
1. Revert PR: #7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: #7953 due to the above same reason
mssonicbld pushed a commit that referenced this pull request Mar 4, 2024
1. Revert PR: #7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: #7953 due to the above same reason
mssonicbld pushed a commit that referenced this pull request Mar 4, 2024
1. Revert PR: #7289, because the relevant issue has been fixed by PR:sonic-net/sonic-buildimage#15080
2. Remove  wait_until(60, 1, 0, check_interfaces_config_service_status, duthost) in PR: #7953 due to the above same reason
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants