Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[202012][fast-reboot] Remove FLEX_COUNTER_TABLE from config_db.json b… #1774

Merged
merged 1 commit into from
Aug 25, 2021

Conversation

vaibhavhd
Copy link
Contributor

Fix sonic-net/sonic-buildimage#8523

What I did

Remove FLEX_COUNTER_TABLE from config_db.json before fast-reboot to allow delaying FLEX counter polling after fast-reboot.

Delaying FLEX counter polling is important to keep fastboot dataplane downtime under 30s.

How I did it

In the going down path, add a step to modify config_db.json - remove the key:value for FLEX_COUNTER_TABLE table.

How to verify it

Repro'd the issue in the latest 202012 image.

With the fix, the counter polling is delayed and downtime is back to normal.

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

# Remove FLEX_COUNTER_TABLE from config_db.json
# This is done so that in fast-reboot recovery path, FLEX_COUNTER polling is delayed.
# Delayed FLEX_COUNTER polling is an attempt keep dataplane downtime below 30s threshold
jq --indent 4 'del(.FLEX_COUNTER_TABLE)' ${CONFIG_DB_FILE} > ${CONFIG_DB_FILE}.new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we restore the config, if use has made some configs?

Copy link
Contributor Author

@vaibhavhd vaibhavhd Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For branches (201811, 201911, 202012):
The user changed configuration to FLEX_COUNTER_TABLE will be reset. This is acceptable so far as for the upgrades, we anyway delete the /host/old_config/config_db.json file (basically removing all of the old config, and letting new image take the default config).

For newer branches (2021**):
Below PRs are targeted to address the issue of keeping the user configurations after reboot. The idea is to maintain a delay indicator that OA will check and decide to enable/disable flex counter polling.
sonic-net/sonic-buildimage#8500
sonic-net/sonic-swss-common#523
sonic-net/sonic-swss#1877
#1768

Adding you to the discussion thread about this topic.

@vaibhavhd
Copy link
Contributor Author

/Azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@vaibhavhd
Copy link
Contributor Author

/Azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@vaibhavhd
Copy link
Contributor Author

/Azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@vaibhavhd vaibhavhd merged commit a72e407 into sonic-net:202012 Aug 25, 2021
@vaibhavhd vaibhavhd deleted the fastboot-handle-flexcounter branch August 25, 2021 15:59
vdahiya12 added a commit to vdahiya12/sonic-utilities that referenced this pull request Aug 26, 2021
vaibhavhd added a commit to sonic-net/sonic-buildimage that referenced this pull request Aug 26, 2021
Update sonic-utilities submodule to latest in 202012 branch:

[show priority-group drop counters] Add user info output when user want to check PG counters and polling are disabled sonic-net/sonic-utilities#1678
[route_check] Filter out VNET routes sonic-net/sonic-utilities#1612
[Show] Update the subcommands of Kdump. sonic-net/sonic-utilities#1682
Add mock support for swsscommon classes sonic-net/sonic-utilities#1780
[acl_loader]: add iptype match to the rules for dataplane acl sonic-net/sonic-utilities@205aff8
[202012][fast-reboot] Remove FLEX_COUNTER_TABLE from config_db.json before fast-reboot sonic-net/sonic-utilities#1774
vaibhavhd added a commit that referenced this pull request Sep 9, 2021
…boot (#1804)

To reduce fastboot dataplane downtime, delay flex counters by removing the flex_counter table from config_db before reboot.
This is porting changes from 202012 branch to 201911 branch.
Related PR for 202012 branch: #1774
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants