-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check platform reboot cause to see if any reset happened during fast/warm-reboot #7920
Changes from 1 commit
0cef668
61192e8
286e115
b0bc476
fa40a2f
f4a07cc
c9c135c
3358e50
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,6 +60,17 @@ function getMountPoint() | |
echo $1 | python -c "import sys, json, os; mnts = [x for x in json.load(sys.stdin)[0]['Mounts'] if x['Destination'] == '/usr/share/sonic/hwsku']; print '' if len(mnts) == 0 else os.path.abspath(mnts[0]['Source'])" 2>/dev/null | ||
} | ||
|
||
function isPlatformReset() | ||
{ | ||
output=$(python3 -c "import sonic_platform.platform; p = sonic_platform.platform.Platform(); c = p.get_chassis(); hw_rc_major, hw_rc_minor = c.get_reboot_cause(); print(hw_rc_major)" 2>/dev/null) | ||
echo "${output}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe this should echo more context, rather than just the reboot cause? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sujinmkang while testing in 201811 image, This line(66) causing changes in getBootType() function. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Removing this line (echo "${output}") resolved the issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I commented about it yesterday but it said pending status. |
||
if [[ "${output}" != "Non-Hardware" ]]; then | ||
return 0 | ||
else | ||
return 1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe add a comment here to explain this logic? |
||
fi | ||
} | ||
|
||
function getBootType() | ||
{ | ||
# same code snippet in files/scripts/syncd.sh | ||
|
@@ -76,6 +87,9 @@ function getBootType() | |
*) | ||
TYPE='cold' | ||
esac | ||
if isPlatformReset; then | ||
TYPE='cold' | ||
fi | ||
echo "${TYPE}" | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sujinmkang : On checking the logs double time, found out one of the process(database.sh) is mentioning it as cold reset despite fast-boot happened. I suspect this is because either determine-reboot-cause or process-reboot-cause function is not up until that time?
Will there be any impact if database is still considering the BOOT_TYPE as cold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@santhosh-kt we're not using determine-reboot-cause or process-reboot-cause for this determination but only using platform api. Probably does the database always consider the boot_type as cold even without this change? Since we're sending the same boot_type to all container starting script. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry! I misunderstood. Except database.sh script all other daemon scripts are called lately if I understood.
So all those scripts(except database.sh) could have easily interpreted the boot type since get_reboot_cause() is showing correct value as track_reboot_reason.sh script got executed before these scripts.
But for database.sh, this get_reboot_cause() is showing wrong value as it runs even before s6100_platform.sh itself.
My question is, will wrong BOOT_TYPE has any impact in database.sh script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@santhosh-kt can we detect the platform driver got initialized or not? Or should we check the exception from platform api call to see the platform driver initialization status? Any other suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@santhosh-kt I checked the database container start code, I don't think the boot_type is be used for database container starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yxieca I don't think the database container is starting differently based on the boot-type. do you have any comments or concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Database container start code has a special handling during warm reboot: https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/docker_image_ctl.j2#L87