-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Performance monitoring update to CMIS_and_C-CMIS_support_for_ZR.md #1258
base: master
Are you sure you want to change the base?
Conversation
@keboliu @mihirpat1 @andywongarista please review |
@qinchuanares Please review |
@jaganbal-a you need to sign the EASY CLA |
The code PRs need be added to this HLD PR |
community review recording https://zoom.us/rec/share/p-EqFKsSHinsWM0mPs0sPlFURq7mCmDVw1URjKLbpcwv3Ib0FVei-9LpTvFubNo.M3RwumppZ9d6y0Uc |
Review comments from the community review.
|
Addressing the OCP review comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a can you please help to add the code PRs? Thanks.
@zhangyanzhao , I am working on the code changes and I will post once I am complete with unit testing. But this PR is for the high level design. so please approve if you do not have any further question on the HLD. |
7.Performance Monitoring - 400-G ZR module
@lguohan , Pls review the document. |
@jaganbal-a code PRs need be included in the HLD before HLD can be merged, can you please help to add the code PRs? Thanks. |
Improvement
Update to the flow diagram
Refer to the HLD for problem description and design. HLD: sonic-net/SONiC#1258
|
||
#show int trans pm history 15min window | ||
commands: | ||
1 - 11 PM window number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a what would the output be for window number history for which sample is yet not collected.? Say after boot, 15 minutes has passed and user wants to see window# 11 for 15min window
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
root@sonic:/home/cisco# show int trans pm history 15min window 11 -n asic2 Ethernet232
Tue Oct 24 17:33:41 UTC 2023
PM window: 15min
Ethernet232: Transceiver performance monitoring data not available for the requested window
|
||
#show int trans pm history 60sec window | ||
commands: | ||
1 - 14 PM window number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a can you show the calculation for the window range. Why 1-14? Why not 1-20?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 60sec window size, by design 15 windows are allocated.
In that, 14 windows are used for history -numbering from 1 to 14 and 1 window is used for viewing current statistics.
For 15min window size, 12 windows are allocated.
In that, 11 windows are used for history and 1 window is used for current.
For 24Hrs window size, 2 windows are allocated.
In that, 1 window used for history and another window used for current.
|
||
##### 7.5.1 Configurations | ||
1. performance monitor enable - Global CLI to enable PM on all ports. | ||
Before this configuration get implemented, PM will be enabled by default on all ports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a is this future implementation? i.e per port configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prgeor , Yes, this will future implementation. Global CLI to enable/disable PM on all ports.
Raised a github issue :- sonic-net/sonic-platform-daemons#402
- 60sec and | ||
- 120sec | ||
|
||
It is recommended to choose 30sec, platforms that have high CPU load can choose 120sec as PM interval. By default 60sec is the PM interval when no input provided by platform, please refer '7.5.4' for platform input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a can you provide a hyper link to section 7.5.4 https://stackoverflow.com/questions/2822089/how-to-link-to-part-of-the-same-document-in-markdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to hyperlink
The 60sec time window are filled as it is read from module if the PM interval is 60sec, the 60sec sample collected from module is then sampled for 15mins and 24Hr window every 60seconds. | ||
|
||
|
||
<img src ="https://user-images.githubusercontent.com/97986478/236910604-bbb52e77-5c62-4a7e-b64d-ff883d9b57d3.png" width=60% height=50%> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a why is the avg rx power remain constant at -10.61
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected the avg in the example and added pm window end time.
sampled_pm_dict['prefec_ber_min'] = min(float(pm_data_dict1['prefec_ber_min']), float(pm_data_dict2['prefec_ber_min'])) | ||
sampled_pm_dict['prefec_ber_max'] = max(float(pm_data_dict1['prefec_ber_max']), float(pm_data_dict2['prefec_ber_max'])) | ||
|
||
sampled_pm_dict['uncorr_frames_avg'] = self.average_of_two_val(float(pm_data_dict1['uncorr_frames_avg']), float(pm_data_dict2['uncorr_frames_avg'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a one example for min, max, avg is sufficient. We dont have to show for all PM parameters
Each window field(Key) in TRANSCEIVER_PM_WINDOW_STATS have a value string comprises of following fields. | ||
|
||
pm_win_start_time = 1*255VCHAR ; PM statistics start time for the window. | ||
pm_win_end_time = 1*255VCHAR ; PM statistics end time for the window. | ||
pm_win_current = 1*255VCHAR ; PM statistics collection is Progressing on this window. (True/False) | ||
prefec_ber_avg = FLOAT ; prefec ber avg | ||
prefec_ber_min = FLOAT ; prefec ber min | ||
prefec_ber_max = FLOAT ; prefec ber max | ||
cd_avg = FLOAT ; chromatic dispersion avg | ||
cd_min = FLOAT ; chromatic dispersion min | ||
cd_max = FLOAT ; chromatic dispersion max | ||
dgd_avg = FLOAT ; differential group delay avg | ||
dgd_min = FLOAT ; differential group delay min | ||
dgd_max = FLOAT ; differential group delay max | ||
sopmd_avg = FLOAT ; second order polarization mode dispersion avg | ||
sopmd_min = FLOAT ; second order polarization mode dispersion min | ||
sopmd_max = FLOAT ; second order polarization mode dispersion max | ||
pdl_avg = FLOAT ; polarization dependent loss avg | ||
pdl_min = FLOAT ; polarization dependent loss min | ||
pdl_max = FLOAT ; polarization dependent loss max | ||
osnr_avg = FLOAT ; optical signal to noise ratio avg | ||
osnr_min = FLOAT ; optical signal to noise ratio min | ||
osnr_max = FLOAT ; optical signal to noise ratio max | ||
esnr_avg = FLOAT ; electrical signal to noise ratio avg | ||
esnr_min = FLOAT ; electrical signal to noise ratio min | ||
esnr_max = FLOAT ; electrical signal to noise ratio max | ||
cfo_avg = FLOAT ; carrier frequency offset avg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaganbal-a we don't need to copy these again from section 2.1.5. A reference to 2.1.5 is sufficient. can you remove this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the redundant fields and added link.
@zhangyanzhao |
please check the description for the code PRs. |
Adding performance monitoring HLD .
Please refer to below code PR for the implementation
sonic-platform-common: sonic-net/sonic-platform-common#402
sonic-platform-daemons :sonic-net/sonic-platform-daemons#390
sonic-utilities : sonic-net/sonic-utilities#2927