-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Port oper down error status notification #3350
Conversation
@moshemos @eddyk-nvidia please review this PR |
|
||
if (port.m_portOperErrorToEvent.find(error_status) == port.m_portOperErrorToEvent.end()) | ||
{ | ||
++errors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if error_status is 0, no need query the map. If it is not 0 and not in the map, we probably need record an error log.
static const std::unordered_map<sai_port_error_status_t, std::string> db_key_errors; | ||
|
||
private: | ||
sai_port_error_status_t m_errorFlag = SAI_PORT_ERROR_STATUS_CLEAR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this member? I think the port event from redis already contains the error status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Junchao-Mellanox probably not required but i thought in case i need to take OA coredump for future debubbing i don't need redis dump.
private: | ||
sai_port_error_status_t m_errorFlag = SAI_PORT_ERROR_STATUS_CLEAR; | ||
size_t m_errorCount = 0; | ||
std::string m_dbKeyError; // DB key for this port error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this member? The DB key name is already in the static map
sai_port_error_status_t m_errorFlag = SAI_PORT_ERROR_STATUS_CLEAR; | ||
size_t m_errorCount = 0; | ||
std::string m_dbKeyError; // DB key for this port error | ||
std::time_t m_eventTime = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this member? The event time can be get whenever there is an error event. Seems no need to save it in a class.
@@ -193,6 +232,9 @@ class Port | |||
sai_object_id_t m_system_side_id = 0; | |||
sai_object_id_t m_line_side_id = 0; | |||
|
|||
/* Port oper error status to event map*/ | |||
std::unordered_map<sai_port_error_status_t, PortOperErrorEvent> m_portOperErrorToEvent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i suppose we only need a std::unordered_map<sai_port_error_status_t, uint32_t> map here to record the occur count for each error type.
const sai_port_error_status_t error_status = error.first; | ||
std::string error_name = error.second; | ||
|
||
port.m_portOperErrorToEvent[error_status] = PortOperErrorEvent(error_status, error_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no sure this is required
|
||
auto key = pevent->getDbKey(); | ||
vector<FieldValueTuple> tuples; | ||
FieldValueTuple tup1("oper_error_status", std::to_string(port.m_oper_error_status)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the latest change to notification structure - opencomputeproject/SAI#2087
Was discussed in the SAI community meeting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eddyk-nvidia that is for future use case. I can't wait for that to merge and start using as its already not meeting 202411 release time.
/azpw run |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
023e14f
to
6955d48
Compare
Signed-off-by: Prince George <prgeor@microsoft.com>
Handler Port oper down error status notification as per sai_port_error_status_t flag as part of existing port oper status notification sai_port_oper_status_notification_t What I did Whenever Orchagent gets the notification for port_state_change for link down event, OA will update the error count for the error flags that are set and the timestamp of the error port down event in new STATE_DB table PORT_OPERR_TABLE Why I did it To correlate various error events like MAC local and remote fault with the link oper down status notification.
Handler Port oper down error status notification as per sai_port_error_status_t flag as part of existing port oper status notification
sai_port_oper_status_notification_t
What I did
Whenever Orchagent gets the notification for
port_state_change
for link down event, OA will update the error count for the error flags that are set and the timestamp of the error port down event in new STATE_DB tablePORT_OPERR_TABLE
Why I did it
To correlate various error events like MAC local and remote fault with the link oper down status notification.
How I verified it
Since we don't have real hardware to test this modified notification, I modified the sairedis to generate this new notification on every link down event