Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue: port_type is referenced before initialized #2323

Merged

Conversation

stephenxs
Copy link
Collaborator

@stephenxs stephenxs commented Aug 20, 2022

Signed-off-by: Stephen Sun stephens@nvidia.com

What I did

Followup fixes : #2312

How I did it

How to verify it

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@judyjoseph judyjoseph merged commit e14f679 into sonic-net:master Aug 20, 2022
@stephenxs stephenxs deleted the fix-port-type-refer-before-init branch August 21, 2022 08:23
@mlok-nokia
Copy link
Contributor

mlok-nokia commented Aug 22, 2022

This may not a right fix. One of the real issue is the function call doesn't handle the internal port correctly. Role such as: Int, Inb and Rec. On a platform with those ports. The following is the output:

admin@sonic:~$ show inter status
Invalid port 'Ethernet-Rec0'
Invalid port 'Ethernet-Rec0'
Invalid port 'Ethernet-IB0'
Invalid port 'Ethernet-IB0'
Invalid port 'Ethernet-IB1'
Invalid port 'Ethernet-IB1'
Invalid port 'Ethernet-Rec1'
Invalid port 'Ethernet-Rec1'
     Interface                            Lanes    Speed    MTU    FEC          Alias            Vlan    Oper    Admin                                             Type    Asym PFC
--------------  -------------------------------  -------  -----  -----  -------------  --------------  ------  -------  -----------------------------------------------  ----------
     Ethernet0          72,73,74,75,76,77,78,79     400G   9100    N/A      Ethernet0  PortChannel101      up       up  QSFP-DD Double Density 8X Pluggable Transceiver         off
     Ethernet1          80,81,82,83,84,85,86,87     400G   9100    N/A      Ethernet1          routed    down     down                                              N/A         off
     Ethernet2          88,89,90,91,92,93,94,95     400G   9100    N/A      Ethernet2          routed    down     down                                              N/A         off
     Ethernet3      96,97,98,99,100,101,102,103     400G   9100    N/A      Ethernet3          routed    down     down                                              N/A         off
     Ethernet4  104,105,106,107,108,109,110,111     400G   9100    N/A      Ethernet4          routed    down     down                                              N/A         off
     Ethernet5  112,113,114,115,116,117,118,119     400G   9100    N/A      Ethernet5          routed    down     down                                              N/A         off
     Ethernet6  120,121,122,123,124,125,126,127     400G   9100    N/A      Ethernet6          routed    down     down                                              N/A         off
     Ethernet7  128,129,130,131,132,133,134,135     400G   9100    N/A      Ethernet7          routed    down     down                                              N/A         off
     Ethernet8  136,137,138,139,140,141,142,143     400G   9100    N/A      Ethernet8  PortChannel101      up       up  QSFP-DD Double Density 8X Pluggable Transceiver         off
     Ethernet9          64,65,66,67,68,69,70,71     400G   9100    N/A      Ethernet9          routed    down     down                                              N/A         off

@judyjoseph
Copy link
Contributor

This may not a right fix. One of the real issue is the function call doesn't handle the internal port correctly. Role such as: Int, Inb and Rec. On a platform with those ports. The following is the output:

admin@sonic:~$ show inter status
Invalid port 'Ethernet-Rec0'
Invalid port 'Ethernet-Rec0'
Invalid port 'Ethernet-IB0'
Invalid port 'Ethernet-IB0'
Invalid port 'Ethernet-IB1'
Invalid port 'Ethernet-IB1'
Invalid port 'Ethernet-Rec1'
Invalid port 'Ethernet-Rec1'
     Interface                            Lanes    Speed    MTU    FEC          Alias            Vlan    Oper    Admin                                             Type    Asym PFC
--------------  -------------------------------  -------  -----  -----  -------------  --------------  ------  -------  -----------------------------------------------  ----------
     Ethernet0          72,73,74,75,76,77,78,79     400G   9100    N/A      Ethernet0  PortChannel101      up       up  QSFP-DD Double Density 8X Pluggable Transceiver         off
     Ethernet1          80,81,82,83,84,85,86,87     400G   9100    N/A      Ethernet1          routed    down     down                                              N/A         off
     Ethernet2          88,89,90,91,92,93,94,95     400G   9100    N/A      Ethernet2          routed    down     down                                              N/A         off
     Ethernet3      96,97,98,99,100,101,102,103     400G   9100    N/A      Ethernet3          routed    down     down                                              N/A         off
     Ethernet4  104,105,106,107,108,109,110,111     400G   9100    N/A      Ethernet4          routed    down     down                                              N/A         off
     Ethernet5  112,113,114,115,116,117,118,119     400G   9100    N/A      Ethernet5          routed    down     down                                              N/A         off
     Ethernet6  120,121,122,123,124,125,126,127     400G   9100    N/A      Ethernet6          routed    down     down                                              N/A         off
     Ethernet7  128,129,130,131,132,133,134,135     400G   9100    N/A      Ethernet7          routed    down     down                                              N/A         off
     Ethernet8  136,137,138,139,140,141,142,143     400G   9100    N/A      Ethernet8  PortChannel101      up       up  QSFP-DD Double Density 8X Pluggable Transceiver         off
     Ethernet9          64,65,66,67,68,69,70,71     400G   9100    N/A      Ethernet9          routed    down     down                                              N/A         off

@mlok-nokia have you taken in both fixes in #2313 & #2323. I see the o/p correct without the "Invalid port" messages

@mlok-nokia
Copy link
Contributor

mlok-nokia commented Aug 22, 2022

This may not a right fix. One of the real issue is the function call doesn't handle the internal port correctly. Role such as: Int, Inb and Rec. On a platform with those ports. The following is the output:

admin@sonic:~$ show inter status
Invalid port 'Ethernet-Rec0'

@mlok-nokia have you taken in both fixes in #2313 & #2323. I see the o/p correct without the "Invalid port" messages

I did. I used the file from the latest master branch. The print out of "Invalid port" is in the function "logical_port_name_to_physical_port_list()" which is expected to display this line before the exception happened for internal port name.

@mlok-nokia
Copy link
Contributor

mlok-nokia commented Aug 22, 2022

I think we should import and add the following line before the "porttype = None":
if port_name.startswith((backplane_prefix(), inband_prefix(), recirc_prefix())):
return False

@stephenxs
Copy link
Collaborator Author

I think we should import and add the following line before the "porttype = None": if port_name.startswith((backplane_prefix(), inband_prefix(), recirc_prefix())): return False

ok. thanks. is there any other case that can fail logical_port_name_to_physical_port_list?

@stephenxs
Copy link
Collaborator Author

I think we should import and add the following line before the "porttype = None": if port_name.startswith((backplane_prefix(), inband_prefix(), recirc_prefix())): return False

ok. thanks. is there any other case that can fail logical_port_name_to_physical_port_list?

Maybe we can use this logic to cover more possible failure scenarios and be scalable for further extension.
@mlok-nokia @judyjoseph how do you think

    if not port_name or not port_name.startswith(front_panel_prefix()):
        return False
    for _, prefix in SONIC_INTERFACE_PREFIXES:
        if prefix != front_panel_prefix() and port_name.startswith(prefix):
            return False

@stephenxs
Copy link
Collaborator Author

fix: #2327

@mlok-nokia
Copy link
Contributor

mlok-nokia commented Aug 23, 2022

Here is my suggestion -- As what I understand, "Port-Name" should be all names which are defined in the port_config.ini. It should not contain "vlan", "PortChhannel" --- these type of interfaces. Most of the places use the following logic to check if the port_name is a internal interface or not. I think that using the same logic and convention is easy to be maintained and enhanced. In the future, if there is a new type of internal inter face is created, we can grep these functions and add the new type.

if port_name.startswith((backplane_prefix(), inband_prefix(), recirc_prefix())):
return False

@mlok-nokia
Copy link
Contributor

mlok-nokia commented Aug 23, 2022

Also, there is another issue I am not sure if it is related to this change. When issue the CLI command "show interface status" on a multi-asic platform (2 asics), the following error has been logged twice. If it is expected, it should not be logged as Error. Otherwise, the OC test case will fail on the LogAnalyzer.

Aug 23 00:50:08.594950 sonic ERR Chassis: :- initializeGlobalConfig: SonicDBConfig Global config is already initialized
Aug 23 00:50:08.599570 sonic ERR Chassis: :- initializeGlobalConfig: SonicDBConfig Global config is already initialized
Aug 23 00:50:13.862567 sonic ERR Chassis: :- initializeGlobalConfig: SonicDBConfig Global config is already initialized
Aug 23 00:50:13.867504 sonic ERR Chassis: :- initializeGlobalConfig: SonicDBConfig Global config is already initialized

@judyjoseph
Copy link
Contributor

I think we should import and add the following line before the "porttype = None": if port_name.startswith((backplane_prefix(), inband_prefix(), recirc_prefix())): return False

ok. thanks. is there any other case that can fail logical_port_name_to_physical_port_list?

Maybe we can use this logic to cover more possible failure scenarios and be scalable for further extension. @mlok-nokia @judyjoseph how do you think

    if not port_name or not port_name.startswith(front_panel_prefix()):
        return False
    for _, prefix in SONIC_INTERFACE_PREFIXES:
        if prefix != front_panel_prefix() and port_name.startswith(prefix):
            return False

The logic of identifying the ports are already present here https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/sonic_sfp/sfputilhelper.py#L75.

I didn't notice that this function "logical_port_name_to_physical_port_list()" was a local function.

My thought would be not to duplicate logic, instead just reuse the platform_sfputil.get_logical_to_physical(port_name) call itself in the try block which will fix everything. @mlok-nokia @stephenxs any thoughts

        try:
            physical_port = platform_sfputil.get_logical_to_physical(port_name)
            if physical_port:
                port_type = platform_chassis.get_port_or_cage_type(physical_port[0])
        except Exception as e:
            pass

@stephenxs
Copy link
Collaborator Author

The logic of identifying the ports are already present here https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/sonic_sfp/sfputilhelper.py#L75.

I didn't notice that this function "logical_port_name_to_physical_port_list()" was a local function.

My thought would be not to duplicate logic, instead just reuse the platform_sfputil.get_logical_to_physical(port_name) call itself in the try block which will fix everything. @mlok-nokia @stephenxs any thoughts

Completely agree that we should not duplicate the logic. Let me check a bit more and get back to you.
Btw, let's discuss this in the new PR.

@stephenxs
Copy link
Collaborator Author

initializeGlobalConfig

It doesn't look relevant to this change.
The log is printed from initializeGlobalConfig

void SonicDBConfig::initializeGlobalConfig(const string &file)
{
    std::string local_file, dir_name, ns_name;
    std::unordered_map<std::string, SonicDBInfo> db_entry;
    std::unordered_map<std::string, RedisInstInfo> inst_entry;
    std::unordered_map<int, std::string> separator_entry;
    std::lock_guard<std::recursive_mutex> guard(m_db_info_mutex);

    SWSS_LOG_ENTER();

    if (m_global_init)
    {
        SWSS_LOG_ERROR("SonicDBConfig Global config is already initialized");
        return;
    }

and exposed to python via load_sonic_global_db_config

    static void initializeGlobalConfig(const std::string &file = DEFAULT_SONIC_DB_GLOBAL_CONFIG_FILE);
#ifdef SWIG
    %pythoncode %{
        ## TODO: the python function and C++ one is not on-par
        @staticmethod
        def load_sonic_global_db_config(global_db_file_path=DEFAULT_SONIC_DB_GLOBAL_CONFIG_FILE, namespace=None):
            SonicDBConfig.initializeGlobalConfig(global_db_file_path)
    %}
#endif

which is called from load_db_config

def load_db_config():
    '''
    Load the correct database config file:
     - database_global.json for multi asic
     - database_config.json for single asic
    '''
    if is_multi_asic():
        if not swsscommon.SonicDBConfig.isGlobalInit():
            swsscommon.SonicDBConfig.load_sonic_global_db_config()
    else:
        if not swsscommon.SonicDBConfig.isInit():
            swsscommon.SonicDBConfig.load_sonic_db_config()

it will always check before calling it so it should not run into error if it is called from this func.
this is the most common place which is called from CLI.
but if it is called from cpp code, I'm not sure whether it will check before calling it.

yxieca pushed a commit that referenced this pull request Aug 26, 2022
Signed-off-by: Stephen Sun <stephens@nvidia.com>
preetham-singh pushed a commit to preetham-singh/sonic-utilities that referenced this pull request Nov 21, 2022
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"show int status" command fails on multi-asic/chassis devices.
4 participants