-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VoQ configuration using minigraph.xml file #5991
Conversation
b05c8bd
to
302539b
Compare
302539b
to
307ee88
Compare
307ee88
to
2cac425
Compare
else: | ||
if child.tag == str(QName(ns, "DpgDec")): | ||
(intfs, lo_intfs, mvrf, mgmt_intf, vlans, vlan_members, pcs, pc_members, acls, vni, tunnel_intfs, dpg_ecmp_content) = parse_dpg(child, asic_name) | ||
(intfs, lo_intfs, mvrf, mgmt_intf, voq_inband_intfs, vlans, vlan_members, pcs, pc_members, acls, vni, tunnel_intfs, dpg_ecmp_content) = parse_dpg(child, asic_name) | ||
host_lo_intfs = parse_host_loopback(child, hostname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the minigraph.xml this is defined under each asic.. We don't need this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For single-asic linecards this is defined under the host's DpgDec
section.
elif child.tag == str(QName(ns, "LinkMetadataDeclaration")): | ||
linkmetas = parse_linkmeta(child, hostname) | ||
elif child.tag == str(QName(ns, "DeviceInfos")): | ||
(port_speeds_default, port_descriptions) = parse_deviceinfo(child, hwsku) | ||
(port_speeds_default, port_descriptions, sys_ports) = parse_deviceinfo(child, hwsku) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sysports are also for each asic. I dont think we need it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sysports a defined globally but needs to be programmed by each asic.
@@ -1268,6 +1347,9 @@ def parse_xml(filename, platform=None, port_config_file=None, asic_name=None, hw | |||
results['INTERFACE'] = phyport_intfs | |||
results['VLAN_INTERFACE'] = vlan_intfs | |||
|
|||
if sys_ports: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a unit tests for this ?
@@ -1115,6 +1143,141 @@ | |||
<Speed>40000</Speed> | |||
</a:EthernetInterface> | |||
</EthernetInterfaces> | |||
<SystemPorts> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the system ports needs to be defined inside each ASIC, not at the line card level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each asic needs to know about all the system ports in the system. Currently system ports are initialized during orchagent init so this makes it easier to have them defined in DeviceInfo
and parsed by each asic (either in multi-asic setup or in single-asic linecards) than having them duplicated for each asic in the minigraph.xml
src/sonic-config-engine/minigraph.py
Outdated
speed = sysport.find(str(QName(ns, "Speed"))).text | ||
num_voq = sysport.find(str(QName(ns, "NumVoq"))).text | ||
key = "%s|%s" % (device, portname) | ||
if slot_id is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't slot_id a required attribute ? will handling of system ports in swss work, if slot-id is not there in the key ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slot_id
is required to identify each linecard in a chassis. For a fixed system with multiple asics this is not required since the device
is enough to identify the each asic of the system.
In the end slot_id
is only necessary to build a unique key to identify each asic in the system. If that helps we can make it mandatory for single-asic systems even if this is not required.
@@ -808,7 +823,15 @@ def parse_meta(meta, hname): | |||
cloudtype = value | |||
elif name == "ResourceType": | |||
resource_type = value | |||
return syslog_servers, dhcp_servers, ntp_servers, tacacs_servers, mgmt_routes, erspan_dst, deployment_id, region, cloudtype, resource_type | |||
elif name == "SwitchId": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is configuration is for the Line card host not asic, do we need these attributes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are required in the case of single asic linecards. In this case the linecard is a single host device. For multi-asic linecards or multi-asic fixed systems these attributes needs to be parsed for each asic in parse_asic_meta
.
Hi @mlorrillere, The multi_npu minigraph.xml has configuration which is not applicable to the VoQ chassis and vice-versa. Can you please add a new sample minigraph.xml and unit tests for VoQ chassis, instead of modifying the multi_npu ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as comments
Sure, I'll add a dedicated unit test for VoQ minigraph.xml |
src/sonic-config-engine/minigraph.py
Outdated
# on Voq system each asic has a instance_name | ||
if instance_name is not None: | ||
results['DEVICE_METADATA']['localhost']['instance_name'] = instance_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not use the "instance_name" anywhere in the voq implementation. If this is not used anywhere we can remove this.
But we need "slot_id" and "device" in DEVICE_METADATA to derive the system lag name in the same way how system port name is derived (Refer: Voq LAG HLD sonic-net/SONiC#697)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "instance_name" here has the same meaning as "asic_name" in the VoQ LAG HLD, so I will rename "instance_name" -> "asic_name".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that there is an api in minigraph.py that extracts asic_id from asic_name. This indicates that asic_name will be either of the two forms like asic0, asic1, etc. or asic<PCI device id> form.
If we have to have the asic_id in the format PCI device id, I do not think it will be good idea to use it for system lag name.
Therefore, I recommend to change the voq lag hld to use "slot_id" and "device" and add these attributes in device meta data so that both system port name and system lag name will be derived in similar way (using slot_id and device). We can get rid of "instance_name".
If this is fine with all, I'll take care of updating voq lag hld.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that asic_id
in minigraph.py has the same meaning as asic_id
from CONFIG_DB. The former is used to extract the instance ID as described by systemd services, while the later is intended to be used internally by the platform to identify a particular asic (like PCI device ID, see also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-orchagent/orchagent.sh#L28-L42).
I think asic_id
in minigraph.py is misleading as it really is the "instance name" as defined by systemd.unit manual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like asic_name
in minigraph.py
is actually the asic's hostname. Adding device
to minigraph.xml will basically replicate asic's Hostname
. I think it would be fair here to simply do:
results['DEVICE_METADATA']['localhost']['asic_name'] = asic_name
or, if it makes more sense to name it device_name
in CONFIG_DB:
results['DEVICE_METADATA']['localhost']['device_name'] = asic_name
Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like
asic_name
inminigraph.py
is actually the asic's hostname. Addingdevice
to minigraph.xml will basically replicate asic'sHostname
. I think it would be fair here to simply do:results['DEVICE_METADATA']['localhost']['asic_name'] = asic_nameor, if it makes more sense to name it
device_name
in CONFIG_DB:results['DEVICE_METADATA']['localhost']['device_name'] = asic_nameLet me know what you think.
I think both in minigraph and config_db "asic_name" takes the value "asic0", "asic1" and so on. Also note that asic_name is nothing but the namespace passed to parse_xml() by sonic-cfggen. And we have a separate config_db attribute for "hostname" which is "Hostname" in minigraph. So I guess "asic_name" cannot be hostname but should be considered as asic instance name. Therefore, we do not need "InstanceName" and "Device" in minigraph instead use the "asic_name".
The "hostname" which is used for both single asic and multi-asic should be equivalent to the line card in chassis context. So we can consider the "Hostname" for "slot_id" and avoid introducing "SlotId" in minigraph.
So my recommendations are:
(1) Set config db ['DEVICE_METADATA']['localhost']['asic_name'] = asic_name, as you suggested.
(2) Set config_db ['DEVICE_METADATA']['localhost']['hostname'] = hostname
(3) Remove "InstanceName" and use "asic_name" for instance name if instance name is needed
(4) For system port table: Remove "SlotId" and "Device". Use "hostname" and "asic_name" instead and derive system port key using the format "hostname|asic_name|portname"
I think (1) and (2) are already there as part of multi-asic. I feel that it is better to extend the use of existing multi-asic attributes for voq chassis systems also instead of introducing new attributes in config db and minigraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vganesan-nokia ,
Regarding this point:
(4) For system port table: Remove "SlotId" and "Device". Use "hostname" and "asic_name" instead and derive system port key using the format "hostname|asic_name|portname"
Just to confirm that we are on the same page, we still need to add Hostname
and AsicName
to each SystemPort
because they are defined globally. We can't infer hostname
and asic_name
while parsing each individual system port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We need "hostname" and "asic_name" in the DEVICE_METADATA and should use these for deriving the system port name something like "<hostname>|<asic_name>|Ethernet1". The recommendation (4) is to say that we do not need to define "SlotId", 'Device" and "InstanceName" in minigraph. We already have "hostname" and "asic_name" and "asic_name" is same as instance name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to define these in the minigraph, but we still do need to use them in SystemPort
, i.e.:
<SystemPort>
<Name>Ethernet1/3</Name>
<Hostname>multi_npu_platform_01</Hostname>
<AsicName>ASIC0</AsicName>
[...]
</SystemPort>
The reason for that is that the minigraph contains all system ports defined in the chassis, so each system port has to include hostname and asic_name of the system port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sounds good to me. It will be good to have the values of HostName and AsicName consistent with values in DEVICE_METADATA "hostname" and "asic_name". The reason is: the values of ['DEVICE_METADATA']['localhost']['hostname'] and ['DEVICE_METADATA']['localhost']['asic_name'] are used for deriving the system lag name. It will be good to have naming convention consistent for system port also. So for here, if we make sure HostName and AsicName, use the same corresponding values (the values of ['DEVICE_METADATA']['localhost']['hostname'] and ['DEVICE_METADATA']['localhost']['asic_name'], it will be great.
@@ -1192,6 +1252,25 @@ def parse_xml(filename, platform=None, port_config_file=None, asic_name=None, hw | |||
current_device['sub_role'] = sub_role | |||
results['DEVICE_METADATA']['localhost']['sub_role'] = sub_role | |||
results['DEVICE_METADATA']['localhost']['asic_name'] = asic_name | |||
# on Voq system each asic has a switch_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VoQ system needs "asic_id" also. Setting of asic_id in ['DEVICE_METADATA']['localhost']['asic_id''] is done in sonic-cfggen. But this asic_id will be in simple number format like 0, 1, etc. Please make sure that this is over-written by asic_id (in either PCI device id format or simple number) from default_config.json before orchagent starts as discussed in sonic-net/sonic-utilities#1228.
2cac425
to
d5fe425
Compare
This pull request introduces 1 alert when merging d5fe4253932597614b97d179bfbbe0ed59ee4a5a into 43d4d45 - view on LGTM.com new alerts:
|
973a926
to
509a315
Compare
511b834
to
6d511db
Compare
This pull request introduces 1 alert when merging 6d511db8ebd653b4623dae5accd953d2ae2df6c7 into aa63c71 - view on LGTM.com new alerts:
|
6d511db
to
8448c14
Compare
This pull request introduces 1 alert when merging 8448c1448f903ca490533be65dddece4451b188f into aa63c71 - view on LGTM.com new alerts:
|
8448c14
to
d00be13
Compare
Hi @vganesan-nokia and @arlakshm , Last changes include updated unit test with a dedicated sample minigraph XML file and address other comments related to InstanceName and SystemPorts . |
output = json.loads(self.run_script(argument)) | ||
self.assertEqual(output['localhost']['switch_id'], '0') | ||
self.assertEqual(output['localhost']['switch_type'], 'voq') | ||
self.assertEqual(output['localhost']['max_cores'], '16') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we checking for the "asic_id", "asic_name", and "hostname" in any other test case? If not would you please add verification (assertEqual) for ['localhost']['asic_id'], ['localhost']['hostname'] and ['localhost']['asic_name'] here? These are mandatory attributes if "switch_type" is "voq".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asic_id
and asic_name
are multi-asic attributes, not specific to VoQ switches. I don't think we need either of them on single-asic if switch_type
is "voq"
. On single-asic devices where asic_name
is not specified the system port key is "hostname|portname".
For checks:
asic_name
is tested in test_multinpu_cfggen.py but notasic_id
.hostname
is not specific to VoQ switches but I didn't see any check for this one.
I'll update test_cfggen.py to add a check for hostname
and test_multinpu_cfggen.py
for asic_id
.
@mlorrillere, thanks for addressing comments. Things look good. I have left a minor comment in test_cfggen.py. |
4b03311
to
04c80fc
Compare
Hi @vganesan-nokia , Last changes include making |
04c80fc
to
fba8f80
Compare
Contains the following changes: - Add support for system ports configuration to minigraph - Add support for SwitchId, SwitchType and MaxCores to minigraph - Add support for inband vlan configuration in minigraph - `asic_name` is now a mandatory attribute in CONFIG_DB on VoQ switches
fba8f80
to
de5c9f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This commit contains the following changes to support for configuring a VoQ switch using a minigraph.xml file.: - Add support for system ports configuration to minigraph - Add support for SwitchId, SwitchType and MaxCores to minigraph - Add support for inband vlan configuration in minigraph - `asic_name` is now a mandatory attribute in CONFIG_DB on VoQ switches Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
This commit contains the following changes to support for configuring a VoQ switch using a minigraph.xml file.: - Add support for system ports configuration to minigraph - Add support for SwitchId, SwitchType and MaxCores to minigraph - Add support for inband vlan configuration in minigraph - `asic_name` is now a mandatory attribute in CONFIG_DB on VoQ switches Co-authored-by: Maxime Lorrillere <mlorrillere@arista.com>
Contains the following changes:
asic_name
inCONFIG_DB
mandatory for VoQ switchessample-voq-graph.xml
and updatedtest_cfggen.py
to test for VoQ switches config attributes- Why I did it
Adding support for configuring a VoQ switch using a minigraph.xml file.
- How I did it
Added the following new sections to the minigraph.xml file:
VoqInbandInterfaces
toDeviceDataPlaneInfo
Contains a list of
VoQInbandInterface
for this device. Example:SystemPorts
inDeviceInfo
Contains the list of all system ports present in the system as
SystemPort
elements. Example:SwitchId
toDeviceMetadata
This is used to set the switch ID of a device. Example:
- How to verify it