diff --git a/doc/mpls/MPLS_hld.md b/doc/mpls/MPLS_hld.md new file mode 100644 index 00000000000..61bcbca5a71 --- /dev/null +++ b/doc/mpls/MPLS_hld.md @@ -0,0 +1,782 @@ +# MPLS for SONiC High Level Design Document # + +## Table of Content +- [MPLS for SONiC High Level Design Document](#mpls-for-sonic-high-level-design-document) + - [Table of Content](#table-of-content) + - [Revision](#revision) + - [Scope](#scope) + - [Definitions/Abbreviations](#definitionsabbreviations) + - [Overview](#overview) + - [Requirements](#requirements) + - [Functional Requirements](#functional-requirements) + - [Configuration and Management Requirements](#configuration-and-management-requirements) + - [Scalability Requirements](#scalability-requirements) + - [Warm Boot Requirements](#warm-boot-requirements) + - [Future Requirements](#future-requirements) + - [Architecture Design](#architecture-design) + - [High-Level Design](#high-level-design) + - [Overview](#overview-1) + - [Database Changes](#database-changes) + - [APPL DB](#appl-db) + - [INTF TABLE](#intf-table) + - [ROUTE TABLE](#route-table) + - [LABEL ROUTE TABLE](#label-route-table) + - [CONFIG DB](#config-db) + - [INTERFACE](#interface) + - [PORTCHANNEL INTERFACE](#portchannel-interface) + - [VLAN INTERFACE](#vlan-interface) + - [CRM Config](#crm-config) + - [ASIC DB](#asic-db) + - [ROUTER INTERFACE](#router-interface-1) + - [INSEG ENTRY](#inseg-entry) + - [NEXT HOP](#next-hop-2) + - [Software Modules](#software-modules) + - [NetLink](#netlink) + - [Functions](#functions) + - [IntfMgr](#intfmgr) + - [Functions](#functions-1) + - [FPM Syncd](#fpm-syncd) + - [Functions](#functions-2) + - [IntfsOrch](#intfsorch) + - [Functions](#functions-3) + - [RouteOrch](#routeorch) + - [Functions](#functions-4) + - [NeighOrch](#neighorch) + - [Functions](#functions-5) + - [CrmOrch](#crmorch) + - [Functions](#functions-6) + - [Label/LabelStack](#label-labelstack) + - [NextHopKey](#nexthopkey) + - [Syncd](#syncd) + - [SAI API](#sai-api) + - [Router Interface](#router-interface-1) + - [MPLS](#mpls) + - [Next Hop](#next-hop-2) + - [Configuration and management](#configuration-and-management) + - [CLI Enhancements](#cli-enhancements) + - [Config DB Enhancements](#config-db-enhancements) + - [YANG Model Enhancements](#yang-model-enhancements) + - [SONiC Interface](#sonic-interface) + - [SONiC VLAN](#sonic-vlan) + - [SONiC PortChannel](#sonic-portchannel) + - [SONiC CRM](#sonic-crm) + - [Warmboot and Fastboot Design Impact](#warmboot-and-fastboot-design-impact) + - [Restrictions/Limitations](#restrictionslimitations) + - [Testing Requirements/Design](#testing-requirementsdesign) + - [Unit Test cases](#unit-test-cases) + - [System Test cases](#system-test-cases) + - [Open/Action items - if any](#openaction-items---if-any) + +### Revision +| Rev | Date | Author | Change Description | +| :--- | :--------- | :------ | :---------------- | +| 0.1 | Jan-10-2021 | A Pokora | Initial version | +| 0.2 | Jan-19-2021 | A Pokora | Updates from MPLS sub-community review | +| 0.3 | Jun-14-2021 | A Pokora | Updates from MPLS sub-community code-review | +| 1.0 | Dec-08-2021 | A Pokora | Final updates to reflect committed changes | + +### Scope + +This document provides general information about the initial support for MPLS in SONiC infrastructure. The focus of this initial MPLS support is to expand existing SONiC infrastructure for IPv4/IPv6 routing to include equivalent MPLS functionality. The expected use case for this initial MPLS support is static LSP routing. + +### Definitions/Abbreviations +| Abbreviation | Description | +| :---------- | :----------------------------------- | +| CRM | Critical Resource Monitoring | +| cRPD | Containerized Routing Protocol Daemon | +| LSP | Label-Switched Path | +| MPLS | Multi-Protocol Label Switching | +| RIF | Router Interface | + +### Overview +This document provides general information about the initial support for MPLS in SONiC infrastructure. + +### Requirements +This section describes the requirements for the initial support for MPLS in SONiC infrastructure. + +#### Functional requirements +- Support for MPLS enable/disable per RIF. +- Support for MPLS Push, Pop, and Swap label operations, including MPLS implicit-null and explicit-null behavior. +- Support for bulk MPLS in-segment entry SAI programming. +- Support for MPLS type next-hop SAI programming. +- Support in CRM for MPLS in-segment entries and MPLS next-hops accounting. +- Support for VS platform SAI for test purposes. + +#### Configuration and Management requirements +- SONiC CLI support for configuring MPLS enable/disable per RIF. +- SONiC CLI support for displaying MPLS state per RIF. +- SONiC CLI support for configuring CRM thresholds for MPLS in-segment entries and MPLS type next-hops. +- SONiC CLI support for displaying CRM thresholds and accounting for MPLS in-segment entries and MPLS type next-hops. + +#### Scalability Requirements +- Up to max ASIC capable MPLS in-segment entries are supported. +- Error is logged in syslog for all attempted MPLS routes after max limit is reached. +- CRM notification upon reaching configurable scaling thresholds. + +#### Warm Boot Requirements +- MPLS functionality continues across warm reboot. +- Support for planned system warm restart. +- Support for SWSS docker warm restart. + +#### Future Requirements +- SONiC CLI support for MPLS operational commands. +- FRR Zebra FPM support for MPLS in-segment entries and MPLS next-hops. +- Support for VRFs. + +### Architecture Design +For MPLS, SONiC SwSS infrastructure route and next-hop support is extended to include optional MPLS label stack in addition to the existing IPv4/IPv6 address information. + +### High-Level Design + +#### Overview +![Overview diagram](images/MPLS_overview_diagram.png "Overview of MPLS components") + +**Figure 1: Overview of the data flow and related components of MPLS** + +#### Database Changes +This section describes the modifications to SONiC Databases to support MPLS. + +##### APPL DB + +###### INTF TABLE +The existing INTF_TABLE in the APPL_DB is enhanced to accept a new "mpls" enable/disable attribute. + +``` +INTF_TABLE|{{interface_name}} + "mpls":{{enable|disable}} (OPTIONAL) + +; Defines schema for MPLS configuration attribute +key = INTERFACE:ifname ; Interface name +; field = value +mpls = "enable" / "disable" ; Enable/disable MPLS function. Default "disable" +``` + +###### ROUTE TABLE +The existing ROUTE_TABLE for IPv4/IPv6 prefix routes in the APPL_DB is enhanced to accept an optional "mpls_nh" attribute that is applicable when a MPLS push operation is configured. The format of the "mpls_nh" attribute string for IPv4/IPv6 prefix routes is: "push\/.../\". + +For IP forward-only next-hops, the "mpls_nh" attribute is not applicable. If the IPv4/IPv6 prefix route is associated with a single IP forward-only next-hop or a next-hop group consisting only of these hext-hops, then the "mpls_nh" attribute will not be present. If the IPv4/IPv6 prefix route is associated with a next-hop group with a mix of MPLS push and IP forward-only next-hops, then each IP forward-only next-hop will be represented by "na" in the "mpls_nh" attribute. + +For all next-hop types, the formats of the "nexthop" and "ifname" attributes are unchanged from previous releases. + +``` +"ROUTE_TABLE":{{prefix}} + "nexthop":{{nexthop_list}} + "ifname":{{ifname_list}} + "mpls_nh":{{mpls_nh_list}} + +; Defines schema for IPv4/IPv6 route table attributes +key = ROUTE_TABLE:prefix ; IPv4/IPv6 prefix +; field = value +nexthop = STRING ; Comma-separated list of IP gateways. +ifname = STRING ; Comma-separated list of interfaces. +mpls_nh = STRING ; Comma-separated list of MPLS next-hop info. +``` + +###### LABEL ROUTE TABLE +A new LABEL_ROUTE_TABLE is introduced to the APPL_DB for MPLS in-segment entries. The LABEL_ROUTE_TABLE uses the ingress MPLS label as its lookup key, instead of the IP prefix used by the ROUTE_TABLE. +The LABEL_ROUTE_TABLE accepts the same attributes as ROUTE_TABLE: +- A "nexthop" formatted-string attribute containing a list of IP gateways. +- A "ifname" attribute containing a list of interfaces. +- A "mpls_nh" attribute containing a list MPLS next-hop info. + +For MPLS in-segment routes, the "mpls_nh" attribute is applicable when a MPLS swap operation is configured. The format of the "mpls_nh" attribute for MPLS in-segment routes is: "swap\/../\". + +For MPLS pop and IP forward-only operations, the "mpls_nh" attribute is not applicable. If the MPLS in-segment entry is associated with a single MPLS pop or IP forward-only next-hop or a next-hop group consisting only of htese next-hops, then the "mpls_nh" attribute will not be present. If the MPLS in-segment etry is associated with a next-hop group with a mix of MPLS swap and MPLS pop/IP forward-only next-hops, then each MPLS pop/IP forward-only next-hop will be represented by "na" in the "mpls_nh" attribute. + +The LABEL_ROUTE_TABLE will contain an additional "mpls_pop" attribute for each MPLS in-segment entry. The value of "mpls_pop" will be "0" if the ingress MPLS label is to be retained (ie, IP forward-only next-hop). The value of "mpls_pop" will be "1" if the ingress MPLS label is to be removed (ie, MPLS pop or MPLS swap next-hop). + +For all next-hop types, the formats of the "nexthop" and "ifname" attributes are unchanged from previous releases. + +For MPLS "implicit-null" operations, the "mpls_nh" attribute is not present and the expected "mpls_pop" attribute value is "1" (ie, it is a MPLS pop next-hop) + +For MPLS "explicit-null" operations, the expected "mpls_nh" attribute value is "swap0" and the expected "mpls_pop" attribute value is "1" (ie, it is a special case of a MPLS swap next-hop). + +``` +"LABEL_ROUTE_TABLE":{{mpls_label}} + "nexthop":{{nexthop_list}} + "ifname":{{ifname_list}} + "mpls_nh":{{mpls_nh_list}} + "mpls_pop":{{mpls_pop}} + +; Defines schema for MPLS label route table attributes +key = LABEL_ROUTE_TABLE:mpls_label ; MPLS label +; field = value +nexthop = STRING ; Comma-separated list of nexthops. +ifname = STRING ; Comma-separated list of interfaces. +mpls_nh = STRING ; Comma-separated list of MPLS NH info. +mpls_pop = STRING ; Number of ingress MPLS labels to POP +``` + +##### CONFIG DB + +###### INTERFACE +The existing INTERFACE table is enhanced to accept a new "mpls" enable/disable attribute. + +``` +INTERFACE|{{ifname}} + "mpls":{{enable|disable}} (OPTIONAL) + +; Defines schema for MPLS configuration attribute +key = INTERFACE:ifname ; Interface name +; value annotations +ifname = 1*64VCHAR ; name of the Interface +; field = value +mpls = "enable"/"disable" ; Enable/disable MPLS function. Default "disable" +``` + +###### PORTCHANNEL INTERFACE +The existing PORTCHANNEL_INTERFACE table is enhanced to accept a new "mpls" enable/disable attribute. + +``` +PORTCHANNEL_INTERFACE|{{ifname}} + "mpls":{{enable|disable}} (OPTIONAL) + +; Defines schema for MPLS configuration attributes +key = PORTCHANNEL_INTERFACE:ifname ; Port Channel Interface name +;value annotations +ifname = 1*64VCHAR ; name of the Interface (Port Channel) +; field = value +mpls = "enable"/"disable" ; Enable/disable MPLS function. Default "disable" +``` + +###### VLAN INTERFACE +The existing VLAN_INTERFACE table is enhanced to accept a new "mpls" enable/disable attribute. + +``` +VLAN_INTERFACE|{{ifname}} + "mpls":{{enable|disable}} (OPTIONAL) + +; Defines schema for MPLS configuration attributes +key = VLAN_INTERFACE:ifname ; VLAN Interface name +;value annotations +ifname = 1*64VCHAR ; name of the Interface (VLAN) +; field = value +mpls = "enable"/"disable" ; Enable/disable MPLS function. Default "disable" +``` + +###### CRM Config +The existing CRM Config stanza is enhanced to include new MPLS in-segment entry and MPLS next-hop attributes. These attributes parallel existing CRM configuration for other resource types (eg, IPv4/IPv6 routes and next-hops). + +``` +CRM + Config + "mpls_inseg_threshold_type":{{percentage|used|free}} (OPTIONAL) + "mpls_inseg_high_threshold":{{UINT32}} (OPTIONAL) + "mpls_inseg_low_threshold":{{UINT32}} (OPTIONAL) + + "mpls_nexthop_threshold_type":{{percentage|used|free}} (OPTIONAL) + "mpls_nexthop_high_threshold":{{UINT32}} (OPTIONAL) + "mpls_nexthop_low_threshold":{{UINT32}} (OPTIONAL) + +; Defines schema for CRM MPLS in-segment entry and MPLS next-hop configuration attributes +; field = value +mpls_inseg_threshold_type = "percentage"/"used"/"free" ; Threshold type. Default "percentage" +mpls_inseg_high_threshold = UINT32 ; High threshold. Default value = 85 +mpls_inseg_low_threshold = UINT32 ; Low threshold. Default value = 70 + +mpls_nexthop_threshold_type = "percentage"/"used"/"free" ; Threshold type. Default "percentage" +mpls_nexthop_high_threshold = UINT32 ; High threshold. Default value = 85 +mpls_nexthop_low_threshold = UINT32 ; Low threshold. Default value = 70 +``` + +#### ASIC DB + +##### ROUTER INTERFACE +Support for a new attribute is introduced to the ASIC_DB for the existing ROUTER_INTERFACE object type: SAI_ROUTER_INTERFACE_ATTR_ADMIN_MPLS_STATE. The definition of this attribute can be found in sairouterinterface.h. + +##### INSEG ENTRY +Support for a new object type is introduced to the ASIC_DB: SAI_OBJECT_TYPE_INSEG_ENTRY. The full definition of this object type can be found in saimpls.h. + +##### NEXT HOP +Support for new attributes are introduced to the ASIC_DB for the existing NEXT_HOP object type: SAI_NEXT_HOP_ATTR_LABELSTACK and SAI_NEXT_HOP_ATTR_OUTSEG_TYPE. The definition of these attributes can be found in sainexthop.h. + +#### Software Modules +This section describes modifications to SONiC infrastructure software modules to support MPLS. + +##### NetLink Library +The Netlink library (libnl3) is an existing open source library imported by SONiC to parse and format Netlink messages. +Modifications to the existing NetLink library MPLS implementation were needed to support MPLS attributes. +###### Functions +New Netlink message next-hop accessors were added to retrieve attributes in a nested MPLS encapsulation (RTA_ENCAP_TYPE of LWTUNNEL_IPTUNNEL_MPLS) stanza. The following accessors retrieve the value for the attributes of MPLS next-hop destination (MPLS_IPTUNNEL_DST) and TTL (MPLS_IPTUNNEL_TTL): +``` + /* Accessor to retrieve MPLS destination */ + extern struct nl_addr * rtnl_route_nh_get_encap_mpls_dst(struct rtnl_nexthop *); + /* Accessor to retrieve MPLS TTL */ + extern uint8_t rtnl_route_nh_get_encap_mpls_ttl(struct rtnl_nexthop *); +``` + +##### IntfMgr +IntfMgr is an existing daemon in SWSS container that monitors operations in CONFIG_DB on INTERFACE, PORTCHANNEL_INTERFACE, and VLAN_INTERFACE tables. + +For MPLS, IntfMgr is modified to additionally process the "mpls" enable/disble attribute from the CONFIG_DB and propagate this attribute to APPL_DB. +###### Functions +The following are functions for IntfMgr: +``` + /* MPLS enable/disable per Interface */ + bool IntfMgr::setIntfMpls(const std::string &alias, const std::string &mpls); +``` +This function sets the Linux kernel variable net.mpls.interface.\ to enable/disable MPLS on the specified interface. + +##### FPM Syncd +FPM Syncd is an existing daemon in BGP container that monitors NetLink route messages (RTM_NEWROUTE and RTM_DELROUTE) from the SONiC routing stack FPM socket for route and next-hop information. + +New support has been added to FPM Syncd for MPLS to process MPLS related route and next-hop information in the received NetLink messages and propagate this information to the APPL_DB. +###### Functions +The following are new functions for fpmsyncd: +``` + /* Handler for rtnl messages with AF_MPLS route */ + void RouteSync::onLabelRouteMsg(int nlmsg_type, struct nl_object *obj); + /* Handler for rtnl messages with IP and/or MPLS next-hops */ + void RouteSync::getNextHopList(struct rtnl_route *route_obj, string& gw_list, + string& mpls_list, string& intf_list); + +``` + +##### IntfsOrch +IntfsOrch is an existing component of the OrchAgent daemon in the SWSS container. IntfsOrch monitors operations on Interface related tables in APPL_DB and converts those operations into SAI commands to manage the RIF object. + +For MPLS, IntfsOrch has been extended to detect the new per-RIF "mpls" enable/disable attribute in the APPL_DB and propagate this configuration to the ASIC_DB via SAI_ROUTER_INTERFACE_ATTR_ADMIN_MPLS_STATE. This MPLS behavior parallels the existing IntfsOrch behavior of SAI_ROUTER_INTERFACE_ATTR_ADMIN_V4_STATE and SAI_ROUTER_INTERFACE_ATTR_ADMIN_V6_STATE for IPv4/IPv6. +###### Functions +The following are new functions for IntfsOrch: +``` + /* Handler to enable/disable MPLS per Interface */ + bool IntfsOrch::setRouterIntfMpls(const Port& port) +``` + +##### RouteOrch +RouteOrch is an existing component of the OrchAgent daemon in the SWSS container. RouteOrch monitors operations on Route related tables in APPL_DB and converts those operations in SAI commands to manage IPv4/IPv6 route and MPLS in-segment entries. Additionally RouteOrch coordinates next-hop object operations with NeighOrch and converts operations into SAI commands to manage next-hop group objects. + +For MPLS, RouteOrch was modified to monitor updates to the new APPL_DB LABEL_ROUTE_TABLE. RouteOrch translates all updates to LABEL_ROUTE_TABLE to equivalent SAI requests for SAI MPLS inseg API. +Next-hop processing for updates from both the new LABEL_ROUTE_TABLE and the existing ROUTE_TABLE has been extended to detect possible MPLS attributes and propagate this additional information to NeighOrch for SAI handling. +###### Functions +The following are new functions for RouteOrch: +``` + /* Consumer handler for all events in APPL_DB LABEL_ROUTE_TABLE */ + void RouteOrch::doLabelTask(Consumer& consumer); + /* Handler to process new MPLS route from LABEL_ROUTE_TABLE */ + bool RouteOrch::addLabelRoute(LabelRouteBulkContext& ctx, const NextHopGroupKey&); + /* Handler to process MPLS route removal from LABEL_ROUTE_TABLE */ + bool RouteOrch::removeLabelRoute(LabelRouteBulkContext& ctx); +``` + +##### NeighOrch +NeighOrch is an existing component of the OrchAgent daemon in the SWSS container. NeighOrch monitors operations on Neighbor related tables in APPL_DB. Additionally NeighOrch coordinates next-hop operations with RouteOrch and converts operations into SAI commands to manage next-hop objects. + +For MPLS, NeighOrch has been extended to send create/remove SAI requests for MPLS next-hop objects (ie, next-hop objects of type SAI_NEXT_HOP_TYPE_MPLS) when associated neighbor objects are created/removed. This MPLS next-hop behavior parallels the existing IPv4/IPv6 next-hop behavior in NeighOrch. +###### Functions +Existing functions from NeighOrch are updated to include NextHopKey parameter instead of IpAddress and visibility is raised to public for RouteOrch accessibility. +``` + bool addNextHop(const NextHopKey&); + bool removeNextHop(const NextHopKey&); +``` + +##### CrmOrch +CrmOrch is an existing component of the OrchAgent daemon in the SWSS container. CrmOrch monitors resource usage in the SONiC system and triggers alarms when configurable thresholds are reached. + +For MPLS, CrmOrch has been extended to monitor the number of MPLS in-segment entries and MPLS next-hops against the platform-specific number of entries available. To facilitate this, new CRM resource types of CRM_MPLS_INSEG and CRM_MPLS_NEXTHOP have been added to CrmOrch. +CRM_MPLS_INSEG has been mapped to the existing SAI object type SAI_OBJECT_TYPE_INSEG_ENTRY for querying via sai_object_type_get_availability(). This MPLS behavior parallels the existing CRM_ROUTE_IPV4 behavior with SAI_SWITCH_ATTR_AVAILABLE_IPV4_ROUTE_ENTRY and CRM_IPV6_ROUTE behavior with SAI_SWITCH_ATTR_AVAILABLE_IPV6_ROUTE_ENTRY. +CRM_MPLS_NEXTHOP has been mapped to existing SAI object type SAI_OBJECT_TYPE_NEXT_HOP with SAI_NEXT_HOP_ATTR_TYPE of SAI_NEXT_HOP_TYPE_MPLS for querying via sai_object_type_get_availability(). This MPLS behavior parallels the existins CRM_NEXTHOP_IPV4 behavior with SAI_SWITCH_ATTR_AVAILABLE_IPV4_NEXT_HOPS and CRM_NEXTHOP_IPV6 behavior with SAI_SWITCH_AVAILABLE_IPV6_NEXT_HOPS. +###### Functions +No new functions were required for CrmOrch MPLS in-segment entry and next-hop support. + +##### Label/LabelStack +Label and LabelStack are new type utilities of the OrchAgent daemon in the SWSS container. +These types are introduced to represent the MPLS label or label stack when found in an MPLS in-segment entry or next-hop. +``` +typedef uint32_t Label; +struct LabelStack +{ + std::vector