Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CiscoBgp4MIB implementation changes #667

Merged
merged 7 commits into from
Sep 4, 2020

Conversation

SuvarnaMeenakshi
Copy link
Contributor

This document captures the current implementation design for CiscoBgp4MIB and propose new design change required to support multi-asic platform.

SuvarnaMeenakshi and others added 3 commits August 26, 2020 16:10
CiscoBgp4MIB and changes for multi-asic platform.

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
Co-authored-by: Gen-Hwa Chiang <gechiang@microsoft.com>
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
Co-authored-by: Gen-Hwa Chiang <gechiang@microsoft.com>
@SuvarnaMeenakshi SuvarnaMeenakshi changed the title Snmp bgp CiscoBgp4MIB implementation changes Aug 27, 2020
This document captures the current implementation design for CiscoBgp4MIB and propose new design change required to support multi-asic platform.

## Current Design
Snmp docker has two main processes snmpd (master agent), snmp_ax_impl (sub-agent) providing data for some of the MIB tables. Snmp_ax_impl mostly gets data from redis database. For multi-asic platform, changes are made so that snmp_ax_impl connects to all namespace redis databases and provide cumulative result for SNMP query. Currently the data required for CiscoBgp4MIB is retrieved from Bgpd using VTYSH socket. snmp_ax_impl connects to bgpd vtysh via tcp socket and retreives the BGP neighbour information required for CiscoBgp4MIB.
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VTYSH [](start = 402, length = 5)

bgpd #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as per comment.

This document captures the current implementation design for CiscoBgp4MIB and propose new design change required to support multi-asic platform.

## Current Design
Snmp docker has two main processes snmpd (master agent), snmp_ax_impl (sub-agent) providing data for some of the MIB tables. Snmp_ax_impl mostly gets data from redis database. For multi-asic platform, changes are made so that snmp_ax_impl connects to all namespace redis databases and provide cumulative result for SNMP query. Currently the data required for CiscoBgp4MIB is retrieved from Bgpd using VTYSH socket. snmp_ax_impl connects to bgpd vtysh via tcp socket and retreives the BGP neighbour information required for CiscoBgp4MIB.
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vtysh [](start = 446, length = 5)

daemon #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as per comment.


## Design considerations for multi-asic platform
### Extending current design
In multi-asic platform, SNMP docker will be running on the host. BGP docker will be running per asic, in separate network namespace. If the currrent design has to be extended, then SNMP docker will have to connect to each BGP via TCP or unix socket. As each BGPd is in a different namespace, BGP docker in each namespace can interact with the snmp_ax_impl in host using docker0 bridge. Docker0 bridge has veth pairs with one interface of the pair on the host and another interface(eth0) inside the namespace. BGPd inside the BGP docker in namespace can open TCP socket to listen on eth0 IP address of the namespace instead of localhost. Snmp_ax_impl can then connect to TCP sockets on each namespace and retrieve data. Another option is to use UNIX socket /var/run/bgpd.vty. For TCP socket, Bgpd in each BGP docker will open socket 240.127.1.x 2065 to talk to VTYSH, currently socket is opened on localhost. If Unix socket is to be used, then /var/run/bgpd.vty of each BGP docker can be used by snmp_ax_impl to get the required data. The issue with this approach is that there will be N number of sockets opened for N asic platform. Also, each BGP docker will have to be updated to use the docker0 IP address network to open the socket.
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate network namespace [](start = 105, length = 26)

corresponding network namespaces #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as per comment.

In multi-asic platform, SNMP docker will be running on the host. BGP docker will be running per asic, in separate network namespace. If the currrent design has to be extended, then SNMP docker will have to connect to each BGP via TCP or unix socket. As each BGPd is in a different namespace, BGP docker in each namespace can interact with the snmp_ax_impl in host using docker0 bridge. Docker0 bridge has veth pairs with one interface of the pair on the host and another interface(eth0) inside the namespace. BGPd inside the BGP docker in namespace can open TCP socket to listen on eth0 IP address of the namespace instead of localhost. Snmp_ax_impl can then connect to TCP sockets on each namespace and retrieve data. Another option is to use UNIX socket /var/run/bgpd.vty. For TCP socket, Bgpd in each BGP docker will open socket 240.127.1.x 2065 to talk to VTYSH, currently socket is opened on localhost. If Unix socket is to be used, then /var/run/bgpd.vty of each BGP docker can be used by snmp_ax_impl to get the required data. The issue with this approach is that there will be N number of sockets opened for N asic platform. Also, each BGP docker will have to be updated to use the docker0 IP address network to open the socket.

### New design proposal using STATE_DB
To avoid multiple-socket approach, the data required by CiscoBgp4MIB can be populated by a new daemon in STATE_DB. This data can be retrieved by snmp_ax_impl from each namespace for multi-asic platform. Current implementaion of CiscoBgp4MIB in SONiC retrive the below information from the device:
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

daemon in STATE_DB [](start = 95, length = 18)

There is no daemon in Redis database. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the statement.

2. Neighbor BGP state
Currently snmp_ax_impl parses the data from Bgpd vtysh to get 'neighbor ip' and 'state'.

Proposal is to add a new daemon called 'bgpmond' which will populate STATE_DB with the above infomration.
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bgpmond [](start = 40, length = 7)

Suggest name bgpsyncd
@lguohan do you have any concern? #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft
Guohan suggested to use "bgpmon" instead of "bgpmond".
Let me know if you are ok with this name change.
Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated document with the name as suggested.

```
NEIGH_STATE_TABLE {
"<neigh_ip>" {
"State" : "active/connect/idle/etc"
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

active/connect/idle/etc [](start = 19, length = 23)

Could you list all possible States ? https://github.com/Azure/sonic-snmpagent/blob/c9b4210e364ccfe571032ae9a9c605af222005fd/src/sonic_ax_impl/lib/quaggaclient.py#L5 #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft
Will update the doc to enumerate all 7 possible states as following:
"Idle/Connect/Active/OpenSent/OpenConfirm/Established/Clearing"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as per comment.

Currently, NEIGH_STATE_TABLE will be used by SNMP. This table can be used by telemetry or any other docker in future.

### Bgpmond daemon to update STATE_DB
This is the new daemon that runs inside of the BGP docker. It will periodically (every 15 seconds) pull the bgp neighbor information by calling "show bgp summary json" and use the output to update the State DB accordingly. In order to prevent unnecessary update to the State DB, a copy of each neighbor state is cached and used to detect if there are any changes from each newly pulled neighbor state. Only when there is a change, then that particular entry is updated. In a steady state situation, there is rarely a need to update the state DB. If the neighbor is deleted from configuration, the corresponding state DB entry will also be cleaned up.
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the [](start = 43, length = 3)

each #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft
Will fix the wording as following: "This is the new daemon that runs inside of each BGP docker..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as per comment.


### Bgpmond daemon to update STATE_DB
This is the new daemon that runs inside of the BGP docker. It will periodically (every 15 seconds) pull the bgp neighbor information by calling "show bgp summary json" and use the output to update the State DB accordingly. In order to prevent unnecessary update to the State DB, a copy of each neighbor state is cached and used to detect if there are any changes from each newly pulled neighbor state. Only when there is a change, then that particular entry is updated. In a steady state situation, there is rarely a need to update the state DB. If the neighbor is deleted from configuration, the corresponding state DB entry will also be cleaned up.
In the future, if there are additional neighbor information that is needed we can add it to this new state table.
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, [](start = 0, length = 15)

Why 'In the future'? I think this should be in the initial design and implementation. #Closed

Copy link
Contributor

@gechiang gechiang Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft
Currently SONiC supports "1.3.6.1.4.1.9.9.187 CiscoBgp4MIB : subagent : sonic_ax_impl -- This table has only BGP peer IP address and STATE".
With this PR, we are in parity with the current SONiC support by providing the BGP peer IP and state in the STATE DB.
The documentation mentioned "In the future, if there are additional neighbor information that is needed..." is saying when more specific requirement is raised by future features, we can come back and add additional info to the state DB as needed".
Hope this answered your question. If not, please clarify. Thanks!

Copy link
Contributor

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
Co-authored-by: Gen-Hwa Chiang <gechiang@microsoft.com>
Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
@pavel-shirshov
Copy link

From my perspective it doesn't worth to update the information each 15 seconds.
I'd suggest track changes in bgpd.log, and. if you see any changes, check output.
Otherwise it is cpu consuming to retrieve Nxjsons, and parse them every 15 seconds

@gechiang
Copy link
Contributor

gechiang commented Aug 28, 2020

From my perspective it doesn't worth to update the information each 15 seconds.
I'd suggest track changes in bgpd.log, and. if you see any changes, check output.
Otherwise it is cpu consuming to retrieve Nxjsons, and parse them every 15 seconds

@pavel-shirshov Thanks for the suggestion. I was not very happy with blindly keep requesting for the neighbor state info without knowing there was any changes from previous state gathering but could not think of a good trigger that I can use as to when it is time to inspect the neighbor states. I think what you suggested was a great way to reduce any unnecessary state polling activities. Were you referring to the frr.log file under the following path: "/var/log/frr/frr.log"? A simple check on a cached file modified timestamp should be a good trigger to kick off the state polling. this frr.log file modified time check will be done periodically at 15 seconds cycle and only when there is a change it shall trigger the state polling request to update the corresponding STATE-DB entries of NEIGHBOR_STATE_TABLE

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
Co-authored-by: Gen-Hwa Chiang <gechiang@microsoft.com>
Copy link

@pavel-shirshov pavel-shirshov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: SuvarnaMeenakshi <sumeenak@microsoft.com>
Co-authored-by: Gen-Hwa Chiang <gechiang@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants