Skip to content

Commit

Permalink
check internface status before start bgp (sonic-net#19189)
Browse files Browse the repository at this point in the history
Why I did it
With the following PR, make bgp start after swss.
sonic-net#12381

bgp started after the swss but still ahead of the interface init.

Jun 12 04:53:59.768546 bjw-can-7050qx-1 NOTICE root: Starting swss service...
...
Jun 12 04:54:12.725418 bjw-can-7050qx-1 NOTICE admin: Starting bgp service...
...
Jun 12 04:54:43.036682 bjw-can-7050qx-1 NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet0 oper state set from down to up
Jun 12 04:54:43.191143 bjw-can-7050qx-1 NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet4 oper state set from down to up
Jun 12 04:54:43.207343 bjw-can-7050qx-1 NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet12 oper state set from down to up

Work item tracking
Microsoft ADO (number only):
26557087
How I did it
Check the interface status before start bgp.
waiting timeout is about 60s, will output a warning message if interface still down.

How to verify it
build debug image, boot the image, check the syslog. and bgp process.

syslog:1098:Jun 3 03:10:30.338071 str-a7060cx-acs-10 INFO bgp#root: [bgpd] It took 0.498398 seconds for interface to become ready
  • Loading branch information
lipxu authored Aug 15, 2024
1 parent 3b27e1c commit 08f8cb6
Show file tree
Hide file tree
Showing 4 changed files with 87 additions and 1 deletion.
1 change: 1 addition & 0 deletions dockers/docker-fpm-frr/Dockerfile.j2
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ COPY ["TSC", "/usr/bin/TSC"]
COPY ["TS", "/usr/bin/TS"]
COPY ["files/supervisor-proc-exit-listener", "/usr/bin"]
COPY ["zsocket.sh", "/usr/bin/"]
COPY ["bgpd_wait_for_intf.sh.j2", "/usr/share/sonic/templates/"]
RUN chmod a+x /usr/bin/TSA && \
chmod a+x /usr/bin/TSB && \
chmod a+x /usr/bin/TSC && \
Expand Down
70 changes: 70 additions & 0 deletions dockers/docker-fpm-frr/bgpd_wait_for_intf.sh.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#!/usr/bin/env bash

# Define global timeout in seconds
GLOBAL_TIMEOUT=60
GLOBAL_TIMEOUT_REACHED="false"

function wait_iface_ready
{
IFACE_NAME=$1
IFACE_CIDR=$2
START_TIME=$3

# First phase: wait for all interfaces until the global timeout is reached
while [ "$GLOBAL_TIMEOUT_REACHED" == "false" ]; do
CURRENT_TIME=$(date +%s.%N)
ELAPSED_TIME=$(awk -v current_time=$CURRENT_TIME -v start_time=$START_TIME 'BEGIN {print current_time - start_time}')

# Check if global timeout is reached
if (( $(awk -v elapsed_time=$ELAPSED_TIME -v global_timeout=$GLOBAL_TIMEOUT 'BEGIN {print (elapsed_time >= global_timeout)}') )); then
GLOBAL_TIMEOUT_REACHED="true"
break
fi

RESULT=$(sonic-db-cli STATE_DB HGET "INTERFACE_TABLE|${IFACE_NAME}|${IFACE_CIDR}" "state" 2> /dev/null)
if [ x"$RESULT" == x"ok" ]; then
return 0
fi
sleep 0.5
done

# Second phase: apply per-interface timeout
# Counter to track the number of iterations
ITERATION=0
while [ $ITERATION -lt 3 ]; do
RESULT=$(sonic-db-cli STATE_DB HGET "INTERFACE_TABLE|${IFACE_NAME}|${IFACE_CIDR}" "state" 2> /dev/null)
if [ x"$RESULT" == x"ok" ]; then
return 0
fi

sleep 0.5
((ITERATION++))
done

logger -p warning "[bgpd] warning: Interface ${IFACE_NAME} not ready."
return 1
}

start=$(date +%s.%N)

{% for (name, prefix) in INTERFACE|pfx_filter %}
{% if prefix | ipv4 %}
wait_iface_ready {{ name }} {{ prefix }} $start
{% endif %}
{% endfor %}

{% for (name, prefix) in VLAN_INTERFACE|pfx_filter %}
{% if prefix | ipv4 %}
wait_iface_ready {{ name }} {{ prefix }} $start
{% endif %}
{% endfor %}

{% for (name, prefix) in PORTCHANNEL_INTERFACE|pfx_filter %}
{% if prefix | ipv4 %}
wait_iface_ready {{ name }} {{ prefix }} $start
{% endif %}
{% endfor %}

end=$(date +%s.%N)
timespan=$(awk -v start=$start -v end=$end 'BEGIN {print end - start}')
logger -p info "[bgpd] It took ${timespan} seconds for interfaces to become ready"
3 changes: 3 additions & 0 deletions dockers/docker-fpm-frr/docker_init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ CFGGEN_PARAMS=" \
-t /usr/share/sonic/templates/supervisord/critical_processes.j2,/etc/supervisor/critical_processes \
-t /usr/share/sonic/templates/isolate.j2,/usr/sbin/bgp-isolate \
-t /usr/share/sonic/templates/unisolate.j2,/usr/sbin/bgp-unisolate \
-t /usr/share/sonic/templates/bgpd_wait_for_intf.sh.j2,/usr/bin/bgpd_wait_for_intf.sh \
"

FRR_VARS=$(sonic-cfggen $CFGGEN_PARAMS)
Expand Down Expand Up @@ -111,4 +112,6 @@ TZ=$(cat /etc/timezone)
rm -rf /etc/localtime
ln -sf /usr/share/zoneinfo/$TZ /etc/localtime

chmod +x /usr/bin/bgpd_wait_for_intf.sh

exec /usr/local/bin/supervisord
14 changes: 13 additions & 1 deletion dockers/docker-fpm-frr/frr/supervisord/supervisord.conf.j2
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,18 @@ dependent_startup=true
dependent_startup_wait_for=zebra:running
{% endif %}

[program:bgpd_wait_for_intf]
command=/usr/bin/bgpd_wait_for_intf.sh
priority=5
stopsignal=KILL
autostart=false
autorestart=false
startsecs=0
stdout_logfile=syslog
stderr_logfile=syslog
dependent_startup=true
dependent_startup_wait_for=zsocket:exited

[program:bgpd]
command=/usr/lib/frr/bgpd -A 127.0.0.1 -M snmp
priority=5
Expand All @@ -86,7 +98,7 @@ startsecs=0
stdout_logfile=syslog
stderr_logfile=syslog
dependent_startup=true
dependent_startup_wait_for=zsocket:exited
dependent_startup_wait_for=bgpd_wait_for_intf:exited

{% if DEVICE_METADATA.localhost.frr_mgmt_framework_config is defined and DEVICE_METADATA.localhost.frr_mgmt_framework_config == "true" %}
[program:ospfd]
Expand Down

0 comments on commit 08f8cb6

Please sign in to comment.