You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue #266: WR node hangs in endless BOOTP loop, when WR PTP link is not established
dietrichb commented on Feb 2, 2021
In rare cases it might happen that a WR node is unable to establish a WR PTP link. In this case the following symptoms are commonly observed.
node issues a BOOTP request via the network ~once every second
(BOOTP server replies with IP)
when reading the relevant register of the WR core, the IP matches the one from the BOOTP server reply: it seems the reply from the BOOTP server has been received by the node
(I believer the console claims 'in training' for IP)
the node continues issuing BOOTP requests until the WR PTP link is successfully established
I am not sure if this is an issue with the way how the WR core is instantiated on Arria II devices or of this is an issue of the WR core itself. Should we file this issue on OHWR?
Issue #256: connection between White Rabbbit node and switch unreliable after reboot of WRS
dietrichb commented on Feb 2, 2021 •
a variety of symptoms is observed when rebooting a WRS to which is WR node is connected. This can happen during maintenance, WRS reboot on purpose, or when recovering from a power-cut.
no White Rabbit lock, occasionally; WRS port claims WA_MSG (waiting for message); node is accessible via the network
no Ethernet link; rarely; WRS ports claims 'link down'; node inaccessible
'hang up'; WRS port claims 'WA_MSG' and node MAC is detected by the WRS; node inaccessible via the network
In all cases, power-cycling the WR node helps
In cases '1' and '2' it is usually possible to recover by 'eb-reset' of the node.
In case '1', forcing a sequence port up->down->up on the WRS helps in some cases
In case '2', forcing a sequence port up->down->up on the WRS does not help
In case '3', the node seems to be almost dead. Access to the node is possible neither from the timing network nor from the host system (no chance for eb-reset). Forcing port ->down->up on the WRS does not help. Autorecovery of the WR node via the 'watchdog' implemented on the SCU does not work. A powercycle helps.
Issue #111: WR port not reachable after power cycle of WR switch
dietrichb commented on Dec 15, 2018
symptoms
WRS
ports shows MAC and ptp state 6 (looks good)
node
eb-mon shows LINK_UP and TRACKING (looks good)
node not reachable via timing network (all EB requests time out)
when
after reboot power cycle of WRS
it may take a few power cycles of the WRS to trigger the bug
workaround
power cycle or restart FPGA using eb-reset
dietrichb commented on Aug 20, 2019
solved for Arria5 based platforms
requires major work (PHY control update) for Arria II based devices (SCU and VETAR)
Issue #51: WR port of node remains down after power cycle of node AND WR switch
dietrichb commented on Oct 23, 2017
There seems to be an annoying bug that seems to occur when a node (SCU) and WRS are switched-on simultaneously after a power cut.
The symptoms are the following
PPS LED not blinking, activity LED not blinking, link LED off
eb-mon -v dev/wbm0 shows "LINK_DOWN" and "NO_SYNC"
eb-console dev/wbm0 causes freezing of the ssh shell
node fails to get an IP via BOOTP
(but the WRS shows both "link up" and "activity" LEDs)
node is not accessible via the WR network
resetting the FPGA of the node via its Reset controller is possible and cures the symptom.
Suspicion: The FPGA of the node is much faster with "booting" compared to the WRS. It somehow misses to detect "link up" after WRS starts and remains trapped in "link down" state.
This issue is causing real annoyance in cases were major parts of the facility need to be recovered after a major power-cut.
Maybe this is linked to another issue:
dietrichb commented on Aug 20, 2019
solved for Arria 5
not solved for Arria II (SCU and Vetar)
a fix for Arria would require a major effort
dietrichb commented on Feb 2, 2021
update (January 2021): in rare cases this is also observed with fallout gateware
The text was updated successfully, but these errors were encountered:
There is another issue. If a 'fallout' node locks to White Rabbit, it might look at different 'positions' within a 4 ns window. Once locked, it will always remain locked at its initial position.
This issue becomes obvious if one compares two timing receivers (time stamping or digital output) with the same signal. The time difference will remain identical as long as none of the two timing receivers is restarted. But after a restart, the time difference might have a different value. This issue seems to be present for all form factors.
Issue #266: WR node hangs in endless BOOTP loop, when WR PTP link is not established
Issue #256: connection between White Rabbbit node and switch unreliable after reboot of WRS
Issue #111: WR port not reachable after power cycle of WR switch
Issue #51: WR port of node remains down after power cycle of node AND WR switch
The text was updated successfully, but these errors were encountered: