-
Notifications
You must be signed in to change notification settings - Fork 243
Restructures LLDP service for safer state management #961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We were holding locks too long and doing more structure copying than optimal.
Emit more logging on errors Eliminated the redundant lldp.Start.
| IPv6Static *testIPv6StaticConfig `json:"ipv6_static,omitempty" required_if:"IPv6Mode=static"` | ||
|
|
||
| LLDPMode null.String `json:"lldp_mode,omitempty" one_of:"disabled,basic,all" default:"basic"` | ||
| LLDPMode null.String `json:"lldp_mode,omitempty" one_of:"disabled,enabled,rx_only,tx_only,rx_and_tx,basic,all" default:"rx_and_tx"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enabled seems to be what we default to in the original code as tested against a factory reset
|
|
||
| var defaultLogger = logging.GetSubsystemLogger("lldp") | ||
|
|
||
| type RunningState struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bundles up all the state for either Tx or Rx into a single object to ease ensuring proper running-state management.
| type RunningState struct { | ||
| Enabled bool | ||
| Running bool | ||
| Logger *zerolog.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing the Logger and TPacket as pointers to ensure we don't accidentally miss mutations.
| l: opts.Logger, | ||
| neighbors: make(map[neighborCacheKey]Neighbor), | ||
| onChange: opts.OnChange, | ||
| Rx: &RunningState{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't care about the initial enabled state because we're going to call the same SetRxAndTx method as would be used when changing options. This gives a single path by which Rx/Tx is Start/Stop.
Also means we can eliminate the Start() method entirely
| return nil | ||
| } | ||
|
|
||
| if err := l.setUpCapture(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the setup inside startCapture()
| func (l *LLDP) SetAdvertiseOptions(opts *AdvertiseOptions) error { | ||
| l.mu.Lock() | ||
| txRunning := l.txRunning | ||
| logger := l.l |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this gets called because of change in the advertising options, we can't use either the Rx or Tx loggers, so use the top-level LLDP logger.
| l.mu.Lock() | ||
| l.enableRx = rx | ||
| l.enableTx = tx | ||
| l.Rx.Enabled = rx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now executed both when things change and one time at device startup
| } | ||
|
|
||
| func (l *LLDP) setUpCapture() error { | ||
| l.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is now executed while already holding the lock inside prepareCapture() and it has setup all the Rx logger and rx TPacket before we get here.
| } | ||
| logger.Info().Msg("created TPacketRx") | ||
|
|
||
| func (l *LLDP) setUpPacketSourceUnderLock(tPacket *afpacket.TPacket, logger *zerolog.Logger) (*gopacket.PacketSource, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now is only tasked with setting up the BPF filter and packet source.
| if l.pktSourceRx == nil || l.rxCtx == nil { | ||
| func (l *LLDP) doCapture() { | ||
| l.mu.Lock() | ||
| l.Rx.Running = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doCapture() now manages the Rx running flag like we did in Tx.
| }() | ||
|
|
||
| if l.pktSourceRx == nil || ctx == nil { | ||
| logger.Error().Msg("packet source or RX context not initialized") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this clause could ever happen, but left it in just in case.
| if err == io.EOF || err == io.ErrUnexpectedEOF || | ||
| err == io.ErrNoProgress || err == io.ErrClosedPipe || err == io.ErrShortBuffer || | ||
| err == syscall.EBADF || | ||
| if errors.Is(err, io.EOF) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is the preferred idiom since 1.3 because it can recognize wrapped errors
| } | ||
|
|
||
| func (l *LLDP) startCapture() error { | ||
| func (l *LLDP) prepareCapture() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prepareCapture() does all the setup and returns true if the caller needs to call doCapture() (e.g. run the Rx loop).
| if l.rxRunning { | ||
| return nil // Already running | ||
| if l.Rx.Running { | ||
| l.l.Debug().Msg("LLDP receiver already running") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... we haven't created the Rx logger yet, use the LLDP level one.
| runningState := &RunningState{Enabled: true, Logger: &logger} | ||
| runningState.Ctx, runningState.Cancel = context.WithCancel(context.Background()) | ||
| runningState.TPacket = tPacket | ||
| l.Rx = runningState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
monotonic switch the new running Rx state...
| l.tPacketRx.Close() | ||
| l.tPacketRx = nil | ||
| } | ||
| l.Rx.Running = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eagerly clear the running flag because we're about to tear everything down in reverse order.
| ) | ||
|
|
||
| func (l *LLDP) toPayloadValues() []layers.LinkLayerDiscoveryValue { | ||
| func (l *LLDP) toPayloadValues(opts *AdvertiseOptions) []layers.LinkLayerDiscoveryValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing in the options we grabbed under read-lock in sendTxPackets()
| l.tPacketTx = tPacketTx | ||
|
|
||
| logger.Info().Msg("created TPacket instance for sending LLDP packets") | ||
| l.Tx = txState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
monotonically swap Tx running state
|
|
||
| if cancel != nil { | ||
| cancel() | ||
| if previousCancel != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed for clarification that this forces the shut-down of the previously running Tx loop
| txCancel := l.txCancel | ||
| l.txRunning = false | ||
| txCancel := l.Tx.Cancel | ||
| l.Tx.Running = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
techinically we should probably let the defer code in doSendPeriodically() clear this flag, but an early teardown indication isn't going to hurt us here.
| IPv6Static *IPv6StaticConfig `json:"ipv6_static,omitempty" required_if:"IPv6Mode=static"` | ||
|
|
||
| LLDPMode null.String `json:"lldp_mode,omitempty" one_of:"disabled,rx_only,tx_only,rx_and_tx,basic,all" default:"rx_and_tx"` | ||
| LLDPMode null.String `json:"lldp_mode,omitempty" one_of:"disabled,enabled,rx_only,tx_only,rx_and_tx,basic,all" default:"rx_and_tx"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the factory-reset defaults to enabled not any of the other options, so declare it as valid to avoid config validation errors.
| // add remaining expected default routes | ||
| for _, gateway := range expected { | ||
| nm.logger.Warn().Str("gateway", gateway.String()).Msg("adding default route") | ||
| nm.logger.Info().Str("gateway", gateway.String()).Msg("adding default route") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be .Debug() if we wanted less logging.
| loadedConfig.KeyboardLayout = "en-US" | ||
| } | ||
|
|
||
| // fixup old lldp modes enabled and all are now rx_and_tx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do a load-time migration of the configuration from the old legacy values to what we interpret them as now.
|
|
||
| func Main() { | ||
| logger.Log().Msg("JetKVM Starting Up") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setup a top-level panic handler to catch and log unhandled panics so we at least get a log message. We might want to add an reboot() call here to see if we can recover?
| networkLogger.Info().Interface("networkConfig", nc).Str("hostname", nc.Hostname.String).Str("domain", nc.Domain.String).Msg("initializing network manager") | ||
| _ = setHostname(nm, nc.Hostname.String, nc.Domain.String) | ||
|
|
||
| if err := setHostname(nm, nc.Hostname.String, nc.Domain.String); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems bad manners to ignore these errors, so emit them to the log at least.
| }) | ||
| if err := lldpService.Start(); err != nil { | ||
| networkLogger.Error().Err(err).Msg("failed to start LLDP service") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a distinct Start() method, we can just leverage the SetRxAndTx() function to manage the Rx and Tx loops.
|
|
||
| if err := lldpService.SetRxAndTx(nc.ShouldEnableLLDPReceive(), nc.ShouldEnableLLDPTransmit()); err != nil { | ||
| networkLogger.Error().Err(err).Msg("failed to set LLDP RX and TX") | ||
| networkLogger.Error().Err(err).Msg("failed to update LLDP RX and TX") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disambiguates the "update" from the "initialize" path in network.go
|
I tested this with reboots in every setting (none/rx-only/tx-only/both) and flipping around through the options and it looks like it works correctly without any panics. |
|
@ym, I left this branch running on my personal JetKVM overnight and came back to a logo-only screen and unresponsive to the local IP. Not sure what happened, and in the process of trying to grab logs I broke off the HDMI connector (Doh!!) and now that guy is toast. Sorry. I have another so continuing on with coding/testing... but wanted to warn you that this version might have an issue I'm unaware of the cause. I did have the HDMI sleep enabled, so could be something there unrelated to this change... |
This targets your
feat/lldpbranch, and builds on PR #960 so merge that first or just this oneRefactors the LLDP (Link Layer Discovery Protocol) service to improve state management, resource handling, and lifecycle control for its receive and transmit functionalities.
Summary
RunningStatestruct to encapsulate the operational state, context, logger, and packet resources for both LLDP RX and TX components independently. This change simplifies management, enhances resource cleanup, and clarifies concurrency handling by ensuring locks are properly released.enabled,all, andbasicmodes to their modern equivalents in existing configurations usingrx_only,tx_only, andrx_and_tx.