Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostics report for Thread networks #88541

Merged
merged 13 commits into from
Feb 22, 2023

Conversation

Jc2k
Copy link
Member

@Jc2k Jc2k commented Feb 21, 2023

Proposed change

To streamline user support that involves thread, this report helps spot and rule out:

  • Is the users border router visible at all? Is one missing? Is there an extra one that the user didn't expect?
  • Is the border router still a valid neighbour (is it failing ipv6 neighbour discovery)
  • Is the border router actually announcing any routes? The user could have a network boundary like
    VLANs or WiFi isolation that is blocking the RA packets.
  • Alternatively, if user isn't on HAOS they could have accept_ra_rt_info_max_plen set incorrectly.
  • Are there any bogus routes that could be interfering. If routes don't expire they can build up. When you have 10 routes and only 2 border routers something has gone wrong. (See Support RFC 4191 Route Information (Option 24) for Thread operating-system#2333 (comment) for the kind of thing we are dealing with).

Right now it is extremely hard to help HAOS users because getting the route and neighbour data involves SSH access at the OS level (the SSH addon is not nearly enough).

The report doesn't yet automatically diagnose any faults, but it does provide enough information to help me rule out a border router / network configuration issue, which is about a third of my triaging process.

This does not do any connectivity checks to end devices. For example, we have cases where an Apple TV thats in a cupboard full of HDMI cables has probably fallen off the mesh and evicting routes to it from the HA instance will magically fix things. We can't detect that case with this.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

@home-assistant
Copy link

Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration (thread) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of thread can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign thread Removes the current integration label and assignees on the issue, add the integration domain after the command.

@Jc2k Jc2k force-pushed the thread_diagnostics branch 2 times, most recently from a1f87ce to 294aa2d Compare February 22, 2023 11:34
@Jc2k Jc2k marked this pull request as ready for review February 22, 2023 13:26
@Jc2k Jc2k requested a review from a team as a code owner February 22, 2023 13:26
@Jc2k Jc2k force-pushed the thread_diagnostics branch from 9b66cdb to 6b84f0c Compare February 22, 2023 13:45
try:
return value.decode()
except UnicodeDecodeError:
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this base64 encode the value prefixed with encode-error:, so we know it's not None?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only moved what was already there so I could re-use it, so I don't know if theres any reason for the current behaviour. I'm not opposed to changing it - @emontnemery?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok let's do it in another PR then.

Copy link
Member Author

@Jc2k Jc2k Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would like that detail in the diagnostic report, but the same decoder is used by the thread panel and It would be weird there. Probably better to show nothing that show the user garbage? Maybe leave as is for now and we can revisit if we see it in practice? Can always include the raw TXT record (in the diagnostics report only) if we start seeing trash border routers?

@Jc2k Jc2k force-pushed the thread_diagnostics branch from 6b84f0c to 7b15281 Compare February 22, 2023 15:46
@balloob balloob merged commit f7bfdfe into home-assistant:dev Feb 22, 2023
@Jc2k Jc2k deleted the thread_diagnostics branch February 22, 2023 16:27
raman325 added a commit to raman325/home-assistant that referenced this pull request Feb 22, 2023
* dev: (60 commits)
  Update frontend to 20230222.0 (home-assistant#88615)
  Add controller support to `zwave_js/subscribe_firmware_update_status` (home-assistant#87348)
  Bump Freebox to 1.1.0 (home-assistant#88609)
  Always include platform in `config/entity_registry/list_for_display` (home-assistant#88601)
  Add dsk option to zwave_js/add_node WS command (home-assistant#87823)
  Update zwave_js FirmwareUploadView to support controller updates (home-assistant#87239)
  Add new zwave_js WS command to parse DSK from QR code (home-assistant#87237)
  Diagnostics report for Thread networks (home-assistant#88541)
  Set default for `hass_config_yaml` fixture to "" (home-assistant#88608)
  Bump reolink-aio to 0.5.0 (home-assistant#88594)
  Bump intents package version; hassil==1.0.5; home-assistant-intents==2023.2.22 (home-assistant#88605)
  Add Reolink update entity (home-assistant#87865)
  Fix cover template: optimistic mode is ignored (home-assistant#87925)
  Fix 500 error when getting calendar events (home-assistant#88276)
  Add clarifying comment about unit of elevation (home-assistant#88489)
  Add ZHA "consumer connected" binary sensor for Xiaomi EU plugs (home-assistant#88194)
  Bump Insteon dependencies (home-assistant#88514)
  Use load_json_object in ecobee (home-assistant#88584)
  Use load_json_object in html5 (home-assistant#88586)
  Improve type hint in homeassistant trigger (home-assistant#88596)
  ...
@github-actions github-actions bot locked and limited conversation to collaborators Feb 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants