Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HassDict implementation #103844

Merged
merged 3 commits into from
May 7, 2024
Merged

Add HassDict implementation #103844

merged 3 commits into from
May 7, 2024

Conversation

cdce8p
Copy link
Member

@cdce8p cdce8p commented Nov 12, 2023

Proposed change

Alternative approach to typing hass.data.
For the previous attempt, see: #96419

Got the idea from aiohttp. For 3.9.0, they've added AppKey to solve a similar problem with typing the app dict.
https://docs.aiohttp.org/en/latest/web_advanced.html#application-s-config

One of the reasons against the initial change was that ideally integrations shouldn't access hass.data directly and the change was to involved for little value.

Even though this diff might look bigger at first, it's actually easier to apply. Most of it are just the correct overloads. If at some point a good design for getter / setter emerges, this change can be easily incorporated or even fully reverted if necessary.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

@cdce8p cdce8p requested a review from a team as a code owner November 12, 2023 12:53
@cdce8p cdce8p requested a review from frenck as a code owner November 12, 2023 12:54
@cdce8p cdce8p marked this pull request as draft November 12, 2023 12:59
@@ -103,17 +104,17 @@ async def async_setup_entry(
async_add_entities: AddEntitiesCallback,
) -> None:
"""Set up AdGuard Home sensor based on a config entry."""
adguard = hass.data[DOMAIN][entry.entry_id][DATA_ADGUARD_CLIENT]
data = hass.data[ADGUARD_HASS_KEY][entry.entry_id]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside for me is that I can't just read the platform code anymore and know what the type is of data. As the type is different for each integration it's also not something you'll likely learn by heart compared to the core or helper interface functions. I'll have to jump around and check the type when reading the integration code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this is what I'm comparing:

data: AdGuardData = hass.data[DOMAIN][entry.entry_id]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, although I would argue it's manageable with good code completion. Also consider that the type (and any attributes) would get type checked and you're less likely to make an error as with type attributes on each individual platform. (Who says it's the correct type that's being assigned in sensor.py?)

Copy link

There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days.
If you are the author of this PR, please leave a comment if you want to keep it open. Also, please rebase your PR onto the latest dev branch to ensure that it's up to date with the latest changes.
Thank you for your contribution!

@cdce8p
Copy link
Member Author

cdce8p commented Jan 12, 2024

This is still active. It just needs a decision if this approach should be explored further.

@MartinHjelmare
Copy link
Member

I'm not a fan of this since it makes reading the code without extra tools than a browser a worse experience.

I'd like to be able to see what type the key in hass.data has stored at the place when accessing the item. Compare when using an object that has a generic type it's possible to replace the generic type with the actual type when the object is a parameter in a function and the actual type is known. Something like that would be much nicer.

@cdce8p
Copy link
Member Author

cdce8p commented Jan 12, 2024

I'd like to be able to see what type the key in hass.data has stored at the place when accessing the item. Compare when using an object that has a generic type it's possible to replace the generic type with the actual type when the object is a parameter in a function and the actual type is known. Something like that would be much nicer.

Do you mean something like #96419? I ultimately didn't pursue this further as it would have required changes for basically every file that wants to use it. Additionally it didn't add much type safety as the types would only be evaluated inside the function context and not across the whole integration.

Compare that to the current proposal where the type is set once per integration with HassKey / HassEntryKey and just the key is reused.

@MartinHjelmare
Copy link
Member

I wouldn't make hass generic. I'd expect it to be hass.data that should be generic. I don't know if it's feasable or not.

@cdce8p
Copy link
Member Author

cdce8p commented Jan 12, 2024

I wouldn't make hass generic. I'd expect it to be hass.data that should be generic. I don't know if it's feasable or not.

Unless I'm missing something, that's what I did there -> make hass.data generic. That just also means the hass becomes generic too then.

HassType was just a workaround to avoid running into issues with disallow_any_generics.

--
Could you give an example of how you would expect to type this? That would help me understand if other options are possible. Take this is as an example, or something else if you have a better one.

async def async_setup_entry(
    hass: HomeAssistant,
    entry: ConfigEntry,
    async_add_entities: AddEntitiesCallback,
) -> None:
    """Set up Elgato button based on a config entry."""
    coordinator: ElgatoDataUpdateCoordinator = hass.data[DOMAIN][entry.entry_id]
    ...

@MartinHjelmare
Copy link
Member

MartinHjelmare commented Jan 12, 2024

Something like this, maybe.
Edited

hass.data: hass.data[DOMAIN, ElgatoDataUpdateCoordinator]
coordinator = hass.data[DOMAIN][entry.entry_id]

As a reader I don't need the DOMAIN type, but maybe that is needed so that the type application can check that ElgatoDataUpdateCoordinator is a valid value. If there's a shorter syntax that would be nice.

@cdce8p
Copy link
Member Author

cdce8p commented Jan 13, 2024

Something like this, maybe.

hass.data: hass.data[DOMAIN, ElgatoDataUpdateCoordinator]
coordinator = hass.data[DOMAIN][entry.entry_id]

Unfortunately, it's not possible to add annotations to attribute assignments. So you'd need to assign hass.data to a temporary variable first. Something like this.

data: HassData[ElgatorDataUpdateCoordinator] = hass.data
coordinator = data[DOMAIN][entry.entry_id]

IMO, this doesn't add much value over the current solution with

coordinator: ElgatoDataUpdateCoordinator = hass.data[DOMAIN][entry.entry_id]

--
I'm mainly trying to address two issue here:

  1. Code duplication: That might be minor, but it's currently necessary to add the annotation coordinator: ElgatoDataUpdateCoordinator = ... for each platform setup.
  2. Improved type safety, especially for code changes. Say instead of the Coordinator, the integration is changed to store a TypedDict with different values. Type assignments, especially if the RHS is Any, aren't type checked. So Mypy wouldn't notice that all async_setup_entry functions are now broken.
    However, if the type is stored together with a HassKey like in this PR, you'd first get an error if you don't change the HassKey type due to an incompatible assignment, and later on arg-type errors as you no longer pass a coordinator instance but a TypedDict to the ElgatoEntity.__init__.

I do understand your readability concerns, but please consider that it's only easy in some cases to understand the type currently. If it's a TypedDict, you'd still have to look up the model to know all the attributes and types. Additionally there are also examples where the current form doesn't really work and is thus omitted, giving not type information at all. Finally, if a function has type annotations, and is therefore type-checked, Mypy would emit errors if an attribute doesn't exist or the types don't match.

Copy link

There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days.
If you are the author of this PR, please leave a comment if you want to keep it open. Also, please rebase your PR onto the latest dev branch to ensure that it's up to date with the latest changes.
Thank you for your contribution!

@github-actions github-actions bot added the stale label Mar 13, 2024
@cdce8p cdce8p added no-stale and removed stale labels Mar 13, 2024
@cdce8p cdce8p mentioned this pull request Mar 27, 2024
20 tasks
@cdce8p cdce8p marked this pull request as ready for review March 27, 2024 23:40
@cdce8p
Copy link
Member Author

cdce8p commented Mar 27, 2024

This idea is now feature complete and fully working as intended with both Mypy and pyright (Pylance). I believe it represents a worthwhile improvement to our codebase and will help to catch issues with little additional effort.

The downside of this approach is that the key itself is now an object, and so if we ever see garbage data in hass.data, we wouldn't know which object generated it.

I've added string compatibility for the HassKey types. I.e. for debugging we'd now get the nice dataclass __repr__ while preserving the backwards compatibility to regular keys.

I'm not a fan of this since it makes reading the code without extra tools than a browser a worse experience.

I'd like to be able to see what type the key in hass.data has stored at the place when accessing the item.

That is a valid concern and the only small downside IMO. As the data type is stored with the key itself, it would now be necessary to look it up first. This can be mitigated by code editors, e.g. for VSCode it's as easy as hovering over the variable to see its type. There is also an option to inlay the type itself: python.analysis.inlayHints.variableTypes: true.

IMO it's wrong it compare it to the current approach of data: AdGuardData = hass.data[DOMAIN][entry.entry_id]. Nobody is guaranteeing that this is actually the correct data type. At best it's a promise that this should be it, at worst it's a lie for the type checker. See my comments earlier for more details on this and related approaches which have similar issues.

It also neglects the fact that often times data would just be untyped and thus interpreted as Any. In the end the developer has to remember to add the annotation to each call and keep it in sync if the type changes.

The approach outlined in this PR has the big advantage that it's fully type checked across all uses. Not only for accessing a value but also for storing it. All the while only requiring one type annotation that needs to be updated if anything changes.

@cdce8p cdce8p removed the no-stale label Mar 27, 2024
@bdraco
Copy link
Member

bdraco commented Apr 7, 2024

It looks like VScode can work out the typing now 👍 That addresses my other concerns with this.

Screenshot 2024-04-06 at 2 17 12 PM Screenshot 2024-04-06 at 2 17 08 PM

If we can solve the run time performance issue, this makes sense to me.

@@ -738,7 +738,6 @@ async def test_integration_only_setup_entry(hass: HomeAssistant) -> None:
async def test_async_start_setup_running(hass: HomeAssistant) -> None:
"""Test setup started context manager does nothing when running."""
assert hass.state is CoreState.running
setup_started: dict[tuple[str, str | None], float]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do really like how all this redundant code disappears

Copy link
Member

@bdraco bdraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good from my perspective now. There are still other reviewer's comments that are outstanding. Its likely going to need some developer docs explaining how to use it.

@MartinHjelmare
Copy link
Member

We've discussed this PR in the core team.

We think the general idea is ok, but we'd like a specific solution for config entries first that stores the data on the config entry. We want it first since we expect a big influx of migration PRs for integrations when this is merged.

The idea for config entries is to make the config entry generic and allow the integration to set what object type is stored.

@MartinHjelmare MartinHjelmare marked this pull request as draft April 15, 2024 15:09
@MartinHjelmare MartinHjelmare removed the second-opinion-wanted Add this label when a reviewer needs a second opinion from another member. label Apr 15, 2024
@cdce8p
Copy link
Member Author

cdce8p commented Apr 15, 2024

We've discussed this PR in the core team.

We think the general idea is ok, but we'd like a specific solution for config entries first that stores the data on the config entry. We want it first since we expect a big influx of migration PRs for integrations when this is merged.

The idea for config entries is to make the config entry generic and allow the integration to set what object type is stored.

Thanks for the feedback! I tried to implement an MVP in #115669 just to get a feel for it. Also opened #115668 to get the unrelated AdGuard changes out of the way. Will rebase both PRs afterwards.

I left some implementation comments on the other PR but overall I'm personally not fully convinced yet that it would be truly better to store the data inside the ConfigEntry. It somewhat seems to me like a solution in need of a problem.

@cdce8p
Copy link
Member Author

cdce8p commented Apr 23, 2024

I've been testing HassKey / HassEntryKey for some more integrations locally. Just to illustrate, this actually helped identify a bug after the data structure was changed during a refactor. The tests didn't catch it as the data itself was mocked.

Side note: The system health info isn't bound to an entry. So it seems we still need to keep hass.data around for some more cases.

@cdce8p
Copy link
Member Author

cdce8p commented Apr 29, 2024

Reverted the AdGuard changes here as we're going with the entry.runtime_data approach for that.

@zifiri99

This comment was marked as spam.

Copy link
Member

@frenck frenck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! With the addition to storing runtime data in the config entry, this one can continue now as well 👍

Thanks, @cdce8p 👍

../Frenck

@frenck frenck merged commit 3d700e2 into home-assistant:dev May 7, 2024
38 checks passed
@cdce8p cdce8p deleted the hass-key branch May 7, 2024 09:33
@github-actions github-actions bot locked and limited conversation to collaborators May 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants