-
-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typing updates #2252
Typing updates #2252
Conversation
3800c18
to
e5f03c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fundamentally, I'm in favor of improvements that systematically ensure that our type annotations remain correct and consistent with reality.
However, I'm not immediately and obviously convinced what this PR is doing is the right approach. I'm willing to be convinced otherwise, and I'll openly admit that some of my hesitation may stem from not fully understanding why/how mypy is doing what it's doing.
- Some of the import oddities. "import Pack as Pack"... I'm not wild about the idea of requiring syntactically weird constructions to keep the type checker happy. I might be willing to live with this if it was
- As flagged inline, the need to test backends as well as dummy.
- The fact that we need to manually maintain a "type coverage test suite" without any guidance from automated tooling. Essentially there's no automated failure criteria triggered around adding a new API that isn't accompanied by a type test.
Part of my hesitation on item (3) is that I'm not clear what this typing suite is doing that the actual unit test suite isn't doing. The combination of the core test suite and the backend test suite is invoking every API endpoint, and every branch - that doesn't necessarily guarantee that every possible type input is being tested, but it should be pretty close. Why can't we use mypy on the test suite as compliance condition?
core/src/toga/style/__init__.py
Outdated
@@ -1,2 +1,2 @@ | |||
from toga.style.applicator import TogaApplicator # noqa: F401 | |||
from toga.style.pack import Pack # noqa: F401 | |||
from toga.style.pack import Pack as Pack # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What am I missing here? This reads like a no-op renaming...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh...i should have just left this out until later. This is just something stricter mypy
gets upset about because toga.style
is not the owner of this object. So, when other projects import Pack
from toga.style
, it is considered an implicit re-export. I imagine the concern is that since toga.style
doesn't own Pack
, it could be renamed and the symbol would disappear from this namespace potentially without really understanding the consequence (although, its a bit obvious in this example). For instance, someone could have even aliased the new name of the object to Pack
back over in toga.style.pack
but maybe they won't here. So, doing an explicit re-export via as Pack
helps explicitly avoid that situation.
This is one of mypy's stricter modes that is enabled via --no-implicit-reexport
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - Ok. That makes sense; and oddly, I'm actually more OK with it given that's the reason, as it requires us to beexplicit about thing we're re-exporting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --no-implicit-reexport
docs say this can also be done using __all__
, which seems like a less confusing and more explicit solution.
core/tests/api_interface/images.py
Outdated
|
||
import toga | ||
|
||
assert_type(toga.Image().size, tuple[int, int]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this will be sufficient. In many cases, type checks on core will result in type checking the behavior of the dummy backend, and while the dummy should definitely do the right thing, it's not what most people will be touching in practice, so checks at the runtime level would seem necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if I understand correctly, this is only making assertions about the type hints themselves, but not whether those hints are actually being followed. It seems like a lot of work for no clear benefit to add hundreds of assertions that essentially say "an argument or return type hinted as X is hinted as X", but don't add any additional confidence that an X will actually be accepted or returned by the code when it's run.
Your time might be better spent moving towards full mypy compliance in some parts of the repository, though I admit I'm ignorant about exactly what that would involve. The fact that Toga is split into semi-independent core, backend and testbed projects might make it a bit more approachable than a monolithic project like Briefcase.
In particular, a quite common class of bugs is where the public API in the core returns an object it got from the backend, but the backend doesn't comply with the public hint (e.g. #1779). It would be nice if things like that could be caught by a static type checker, rather than needing an isinstance check next to every equality check in the testbed. This would require formalizing the core / backend interface, which is mostly duck-typed at present, with a set of type-hinted base classes, which is probably a good idea even if we don't end up running mypy at all.
These are good points. I would like to focus on the abstract goals, though, instead of my quick POC badly attempting some sort of an implementation. So, let's assume the type hints for the actual API (instead of dummy or something else) can be effectively evaluated for this discussion. GoalEnsure Toga provides a usable and accurate typing experience for users. Measuring achievementThis goal would be satisfied when a user application exercising arbitrary components of Toga allows mypy to run to success regardless of strictness settings (where possible and pragmatic - StrategyToga compliance with mypy
These points are obviously valid; my primary contention with them, though, is the burden it bestows on Toga development. Writing Python that's compliant with mypy (even without the strict settings) can be a frustrating experience. This is especially true if you consider some of the more polymorphic code in Toga. Additionally, if you want to encourage outside developers to contribute, typing has a particularly rough learning curve and contributions may be ultimately abandoned because of it. That said, a combination of mypy compliance in the source and in the tests could achieve the goal. And it's important that we consider running mypy against actual usage of Toga (by evaluating the tests as well) because we want to confirm the actual typing is not just internally consistent but is actually what the API should advertise. In many cases, an API's compliance with mypy with satisfy both of these....but it is also possible to satisfy mypy with typing that doesn't reflect desired usage. This is more likely in complicated typing involving Generics, for instance. Toga usage compliance with mypyThis is effectively what I'm proposing; that when Toga is used, the APIs satisfy mypy even if Toga doesn't itself. It is essentially a subset of the solution space of "run mypy over the tests" but it wouldn't require all of the tests to satisfy mypy.
This is a good point; in some of my examples in my POC implementation, the assertion is trivial and isn't likely to provide enough value to warrant implementing hundreds of them. (Although, I will note it provoked fixing However, it is also not necessarily the goal of this type checking to ensure the code complies with it. This is part of the balance I talk about in the next section. ImplementationI see levels, I suppose, of implementation options. They all come with their own trade offs (some mentioned above) and ultimately, this may come down to finding the right balance of achieving the fullest heights of the goal with the pragmatic situation of reality and existing tooling. Un-automated and ad-hoc type testingI would consider this the 80/20 solution. Basically, you write a quick Toga app implementing all (or at least the complicated) APIs and ensure it can pass mypy. This is, more or less, how I've come to proposing a solution at all; I will have a Toga app open in PyCharm and I start seeing squiggles under my variables because the typing isn't right. If nothing else, such an approach would head off many of the issues users would create. Moreover, though, a related concern is that users don't report them and instead, Toga just has a lower reputation for them. Automated type testing without known coverageIncorporate the idea above in to CI. I haven't seen any tooling that would allow us to know if we're actually testing everything. However, if that was the goal, then I think the recommendation would just be "run mypy on your code". So, I would consider the intention here to make sure sensitive or error-prone code is being exercised. Toga type checkingThis is the "true" solution of just getting Toga to pass mypy...which I've discussed. Other comments
Agreed. My approach here is more of an interim solution; however, it'd only be interim if someone actually does all this work to type Toga soup to nuts at some point. I'm mostly targeting the immediate user experience of trying to use Toga with mypy or even in a modern IDE. To that end, if you don't have Pylance enabled in VS Code, I would recommend turning it on so at least the type issues it can find are elevated at dev time. Final ThoughtsAt the end of the day, this is not a silver bullet or even a pretty stopgap. The best experience for typing with Toga would be facilitated by fully implementing typing within Toga and ensuring the code and its tests pass mypy. That is no small order, though. So, this is mostly trying to find a middle ground where at least some of the typing for the API's interface can be guaranteed and user ergonomics for implementing type checking in their projects using Toga is as painless as possible. |
e5f03c7
to
bf3e4b5
Compare
To be clear - dummy is a valid backend - it's just not a backend most people will use in practice. It should match the same typing interface as a "real" backend.
Agreed - although I'd qualify that slightly by saying that while typing can be a useful mechanism for providing IDE hints and identifying some classes of bugs, it's not a panacea. I'm interested in improving typing to the extent that it improves the lived experience of Toga developers (both those developing with Toga, and those developing Toga itself). I'm not especially interested in expending excessive effort just so we can metric of "type purity". Practicality beats purity.
+1. +2 to the "pragmatic" part :-)
+1
It might be a lack of imagination on my part, but I'm having difficulty working out what this would look like. If Toga's test suite (core + testbed) is exercising API and every code path, I'm not sure I see how the tests could satisfy mypy if Toga doesn't.
I can see how this would give us a quick win (or at least let us quickly identify problems that exist right now). My concern would be maintaining this sample app over time. I'm not wild about the idea of a new project that exists purely for testing purposes that isn't executed as part of automated testing except as part of mypy. However, I wonder if there's a two-birds-one-stone solution here: We have toga-demo, but it is... not a very good demo :-) A much better demo would be a "widget gallery" - a live demonstrator of every widget that Toga contains (or, at least, a placeholder that says "this widget doesn't exist". This app could be uploaded to app stores, or made available for download. If it also gives us an opportunity to do a practical type check... ? The reason we haven't historically done this is that we don't have a widget to use for the top-level navigation of such an app. We need the ability to display a list of widgets to demonstrate, which links to a page that demonstrates that widget.
I'm not sure I understand why these two are in this order in terms of complexity. I would have thought internal type completeness in Toga would have been easier, as the external type surface isn't being tested.
I'm definitely up for a "gradual improvement" approach. My concern is around ensuring that the approach we take is "self-sustaining" - that it doesn't involve adding something to a test suite that then starts to bitrot because there's no reason to ensure that it remains current and accurate. |
I think I should clarify my position here. I am saying 1) my stated goal can be achieved with only type completeness of Toga's API surface and 2) this is far more feasible than type completeness of Toga, at large. The primary piece of prior art for this strategy is Python's typeshed itself. It provides type completeness for the API surface of many packages and allows downstream developers to implement them with total typing compliance of their own code. This is, of course, while many of the packages represented by the typeshed are not typed themselves at all. Therefore, as made evident by typeshed, type completeness of an API surface is an abstract subset of type completeness for the entire API (and its supporting code base). When a developer runs mypy on their code that's using Toga, mypy doesn't recursively review the internals of all function calls to check if they are internally consistent about the types they are declaring at their API surface; mypy takes the declared types and assumes they are correct. Finally, retroactively achieving type completeness of a large code base long in to its existence is likely to be a perilous mission. When you write Python code without typing in mind from the get go, you're likely to end up with code that isn't compatible with type checking via static analysis. A good example is Briefcase; I never finished typing all of Briefcase because it strongly leverages composable classes via mix-ins, incompatible subclassing, and several of the important data structures are loosely defined. So, I would have had to create all of these awkward At any rate, I feel confident saying achieving type completeness of Toga, in general, would be a significant task...and also one I am unlikely to take on. To get a taste of this, just run mypy on select files in Toga, e.g.
Yeah; I understand this as a competing priority. I think the reality is, unfortunately, that there will be bitrot either way; if we don't verify or test the API surface types, they will bitrot...and if we do verify them with some sort of awkward pseudo-test app that won't have any coverage, it will bitrot. We're then full circle with "make Toga type complete" which I think is likely to be a task of dubious merit. I do like the idea of the demo app potentially serving a dual purpose to verify Toga's API typing. However, I don't think I'm keen to take that on as part of this. So, given all this, I can probably commit, at least, to a survey of Toga's current API surface types with the aim to:
|
This all makes sense. I hadn't considered Typeshed as a point of comparison, and yeah - I can now see how an external API surface can be type complete without the internals being complete.
And that's my general criticism of typing as an approach. There are times where it's very helpful. There are times when it is in direct conflict with the flexible-duck value proposition that is a core of Python's appeal.
That seems like the best way forward to me. There's definitely stuff we can fix right now, but we don't need to swallow the full mypy elephant in one sitting; and we have a goal we can work towards in terms of a demo app (and any other areas that might be identified as part of a once-off type audit) that might provide a long term way to make mypy palatable. |
These all seem like worthwhile goals. You've probably worked this out already, but the type hints currently in the API were written mainly for the purpose of generating the documentation rather than being logically rigorous. In fact, there were a couple of times where I deliberately made the hints looser because the "correct" form was unreadable in the docs. Type aliases are supposed to be the solution to this, but Sphinx doesn't seem to support them very well (e.g. #1984 (comment)). |
Thank you for bringing up docs; I hadn't properly included them in my calculus. I will definitely incorporate them as part of the pragmatism of this implementation. |
Want to mention this now that I've seen how fickle Sphinx is with rendering TypeAliases. It turns out that Sphinx is very sensitive to how the TypeAlias is defined. Notably, a TypeAlias that's used in an API signature cannot use forward references or mixed definitions. That is, "forward references" will not work: ActionHandler: TypeAlias = "ActionHandlerType1 | ActionHandlerType2"
class ActionHandlerType1(Protocol): ...
class ActionHandlerType2(Protocol): ... You need to ensure ordering to avoid the forward references: class ActionHandlerType1(Protocol): ...
class ActionHandlerType2(Protocol): ...
ActionHandler: TypeAlias = "ActionHandlerType1 | ActionHandlerType2" Furthermore, though, the TypeAlias definitions cannot be robust, if you will, if expressed as a string; for instance: T = TypeVar("T")
class ActionHandlerType1(Protocol[T]): ...
class ActionHandlerType2(Protocol[T]): ...
ActionHandler: TypeAlias = "ActionHandlerType1[T] | ActionHandlerType2[T]" I was only able to get this to work as: T = TypeVar("T")
class ActionHandlerType1(Protocol[T]): ...
class ActionHandlerType2(Protocol[T]): ...
ActionHandler: TypeAlias = Union[ActionHandlerType1[T], ActionHandlerType2[T]] Therefore, TypeAliases can be reliably rendered (ostensibly anyway) in the docs if avoid forward references and use explicit Unions. P.S. You also must build the docs using Python 3.10 or later....which we always do anyway via |
4df6fdf
to
cb569ba
Compare
tox.ini
Outdated
skip_install = True | ||
deps = ./core[dev] | ||
commands = pre-commit run --all-files --show-diff-on-failure --color=always | ||
|
||
# The leading comma generates the "py" environment. | ||
[testenv:py{,38,39,310,311,312}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: these tox envs can't be run in parallel...or at least not reliably. the simultaneous pip install
commands might be an issue but I know that coverage combine
is definitely an issue.
That specific exercise smells a bit like a bug to me. However, this is also getting to the point where I'm not sure we're actually gaining anything by adding strong type declarations. If Toga's types are useful as a way to pass edge cases upstream to the people working on typing to diagnose bugs, then that's great - but I'm not wild about the idea of putting typing's bleeding edge on our critical path. |
4ac5de4
to
3eae7c1
Compare
These are all Sphinx bugs; there are several logged issues I found related to TypeAliases, forward references, etc. However, if you avoid using all these things together and just stick to basic typing and assignments, then this all works as expected for documentation; given that, I wouldn't necessarily consider this bleeding edge, per se....perhaps just a bit rough around the edges.
Well, I think we are gaining something (other than typing purity, that is); we are getting useful and accurate typing for handlers...which was one of the primary impetuses for me to do this (i.e. #2192). I'd also argue this use case isn't really an edge case; right now, the typing for handlers doesn't support async (and, less importantly, generators). More ideologically, though, I would put forth that inaccurate typing is worse than no typing. Most IDEs are easily configured (some even out-of-the-box) to surface typing issues for Python to the developer. Therefore, if Toga is going to type its public API, I think it should be accurate. Now...I do not intend this to mean that every type has to be accurate to the extent of the true strictness of the underlying code...but it should still be accurate in its permissiveness. For instance, if we consider handlers, I've implemented an accurate (and complete) typing for them (that's also not overly burdensome, IMHO). However, if we think this goes too far with typing, then we should just type them as (meta: we're starting to enter the "pragmatism" debate of this PR :) so, with that, I would like to try to establish some guiding principles to help us balance Toga's typing. Hence, I'm putting forth that irrespective of how "true" a particular type is, its permissiveness should always be accurate.) |
+1
Agreed. Elaborating further on this, I'd give high level goals of:
Balancing the relative priorities of those will be a bit of a balancing act, but they're at least principles to start with (or, at least, principles to start a discussion about on the way to a final set :-) |
ff85208
to
1fad2cc
Compare
Note about the current state: You may notice changes that aren't particularly in line with the stated goal of "typing the API surface of Toga". I'm currently experimenting with different functionality of mypy but I will most likely target an end state that will not include functional changes to code or |
1fad2cc
to
beb7384
Compare
a326c0e
to
a178808
Compare
I spent some time with My goal was to add typing generics to these classes so static code analysis could know the arbitrary types of the data inside If we consider
If you try to union these types, the type of That said, if we ignore the "iterable of objects" case, then using A similar issue exists for Alternatively, I suppose, we could re-implement their designs.....and instead, perhaps, use specific constructors based on the type of input. These would obviously be pretty disruptive as well as a worse user experience... |
core/src/toga/sources/tree_source.py
Outdated
if TYPE_CHECKING: | ||
if sys.version_info < (3, 10): | ||
from typing_extensions import TypeAlias | ||
else: | ||
from typing import TypeAlias | ||
|
||
NodeDataT: TypeAlias = Union[object, Mapping[str, T], Iterable[T]] | ||
TreeSourceDataT: TypeAlias = Union[ | ||
object, | ||
Mapping[NodeDataT[T], "TreeSourceDataT[T]"], | ||
Iterable[Tuple[NodeDataT[T], "TreeSourceDataT[T]"]], | ||
] | ||
else: | ||
TreeSourceDataT = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a note, we end up with weird TYPE_CHECKING
blocks like this if we continue managing typing_extensions
the way. That is, by including it as a dev
requirement, we can't freely use TypeAlias
(or anything else we'll need from it) and as a consequence, we can't assume anything defined as a TypeAlias
is available elsewhere for importing.
Interestingly, I didn't know I had all these import issues until I actually tried running a real app with Python 3.9. Neither the unit tests or testbed exercise in CI caught it.
The alternative would be to add typing_extensions
as a full requirement....but I definitely understand the reluctance there.
Just as a further note, the design of typing_extensions
basically lets you use it exclusively. The devs simply alias the backported classes to the real deal once youre on a supported Python. Furthermore, though, they backport features of classes; so, TypeVar
, for instance, has features in 3.12 that do not exist in stdlib in 3.10. But if you use the TypeVar
from typing_extensions
, you get those features even on 3.10.
Finally, pyupgrade supports typing_extensions
and will automatically revert its use to typing
when only appropriate Pythons are supported.
Mostly just wanted to lay this out in one place.
FWIW: I'm not fundamentally opposed to some redesign here if it will help - however, I suspect this is just one of those areas where we see the hard edges of strong typing. We're deliberately exposing an interface that is flexible, which is the exact opposite of what strong typing wants to do. |
At this point, I'm thinking I'm just going to remove the |
e3c3adc
to
8e1170b
Compare
The typing updates to the code are complete at this point; I removed the attempt to make |
9d8152a
to
76da8fe
Compare
I reviewed all of the |
Looks like the RTD problem is because Microsoft's web servers are either being flaky, or they've blackholed a URL we were using.... |
76da8fe
to
9145b98
Compare
Just saw this in that last re-run for Windows...curious to see a stacktrace in the tests....
|
Hrm... that's a new one for me - haven't seen that before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My brain is bleeding after going through all this... but AFAICT, it's all fairly straightforward type updates (and occasional code cleanups, I'm guessing some of which triggered type checks).
I reiterate my original concern that these types will drift to being incorrect over time without automated checking, but I also acknowledge that automated checking will only serve to make the types even more incomprehensible, so I guess we'll just need to live with the occasional "type fixes" PR when someone notices a problem.
The two things of substance I noticed in my review:
- The documentation links for IconContentT and ImageContentT (et al) seem to be broken - the types are documented, but the hyperlinks aren't working
- The type annotation for SplitContainer.container is
tuple
... which doesn't seem right.
However, neither of these are new problems - the existing docs aren't hyperlinking IconContentT, and you've only corrected the types (and docs) for SplitContainer relative to what is there. Given how much effort it will be to keep this PR up to date, I'm going to merge it as is, and we can deal with those problems as follow ups in the future.
def _create_impl(self): | ||
return self.factory.App(interface=self) | ||
def _create_impl(self) -> None: | ||
self.factory.App(interface=self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All that's an interesting oddity... I've never noticed that we're not actually using the result of the factory call, relying on the factory to set the attribute on the interface.
Not something we need to fix here, but I'll remember to clean this up when I'm landing the #2244 changes.
As far as I think I've ultimately landed on that for a lot of this a custom sphinx extension will be necessary to really achieve the desired balance between accurate types and comprehensible documentation. |
Yeah - looking closer at the problem refreshed my memory; I think you're right on both counts. It's still worth logging as an issue in case someone is motivated to look into it - see #2638. |
Changes
ListSource
andTreeSource
are not genericDetailedList
,Table
, etc.Protocol
s will exist for the backend implementations for each widget, window, etc.Notes
Any
vsobject
KI
Original PR body
## Summary This is a proposal to introduce better typing ergonomics for users and ensure typing accuracy of the Toga API.Strategy
Introduce tests that specifically exercise the types for the API's interface.
These tests are a little different than what pytest does; they aren't runnable code - instead, they simulate user code and ensure that mypy thinks the API conforms to the expected types.
For instance, this test confirms
src
accepts astr
andsize
returns atuple
of twoint
s:Benefits
Considerations
src
fortoga.Image(src="...")
should accept aPath
....but running mypy over code usingtoga.Image
can verify thatThoughts, feedback, criticism welcome before I move forward
PR Checklist: