Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin extensions to enable robust checking for unawaited coroutines in Trio programs #5650

Open
njsmith opened this issue Sep 20, 2018 · 1 comment
Labels
feature priority-1-normal topic-plugins The plugin API and ideas for new plugins

Comments

@njsmith
Copy link

njsmith commented Sep 20, 2018

Forking this off from #2499, as requested by Jukka. The discussion is about how Trio programs can use mypy to detect when they've accidentally forgotten an await, which is a fairly common problem. (See also the corresponding Trio bug, which has examples: python-trio/trio#671)

What makes Trio special here is that by design, it's extremely simple for a static analysis tool to detect when an await is missing: when you see a function that returns an Awaitable, then you should always see an await immediately there to consume the Awaitable; if you don't, that's a mistake. This is a much simpler check than what you need to detect this mistake in asyncio programs (which is what #2499 is about), but it's also of less universal applicability, so Jukka suggested it should be a plugin.

This check needs to look at each call site, and get two pieces of information: (1) does this call return an Awaitable?, (2) if so, is it the direct child of an Await node?

The existing function/method hooks only let you define one hook per call site, and are used to determine the call return type in complicated cases (e.g. open). So logically our new check ought to run after those hooks, and certainly doesn't want to replace them.

Jukka suggested adding a new type of "check" hook that runs at that stage, and can report issues but doesn't mutate types.

I was originally thinking that this check might actually want to mutate types. Rationale: If someone writes a = some_async_func(); a.some_method(), then there are potentially two errors you can report: (1) missing await in front of some_async_func; (2) coroutine object has no attribute some_method. Obviously the first error is the real one; the second one is just shrapnel from the first mistake, and maybe you don't even want to print it, because it's just going to distract the user from the real mistake. So maybe to get the highest quality error messages, after reporting the missing await you might want to coerce the call expression's type to make it act like the await was actually there, so that follow-up errors are more useful. OTOH, I realized that if for some reason you really did want to skip an await (e.g., because you're running this check over trio's internals, which do have to create bare coroutine objects in a few places), then you need some way to do that, and # type: ignore can suppress an error message but it can't suppress the type coercion. So maybe we should just live with the shrapnel and a pure "check" hook is good enough? Throwing this out there in case you experts have any thoughts :-).

Anyway, if we go with the check hook, there are two possible approaches that come to mind:

  • we could have local/specialized check_function/check_method hooks that run on function/method calls, similar to the existing hooks, and give them a custom Context that includes information about whether the parent node is an Await.

  • we could have a very generic check_file hook that runs after the file has been fully type-annotated (so I guess after checker phase 2?), and can walk the tree and do whatever it wants with what it sees.

The latter has the advantage that you can imagine other plugins taking advantage of this API besides just this one, and that you wouldn't have to hard-code strange things like parent-node-is-an-Await, because the check hook could compute this as it walked the tree. It might also be useful if we want to get really fancy and add trio-asyncio support: with trio-asyncio you can have parts of the file that are trio code, and other parts that are asyncio code, and a checker could potentially figure out which is which and only complain about missing awaits in the trio code... so that all seems pretty promising.

But what I'm not as sure of is whether there's actually a good moment to run a hook like this. Does mypy ever actually store all the types, or just compute them on the fly? (It looks like the checker's type_map attribute may store them?) Would it cause any problems with all the fancy incremental/caching stuff that mypy does these days?

@ilevkivskyi
Copy link
Member

Something I wanted to add from myself: there is a common limitation on all current plugin hooks: they only get called for a given qualified name of some node. I understand this limitation is probably to avoid big performance losses if a plugin is called on every node. But I think we should think more about possible compromises here. This limitation makes it hard to implement plugins that need more information to act, another example (in addition to await plugin) is doing TypeInfo transformation depending on a dynamic base class.

Does mypy ever actually store all the types, or just compute them on the fly?

Yes, they are all in type_map.

Would it cause any problems with all the fancy incremental/caching stuff that mypy does these days?

This is a known issue. Fine-grained incremental mode doesn't work well with plugins. I will work on fixing as soon as I will have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature priority-1-normal topic-plugins The plugin API and ideas for new plugins
Projects
None yet
Development

No branches or pull requests

3 participants