Skip to content

Make multiprocessing pipes generic #11137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Oct 2, 2024

Conversation

Avasam
Copy link
Collaborator

@Avasam Avasam commented Dec 10, 2023

A while ago I was toying with the idea of using multiprocessing for a long-running task, and got annoyed that I could not specify a type for my interfacing pipes. So I hashed up a quick generic multiprocessing stub for my need.

I didn't end up going that route, but since I already did some typing work, figure I'd publish it rather than throw it away.
Idk if the TypeVar variance is correct, please advise.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Collaborator

@srittau srittau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm undecided on whether this is a good idea or not. I like the added type safety, but especially the Pipe case seems a bit cumbersome. Maybe this should wait for PEP 696?

But some comments below.

Comment on lines 14 to 15
_T1 = TypeVar("_T1")
_T2 = TypeVar("_T2")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling these _S and _R or even _SendT and _ReceiveT might be better for documentation purposes.


class Listener:
def __init__(
self, address: _Address | None = None, family: str | None = None, backlog: int = 1, authkey: bytes | None = None
) -> None: ...
def accept(self) -> Connection: ...
def accept(self) -> Connection[Incomplete, Incomplete]: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Incomplete instead of Any since you'd like to make Listener generic as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this could be improved further, rather than "too complex for the type system" or "you can truly send anything through the connection". Just not something I care to do in the scope of making Pipe generic.

Comment on lines 75 to 78
def Pipe(duplex: bool = True) -> tuple[Connection[_T1, _T2], Connection[_T2, _T1]]: ...

else:
def Pipe(duplex: bool = True) -> tuple[PipeConnection, PipeConnection]: ...
def Pipe(duplex: bool = True) -> tuple[PipeConnection[_T1, _T2], PipeConnection[_T2, _T1]]: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As seen in the primer hit (and the tests), this means that the return types of Pipe() need to be annotated now, which is a bit cumbersome.

Copy link
Member

@AlexWaygood AlexWaygood Dec 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's actually valid to use a TypeVar like this, where it's entirely scoped within a return annotation. IIRC type checkers special-case Callable so that you can do things like def x() -> Callable[[T], T], which comes up a lot. But in general, I don't think there's any supported way of spelling what you're trying to express here, unfortunately :(

Excluding methods that use typevars which are scoped to the class, if you want a TypeVar in a return annotation of a generic function to be solvable, you have to use the TypeVar at least once in the context of the function's parameters

Copy link
Collaborator Author

@Avasam Avasam Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping something like

_T1 = TypeVar("_T1", default=Any)
_T2 = TypeVar("_T2", default=Any)
def Pipe(duplex: bool = True) -> tuple[PipeConnection[_T1, _T2], PipeConnection[_T2, _T1]]: ...
# or
def Pipe[_T1, _T2](duplex: bool = True) -> tuple[PipeConnection[_T1, _T2], PipeConnection[_T2, _T1]]: ...

would eventually be supported syntax that still ensures the TypeVar is bound. But I don't see anything about it either in https://peps.python.org/pep-0695/ or https://peps.python.org/pep-0696/ (only class examples)

As shown in the tests. Mypy doesn't support this to restrict the generics between each other (but it also doesn't fail). Pyright supports it, but idk if intentionally or not.

Copy link
Member

@AlexWaygood AlexWaygood Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not do this for now. It's an underspecified area of the spec (I have no idea if pyright is deliberately supporting this or not), and we can see from the primer hits that mypy can't currently solve the TypeVars if we do things like this, causing false positives in user code

Copy link
Collaborator Author

@Avasam Avasam Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the expectation of inverting the generic parameters between connections, from the type definition, to a comment.
It should be easy to revise it if that behaviour is eventually defined.

This comment has been minimized.

Avasam added a commit to Toufool/AutoSplit that referenced this pull request Dec 19, 2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the variance of the TypeVars correct here? It looks like maybe _SendT could be contravariant, and _RecvT covariant?

Copy link
Collaborator Author

@Avasam Avasam Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk if the TypeVar variance is correct, please advise.

🤷 variance with Python's TypeVars is still something I get highly confused about and has not "clicked" at all for me. Like I understand the general concept of directionality (sometimes any base type can be used, sometimes any subtype can be used), and that you'd want a "wider" type for parameters, "narrower" type for return types. But I'm never sure which to use (if any) when it comes to TypeVars

Copy link
Member

@AlexWaygood AlexWaygood Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think anybody who claims to be able to think about generics in Python without their head hurting sometimes is lying to you :)

In general, a good heuristic for a given TypeVar in a given class is that if the TypeVar only appears in return annotations in that class, it should be covariant; if it only appears in parameter annotations, it should be contravariant; if it appears in both, it has to be invariant

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted and bookmarked. It may not always apply, but heuristics I can follow.

Copy link
Collaborator

@Akuli Akuli Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have explained my view of variance on typeshed a few times, and I'm too lazy to find a previous explanation, so here goes again :)

Most of the time you only need invariant (default) and covariant TypeVars. To distinguish them, think about lists and tuples with int and float items. A list[int] is not a valid list[float], because we don't want to allow this:

foo: list[int] = [1, 2, 3]
bar: list[float] = foo
bar.append(12.34)

We communicate this to type checkers by using an invariant TypeVar in list. It means that the item types must match exactly for lists to be compatible.

On the other hand, a tuple[int, ...] is also a valid tuple[float, ...], so the TypeVar in tuple is covariant. A tuple of a more specific type is also a valid tuple of a more general type.

Contravariance comes up less often, but it's basically the opposite of covariant: making T more general actually makes the generic type more specific. Consider Callable[[T], None] for example. A Callable[[float], object] (think time.sleep) is also a valid Callable[[int], object].

And yes, variance is hard :)

This comment has been minimized.

This comment has been minimized.

@Avasam
Copy link
Collaborator Author

Avasam commented Mar 7, 2024

Marking as deferred until #11422

@Avasam Avasam added the status: deferred Issue or PR deferred until some precondition is fixed label Mar 7, 2024

This comment has been minimized.

@srittau srittau removed the status: deferred Issue or PR deferred until some precondition is fixed label Mar 20, 2024
@srittau
Copy link
Collaborator

srittau commented Mar 20, 2024

TypeVar defaults are now available in typeshed.

@Avasam Avasam requested review from srittau and AlexWaygood March 30, 2024 18:51
Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

Comment on lines +22 to +24
# Any: Unused send and recv methods
_reader: Connection[Any, Any]
_writer: Connection[Any, Any]
Copy link
Member

@AlexWaygood AlexWaygood Mar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these should be Connection[object, object] or Connection[Unused, Unused]? Otherwise we might get complaints about false positives from the (very small) group of people who use mypy's --disallow-any-expr optional lint: https://mypy.readthedocs.io/en/stable/command_line.html#cmdoption-mypy-disallow-any-expr

Copy link
Collaborator Author

@Avasam Avasam Mar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure for the first, _SendT, contravariant, used for params argument.
Idk about the second, _RecvT, covariant, used for return type.

reader1: Connection[int, object] = _ThreadWakeup()._reader  # OK
reader2: Connection[object, int] = _ThreadWakeup()._reader  # "Connection[object, object]" cannot be assigned to declared type "Connection[object, int]"

class MyThreadWakeup(_ThreadWakeup):
    _reader: Connection[int, int]  # OK

Arguably this is also a "private" name and I fail to come up with a real-world example

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to use the stricter non-Any types for now, and see if anybody complains about false positives... But I'm also fine with the PR being merged as-is; I don't have a strong opinion here :-)

@JelleZijlstra JelleZijlstra merged commit bdb5b52 into python:main Oct 2, 2024
@Avasam Avasam deleted the multiprocessing-generic-pipes branch October 2, 2024 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants