-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib stubs are unnecessarily strict with file-like objects #4212
Comments
We're already moving in this direction, thanks mostly to @srittau's efforts. However, our approach has been to create ad-hoc protocols for individual cases, instead of one-size-fits-all protocols like you propose. The trouble is that while in theory the file-like object concept may be "well known", in practice it is ill-defined, and there are lots of variations in exactly what methods are expected to exist on file-like objects. |
I unfortunately experienced that first-hand with
Also adding methods on top of those core protocols is easy (e.g. extend the protocol) so there is probably no need to redefine those from scratch every place they are needed... In any case I am glad to hear that an effort is going on. So long as I can get my custom writers to type-check when I pass them to |
Yes, such PRs would be accepted. There's no centralized tracking, but we recently (#4161) added a |
One of the goals of using ad-hoc protocols for now is to determine which protocols are needed in practice and then move those to |
Once we converge on a set of protocols, wouldn't it be better to expose them publicly? Them being protocols, I can duplicate them in my code, but that still feels wrong. |
Maybe once we make typeshed modular we can also generate |
@remram44 this is a great proposal. I've been teaching Python for 20+ years and "a file-like object" has always been one of the best examples of the informal protocols that appear often in the standard library. I think we should seek inspiration with Go, which popularized static duck typing several years before PEP 544. Their philosophy is that interfaces should be narrow, often just a single method, and it works very well for them. Take a look at: And combinations like: I also like very much their naming convention of turning a verb into a noun by affixing "er". |
Some previous discussion in python/typing#564. |
I'll just copy my comment from python/typing#213 here:
def foo(file: HasRead & HasSeek) -> None:
pass |
I don't think there's anything immediately actionable here for typeshed. We are already moving in the directory of small ad-hoc protocols in typeshed that can be used for this purpose. Larger protocols only make sense if they are currently used. |
Pandas 1.4 will use protocols for its IO functions pandas-dev/pandas#43951
This will be slightly relaxed in 1.4. |
Problem
Currently the IO situation is less than ideal. Not only are
IO[str]
/TextIO
andIO[bytes]
/BinaryIO
a bit confusing (interchangeable in most cases), but the use ofIO
through stdlib is inconsistent and doing things like passing an object with awrite()
method tojson.dump()
does not work.This is because the
IO
object, while describing the actual objects returned byopen()
perfectly, is not suitable to represent the "file-like object" interface. This interface is well known, documented prominently in the standard library's documentation (glossary: "file object" and "file-like object") and a testament to duck-typing; however it's not compatible with how typeshed is currently written (for the most part).Proposal
I propose to introduce
Protocol
s (not abstract classes) to be used for parameters where a "file object" is expected, allowing one to correctly type their file-like objects without having to inherit one of the abstract base classes. Furthermore, I think we should have two protocols representing files that can be read from or written to.This work can be done incrementally, and I am willing to spend time doing this if there is no veto to this ticket.
Pros
This would allow a file-like object to be passed to
json.dump()
,zipfile.ZipFile
, and others (like it already can tocsv.write()
).Using
Protocol
s of this small scale would allow objects that already conform to be used in interfaces expecting file-like object, without having to implement too many methods (or explicitly inherit from the base class, as is required now). This should lower the effort of bringing libraries to the typing world. Using two separate protocols is similar to how most languages do this, off the top of my head:io::Read
andio::Write
traits).InputStream
withread()
,OutputStream
withwrite()
andflush()
)istream
,ostream
)It is interesting to note that the protocols I describe already exist in
typeshed
. Not wanting to putIO
where the documentation called forfile-like object
, protocols have already been introduced:shutil.copyfileobj()
:_Reader
and_Writer
protocolscsv.writer()
:_csv._Writer
protocolIntroducing those protocols would also allow us to remove some of the
IO[str]
/TextIO
complexity: whileTextIO
andBinaryIO
are still needed for the native file objects (they have additional methods compared toIO
), the protocols used for function parameters everywhere can be onlyRead[str]
andRead[bytes]
.Cons
This is a sizeable change, and people are likely to use both the base class and those protocols for some time. However code using the base class should not break when passed to functions expecting the protocol.
Another caveat is that this might give a false sense of security: libraries in the wild do their own check to determine if an object conform to the interface, and for example
pandas
will not accept to write on a file object that does implement__iter__
. Therefore objects conforming to the protocol might still not be accepted by (IMHO buggy) libraries, while inheriting the base class would make their objects look more like file objects (maybe too much, since it gives everything the attributes of both a readable and a writable file!).Draft
Unfortunately this is where the typeshed-bikeshed starts, but this is my proposal:
Additional protocols can be added to provide
seek()
/tell()
(similar to Rust's io::Seek trait)The text was updated successfully, but these errors were encountered: