-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore building third party stubs as packages #2491
Comments
I haven't had time to think over the full proposal yet but I did want to correct a misunderstanding and add some relevant information. First a minor (but important!) nitpick: PEP 561 is about package names. The things you install off of PyPi are distributions, which have zero or more packages. The example I always give to explain this is that you Therefore, it would be possible to install, say Speaking of squatting names, there has already been some discussion of this topic in pypi/warehouse#4164. Donald Stufft is planning on namespacing packages in some form, so it is likely |
I like the proposal! Here are a few additional thoughts. If we split (most) third party stubs to separate stub packages, type checkers could propose a relevant stub package to install if they can't find stubs for some package. This would be easy to do if third party stubs are managed centrally, but hard if they are spread out across numerous repositories. I think that it would make sense to request permission proactively for the most popular packages on PyPI to be included on typeshed (I mentioned this in #2440). This would make it easier to contribute new stubs, as asking for permission is something many people are uncomfortable doing. I'm happy to try doing this if others think that this is a good idea. I think that review workload is the biggest potential issue. I'm ready to volunteer to do more code reviews if we can pull some of the proposed changes off, in part because I expect they could well make code reviews easier. I have some ideas about making core review workload manageable below. First, I think that it would be a big help if we had tests for third party stubs. At the very least, we should have a Second, if we have support for tests, we can require that any new third party stubs have decent test coverage. Also, most fixes to a stub should include a test. Code reviews will be simpler if we can trust that tests catch any egregious errors at least. Also, the presence of tests makes it more likely that the PR author has done a reasonable job, thus hopefully requiring fewer review iterations. We can ask the creator of a new stub to actually verify that the test file can be run against the library (not just checked). It's probably impractical to execute tests in typeshed CI, though. Third, since third party stubs wouldn't be installed by default, errors in those stubs would be less serious than now, as it would be easy to uninstall the stubs if there are problems, and it would be easy to revert back to an earlier stub version. Reporting bugs in stubs back to typeshed would perhaps also be simpler, as it would be easy to attribute blame to a certain typeshed stub package. Due to these reasons, code reviews could be less strict for third party packages (i.e. no need to double check against the implementation if something is unclear), excepting maybe the most popular packages. Here my rationale is that it will be much easier to encourage users to contribute small fixes to not-very-polished stubs than to create a new set of stubs from scratch. Automatically having the fixes available on PyPI immediately after the PR has been merged could be another big motivating factor. Finally, maybe we should consider requiring the use of a code auto-formatter for stubs (at least new ones), if there is one that works for stubs. Or we could maybe introduce more aggressive linting for new stubs. These would improve the readability of contributed stubs. |
There are few questions I wanted to clarify about the proposal:
|
I think that these could be decided on a case-by-case basis. If the stubs don't generate false positives without a plugin, maybe the stubs can be included in typeshed. We don't need to include all stubs in typeshed, but I think that it's a good place for the "long tail" of stubs where nobody may be motivated enough to set up a separate repository just for stubs.
I proposed that we could automatically generate a version number from the last modified date.
In my opinion |
https://github.com/ambv/black supports auto-formatting stubs. (I added the support.) |
It means that version number would have no connection with runtime package version and non-descriptive as a result. |
We at PyCharm met this problem while developing stub packages advertiser. There is no way to determine what stub package version fits installed runtime package without installing them from the newest one to the eldest. |
@sproshev Fair point. I'm not sure if there's anything we can do that would always work reliably. We could perhaps include the package version in typeshed as metadata. The stub package version could be derived from that (and incremented by 1 for each update). For example, the stub package versions for numpy 1.14.1 could be something like 1.14.1, 1.14.1+1, 1.14.1+2, etc. However, in practice it would be quite likely that stubs wouldn't fully conform to any package version. Maybe the version would only be a best-effort hint, and would roughly correspond the latest package version that is known to have some support by the stub (i.e. works in a reasonable fashion, but is not necessarily complete). I don't think that we need anything perfect here. I'd be happy with an approach that works well for simple cases (most modules probably have pretty simple interfaces) and supports gradual refinement for more complex cases. The key would be making it easy for users to contribute. Crowdsourcing seems the only feasible approach for stubs. Other ideas for guidelines:
|
Okay, so after thinking this over, I decided I really like this idea. I definitely think it would be a good idea to reach out to the typescript folks and hear about their experiences, since they likely have already hit problems we will in executing this plan. As for versioning, this is actually covered in PEP 561. Essentially, the stub package declares which version(s) of the runtime package it supports in the install requires. Therefore we should be able to tell what package fulfill requirements based on the At worst, you have to download (but not install!) several wheels. I think this gets a lot easier if we work with the warehouse folks to make sure the metadata we need is available through their JSON API (some if not all of it already is). With regards to partial packages, I would say people can start with their own incomplete packages, and once they pass a minimum bar, they can be brought into typeshed. As for plugins, I don't really see an issue with including them, because they are not likely to cause issues with other type checkers, and they don't take that much disk space. Alternatively, we could do some packaging magic and make a separate plugin package that gets installed if you ask for that extra (e.g. Lastly, and probably most importantly, reviewing the stubs. I am also happy to step up and do more review as well. I also think pulling something like the mypy test suite out of mypy and making it more generic is a good idea, as is requiring new packages/improvements to be tested. Related to reviewing, I have been inspired by Marietta to hold "office hours". The idea is to help people with static typing and PEP 561 packaging for probably about an hour every week. I am hoping to start holding hours sometime this month. If there are others who want to do this too perhaps we could coordinate. |
For reference, better tests are discussed in #754. |
+1 |
I have created a new branch |
I suggest we start adding Another question is how strict the version specifier is supposed to be. For example, if a package uses semantic versioning and the stub was written for version 1.4.1, what should we use? Possibilities are:
I would suggest using the current minor level as lower bound, since using the stub with older versions will not catch some problems, such as using API additions from newer versions. On the other hand using the next major version as upper bound seems fine. This could give false positives when using API not yet supported in the stub, but you can use newer versions without using newer API and it can be encouragement to contribute to typeshed. So the second example above. |
An additional idea about the generated version number. Currently the build script uses
|
Some questions (mostly for @srittau's
I also have a comment regarding the structure of the repository once every third party stub is a package. The current structure is |
One thing I'd like to do is "METADATA merging". I think it makes sense to have a I also agree that moving all packages to the Regarding the suffix and squatting: This is something that we have to check with the warehouse maintainers. They might have opinions and ideas about that. |
This sounds good to me. Although I think outright duplicating the file in every package should also be considered. That would be more direct and easy to understand, and since the packages are managed centrally in a single repository, every mass change can simply be applied at once across all of them. And it also allows for gradual (intentionally non atomic) changes if that's ever needed.
Are there currently any type checkers that install typeshed dynamically rather than vendor or pin it? If they all vendor it, they only need to adapt when they update typeshed, no? |
Not a mypy expert, but from my understanding at least mypy is vendoring typeshed using git submodule. Changing it now would break mypy's build process and would need quite a bit work to fix, for what is a temporary situation. |
I have now reached out to the pypa team in pypi/warehouse#4967. |
@srittau Mypy uses typeshed as a submodule, but submodules are pinned to certain commits (which we manually update). The tests within typeshed would probably be quite broken since those use the master branch of typeshed instead of the submodule, but I expect changing the format of third_party would require corresponding lock-step changes in both pytype and mypy anyway. As for package versioning, I think there are two fundamental features we want from a version:
Therefore I propose just |
Support the new structure of typeshed (python/typeshed#2491) and only bundle stdlib stubs with mypy. Most stubs for third-party packages now need to be installed using pip (for example, `pip install types-requests`). Add stubs for `typing_extensions` and `mypy_extensions` as mypy dependencies, since these are needed for basic operation. Suggest a pip command to run if we encounter known missing stubs. Add `--install-types` option that installs all missing stub packages. This can be used as `mypy --install-types` to install missing stub packages from the previous mypy run (no need to provide a full mypy command line). This also replaces the typeshed git submodule with a partial copy of the typeshed that only includes stdlib stubs. Add a script to sync stubs from typeshed (`misc/sync-typeshed.py`). This is still incomplete since typeshed hasn't actually migrated to the new structure yet. Work towards #9971.
See discussion in #2491 Co-authored-by: Ivan Levkivskyi <ilevkivskyi@dropbox.com>
OK, I merged the big PR with directory re-structure. I will continue working on typeshed -> PyPI auto-upload later this evening. |
Oh, I forgot to update |
See #821 and its linked issue for context. The typeshed directory structure is changing significantly, so we need to update pytype accordingly. python/typeshed#2491 (comment) contains a nice diagram of the new structure. Note that I first developed this change on GitHub, then imported the PR. I'm asking for a review on the import (rather than the PR) because the import contains additional BUILD file changes (especially to third_party/py/toml - see the diffbase). PiperOrigin-RevId: 354138398
In the meantime, I have manually tested PyPI upload GitHub actions, they seem to work. After @JukkaL double-checks the code tomorrow morning, I will switch the cron diff upload action from |
Thanks @rchen152 for updating the pytype test! Now we only need to fix |
Thanks @ilevkivskyi and @srittau for making this change. I just updated the import logic in pyright, and it now works with the new typeshed directory structure. |
See #2491 for previous discussion. Co-authored-by: Ivan Levkivskyi <ilevkivskyi@dropbox.com>
I think we are almost ready, sorry for a little delay, we were not able to enable the auto-upload today, we will enable it Monday morning. In the meantime, I think it is already OK to start merging PRs again. |
First of all, thanks for working for the better type hinting. I look forward to all type information packages coming out to PyPI. Now I have a question on the upcoming updates on the type information metadata in all libraries in this repository. How does the metadata file (METADATA.toml) will include the supporting runtime package versions? I saw a discussion around the versioning of stub packages but wasn't confirmed the concrete mention on the description on the metadata file. If I missed the exact part that tells the answer for them, apologies in advance. |
I finally switched the auto-upload task from |
Currently you can include |
@ilevkivskyi Thank you for the reply. Then now the stub packages will be automatically uploaded to PyPI based on that version information? For example, the stub for paramiko is version "0.1", whose runtime package's current latest version is "2.7.2". In this example, given paramiko is following semver, 2 major version diff means huge interface breaking changes happened during the update, which means at least the stub version should be Can we assume that now all stub packages are ready to accept the pull requests for such version number corrections? |
Yes, we'd accept PRs that update METADATA.toml to the oldest version of the package that the stubs reflect support for (#4981) |
When adding new stubs, the version should reflect the actual version of the package that stubs were created for. Since this will be hard to determine reliably for existing stubs, it's okay for the version to refer to some older version with which they are still compatible. We probably also don't want the version to refer to a more recent version version of the package, or a version that is poorly supported. If a stub is created for version 2.3 of a library but also contains a few 2.4 features (but we know/suspect that some important 2.4 features are missing), I think it's fine to continue to use 2.3 as the version of the stubs. |
@hauntsaninja @JukkaL Thank you for the reply. I understand that we need to avoid the confusion that the version changes will bring and we can keep using versions in the existing METADATA.toml files. However, that applies only to the case the stub version is in the same major version as the package current, and not to the case I referred to. (i.e. 2 major version diffs) Anyway, it's great that the discussion is already happening in #4981 and I'm happy to contribute to this part. |
FYI I just opened a PEP about package repository namespaces and I used typeshed as the first motivating example 🙂 https://discuss.python.org/t/pep-752-package-repository-namespaces/61227 |
This is an alternative to #2440 (disallowing third-party stubs). The idea is that typeshed remains/becomes a central repository for third-party stubs that are not bundled with the parent package, similar to DefinitelyTyped. In the future I expect type checkers will not want to bundle all third-party stubs for a variety of reasons, so third-party stubs would be distributed as separate PEP 561 stub-only packages, one per upstream package.
(I tried to integrate points raised there into this issue, especially those by @JukkaL in this comment.)
Advantages
Issues
Further Considerations
What should the generated packages be called? @ethanhs's PEP 561 actually requires stubs-only package to be named
<package>-stubs
. typeshed could squat these names and release them (and remove the stubs) on the request of upstream maintainers. Alternatively, typeshed could add a common prefix or suffix (ts
,typeshed
) or in addition to or instead of the-stubs
suffix. This would be in violation of PEP 561, so we'd need to get broader consensus to amend the PEP. My personal favorite would be<package>-ts
.To guarantee a fairly quick turnaround on stubs, to minimize work for publishing stubs, and to prevent all third-party stub packages to be updated whenever a new typeshed version is released, stubs for a specific third-party package should be published automatically when it changes.
Possible Implementation
setup-tp.py
to typeshed that takes its package name from the directory it's in and uses the current date and time as version number.master
only so that after a successful test run, for every third-party package that was changed since the last successful run, the following is done automatically:setup-tp.py
into the third-party module directory assetup.py
.The text was updated successfully, but these errors were encountered: