-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[red-knot] Add list of failing/slow ecosystem projects #17474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
I'm pretty strongly in favour of running (a lot) more mypy_primer projects in CI. Apart from anything else, we've already had several cases where the mypy_primer workflow has caught new panics for us, which seem very important for us to know about. It looks like the changes here mean that it now runs in six minutes, which is quite a lot slower than the two minutes it took previously. We could consider sharding the mypy_primer workflow between several CI jobs and joining the artifacts afterwards, which is what both mypy and typeshed do in their mypy_primer workflows. |
|
6 minutes is reasonable enough that I'm tempted to just land this as-is and move on; I don't want to spend a lot of time wrangling GitHub Actions. But maybe if it's easy to borrow the mypy/typeshed sharding setup, I could look at that. This does make mypy-primer the long pole in our CI (replacing windows and release-mode tests at 4 minutes.) |
|
I tried to do some work towards sharding in #17475, but I feel like I've reached my timebox on that. I have the sharded runs working fine and uploading artifacts, but the "comment" workflow needs significant changes in order to download all the diff artifacts and combine them, and our current comment workflow is quite different from the typeshed/mypy ones. Compounding the difficulty, it appears that a workflow that is triggered on completion of another workflow (like the mypy primer comment one) only runs if it exists on the main branch of the repo, so it seems like debugging it would have to occur on main branch. So at this point my proposal would be to merge this and accept that ecosystem checks take 6-8 minutes. Open to alternative proposals, including "put more work into sharding" or "reduce the project set." |
|
Hmm, scratch that; something caused the last run to take 20min and time out. Will need to dig into that before we can consider merging this. |
## Summary Takes the `good.txt` changes from #17474, and removes the following projects: - arrow (not part of mypy_primer upstream) - freqtrade, hydpy, ibis, pandera, xarray (saw panics locally, all related to try_metaclass cycles) Increases the mypy_primer CI run time to ~4 min. ## Test Plan Three successful CI runs.
b78112a to
028e225
Compare
sharkdp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carljm Thank you for this. After updating mypy_primer to run from upstream (which also includes a lot of recent changes to project dependencies), I re-analyzed all projects here. See the commit history for a log of the changes that I did. I moved five projects to bad.txt (all try_metaclass_ panics) which were recently part of good.txt (merged separately). I also moved four projects to good.txt. The hang on Tanjun is probably related to #17537, I reclassified it.
|
I don't know if performance is still a concern and if it is mainly red knot or mypy primer being slow. If it still is a concern and it's mainly red knot, then consider creating a new build profile which performs a release build but disables LTO (you could even experiment with less aggressive optimizations). LTO tends to be the slowest step and this could be a good trade between a faster red-knot without spending too much time on compiling. |
The red knot compilation used to be the bottleneck (when we ran on a few projects only), which is why I made it a debug build. Now it's the setup (clone + dependency installation) + the actual execution of red knot, which is the bottleneck. So we could consider switching back to a release build, with the disadvantage that we would have worse backtraces and no debug-assertions.
👍 |
We could enable debug assertions, similar to the |
* main: (37 commits) [red-knot] Add list of failing/slow ecosystem projects (#17474) [red-knot] mypy_primer: extend ecosystem checks (#17544) [red-knot] Move `InstanceType` to its own submodule (#17525) [red-knot] mypy_primer: capture backtraces (#17543) [red-knot] mypy_primer: Use upstream repo (#17500) [red-knot] `typing.dataclass_transform` (#17445) Update dependency react-resizable-panels to v2.1.8 (#17513) Update dependency smol-toml to v1.3.3 (#17505) Update dependency uuid to v11.1.0 (#17517) Update actions/setup-node action to v4.4.0 (#17514) [red-knot] Fix variable name (#17532) [red-knot] Add basic subtyping between class literal and callable (#17469) [`pyupgrade`] Add fix safety section to docs (`UP030`) (#17443) [`perflint`] Allow list function calls to be replaced with a comprehension (`PERF401`) (#17519) Update pre-commit dependencies (#17506) [red-knot] Simplify visibility constraint handling for `*`-import definitions (#17486) [red-knot] Detect (some) invalid protocols (#17488) [red-knot] Correctly identify protocol classes (#17487) Update dependency ruff to v0.11.6 (#17516) Update Rust crate shellexpand to v3.1.1 (#17512) ...
Summary
I ran red-knot on every project in mypy-primer. I moved every project where red-knot ran to completion (fast enough, and mypy-primer could handle its output) into
good.txt, so it will run in our CI.The remaining projects I left listed in
bad.txt, with a comment summarizing the failure mode (a few don't fail, they are just slow -- on a debug build, at least -- or output too many diagnostics for mypy-primer to handle.)We will now run CI on 109 projects; 34 are left in
bad.txt.The main question about this PR is whether running mypy-primer on 111 projects is too much for CI: does it take too long? does it make the mypy-primer output overwhelming? If it is too much, I can split
good.txtintorun-in-ci.txtandgood.txt, and we can pick some subset to run in CI.Test Plan
CI on this PR!