-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-determinism in pip_parse or pip_import #571
Comments
Funny enough I ran into this today and was wondering as to why the bazel-remote cache was not working as expected in our CI, but was working correctly in local setups. |
Hi. Is this addressed by #570 or is there additional work needed? Thanks! |
There is still more work. At the very least, packages which compile C/C++ code are subject to non-deterministic targets. There's still one open question on that PR which may mitigate this just a bit more (#570 (comment)) but ultimately, the |
Though that said, if the packages you depend on are pure python packages (don't build any extra files when setting up the wheel) then I don't think you'll run into determinism issues. Though, if you're impacted by |
How is this related to use cases when users are using binary wheels and even though we have |
Are you sure the entire transitive closure comprises of Wheels? It's often the case that there is an sdist (aka source distribution) somewhere in the transitive closure and this builds on every host machine. This can easily introduce non-determinism, since the building of dependencies in Python is essentially |
I am not 100% sure and I would like to understand this better, how would I check this? We have been using Before doing that we needed to have |
In my investigations, I've found packages which compile some C code that may produce different results per build. But I've also see wheels with embedded |
You can't test this with However, to check if your transitive closure has
That will begin to download wheels from your package index (or PyPI) and will fail if there are missing wheels. |
I would like to distinguish between non-determinism that has always been present in rules_python and the underlying ecosystem, and what was recently introduced by the pip_import rule. If the latter is just as deterministic as anything else, then we don't need to warn users away from it like in the 0.5.0 release notes. |
I think there's been some confusion here. My comment (#551 (comment)) was saying that the rules were gave no indication that they were non-deterministic which (IMO) is the expectation from any Bazel rule. So when I ran into non-deterministic behavior, I thought it warranted re-applying the "experimental" badge to the rules so users don't adopt them without question and later find tons of cache invalidation. To my knowledge there's no specific issue introduced in TLDR I agree with @alexeagle 😄 |
Thanks @UebelAndre I think we can close this then. The repo description/tagline is "Experimental Bazel Python Rules", and in this case the non-determinism comes from external tools we spawn so the proper fix belongs there (imagine We can and should do what we can do document it, and users need some determinism-checker tool to discover the problems more easily. |
Are there other issues for those? If not I think they should be made and linked with this ticket |
I think #589 is what's needed to truly resolve this issue. It allows users to correct issues related to non-determinism in their dependencies on a case-by-case basis. I think this is the best that can be done unless python itself is going to impose some determinism check. |
Enabling remote execution requires passing `--config=cfc-remote`, which will allow us to roll out remote execution slowly -- and roll it back if needed. Several targets and tests are incompatible with remote execution had have been tagged as such; these targets will be addressed in subsequent changes: * `oci_image` doesn't work with remote execution, though a newer version may. bazel-contrib/rules_oci#477 * `pip_parse` builds source dependencies based on the host platform, which (a) introduces non-determinism and (b) can result in binaries that cannot run on the remote host (e.g., due to incompatible glibc versions). We may be able to use annotations to substitute in versions of dependencies built by bazel, not pip: bazelbuild/rules_python#571. But we also might be able to remove the Python targets... Bug: 321291571 Change-Id: Id4b2ece75f7945736e0e3c6bba056c85842e9618
#551 (comment) reports that different machines running the pip operations result in different outputs, making cache busts on subsequent actions that depend on python requirements in their inputs.
The text was updated successfully, but these errors were encountered: