-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
bpo-43372: Use BUILDPYTHON for regen-frozen. #24714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The marshaled data for code objects has changed with bpo-42246, commit 877df85. Update the frozen code for __hello__ and the expected code sizes in ctypes test_frozentable.
Isn't this a circular dependency? Not quite sure, hence I figured I'd rather ask. |
No, I don't think so. regen-all is a top-level target and not a dependency of anything. |
In Fedora, we regen-all with the previous Python build before we build anything. Since regen-frozen is part of that, we would need to build Python twice. The more I think about this the more I belive that regen frozen should be kept separate from regen all similarly to regen stdlib modules. I'm a bit tired today for more discussion and typing from my cellphone, can provide more details tomorrow. |
What is the motivation to do that? Doing the regen with a potentially mismatched older version of Python seems strictly worse than using what is checked into git for those files. Maybe I'm missing something. Could you do the regen after the build and verify that generated files don't change, like the CI test does? Perhaps we should build a check like that into the build system. I can re-write regen-frozen so that it works like regen-importlab. I just need to modify If we separate out regen-frozen from regen-all, I think the instructions we need to give to people become more complicated. Rather than just running "make regen-all" they have to run multiple regen build steps. The extra complication is why I think we should run regen-stdlib-module-names as part of regen-all. The instruction generated by the github CI script is already inaccurate:
I think the correct instruction is to run "make regen-all" and then run "make regen-stdlib-module-names". I don't understand why the separation is needed. |
Here's a more detailed explanation. Note that the majority of this is just how I understood things and I might be wrong. Most of this I've learned from @vstinner, but I might have misunderstood something. The reason PYTHON_FOR_REGEN exists is so you could regen the files with an older version of Python before you have the new version available. The existing generator scripts are written in Python because it is convenient, not because they need access to interpreter internals (at least before regen-frozen that was the case). As long as the Python version si compatible with the code used, you can use it as PYTHON_FOR_REGEN. (The scripts could very well be written in Shell or Ruby, but Python developers prefer to write Python, for obvious reasons.) This is also the assumption of the CI setup. PYTHON_FOR_REGEN is "any Python 3" in one CI job, or even explicit Python 3.8 somewhere else. This all is possible with one purpose in mind: To be able to run regen-all before the build. E.g. in an environment, where pre-generated files are discouraged (such as Linux distro packaging) or when you want to to regen the files fast (e.g. on the CI) without a need to rebuild Python itself. Including make targets that require just-built Python in regen-all breaks this use case. As for Fedora, yes, we could not regenerate the files before the build, do it after instead and check if they are identical. I don't particularly like it, but we can live with that. But does this also break use cases of other distributors maybe? One regen job that I know of that would otherwise break this sequence is the mentioned freeze_importlib thing. And I believe this is the reason it is not written in Python. So adapting _freeze_importlib.c seems like the best solution to me. happy to discuss this more or to bring it to the mailing list / discuss. |
Thanks Miro, those details help. I was thinking |
Right, not often, but sometimes we do. E.g. we carried a patch for bootstrap importlib and including the diff of the generated header would be a nightmare. |
Closed as I think #24759 is a better fix. |
Using PYTHON_FOR_REGEN is not a good idea since it can be older and have an incompatible code object format. Use BUILDPYTHON instead.
The marshaled data for code objects has changed with bpo-42246, commit 877df85. Update the frozen code for
__hello__
and the expected code sizes in ctypes test_frozentable.This seems to be a cleaner fix than #24708.
https://bugs.python.org/issue43372