-
Notifications
You must be signed in to change notification settings - Fork 981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] Obtain timestamp from unzip routines for SOURCE_DATE_EPOCH #14480
Comments
Hi @iskunk Thanks for the suggestion, it is interesting. It seems that not that much that Conan is extracting zipped files itself, but most times just calling Python stdlib
I have been trying to find some official guides about this, but I have found nothing. In some places it says that all files should be using the latest timestamp to work properly, but that is up to the creator of the tarball. It would be good to have something a bit more clear. Taking all into account, I wouldn't love to make all decompressions slower to be able to iterate all files (or iterate all files after unzipping, is this possible too? If it is, then this would make sense as a separate tool, decoupled from the unzip, and would be more generic to all downloads?), just to extract this SOURCE_DATE_EPOCH, in case it might be eventually needed by some users (it hasn't been a massive use case so far, it is the first time this is mentioned). It seems too much negative impact for a potential value that doesn't have strong evidence yet. |
The best implementation approach remains to be seen. For example, I see that Iterating through all files after unpacking could work too, but that's a less integrated approach. By the same token, the user could run some external script/utility that returns the appropriate timestamp (though it would have to watch out for patched files with a current timestamp). It all comes down to how close to Conan's core mission is facilitating reproducible builds. I'm not aware of any formal guidelines for obtaining the
Not everyone may follow the same exact guidelines for reproducibility, but testing for reproducibility isn't hard. If the guidelines deliver, and don't get in the way of other goals, that's ultimately what matters. |
If this is possible, I think it would be the best approach:
from conan.tools.files import get, source_date_epoch
get(self, url, "myfile.zip", ...)
source_date = source_date_epoch(self, self.source_folder)
apply_conandata_patches(self) # this can be done later
# vs
# Yes, one liner instead of 2 lines, still uglier interface
from conan.tools.files import get
source_date = get(self, url, .... compute_source_date=True)
apply_conandata_patches(self) # this can be done later |
Okay, so you'd prefer a new function using I suppose that is reasonable for now, when reproducible builds are still somewhat of a specialized use case. There will probably be pressure to speed things up in the future, especially if Conan Center decides to go that way. (Getting timestamps straight from the archive metadata would mean no I would say, give some low-priority thought cycles to how this might be handled via |
What is your suggestion?
This is related to #5152, but addresses one specific aspect of the issue.
Best practices for reproducible builds related to the use of
SOURCE_DATE_EPOCH
indicate that this should be set to a well-defined timestamp associated with the source code being compiled. For a source tarball, this could be the latest timestamp among its files; for a Git tree (or other VCS), this could be the commit timestamp. I am particularly concerned with the former case, but the latter might be considered in scope as well.Since Conan takes care of unpacking source archives itself, rather than calling out to external programs, it is in a good position to grab timestamps of files as it iterates through them. It can then return a useful value from that information, e.g. the most recent timestamp across all of them.
Automatically setting
SOURCE_DATE_EPOCH
from that result is a potential user-configurable convenience, and perhaps might even be considered as a future default. For now, however, the focus of this issue is to add the logic necessary to obtain that timestamp in the first place, across the different source archive-unpacking implementations in the codebase. Once that information is available, how it is used will be a different story.Have you read the CONTRIBUTING guide?
The text was updated successfully, but these errors were encountered: