-
-
Notifications
You must be signed in to change notification settings - Fork 15.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write meta attribute as JSON into the derivation #256296
base: master
Are you sure you want to change the base?
Conversation
Aren't CA derivations still an experimental feature? We can't rely on them like this. Metadata changes should not cause rebuilds and should not be part of the output. That's not metadata, that's data. If the maintainers of glibc change we shouldn't rebuild the world. The proper fix here is for Nix to have some mechanism for attaching metadata to store objects that doesn't impact the hash algorithm. |
They are of course, I meant that long term we have a solution for this issue so it's less impactful than it may seem.
The way I see it is that CA derivations are exactly the idea behind this, trivial changes shouldn't impact the whole recompilation of the tree.
There are two conflicting shoulds:
I do think (2) brings so many good things that we can delay (1) should until CA derivations are stable. |
If this feature is important, implement it properly in Nix. No need to wait for CA derivations. |
I like to split improvements into three phases:
|
You're "making it work" by breaking the semantics of an existing working system for everyone. You're not building some prototype of some opt-in functionality. Again, why not do your step 1 by implementing a prototype of a metadata feature in Nix? In 30 seconds of thought here's an approach you could probably put together in a day or two:
|
I'm not sure creating another type of derivation is the right approach here, given that this is a solved problem already as experimental feature in Nix for ~2 years. Note that your suggestion breaks most likely the semantics of |
I'm not creating another type of derivation. I'm suggesting simply a wrapper primop that adds a new field to the Nix db.
Do CA derivations work? Are they ready to be enabled by default? Are they stable? If so, then they should be unexperimental. If not, they cannot be relied upon. Even if they were non-experimental, they would not be a proper solution here: nothing should rebuild if you change metadata.
No, no data is stored in the drv. Just in the db associated with the drv path. |
I think that this pulls in too much information into the build. While license is a part of sources, and license change wouldn't happen without needing a rebuild anyway, other pieces of information like current maintainer or level of brokenness on different platforms shouldn't affect the package. Maybe we should tie license to the sources of the package and store that in derivations instead? Or maybe even keep it in "source" derivations and not in "package" derivations. I'm not sure how this helps security vulnerability checks though. If we find a vulnerability in the package, we add it in meta and currently it doesn't trigger package rebuild, so you can compare store path and check if the package that you already have is now vulnerable. If you add this information into derivation, without CA you'll get a new output path and you won't be able to check it on already deployed systems. |
Given the package drv, we can almost always identify the src, unpack it, and scan the result for licenses. No new drv needed, probably gets you 95% coverage with a simple script. Seems like a perfect fit for @domenkozar's "make it work" stage. |
I think this is too much effort with a bunch of corner cases, so “caching” the result of such action is a good thing. I also understand the need for having this cached not just in code, but in derivations as often you don’t have sources for drvs that produced these outputs, but drvs themselves are quite easy to keep around with keep-derivations. |
That's exactly what IFD can already do today. Just because it doesn't fly with Hydra and its scheduler doesn't mean it's a bad idea - meta isn't recursed AFAIK anyways. I still would very much prefer if we specified this stuff manually, rather than blindly trusting a script to gather the right data. |
This allows to generate SBOMs and filter derivations based on license without relying on evaluation time information. The downside is that changing meta attributes triggers a rebuild (which shouldn't have much impact with CA derivations). Another downside is disk space usage, for example pkgs.git closure of .drv files goes from 3.8MB to 4.4MB.
8518fd8
to
df32b8c
Compare
I've changed the implementation to be opt-in, so that we can develop the feature until it's ready to be on by default. |
No, they are far from ready. At the moment you can't reliable convert a already built system output because each other referencing outputs create deadlocks. In the past multiple such deadlocks where already fixed but I can easily imagine that there are more to be found and edge cases to be fixed. |
This could be handled by improving the string context mechanism. That way we don't "pollute" the drv files. |
After Domen posted this PR in #slsa:nixos.org a quick exploratory discussion ensued between @RaitoBezarius and myself the outcome of which I want to resume here:
Using The reasons it seemed unsuitable even under this angle mostly stems from:
I hope I did capture the outcomes appropriatly and that our findings are well received. |
NixOS/nix#8080 is somewhat related; the approach currently implemented there only applies metadata from build outputs, but I could see a future where tags can come from evaluation-time info that doesn't land in the derivations. |
Can't we just have |
I'll mark this as draft, as it's clear this shouldn't be merged until multiple concerns are resolved, and I want to prevent this getting merged by someone by accident (well aware there is currently a merge conflict, that prevents it from being merged right now anyway). |
I'm still thinking about this because it seems like such low-hanging fruit. Storing the The proposed alternatives are interesting but require patching Nix, changing the binary cache protocol, or deploying new infrastructure. These ideas require one or two orders of magnitude more effort to get out there, and so far, I haven't seen anybody decide to tackle them. This is the typical "perfect is the enemy of good" situation; everybody jumps in with their ideal opinions. Now, almost one year later, nothing happens. We could have benefited from the hacky solution already. |
Agreeing with you @zimbatm. I'd be fine with merging this in principle, but I'd like to see:
|
Please be aware the this is how bad design choices and complexity proliferate. Slippery slope and all that. That said, a non-mass-rebuild solution such as NixOS/nix#10780 is not available, and Nixpkgs can make its own choices. Let's make sure we mitigate the impact of this workaround and make sure we actually agree, specifically:
|
I have more urgent matters to do, would be great if someone picks this up. |
That's fine by me. There's a lot to improve about licenses.nix, but that work isn't exactly happening fast, so it's fine for it to go through staging. I wouldn't want to commit to the format of licenses in drv files being stable though, unless all that's written is a SPDX identifier. |
This allows to generate SBOMs and filter derivations based on license without relying on evaluation time information.
The downside is that changing meta attributes triggers a rebuild (which shouldn't have much impact with CA derivations).
Another downside is disk space usage, for example pkgs.git closure of .drv files goes from 3.8MB to 4.4MB.
This is part of a larger effort to bring security vulnerability notifications to https://cachix.org and anyone else that wants to have the automation.