Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check materialization in pure nix #2051

Closed
anka-213 opened this issue Sep 10, 2023 · 15 comments
Closed

Check materialization in pure nix #2051

anka-213 opened this issue Sep 10, 2023 · 15 comments
Labels
enhancement New feature or request preserved Keep stale bot away wontfix

Comments

@anka-213
Copy link

Is your feature request related to a problem? Please describe.
I want to avoid IFD as much as possible since it causes a bunch of problems, like dry-run not working, not being supported by flakehub etc. Materialization is a solution to this, but now the materialized expressions risk getting outdated. Enabling checkMaterialization is just a regression to using IFD again.

Describe the solution you'd like
My proposal is to within the materialized files write hashes of all the relevant files that affect the materialization, in particular files like <packagename>.cabal, stack.yaml and cabal.project should be sufficient unless I'm missing something. That way we could cheaply check if the materialized expressions might be outdated in pure nix code without needing to do the full reevaluation.

Describe alternatives you've considered
The documentation recommends enabling checkMaterialization in CI only, but I'm not sure how to do that and it will still needlessly slow down CI in cases where nothing has changed.

Potential issues
One issue I can think of if this were enabled by default is that it might trigger re-checking more often than necessary if the changes are minor and can't change the result. There is also a risk that sometimes causing IFD and sometimes not depending on which files have changed would be confusing behaviour for the users.

@anka-213 anka-213 added the enhancement New feature or request label Sep 10, 2023
@anka-213
Copy link
Author

anka-213 commented Sep 10, 2023

Here's a basic prototype I threw together. It checks if the hashes of files like the cabal file matches the hashes mentioned in the materialized directory:

let
checkMatStatus = { src, materialized, ... }@args:
          let
            hashFile = "${materialized}/hashes.json";
            checkFile = __pathExists hashFile;
            cachedHashes = __fromJSON (__readFile hashFile);
            hashAll = __mapAttrs
              (fnm: hsh: {
                old = hsh;
                new = __hashFile "sha256" "${src}/${fnm}";
              })
              cachedHashes;
            isValid = __all (x: x.old == x.new) (__attrValues hashAll);
          in
          if checkFile then
                if isValid then args
                else args // { checkMaterialization = true; }
          else args // { checkMaterialization = true; };
in
  haskell-nix.project (checkMatStatus { ... }

for the prototype I manually created a file like this in the materialized directory: hashes.json

{"hello.cabal":"bceadae934c33b11f4adc4b2567a4fb84a229bffe41e4ab7f9f79a11f16de233"}

A full implementation would also have to check that other parameters are the same as well, like index-state etc. and automatically generate this file in the materialization script.

@andreabedini
Copy link
Member

Thank you for thinking about this. I am not sure I understand though, bear with me.

AFAIU checking the materialisation of a plan consists in building a new build plan and checking that it is identical to the save (materialised) one.

Just off the top of my head, the build plan depends on: compiler, platform, repositories, project configuration, local packages, source-repository-packages and pkgconfig database. I might have missed something :)

Perhaps, it might make sense to have a "shallow" form of validation. Assuming that all the "impure" inputs are fixed by nix, we could isolate the inputs to the solver that might actually have changed?

But no, a change in any of those inputs does not mean there is a change in the plan, the plan would still be valid.

I don't think this is possible unless we can reproduce the solver in nix :)

@anka-213
Copy link
Author

anka-213 commented Oct 6, 2023

@andreabedini Yeah, my idea is indeed to have two levels of checks, one which is cheap to compute and says "the plan may have changed" and the other which is expensive and actually computes the full plan. By only doing the full plan calculation when the hashes of the inputs have changed, you can get the cheap checks most of the time, while still being guaranteed to not forget to update it.

I would personally prefer a workflow that is completely free from IFD, where you run a command to update the materialization whenever the hashes of the inputs have changed. Often the plan may not change, in which case only the hashes would update in the materialization, not the actual plan. Since you need to recalculate the full plan anyways when doing a full check, this should not have any extra cost compared to how it is currently done.

@andreabedini
Copy link
Member

I feel I still miss something. If you materialise the plan and avoid checking the materialisation, isn't that IFD free? Then you can run project.passthru.generateMaterialized to update it? Sorry if I keep misunderstanding.

@anka-213
Copy link
Author

anka-213 commented Oct 6, 2023

@andreabedini IFD is never free because you can't run it in limited environments and you can't use --dry-run on it.

Unfortunately, because we don't have NixOS/nix#4090 yet, calling project.passthru.generateMaterialized will force whatever IFD was used to define project to evaluate first, meaning that you e.g. can't --dry-run project.passthru.generateMaterialized, but instead it will first generate a plan and then evaluate the derivation for generateMaterialized and if you have checkMaterialization enabled it will throw an error and force you to copy paste a command from the error message instead of actually generating it using the command we just called.

@anka-213
Copy link
Author

anka-213 commented Oct 6, 2023

Oh, sorry, I misparsed your statement, but my second paragraph still holds.

@anka-213
Copy link
Author

anka-213 commented Oct 6, 2023

@andreabedini Anyways, the point is to be guaranteed to not forget to call project.passthru.generateMaterialized, without doing the expensive checkMaterialization on every evaluation.

@andreabedini
Copy link
Member

calling project.passthru.generateMaterialized will force whatever IFD was used to define project to evaluate first, meaning that you e.g. can't --dry-run project.passthru.generateMaterialized, but instead it will first generate a plan and then evaluate the derivation for generateMaterialized

the point is to be guaranteed to not forget to call project.passthru.generateMaterialized, without doing the expensive checkMaterialization on every evaluation.

Right, I think we got to the point. In principle, generating the materialised plan does not require any IFD, but perhaps the way generateMaterialized is written triggers it anyway?

Does it? I tried on one of my projects it looks like it does not.

ghc-shell-for-foliage-env ❯ nix build --print-out-paths --option allow-import-from-derivation false .#project.x86_64-linux.plan-nix.passthru.generateMaterialized
/nix/store/6h5sizq1081ccz9fg1hdmb4qjzjaadgy-generateMaterialized

I belive this would have failed to evaluate.

By the way, the generateMaterialized script is nothing else than a derivation that builds a script to copy the materialised plan into a folder. You can obtain the exact same result by copying output path of nix build .#project.x86_64-linux.plan-nix.

Maybe you CI check could be based upon

nix build --dry-run --option allow-import-from-derivation false .#project.x86_64-linux.plan-nix

?

@andreabedini andreabedini self-assigned this Oct 9, 2023
@anka-213
Copy link
Author

@andreabedini Did you already have it materialized when you tried that? If you don't have it materialized or if you enable checkMaterialization, you will get IFD even when calling .#project.x86_64-linux.plan-nix.passthru.generateMaterialized in my experience. That means that your first run you will always be forced to go the IFD route, or copy-paste a command from an error message.

I was also hoping to be able to automatically check the materialization without CI, but still having it be fast when the inputs haven't changed.

This also ties together with an issue I had (#2036), where even building the plan fails when you have cross-compilation enabled, since it for some reason seems to require the full cross-compilation toolchain (which failed on my platform) in order to be able to make a plan at all. So for that reason, I was completely unable to use it without materializing first, which meant copy-pasting an expression from an error-message, which I find to be bad ergonomics. And because of IFD I wasn't able to use --dry-run to even tell what it wanted to build.

@andreabedini
Copy link
Member

andreabedini commented Oct 11, 2023

@anka-213 AH! I tried in a scratch project and I was going to say that I was mistaken but looking at the stack trace provided by nix made me remember.

Are you using cabalProject? Try switching to cabalProject'. This second version is more lazy and should not trigger the IFD.

@anka-213
Copy link
Author

I was using hixProject, which was provided by the template nix flake init --template templates#haskell-nix --impure

@andreabedini
Copy link
Member

I was using hixProject, which was provided by the template nix flake init --template templates#haskell-nix --impure

🤔 hixProject calls project' which is the lazy version.

@hamishmack do you have any idea?

@hamishmack
Copy link
Collaborator

Try comparing the behaviour of:

nix build .#packages.x86_64-linux.default.project.plan-nix.passthru.generateMaterialized

with:

nix repl
:lf .
:b packages.x86_64-linux.default.project.plan-nix.passthru.generateMaterialized

I would have expected these to behave the same, but I've noticed that nix build seems to force more evaluation than I thought it needs to.

@hamishmack
Copy link
Collaborator

One potential problem is that if something on the path for generateMaterialized (like packages.x86_64-linux) includes keys based on the plan (attribute sets keys are not lazy and all of them must be known before you can look at anything in the attribute set), then even glancing at it will cause the IFD to be used.

In fact the template has a bug/feature that probably hides this problem:

      in flake // {
        legacyPackages = pkgs;

        packages.default = flake.packages."hello:exe:hello";

As well as setting packages.default, the last line also hides flake.packages (which would require IFD to look at without indirection). To include flake.packages without forcing the IFD to run we probably need to explicitly inherit each package (so nix can construct the attribute sets):

      in flake // {
        legacyPackages = pkgs;

        packages = {
          inherit (flake.packages) "hello:exe:hello"; # Tell nix what components are in the plan
          default = flake.packages."hello:exe:hello";
        };

@andreabedini andreabedini removed their assignment Nov 2, 2023
Copy link

stale bot commented Mar 1, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 1, 2024
@stale stale bot closed this as completed May 1, 2024
@hamishmack hamishmack added the preserved Keep stale bot away label May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request preserved Keep stale bot away wontfix
Projects
None yet
Development

No branches or pull requests

3 participants