Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support self-fetching parameters in flake.nix #5312

Open
nrdxp opened this issue Sep 30, 2021 · 20 comments
Open

Support self-fetching parameters in flake.nix #5312

nrdxp opened this issue Sep 30, 2021 · 20 comments
Labels

Comments

@nrdxp
Copy link
Contributor

nrdxp commented Sep 30, 2021

#5284 happened, so instead I'll open a proper issue to address what I feel would be the "correct" solution since I am incapable of implementing it atm.

I see no particular reason why self should be especially handicapped from the other inputs, but should be able to be easily handled as a regular git type input, or whichever active VCS Nix can detect.

you wouldn't even have to change the syntax, but simply allow inputs.self.submodules = true, for example.

Nix would simply have to know that inputs.self is the current self.

@edolstra
Copy link
Member

edolstra commented Oct 1, 2021

There is a very good reason why self is different: we have to fetch the self flake before we can read its flake.nix. So any attributes like submodules have to be known in advance.

@nrdxp
Copy link
Contributor Author

nrdxp commented Oct 3, 2021

There is no possible mechanism to read the flake inputs before self is fetched? Like maybe just pulling the lock file into the Nix store for reading before the rest of the flake?

@mstone
Copy link

mstone commented Oct 4, 2021

@nrdxp — I’ve been thinking about this a bit because it’s going to bite everyone who tries to flakeify upstream projects that use submodules, like, in my case, qemu.

So far the best idea I’ve come up with is to write the flake to use an auxiliary input, in addition to self, that fetches the same underlying repo just with submodules enabled.

Unfortunately, that still leaves several quality-of-life problems that I wonder if we can fix:

a) that the self input is fetched but is unused (other than for obtaining flake.nix/flake.lock)

b) that the self-with-submodules input is “locked” when we’d rather that it be “derived from” or “tied” to self — e.g. so that we see dirty working copy edits / whatever the flake-ref resolves to when being built, and don’t have to remember to push commits that bump flake.lock just to test changes.

(P.S. — if you’re interested in this approach, I wrote more at https://gist.github.com/mstone/4218e6fa9ef98e5153f543fb8e7b4743 with several variations in this idea, though I’m not really sure that any of them are ready for prime time. :-/)

@Kha
Copy link
Contributor

Kha commented Oct 4, 2021

There is a very good reason why self is different: we have to fetch the self flake before we can read its flake.nix. So any attributes like submodules have to be known in advance.

Would this still be the case with #3121? Afaiu, in that case we would evaluate flake.nix in the local checkout/git cache and only then copy self to the store.

@edolstra
Copy link
Member

edolstra commented Oct 4, 2021

#3121 only addresses the case of a local git tree.

The fundamental issue is that we do a fetchTree to get the top-level flake, which requires having all the attributes needed for fetching (like submodules). If the top-level flake.nix can then change those attributes, we may have to do another fetchTree. Also the semantics get pretty ugly, since you could do something like inputs.self.url = some-arbitrary-url.

@kero0
Copy link

kero0 commented Oct 11, 2021

Am I missing something? Or why can't the answer just be as simple as manipulate self with command line args? So have a --fetch-submodules arg and add more as they are requested.

@nrdxp
Copy link
Contributor Author

nrdxp commented Oct 11, 2021

Command line args would work in theory, but if I'm conceptualizing it correctly, it would also break purity since the end result would be vastily different depending on the args passed.

@mstone
Copy link

mstone commented Oct 11, 2021

While I like them for other reasons, I disagree that CLI improvements are sufficient here: — instead, I think the point here is that flake-authors need a way to specify all build instructions, including what code to build, as part of the flake definition — that way, end-users get something that “just works” without more detailed communication from the flake authors (e.g. of a special URL to build, or extra CLI arguments, etc.) whenever possible.

@kero0
Copy link

kero0 commented Oct 11, 2021

I think that it's inevitable that flake authors need to communicate with flake users.

As for purity or consistency, I don't see many scenarios in which someone would have a requirement for a flake and the flake would still build without it. A build that is missing files will pretty much always fail to build. If your building a top level flake you should be checking the docs released by the author and use the proper args.

@nrdxp
Copy link
Contributor Author

nrdxp commented Oct 11, 2021

IIRC, more general flake arguments passed from the cli to the flake builder were considered early on, similar to how one can pass arguments to nix expressions with the old nix-build, but were never implemented because of the same purity concerns. So, unless I misunderstood, I dont' see cli args being accepted as a solution.

@edolstra
Copy link
Member

edolstra commented Oct 11, 2021

In principle you should be able to use a flake URL like nix build git+file:///path?submodules=true, but it looks like we're not handling the submodules parameter in URLs yet...

Also this wouldn't be very user-friendly

@mstone
Copy link

mstone commented Oct 11, 2021

@edolstra, @nrdxp, @kero0 -- given the discussion above, I'd like to try to summarize what I think we've collectively concluded so far.

Please let me know if you think I've missed anything critical or if you think I've misstated/mischaracterized any of the following facts:

  1. most flakes work fine without automatically-fetched git submodules.
  2. fetching git submodules by default caused a bad performance regression, fixed by reverting that behavior via Revert "Merge pull request #4922 from nrdxp/default-submodules" #5284.
  3. fetching git submodules by default is unwanted by some flake authors in any event.
  4. nevertheless, a small but important set of flake use cases (specifically including flakeifying upstream projects that already use submodules) would benefit from giving flake authors an ergonomic way to tell flake-users' flake interpreters that the flake's source code requires submodule fetching.

Additionally, I think we seem to be orbiting the following "nascent UX principles" (though with some conflicting opinions about the latter two):

  1. "flake users should be able to use query-string parameters to control the data of the FlakeRef of the InstallableFlake they are manipulating -- what we have, in earlier parts of the conversation, referred to as the notional 'inputs.self fetchTree data'."

  2. (advocated by me:) "Flake authors should be able to ergonomically provide data and instructions to flake users via flake.nix in order to make the use cases supported by the nix CLI "just work". Today, this is done primarily for CLI commands like nix search, nix build, nix run, nix flake show, etc. via 'well-known attributes' like outputs, packages.<system>, defaultPackage."

  3. (or alternately, paraphrasing @kero0): "Flake authors need to communicate with flake users. If your building a top level flake you should be checking the docs released by the author and use the proper args."

Finally, I think we are juggling the following emergent design constraints / conclusions:

  1. The facts above prohibit us from fetching submodules by default for every git-based flake.

  2. Accepting (6) above should discourage us from requiring that flake-authors communicate submodule-fetching requirements to end-users "out-of-band", i.e., outside of nix-meaningful instructions in flake.nix/flake.lock.

  3. Accepting (7) above in combination with improving flake-ref query argument support likely would give us an answer for how flake author/user communication should work around coordinating submodule-fetching but "wouldn't be very user-friendly", per @edolstra.

  4. However, all the solutions that we know about based on accepting (6) seem to require some kind of "double-fetching" (or in build system terms, "second evaluation", or in architectural style terms, "code on demand") scheme in order to enable flake.nix fetching and evaluation to happen before the particular value representing the self argument [or possibly another argument, but with self-like treatment in flake.lock] is determined.

Have I got anything wrong here?

Thanks! -- Michael

@nrdxp
Copy link
Contributor Author

nrdxp commented Oct 11, 2021

for 11, couldn't we avoid a double fetch by generating lock entries for submodules?

@mstone
Copy link

mstone commented Oct 11, 2021

@nrdxp -- oooh, interesting, I'll need to think about that. I wonder if that might also make it possible to handle ?submodules=true instructions for github:...-style flake-refs...

@L-as
Copy link
Member

L-as commented Oct 20, 2021

My suggestion for how to fix this:

Flakes could have a metadata attribute set in flake.nix, as such:

{
metadata.fetch-git-submodules = true;
}

The metadata attributes would also have to be in normal-form, and would not allow any path literals.

This means that Nix can fetch the flake, check the metadata, then check whether to also fetch the submodules before evaluating any Nix code.
The metadata could also be in a separate JSON file if that is better.

With this you don't have the double-fetching problem, but it's restricted to Git. This isn't "clean" IMO, but IMHO the problem is Git itself not always fetching submodules. Git should have from the beginning enforced fetching submodules, and we have to work around that now.

The only downside with always fetching submodules in Nix is that it breaks backward compatibility in some subtle ways, and doesn't work well with git-crypt.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-it-possible-to-make-a-flake-that-has-no-source-tree/16037/3

mhuesch added a commit to neighbour-hoods/social_sensemaker that referenced this issue Nov 27, 2021
`self` flake can't have submodules. see:

NixOS/nix#5312
NixOS/nix#4435

for now, it seems cleaner to just work through `Cargo.toml`, with
specified git revisions...
@marksisson
Copy link

marksisson commented Jul 22, 2022

Currently, I use nix build .?submodules=1 to build a flake that uses git submodules.

What if we added a nix configuration option, e.g. flake-reference-use-submodules, to set the flake reference submodule attribute to a desired default (if not specified, it could be false by default to maintain backwards compatibility). Then we would not have to pass it on the command line.

Also, since flakes have a nixConfig attribute, maybe this nix configuration option could also be specified in the flake.nix to override the option that is specified in nix.conf? or vice-versa?

@nrdxp
Copy link
Contributor Author

nrdxp commented Jul 22, 2022

Also, since flakes have a nixConfig attribute, maybe this nix configuration option could also be specified in the flake.nix to override the option that is specified in nix.conf? or vice-versa?

This sounds like the cleanest solution to me, assuming we can get approval for the new nix.conf setting.

@roberth
Copy link
Member

roberth commented Nov 12, 2022

nixConfig would also affect inputs. We need it to be more fine grained; per-flake. Something like @L-as' suggestion or fetchConfig would be more appropriate for this, or inputs.self could be special cased, so as to provide a consistent interface.

With this you don't have the double-fetching problem, but it's restricted to Git.

We have multiple solutions to prevent double fetching.

Caching:

  • For inputs, only double fetch when an input is first added. Save the fetch config to the lock. Reuse this info and double fetch only when the fetch config has changed.
  • For the cli, cache the fetch config after first fetch, just like we do in the lock.

Fetcher dependent solutions:

  • For git, which is most flakes, do a minimal fetch first and expand on that as required by the fetch config.
  • For github, gitlab, etc: use the API to fetch just flake.nix first. This is hardly a double fetch, but not as efficient as the previous solutions. Worth keeping in mind though.

The caching solutions introduce significant complexity. This may be worth embracing at some point, but until then, the fetcher dependent solutions don't impose complexity on the rest of the system, making them feasible to implement first, to solve 90% of the problem.


I'll rename this issue, as the current problem is only about fetch parameters. The remainder of original problem statement is to treat similar things similarly, and I think we can agree on that. If there's another issue related to self, a new issue with a new thread is more productive.

@roberth roberth added the flakes label Nov 12, 2022
@roberth

This comment was marked as resolved.

@nrdxp nrdxp changed the title flakes: don't discriminate self input Support self-fetching parameters in flake.nix Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants