Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0171] Default name of fetchFromGithub FOD to include revision #171

Closed
wants to merge 8 commits into from

Conversation

jonringer
Copy link
Contributor

@jonringer jonringer commented Mar 20, 2024

Make fetchFromGithub FOD name more meaningful. This avoids stale artifacts and gives more content fidelity when looking at nix store paths.

Rendered: https://github.com/jonringer/rfcs/blob/jringer/fetch-from-github/rfcs/0171-fetch-from-github.md

Items for further refinement:

  • Other version control fetchers
  • Prefix vs suffix of src label

rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
rfcs/0171-fetch-from-github.md Outdated Show resolved Hide resolved
Jonathan Ringer and others added 2 commits March 19, 2024 19:04
@jonringer jonringer force-pushed the jringer/fetch-from-github branch from 17a0a70 to 7534dd4 Compare March 20, 2024 02:06
@Aleksanaa
Copy link
Member

Just one silly question: why only fetchFromGitHub but not including fetchFromGitLab and others, besides the reason that it is very common?

Although fetchFromGithub is one of many fetchers; it is very common, and generally has a user specify granular source information which makes differentiating between sources easy.

@jonringer
Copy link
Contributor Author

jonringer commented Mar 20, 2024

Just one silly question: why only fetchFromGitHub but not including fetchFromGitLab and others, besides the reason that it is very common?

I did reference this

- Similar treatment to similar fetchFromGitX helpers?

I mention fetchFromGitHub because it's about 20x as common:

$ rg -i fetchFromGitHub | wc -l
33178
$ rg -i fetchFromGitlab | wc -l
1478

I'm not opposed to doing a similar treatment to the other fetchFromX helpers. I think this RFC could be expanded to encompass them all eventually.

To your point, maybe this should be relabeled as "Default name of fetchFromX FOD helpers to include revision"

@eclairevoyant
Copy link

There's already a speculative impl as of 2 weeks ago: NixOS/nixpkgs#294068

Though the format is slightly different ("source-${owner}-${repo}-${rev}", essentially)

@ShamrockLee
Copy link

ShamrockLee commented Mar 23, 2024

Though the format is slightly different ("source-${owner}-${repo}-${rev}", essentially)

I choose the format to make it easier to inspect which source FOD is going wrong when several packages got rebuild at the same time.

Update: Just came across NixOS/nixpkgs#49862, which seems to work on the package names of fetchers.

Comment on lines 69 to 77
- "Interchangeability" with other fetchers is diminished as the derivation name is different
- In practice, fetchFromGitHub is never used in this way. It is generally the only fetcher, so there is never another FOD to dedupilicate.
- Out-of-tree repositories may get hash mismatch errors
- If the cause of the mismatch is staleness, this is good and working as intended
- If the cause is non-determinism, this is frustrating.
- Some derivations assume "source" to be the name of sourceRoot
- This has been mitigated over two years within Nixpkgs
- Out-of-tree code may break if they assume "source" is the name
- Can be mitigated with release notes describing the issue
Copy link

@ShamrockLee ShamrockLee Mar 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are caused by the assumption of src.name == "source" being broken.

However, we have already deprecated such assumption as early as Nixpkgs 21.11. Nixpkgs Manual now requires sourceRoot to start with "${src.name}" instead of "source" when src is constructed with by fetchgit-based fetchers, and tree-wide conversions (NixOS/nixpkgs#245388, NixOS/nixpkgs#247977, NixOS/nixpkgs#248528, NixOS/nixpkgs#248683, NixOS/nixpkgs#294334) have been merged.

fetchFromGitHub and other output-as-a-directory fetchers can still be used interchangeably if we stick to ${src.name} instead of "source".

In my opinion, we do not provide bug-level compatibility to packages failing to follow already-stablized specifications, in-tree or out-of-tree.

@ShamrockLee
Copy link

maybe this should be relabeled as "Default name of fetchFromX FOD helpers to include revision"

IMO, "fetchers for version-controlled repositories" would be a suitable target. This includes fetchers based on specific version control systems (e.g. fetchcvs, fetchsvn, fetchgit, fetchhg, fetchbzr, etc.) and fetchers based on specific service providers (e.g. fetchFrom*).

@Ericson2314
Copy link
Member

Hehe per #133 (comment), my dream of single hash git fetchers (avoiding the need to put things in the name like this) is now a good bit closer!

@ShamrockLee
Copy link

Hehe per #133 (comment), my dream of single hash git fetchers (avoiding the need to put things in the name like this) is now a good bit closer!

Congratulations!

We still need to fix those fetchers at Nixpkgs level, though, since Nixpkgs often takes years to adopt new Nix features.

Comment on lines +95 to +101
- Full commit hashes could be truncated. This sacrifices a bit of simplicity for better looking derivation names:
```
let
version = builtins.replaceStrings [ "refs/tags/" ] [ "" ] rev;
# Git commit hashes are 40 characters long, assume that very long versions are version-control labels
ref = if (builtins.stringLength rev) > 15 then builtins.substring 0 8 version else version;
in lib.sanitizeDerivationName "${repo}-${ref}-src";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't recommend this. Short hashes are not guaranteed to be stable for long term storage.

Copy link

@ShamrockLee ShamrockLee Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

40 characters are not really that long for a machine-generated derivation.

People would most likely see the store hash and (hopefully) part of the name when it flashes through their terminal. During debugging, a path like that would occupy at most a bit more than one line inside the terminal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current "source" isn't that stable either. 8 characters is still a fair amount of entropy, and likely to be different enough for most repositories.

Copy link
Contributor Author

@jonringer jonringer Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also remember, that it doesn't have to be "unique across all time". It just needs to be different than what was there before, so that the combination of name + hash are different.

@infinisil infinisil changed the title Default name of fetchFromGithub FOD to include revision [RFC 0171] Default name of fetchFromGithub FOD to include revision Mar 28, 2024
@MMesch MMesch added the status: open for nominations Open for shepherding team nominations label Apr 2, 2024
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-04-02/42643/1

@ShamrockLee
Copy link

ShamrockLee commented Apr 3, 2024

I'd like to nominate myself as a shepherd. I previously authored a related implementation, NixOS/nixpkgs#294068, which could inform the discussion around this RFC's design.

I'm excited about this RFC and look forward to working with the community to shepherd it through the process.

Co-authored-by: Yueh-Shun Li <shamrocklee@posteo.net>
@jonringer
Copy link
Contributor Author

IMO, "fetchers for version-controlled repositories" would be a suitable target. This includes fetchers based on specific version control systems (e.g. fetchcvs, fetchsvn, fetchgit, fetchhg, fetchbzr, etc.) and fetchers based on specific service providers (e.g. fetchFrom*).

If we have a shepards meeting, we can refine the scope.

@oxij
Copy link
Member

oxij commented Apr 4, 2024

Yes, NixOS/nixpkgs#49862 is very related, in its latest reincarnation it allows you to keep both *-source names on Hydra and generate pretty *-<name>-<version>-<optional fetcher>-source names with config.nameSourcesPrettily option set. See https://github.com/NixOS/nixpkgs/pull/49862/files#diff-1977c7748af8b43b92093d2383e23e88c4df10a702578fbe885675a0956a2f1f

@ShamrockLee
Copy link

To code the fetcher name into the name attribute, the default name could also be ${repo}-${rev}-github-source.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfc-171-default-name-of-fetchfromx-to-include-version-information/43001/2

@jonringer
Copy link
Contributor Author

@oxij I would like to nominate you as a shepherd, looks like you've thought about this longer than I've been involved in nixpkgs.

I like your addition of a lib function to make derivation names more consistent in NixOS/nixpkgs#49862

@oxij
Copy link
Member

oxij commented Apr 29, 2024 via email

@risicle
Copy link
Contributor

risicle commented May 6, 2024

I certainly think this conversation needs to be revived because of the security implications of

Nix reuses existing fixed-output outputs between different derivations without actually checking the derivation actually builds into an output with the given output hash

which leaves us open to a potential cache poisoning attack on cache.nixos.org, related to NixOS/ofborg#68 (comment) - using the same name for a large proportion of our FODs significantly exacerbates this.

@oxij
Copy link
Member

oxij commented May 6, 2024 via email

@risicle
Copy link
Contributor

risicle commented May 6, 2024

You're close to the scenario detailed by that author to the security team.

Ultimately we can never completely stop a malicious change missing the attention of a reviewer (and this goes for all projects, hashed caching scheme or not), but we can make it so the submitter has to jump through some more hoops that will hopefully make the PR look weirder and provoke more attention. Hence small ("small") changes like this.

@ShamrockLee
Copy link

ShamrockLee commented May 6, 2024

Eelco and others objecting to the <hash>-<name>-<version or revision>-source naming scheme (and to the original 2018 version of NixOS/nixpkgs#49862 which wanted to do exactly this, but for all fetchers, not just fetchFromGitHub, and, primarily, for a different reason of /nix/store discoverability) do have a point: the current most common <hash>-source naming scheme is awfully convenient for running a build server (like Hydra) and when you have to frequently switch between fetchers (e.g. when using out-of-Nixpkgs derivations and/or flakes).

The objection(NixOS/nixpkgs#49862 (comment) and NixOS/nixpkgs#59858 (comment)) is no longer relevant, as sourceRoot = "source" is officially deprecated since Nixpkgs 23.11 when an FOD returned by a fetchzip-based build helper is assigned to src. At least five rounds of clean-up PRs has been merged to eliminate the sourceRoot = "source" assumption.

@oxij
Copy link
Member

oxij commented May 6, 2024 via email

@ShamrockLee
Copy link

ShamrockLee commented May 6, 2024

  • Changing most source derivation names no longer breaks most builds, yes. (Some do depend on customly named source derivations still, AFAIK. E.g. Apache Arrow, Datafusion, etc.)

Custom src-name and sourceRoot are often required for tools that requires certain source directory name to work.

  • But moving away from *-source naming scheme will still make using flakes inconvenient.

Good point!

That would be an unfortunate cache miss. Nevertheless, isn't it "the Nixy way" to pursue dependency isolation even at the cost of rebuilds (i.e. changing the version of compiler used to compile Bash triggers a world rebuild)?

@oxij
Copy link
Member

oxij commented May 6, 2024 via email

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-05-14/45414/1

@kevincox
Copy link
Contributor

RFCSC:

@oxij We notice that you didn’t accept the nomination, but keep in mind that you don’t need to agree with the RFC to be a shepherd, contrasting views can be very helpful. The role of the shepherd team is to decide whether the community as a whole would benefit from the RFC, having different views in the shepherd team can be helpful to ensure the entire community is heard. Please reconsider if you’d like to be a shepherd after all and let us know.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-05-28/46113/1

@oxij
Copy link
Member

oxij commented Jun 2, 2024 via email

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-06-10/46817/1

@kevincox
Copy link
Contributor

I don't see much point in merging an option if we don't know that we want to use it. Otherwise we need to consider taking it back out if it ends up that this isn't desirable. I would only recommend merging the implementation first if there are unknowns that would have a major impact on the RFC itself and that can not be properly tested without shipping to most users behind a config option.

I would recommend going forward with the RFC, then if it is accepted the mentioned PRs should be fairly easy to get in. (Code quality concerns would still have to be addressed of course, but the intention will be known to be desirable.)

@jonringer
Copy link
Contributor Author

I don't see much point in merging an option if we don't know that we want to use it. Otherwise we need to consider taking it back out if it ends up that this isn't desirable.

Agreed, the main reason why I drafted the RFC is for correctness. I would like for it to be opt-out rather than opt-in.

Using the opt-in might be nice for PR stabilization, but I wouldn't want that for an end state.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-06-24/47589/1

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-07-08/48678/1

@philiptaron
Copy link

philiptaron commented Jul 13, 2024

I'm going to resign as shepherd, since I don't know how to do the job.

I agree with this RFC broadly that making, using, and updating fixed-output derivations is too hard in Nix and Nixpkgs. There are a lot of good suggestions here about how to make it better. Without taking a strong position on any particular suggestion, it's a shame that doing these sorts of upgrades on the status quo are so difficult, and the levers of action seem both far away and not able to be pulled by roughly anybody.

I do see some positive action in Lix-land: https://gerrit.lix.systems/c/lix/+/1536

@infinisil
Copy link
Member

I'm going to resign as shepherd, since I don't know how to do the job.

Ah that's unfortunate, thanks for being explicit about stepping down. Shepherding has an (imo) decent explanation here:

The responsibility of the team is to guide the discussion as long as it is constructive, new points are brought up and the RFC is iterated on and from time to time summarise the current state of discussion. If this is the case no longer, then the Shepherd Team shall step in with a motion for FCP.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-08-05/50170/1

@asymmetric
Copy link
Contributor

RFCSC: This RFC is being closed due to lack interest. If enough shepherds are found this issue can be reopened. If you don't have permission to reopen please open an issue for the NixOS RFC Steering Committee linking to this PR.

@asymmetric asymmetric closed this Aug 19, 2024
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfcsc-meeting-2024-08-19/50831/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: open for nominations Open for shepherding team nominations
Projects
None yet
Development

Successfully merging this pull request may close these issues.