Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opam monorepo support #18

Merged
merged 1 commit into from
Sep 22, 2022
Merged

opam monorepo support #18

merged 1 commit into from
Sep 22, 2022

Conversation

RyanGibb
Copy link
Contributor

@RyanGibb RyanGibb commented Sep 7, 2022

This PR adds support for opam-monorepo.

To support building mirage unikernels for cross-compilation targets we need to build them with dune since mirage 4 (in the absence of ocaml/opam cross compilation support).

Dune supports cross compilation with 'build contexts' using a toolchain which will use a target compiler installed in <opam-switch>/default/<toolchain> (as opposed to the host compiler). But not all dependencies are build with the target compiler. Notably ppx requires building with the host compiler. This information in stored in dune files -- e.g. here. In order to statically know how to build dependencies, we would have to parse the dune files to try and find out (which is non-trivial). Potentially a better solution is to encode the dependency build contexts in opam files, but opam doesn't currently support cross-compilation.

Instead, opam-monorep is used to build mirage unikernels. This fetches all the dependencies (filtered with ?monorepo in the opam file) to a local directory duniverse. Dune then builds the whole project. Unfortunately we can't benefit from the dune cache in our nix derivation, so all the dependencies will be rebuilt on every nix build.

This is implemented in the opam-monorepo project. This PR adds support for this workflow in the for of the queryToScopeMonorepo function, along with the lower layer functions mkSrcsScope, deduplicateSrcs, and defsToSrcs. We refactor the source fetching logic into evaluator/default.nix to share it between mkSrc (a derivation fetching a package's source) and builder.

To see an example of this usage see https://github.com/RyanGibb/mirage-hello. A flake template is the works.

README.md Outdated Show resolved Hide resolved
src/opam.nix Outdated Show resolved Hide resolved
src/opam.nix Outdated Show resolved Hide resolved
src/opam.nix Outdated Show resolved Hide resolved
src/opam.nix Outdated
defToSrc = { version, ... }@pkgdef:
let
inherit (getUrl bootstrapPackages pkgdef) src;
name = if pkgdef ? dev-repo then urlToName pkgdef.dev-repo else "";
Copy link
Collaborator

@balsoft balsoft Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is having "" as the fallback name for the case when there's no dev-repo the same as what opam-monorepo does?

Copy link
Contributor Author

@RyanGibb RyanGibb Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opam-monorepo filters out 'virtual' packages (without a dev-repo or url) here: https://github.com/tarides/opam-monorepo/blob/9262e7f71d749520b7e046fbd90a4732a43866e9/lib/lockfile.ml#L104

I'm still thinking about the best way of doing this with opam-nix.

This isn't the only filtering opam-monorepo does: it also filters packages that aren't build with dune (which is the reason for opam-overlays). But AFAIK we can't do this with opam admin list. Maybe modifying the repo to remove any packages not built with dune is the best way to go.

I'm not sure if we need to filter packages built with dune if out of this holds true:

Package versions from earlier repositories take precedence
over package versions from later repositories.

But if this could take a higher version from a later repository (if the higher version doesn't have an overlay yet) it would cause trouble.

Copy link
Collaborator

@balsoft balsoft Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the higher version doesn't have an overlay yet

Hm, yes, good point. The filtering can probably be done by doing listRepo on the opam-overlays and removing those packages from opam-repository entirely, before passing the final repo to opam admin list. This can be quite slow though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this here, but unfortunately it doesn't consider the case when the newer version does build with dune but an older version (which is overlayed) doesn't.

For example, bheap.1.0.0 doesn't build with dune and therefore has an overlay, but bheapversion.2.0.0 does build with dune so doesn't have an overlay.

When building mirage-hello using opamRepositoryModuloOverlay we get the error

       > [ERROR] No solution including optional dependencies for hello.dev:   * Missing dependency:
       >             - bheap >= 2.0.0
       >             no matching version

I think this is why opam-monorepo filters (if it depends on dune) by package version rather than package name.

Copy link
Contributor Author

@RyanGibb RyanGibb Sep 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option I can see is parsing every package in the repo and filtering out only those that don't depend on dune. But I imagine this would be quite expensive (there are ~24k packages in opam-repository[0]) even if it would only have to be done once.

Another option would be to use opam monorepo lock to resolve dependency versions.

[0]

opam-repository $ find packages -mindepth 2 -maxdepth 2 | wc -l
24767

@balsoft
Copy link
Collaborator

balsoft commented Sep 8, 2022

Overall a very neat implementation, seems to be minimal and mostly agree with upstream. Amazing job! The comments are mostly nitpicking, feel free to ignore them if you see fit. Please squash to one (or maybe a couple) commits before I merge this.

@RyanGibb
Copy link
Contributor Author

RyanGibb commented Sep 9, 2022

Thank you! The comments are very useful :-)

I'm actually working on a buildOpamMonorepo to pick up pin-depends (for example in mirage-www) so will convert this to a draft this and a few other tweaks are complete.

@RyanGibb RyanGibb marked this pull request as draft September 9, 2022 12:04
src/opam.nix Outdated
# filter out pkgs without dev-repos
if pkgdef ? dev-repo then { inherit name version src; } else { };
# remove filterPkgs
filteredDefs = builtins.removeAttrs defs filterPkgs;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed here? If callers of this function want to remove some packages, it's really easy for them to just call remoteAttrs themselves.

Copy link
Contributor Author

@RyanGibb RyanGibb Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was this allows packages to be removed based on their name (rather than their dev repo name). For example, the unikernel package we're building has a `dummy' dev-repo which we need to remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite useful when we want to exclude dependencies from the monorepo that are in the same source as the unikernel (e.g. https://github.com/mirage/mirage-www) in buildOpamMonorepo with:

filterPkgs = ... (attrNames (latestVersions // query)) ...;

src/opam.nix Outdated Show resolved Hide resolved
src/opam.nix Outdated
filter = path: type: type != "directory" ||
# O(n^2)...
# if there isn't a match for an package in opam-overlay
! elem true (map (overlayPkgPath: hasInfix overlayPkgPath path) overlayPkgPaths);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
! elem true (map (overlayPkgPath: hasInfix overlayPkgPath path) overlayPkgPaths);
! any (overlayPkgPath: hasInfix overlayPkgPath path) overlayPkgPaths;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, hasInfix might be a bit flaky here, and quite slow.

On a more general note, maybe there's a way to just make a custom joinRepos that would overwrite entire package's instead of just package versions. That would remove the slowdown, but will probably require modifying the symlinkJoin from nixpkgs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That just made me realise that joinRepos is not quite correct ATM, since it doesn't actually do the version overriding properly. I think I'll fix it and implement this at the same time.

Copy link
Contributor Author

@RyanGibb RyanGibb Sep 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea, but I should warn you I think it will suffer from the same issues I described in the other open comment.

Essentially there's two possibilities when there's a package in opam-overlay with a newer in opam-repository without an overlay. One, an overlay hasn't been written yet. Two, the new version now supports dune and an overlay isn't required. In the former we want to remove the new version from opam-repository, in the latter we don't.

I don't think there's a way to differentiate these cases without parsing the new version's opam file to check for a dune dependency which suggests (although I guess doesn't guarantee, but it seems to work fine for opam-monorepo) dune support.

I'm ironing out the bugs on a function to do this now. It's not fast, but I think it'll work. I'll be offline for the next couple of hours but will push what I have now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay it seems to be working! (Although it's still building... I might lose network connection soon so pushing preemptively)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, ok, sorry for not reading thoroughly. I still made the joinRepos a bit better (and quite a lot faster actually) by patching lndir (which comes from... Xorg of all places?)

#20

src/opam.nix Outdated Show resolved Hide resolved
src/opam.nix Outdated
@@ -562,6 +562,19 @@ in rec {

# remove non-dune dependant packages
opamRepositoryFiltered =
# TODO is there a way to do this outside a derivation?
let parseOpam = pkgPath:
Copy link
Collaborator

@balsoft balsoft Sep 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use importOpam? (maybe it's a good idea to rewrite it and other functions to use your pkgNameVer thing, though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purely aesthetics. It's hard to judge progress when building ~24k derivations with the same name 😄 I can add this to importOpam if you think it's nice!

src/opam.nix Outdated
@@ -562,6 +562,19 @@ in rec {

# remove non-dune dependant packages
opamRepositoryFiltered =
# TODO is there a way to do this outside a derivation?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there's not. IFD is the only way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(it might be a bit faster to do all the opam2json-ning in one derivation and then read the combined output, though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking the same thing. I'll give it a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this works, I'm wondering if materialization might be a way of speeding up deployments...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An aside: it would be nice if we could use just the version solving functionality of opam without pulling the whole program in. I might look into the feasibility of a refactor of opam to support this.

Copy link
Contributor Author

@RyanGibb RyanGibb Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've hacked something together that builds opamRepositoryFiltered in one derivation:

  # remove non-dune dependant packages
  opamRepositoryFiltered =
    let
      keepPackages = [ "ocaml" "dune" "ocaml-solo5" "ocaml-src" "solo5"
      "ocaml-config" "ocaml-system" "ocaml-variants" "base-bytes" ];
    in bootstrapPackages.stdenv.mkDerivation {
      name = "opamRepositoryFiltered";
      src = opamRepository;
      buildInputs = with bootstrapPackages; [ jq ];
      installPhase = ''
        # look at package version directories
        # (e.g. /nix/store/<hash>-opam-repository-modulo-overlay/packages/<name>/<name>.<version>)
        for f in $(find packages -mindepth 2 -maxdepth 2); do
          echo $f
          # special cases as we will filter them before creating monorepo
          # just needed so `opam admin list` doesn't complain
          # TODO parse queries opam files to a list of dependancies with 
          # build=false so only pick up monorepo deps and we remove this workarounds
          # TODO or just use opam monorepo solver...
          if $(echo "$f" | awk 'BEGIN {
              split("${concatStringsSep " " keepPackages}", parts);
              for (i in parts) dict[parts[i]]=""
            } { exit !($1 in dict) }'); then
            # do nothing
            echo
          # if package depends on dune
          # TODO properly recursively parse vals of depends
          elif $(${opam2json}/bin/opam2json "$f/opam" |\
            jq '[ .depends | flatten | .[] | if type=="string" then . else .val end ] |
              if index("dune") == null then halt_error(1) else halt end'); then
              mkdir -p "$out/$f"
              cp -r "$f" "$out/$f"
          fi
        done
      '';
    };

But it's also very slow (waiting on exact timings now). I think using opam monorepo's solver would be a better solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this creates a very IO bound process that takes ~half an hour to complete 😄 I'm going to look at using opam monorepo lock and parsing the lockfile to solver versions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable.

@RyanGibb
Copy link
Contributor Author

RyanGibb commented Sep 22, 2022

So trying use opam monorepo lock has more problems than I anticipated. First and foremost I can't get it to use local repositories.

However I don't think we actually need this. It's nice to have, but a work around for:

The case when the newer version does build with dune but an older version (which is overlayed) doesn't.

Is just to downgrade opam-repository.

This isn't ideal, but I'm happy to merge this as-is if you are and create a separate PR for this customer solver logic. Alternatively, I can carry on using the fork and let you know when I have more success.

@RyanGibb
Copy link
Contributor Author

RyanGibb commented Sep 22, 2022

An aside: another use case I found for opam monorepo is picking up or dependencies. A workaround for this is just manually adding them to the query.

@RyanGibb RyanGibb marked this pull request as ready for review September 22, 2022 12:50
@balsoft
Copy link
Collaborator

balsoft commented Sep 22, 2022

This isn't ideal, but I'm happy to merge this as-is if you are and create a separate PR for this customer solver logic

Yep, I think this is a good-enough solution as it is right now. opam monorepo lock can be added in a later PR.

@RyanGibb
Copy link
Contributor Author

Great :-)

@balsoft
Copy link
Collaborator

balsoft commented Sep 22, 2022

Could you clean up the commit history then?

@RyanGibb
Copy link
Contributor Author

Done!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants