Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildPackages should not know about the top-level targetPlatform #35543

Closed
shlevy opened this issue Feb 25, 2018 · 35 comments
Closed

buildPackages should not know about the top-level targetPlatform #35543

shlevy opened this issue Feb 25, 2018 · 35 comments
Assignees
Labels
0.kind: bug Something is broken 6.topic: cross-compilation Building packages on a different platform than they will be used on 6.topic: developer experience

Comments

@shlevy
Copy link
Member

shlevy commented Feb 25, 2018

Currently when cross-compiling, the buildPackages set sees the top-level targetPlatform and thus cross-compiling-aware toolchain components are built to target targetPlatform instead of buildPlatform. To get access to a buildPlatform-targeted version of the tool, you need to access buildPackages.buildPackages or, in the special case of cc, we have a hack to make buildPackages.stdenv.cc do the right thing.

Also, anything that checks targetPlatform != buildPlatform (which is now aliased to stdenv.isCross) will get the wrong answer inside buildPackages.

Semantically, it seems like buildPackages should be exactly the same package set you'd get if you weren't cross-compiling at all. Ideally we'd live in a world where all tools were like clang and didn't need specific advance target awareness, but for tools that do isn't this exactly what targetPackages is for? That is, shouldn't I take my build-native compiler from buildPackages, my cross-compiler from targetPackages, and my cross-compiled target-native compiler from hostPackages?

As an aside, those names are really unclear even though they are conventional. I'd expect targetPackages to be "packages that run on the target" and something like crossPackages to be the source of cross-compiling tools.

@shlevy shlevy added 0.kind: bug Something is broken 6.topic: cross-compilation Building packages on a different platform than they will be used on 6.topic: developer experience labels Feb 25, 2018
@dtzWill
Copy link
Member

dtzWill commented Feb 25, 2018

Erm, generally the check for "is cross-compiling" is `hostPlatform != buildPlatform'.

The Nixpkgs manual helps explain these, although even so they're a bit hard to follow 😇.

However, from the manual, nativeBuildInputs:

A list of dependencies whose host platform is the new derivation's build platform, and target platform is the new derivation's host platform.

So, for building perl that runs on RISC-V (host=riscv, build=x86_64), nativeBuildInputs will give you packages that are:
host == (new derivation's build) == x86_64 -- that is, it will run on the current builder
target == (new derivation's host) == RISC-V -- that is, the compiler will generate code for RISC-V.

So the fact that nativeBuildInputs = [ ...compiler-package ]; gives you a cross-compiler is not surprising.

What is surprising is that depsBuildBuild = [ ... ]; doesn't work., since according to docs depsBuildBuild is:

A list of dependencies whose host and target platforms are the new derivation's build platform. This means a -1 host and -1 target offset from the new derivation's platforms. They are programs/libraries used at build time that furthermore produce programs/libraries also used at build time. If the dependency doesn't care about the target platform (i.e. isn't a compiler or similar tool), put it in nativeBuildInputs instead.

Actually, would depsBuildBuild = [ gcc6 ] work? I think that's the intended usage?

Anyway I think this is clearly a source of confusion and we should try to sort this out. Hopefully @Ericson2314 will be by soon :).

@Ericson2314
Copy link
Member

Hi @shlevy. @dtzWill has it it right. Cross compilation adds not one but two extra stages: https://github.com/NixOS/nixpkgs/blob/master/pkgs/stdenv/cross/default.nix, giving us in total

  1. Vanilla bootstrap stages: build=native, host=native, target=native
  2. Vanilla final stage: build=native, host=native, target=native
  3. Cross tools stage: build=native, host=native, target=foreign
  4. Cross packages stage: build=native, host=target, target=target

You've a few times told me that I'm too trepid about extra rebuilds, but maybe this a good reason why :). By making as few packages as possible know about the target platform, I make stage 3 very small. (Vanilla final stage, crucially, is its own buildPackages, which allows the cross tools stage target-platform-agnostic packages to be identical since `f (fix f) = f``.)


What is surprising is that depsBuildBuild = [ ... ]; doesn't work.

Actually, would depsBuildBuild = [ gcc6 ] work? I think that's the intended usage?

Yeah it should work; if it doesn't that's a bug. One needs the buildPackages in depsBuildBuild = [ buildPackages.stdenv.cc ] only because stdenv is a one a few attributes on a blacklist not to be spliced.

@dtzWill
Copy link
Member

dtzWill commented Feb 26, 2018

In my testing depsBuildBuild = [ gcc6 ] did fix the perl problem, FWIW.

(before the patch for -fwrapv was tracked down)

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 26, 2018

@shlevy

in the special case of cc, we have a hack to make buildPackages.stdenv.cc do the right thing.

Elaborating on what I said above, this is actually not a hack. stdenv is a very weird because all the packages in it (except for stdenv.libc) are actually drawn from the previous stage. Pretending we only have gcc:

  1. stdenv.cc == buildPackages.cc
  2. buildPackages.stdenv.cc == buildPackages.buildPackages.cc

@Ericson2314
Copy link
Member

@dtzWill sounds like it is working, then?

@shlevy
Copy link
Member Author

shlevy commented Feb 26, 2018

OK, this makes a lot more sense now. I've also been reading from the 17.09 manual, not the unstable manual, which made things even more confusing 😀

So yes, things are all working as expected and this is fine. A few things that I think might be better than the status quo:

  1. We need better names for these stages, despite their ubiquity. Especially targetPackages not actually being "packages which target your target" is very confusing.
  2. depsBuildBuild is a really confusing name.
  3. If we're going to stick with the splicing approach (and I'm not convinced we shouldn't just require specifying the source package set more explicitly somehow), nativeBuildInputs should mean "target-unaware" stuff and something new, like crossBuildInputs, should be used if you need to explicitly include a cross-compiler whose host is the build-system.

Proposal: buildSystemInfo, systemInfo, and crossSystemInfo corresponding to buildPackages, pkgs, and crossPackages which are selected from nativeBuildInputs, buildInputs, and crossBuildInputs. Ideally I'd like to make the latter set buildInputs, inputs, and crossBuildInputs.

@Ericson2314
Copy link
Member

Mmmm there is some symmetry / other logic in the existing names I want to lay out first before thinking about improvements:

  1. depsBuildBuild is because there would be all combinations of deps{Build,Host,Target}{Build,Host,Target} satisfying the restriction that the second build/host/target is less than or equal the first, given the order Build < Host, Target. That gives us:
    • depsBuildBuild
    • depsBuildHost
    • depsBuildTarget
    • depsHostHost
    • depsHostTarget
    • depsTargetTarget
      Its unfortunate, but unavoidable that there really is two axes here.
  2. As per the manual, I didn't want to break back compat, so buildInputs == depsHostTarget and nativeBuildInputs == depsBuildHost. Only the legacy names are supports cause I need only one name in bash and it would be weird to rename.

@Ericson2314
Copy link
Member

Especially targetPackages not actually being "packages which target your target" is very confusing.

It's "packages that run on your target. buildPackages is packages that run on your build.

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 26, 2018

If we're going to stick with the splicing approach (and I'm not convinced we shouldn't just require specifying the source package set more explicitly somehow)

Agreed, I hate splicing. I just didn't know an easier way to be backwards compatible.

But also, to be clear, we still need to distinguish the inputs for reasons other than splicing, namely env hooks and similar deps-processing-other-deps stuff. The only good thing about splicing besides back compat is it makes it sorta avoids using the wrong package set for the wrong dep since its (lamely) automatic.

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 26, 2018

nativeBuildInputs should mean "target-unaware" stuff and something new

Ah so this would be like depsBuildWhatever, in my naming scheme? I thought about that. Indeed the vast majority of packages don't care about the second axis. But I didn't want to make even more deps. and target = host + 1 (of which depsBuildHost and depsHostTarget are examples) is the sanest default if one has to be chosen.

@Ericson2314
Copy link
Member

something new, like crossBuildInputs, should be used if you need to explicitly include a cross-compiler whose host is the build-system.

To be clear, depsBuildBuild depsBuildHost and depsBuildTarget all have the compiler's host as the current package's build. They differ in what the compiler's target is (the second axis).

@shlevy
Copy link
Member Author

shlevy commented Feb 26, 2018

  • depsBuildTarget
  • depsHostHost
  • depsTargetTarget

These seem like pathological cases to me. Can you show real examples? I'm fine with keeping support for them under the hood just in case, but it seems odd to change the normal interface based on their theoretical possibility.

So, in my nomenclature, depsBuildBuild would be nativeBuildInputs, depsBuildHost would crossBuildInputs, and depsHostTarget would be buildInputs (or, ideally, buildInputs, crossInputs, and inputs).

As per the manual, I didn't want to break back compat, so buildInputs == depsHostTarget and nativeBuildInputs == depsBuildHost.

I disagree that the old semantics was nativeBuildInputs == depsBuildHost. In most cases, there's no difference from depsBuildBuild. So what cases are left? If a tool needs a native version of itself, like guile, you want depsBuildBuild so you don't rebuild twice. If a tool needs a cross-compiler, that's almost always wrapped up in a mkDerivation variant. If some end-developer is asking for a compiler in nativeBuildInputs, they almost certainly want a build-system native one. Is there anywhere where nativeBuildInputs makes more sense as depsBuildHost?

It's "packages that run on your target."

Yes, but I still maintain the terminology is confusing. When you say "your target" in this context, I think "the place this build will run", not "the place the things this build will build will run".

I just didn't know an easier way to be backwards compatible.

I'm talking about working toward the future now. I think the system you've built so far has validated itself beyond imagination, you've literally revolutionized cross-compilation IMO. Now let's make a transition path and clean up.

Ah so this would be like depsBuildWhatever, in my naming scheme?

No, this seems more complicated than it needs to be. Someday in the future I'd like Nix to be able to do things like "satisfy this input with either a native or cross-compiled variant of this package, whichever is easier for you", and in that hypothetical world this might make sense, but I agree on choosing a default.

target = host + 1 (of which depsBuildHost and depsHostTarget are examples) is the sanest default if one has to be choosen.

I just disagree that this is the right one.

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 26, 2018

Can you show real examples?

  • depsHostHost: it's pathalogical on its own, haha. But it can be induced through the way propagated dependencies work. I could complicate that algorithm, but it is already pretty crazy already.

  • depsBuildTarget depsTargetTarget: compilers with build systems we don't like :). depsTargetTarget is used by cc-wrapper and GHC propagated as depsTargetTargetPropagated. cc-wrapper uses it for libc, since libc is morally a buildInput downstream. GHC uses it for gmp; likewise, downstream->base->integer-gmp->gmp is morally a buildInputs chain.

Btw I used to __ prefix these to scare people a way from them hehe, but then Eelco wanted to just use that for actual Nix builtins which makes sense.

@Ericson2314
Copy link
Member

Is there anywhere where nativeBuildInputs makes more sense as depsBuildHost?

stdenv.cc works by making the C compiler a defaultNativeBuildInput. That C compiler runs on the build platform, and targets the host platform. Whenever we use a tool the normal way, as a nativeBuildInput, it's targeting our host platform.

@Ericson2314
Copy link
Member

If a tool needs a native version of itself, like guile, you want depsBuildBuild so you don't rebuild twice.

Err I dunno about guile, but to build a build != host == target GCC or GHC, you absolutely do need to build twice. You need the depsBuildBuild GCC for built-on-the-fly build tools, the nativeBuildInputs/stdenv.cc GCC for building the compiler and building the runtime. With a build != host != target GCC you'd need three different GCCs cause those pesky runtime libs.

@Ericson2314
Copy link
Member

Yes, but I still maintain the terminology is confusing. When you say "your target" in this context, I think "the place this build will run", not "the place the things this build will build will run".

I agree the autoconf names suck. I kept them because the the concept of a "target platform" also sucks, and I wanted to use just one name for this lousy concept. I guess what I'm getting at is lets figure out if the naming scheme you want is isomorphic to what we have and then bikeshed the exact identifiers :).

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 26, 2018

I'm talking about working toward the future now. I think the system you've built so far has validated itself beyond imagination, you've literally revolutionized cross-compilation IMO. Now let's make a transition path and clean up.

Thanks :). May the revolution continue then! The plan I originally had was change GCC, GHC, and other libraries so

  1. Shorter term: we never build run time libraries and the compiler itself in the same derivation
  2. Longer term: All targets are simultaneously supported.
  3. Drop targetPlatform.
  4. Use whatever terminology we want as we actually aren't dealing with the autoconf concepts anymore.

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 26, 2018

Btw, depsHostHost would be used by a multi-target GHC for a hard-coded dependency on a binutils or GCC it would use at run-time for compiling TH or compiler plugins it would then itself use. So I hope someday its not contrived (But then again not whether or not its a good idea to hard-code things like that.)

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 26, 2018

I'll grant you since I only use depsTargetTarget for the library dependencies of language runtimes, the second axis is always irrelevant. I'll also granted you the vast majority of buildInputs are also libraries and so the second axis there is also irrelevant.

@shlevy
Copy link
Member Author

shlevy commented Feb 26, 2018

But it can be induced through the way propagated dependencies work.
depsTargetTarget is used by cc-wrapper and GHC propagated as depsTargetTargetPropagated. cc-wrapper uses it for libc, since libc is morally a buildInput downstream. GHC uses it for gmp; likewise, downstream->base->integer-gmp->gmp is morally a buildInputs chain.

IMO these are all "internal" cases that justify knowing more about the system. They don't really seem like things most people making their packages cross-compilable should think about. But yes, these are real, and these systems are horrifying 😀

stdenv.cc works by making the C compiler a defaultNativeBuildInput. That C compiler runs on the build platform, and targets the host platform. Whenever we use a tool the normal way, as a nativeBuildInput, it's targeting our host platform.

The stdenv.cc case is an implementation detail, it could just as well work by making it a defaultCrossBuildInput. I'm not sure what you mean by "use a tool the normal way".

Err I dunno about guile, but to build a build != host == target GCC or GHC, you absolutely do need to build twice.

Sorry, I was treating my baseline as the normal non-cross set. What I meant is, guile having itself as a nativeBuildInput means building guile three times: buildPackages.buildPackages.guile, buildPackages.guile, and finally the guile you want. Similarly for all the commits I've made lately adding buildPackages.buildPackages.foo to nativeBuildInputs (since I didn't realize that depsBuildBuildactually does work).

Putting it another way: As a maintainer of packages I want to cross-build, empirically speaking the vast majority of the time I want to think about depsBuildBuild and depsHostTarget. The few times I want to think about depsBuildHost, it's something that should go in a setup hook or a mkDerivation variant, not normal packages.

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 27, 2018

Ok so I talked to @shlevy a bit on IRC, and here is me-written and perhaps jointly-editted (woo super perms!) summary:

  1. The hats we're wearing for this discussion are a nice division of labor where I am coming from overhauling all this core infra, and @shlevy is thinking about the ease of maintaining individual packages. We both agree its important the core infra handle the common and obscure cases alike, because all the language-specific infra on top shouldn't be reinventing this stuff. We also agree the common case mkDerivation or whatever should be easy and intuitive. If people, exposed to the full power, use the wrong type of dep, then we loose the composability that handling the obscure case so deeply was supposed to give us!

  2. We basically agree on what the core infra should support, e.g. the use of these two axes, as annoying as it is, to support compilers. So while we can still bikeshed the deps* naming scheme, something that is isomorphic would be the result.

  3. We don't yet agree on what the default handling of nativeBuildInputs should be. The "primary compiler" of a derivation (e.g. GCC for C package, GHC for Haskell package, etc) is build->host (depsBuildHost) in the current naming scheme. @shlevy feels strongly that additional compilers tend to be used for on-the-fly built, used, and discarded build tools; that would be depsBuildBuild. I am on the fence.

    This boils down however into thinking which is more important, the more often syntactically mentioned compiler (the secondary one, since some language glue code typically handles the primary one) or the more often excised code path, which is obviously the primary compiler. Naturally, I, with my infra hat on, am sympathetic to the latter, while @shlevy, wearing an individual package maintainer hat, is thinking about the former.

  4. The easily agreed upon point with that is that the core stdenv infra should probably not be using legacy names (like nativeBuildInputs). The fact that there is no depsBuildHost but the name must be interpolated makes teaching and reading the infra harder. Conversely, by not using legacy names, we can take the core infra, and me (!), out of this debate, letting the package maintains decide what existing uses of nativeBuildInputs best map to.

    The easiest way to do this is make like a stdenv.mkDerivationExplicit (to be bike-shed, of course), and then desugar buildInputs and nativeBuildInputs completely away in stdenv.mkDerivation. Now there are a number of bash uses of the legacy names that would need to be changed, but now that the infra has landed, I don't think that will be so bad to debug.

  5. The remaining bike-sheds are

  6. What is the right meaning of the now-just-sugar nativeBuildInputs, as discussed.

  7. Is there a better (probably-isomorphic) naming scheme for my internals than depsBuildBuild.

  8. What's a better name than mkDerivationExplicit.

  9. On a separate note, I am better appreciating @shlevy's point about the autoconf terminology needing to be replaced. As I wrote above, I knew it sucked, but thought perhaps sticking with existing names was still better than new ones. But I see everyone messing up the names, even people who clearly know what's going on by the 100s (!!!) of packages of they've fixed. And the names we could use, credit @eternaleye, are just so damn slick: "build", "run", "emit". Wow, so clear; autoconf you don't stand a chance.

@shlevy
Copy link
Member Author

shlevy commented Feb 27, 2018

Thanks for this summary! I think there are two action items here:

  1. Split out the core, language-agnostic, full cross-compilation-details aware bits of mkDerivation into as clean an interface as possible while still supporting mapping from legacy mkDerivation calls (it's OK if corner cases e.g. cc-wrapper need to be rewritten). This wouldn't necessarily need be limited to better cross support, if e.g. propagation needs to be fixed somehow or whatever this is a good time to do it.
  2. Design buildCPackage, buildHaskellPackage, etc. to be built on top of the new core design but with the per-package cross-compilation concerns more easily visible (and the handling of the cross-compiler itself abstracted away).

Thoughts? Do you have a clear vision for 1? I can take a look at 2.

@shlevy
Copy link
Member Author

shlevy commented Feb 27, 2018

I definitely think we should use the build/run/emit terms in the new infra, by the way.

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 27, 2018

That sounds good. I have a clear vision for 1, and some WIP work for Haskell you may be interested. I'd also like to brainstorm about how to do ccWithPackages etc in conjunction with 2.

@shlevy
Copy link
Member Author

shlevy commented Feb 27, 2018

@Ericson2314 Can you outline your vision somewhere so we can start looking at 2 in parallel?

@Ericson2314
Copy link
Member

For 1:

  1. Make mkDerivation a wrapper that desugars the legacy input names as proposed above.

  2. Move away from autoconf terminology (isomorphic renaming).

  3. Make env hooks more coarse-grained, I see no use for the second axis for them.

  4. Revert 469fd89#diff-7a1764865f30c6cb4c22c2654ebe9a38, ripping off the bandaid that makes packages with miscategorized dependencies work for native.

  5. By default only allow references (immediate, not filtering to depsHost* and depsTarget*. I.e. nativeBuildInputs and friends will not be allowed to be referenced, unless those derivations are also listed as those other types of deps. This might require Nix modifications for per-output whitelists to accommodate dev outputs. (@vcunat clued me into that problem.)

@shlevy
Copy link
Member Author

shlevy commented Feb 28, 2018

@Ericson2314 Why do we need 4 if we're going to a new interface?

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 28, 2018

@shlevy Internally I've largely done the long-term implementation. 4 is sort of an anti-revert; I first rewrote stdenv/setup how it ought to be, more or less, and then added in some native-only hacks to avoid widespread breakage. 4 removes the stop gap hacks, bringing us closer to the ideal design.

Also important to know is the type of hacks in that commit I did can't readily be done with a legacy desugaring on top. If we rip the bandaid off I think we also need to fix a bunch of packages.

@shlevy
Copy link
Member Author

shlevy commented Feb 28, 2018

@Ericson2314 Can't the legacy wrapper just have a phase where it sets PATH etc.?

@Ericson2314
Copy link
Member

@shlevy Sure (though nix-shell would annoy people). But deciding what deps go on the PATH seems like an orthogonal question to when the PATH is built?

@shlevy
Copy link
Member Author

shlevy commented Feb 28, 2018

Right, but I mean the legacy wrapper can add extra things to PATH on top of what the core thing does. So we don't need to break things on the old setup.

@Ericson2314
Copy link
Member

@shlevy Well as I said above, I am a bit skeptical of being able to desugar. The PATH part is easy, for sure. The env hooks though.... I dunno. Not always sure if env hooks are effectively ideompotent if we end up running them on deps twice.

@shlevy
Copy link
Member Author

shlevy commented Feb 28, 2018

Ah, OK.

@Ericson2314
Copy link
Member

I've only seen the breakage in conjunction with other WIP cross stuff. Maybe if we set up a hydra job with just the rebuild it would be less bad than I fear. At the very least, it would be interesting to open a hydra job and then reevaluate the situation.

@shlevy
Copy link
Member Author

shlevy commented Mar 2, 2018

Broke out action items into #36217 #36218 and #36219

@shlevy shlevy closed this as completed Mar 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: cross-compilation Building packages on a different platform than they will be used on 6.topic: developer experience
Projects
None yet
Development

No branches or pull requests

3 participants