Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nix: make xcrun visible in Nix sandbox for precompiling Metal shaders #6118

Merged
merged 3 commits into from
Mar 26, 2024

Conversation

josephst
Copy link
Contributor

Fixes #6117

Adds /usr/bin/ to $PATH prior to running Cmake when building with Nix. No change to building normally/ without Nix.

Comment on lines 161 to 163
preConfigure = lib.optionals useMetalKit ''
export PATH=/usr/bin:$PATH # make sure /usr/bin/xcrun is on PATH
'';
Copy link
Collaborator

@SomeoneSerge SomeoneSerge Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not particularly familiar with Darwin and I don't know what xcrun is. Could you attach a brief comment expounding this and explaining why it's needed for the configurePhase? What else is exposed in /usr/bin with the relaxed sandbox? Why do you give /usr/bin a higher priority than to the nix-provided PATH? If xcrun is the only thing you need from /usr/bin you could instead make a wrapper for it:

nativeBuildInputs = optionals hostPlatform.isDarwin [
  (writeShellScriptBin "xcrun" "/usr/bin/xcrun $@")
];

or better yet (not sure if that works since /usr/bin/xcrun` is not a store path, but with the relaxed sandbox - probably)

runCommand "xcrun" { nativeBuildInputs = [ makeWrapper ]; } ''
  makeWrapper "/usr/bin/xcrun" "$out" --suffix PATH : /usr/bin
''

A more general darwin question: can we explicitly request the relaxed sandbox using some kind of requiredSystemFeatures?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be necessary to taint the build with /usr/bin/xcrun if all we need is xcrun -- it can be found in the xcbuild and xcodebuild packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swdunlop not having any progress with xcbuild package -- I wonder if it's too old to know about the metal shader compiler. Looks like its Github project has been archived since mid-2021.

When attempting to run nix run nixpkgs#xcbuild and nix run nixpkgs#xcodebuild, get the following error:

warning: non-build action metal not implemented
error: unknown build action 'metal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SomeoneSerge xcrun is a MacOS/Xcode tool that finds other command line tools--for example, finding the metal shader compiler. I think it's used instead of a hard-coded path because different installations of Xcode can coexist, so xcrun will run whichever tool has been made the default by xcode-select.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man page for xcrun and Xcode-select here

@SomeoneSerge SomeoneSerge added the nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment label Mar 18, 2024
@SomeoneSerge SomeoneSerge changed the title add /usr/bin to path for building MetalKit with xcrun on MacOS nix: add /usr/bin to PATH for building MetalKit with xcrun on MacOS Mar 18, 2024
@josephst
Copy link
Contributor Author

Updated, based off macvim.nix from Nixpkgs, which also requires calling xcrun and cannot be built within a sandbox. It looks like setting __noChroot = true is the usual way to note that a package cannot be build within a sandboxed environment.

Now, /usr/bin/xcrun is symlinked and provided as a nativeBuildInput when building on Darwin system. No other changes to $PATH are made.

@@ -157,6 +163,8 @@ effectiveStdenv.mkDerivation (
substituteInPlace ./*.py --replace "/usr/bin/env python" "${llama-python}/bin/python"
'';

__noChroot = useMetalKit && effectiveStdenv.isDarwin;
Copy link
Collaborator

@SomeoneSerge SomeoneSerge Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an undocumented feature used in Nixpkgs, called __impureHostDeps which lets you demand specific paths from the host. That would be more fine-grained than disabling the sandbox. ImpureHostDeps also handles the missing paths gracefully

That said, Nixpkgs seems to expose its own xcrun as a passthru at xcbuild.xcrun: https://github.com/NixOS/nixpkgs/blob/2d5db19dffbb7fd86c8eba10a77deb8b8838ece3/pkgs/development/tools/xcbuild/wrapper.nix#L118. Are you saying it's broken? If it is we should report it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that xcrun shim will work here. Looking at source and testing on my system, the xcrun shim from xcbuild expects whatever tool it is calling to be on $PATH so that the tool can be found with exec "$@". This works for some binaries in /usr/bin, such as cc and clang (which are Apple-provided shims for the full binaries in /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/.

However, the metal compiler (and the directory it's in, /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/metal) are not added to $PATH because there may be multiple installs of Xcode on a system. xcrun handles running the metal binary from whichever Xcode installation is the default or provided via an environment variable.

While investigating, it looks like the xcbuild project from Facebook includes a more robust implementation of xcrun which can detect the install location of Xcode and run binaries from this location. From a brief glance at the code for that implementation, it looks through environment variables, the user's $HOME dir, and the standard install locations.

That said, I don't think this ultimately solves the problem of needing an Xcode-specific binary to build llama.cpp on MacOS. Even with a shim for xcrun, the metal shader compiler from Xcode is still necessary. And if we're going to be calling Apple-provided binaries, I don't see any reason to avoid calling the normal xcrun, particularly as the shell implementation only works for binaries already on $PATH, and the xcbuild implementation from Facebook is archived/ no longer maintained.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josephst did you try the shim? Xcrun here is only used for the build, inside the sandbox, and it'll use the toolchain provisioned by nix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even when using the xcrun binary provided by Xcode/Apple, the rest of the build is using the toolchain provided by Nix and the final ./result/bin/llama binary confirms that it is built with clang 16.0.6, provided by Nix toolchain (rather than clang 15.0.0, which is the system default installed with Xcode).

The problem I'm having with the xcrun shim is that it cannot locate the metal shader compiler. metal is unfortunately proprietary so there's no nix equivalent (unlike ie clang; xcrun clang will look for clang on $PATH and find it as part of the nix toolchain)

To illustrate the problem:

With the "full" xcrun provided by MacOS/Xcode:

$ /usr/bin/xcrun -sdk macosx metal --version
Apple metal version 32023.155 (metalfe-32023.155)
Target: air64-apple-darwin23.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/metal/macos/bin

With the shim xcrun:

$ nix run nixpkgs#xcbuild.xcrun -- -sdk macosx metal --version
/nix/store/d0hnw5gy6mji9hqxcs564xx6zvba5m4s-xcrun/bin/xcrun: line 42: exec: metal: not found

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just realized we haven't got an aarch64-darwin builder on github.

Could you please share the build logs? If your xcode/metal is from the host system and you had the sandbox enabled, how can the build access it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm looks like I didn't have the sandbox properly enabled earlier. Just set sandbox = true and getting some new build errors. Lots of output, so added both options (__impureHostDeps vs noChroot) to a Gist.

As I work on this more, I'm starting to think that __noChroot = true is the better option. Trying to add /usr/bin/xcrun to __impureHostDeps with sandbox=true generates several new errors about how /usr/bin/xcrun isn't one of the allowedImpureHostPrefixes.

With sandbox = true in my nix.conf, the only way to get this to build is to go back to using __noChroot = effectiveStdenv.isDarwin && useMetalKit;, then building with nix build .# --rebuild --option sandbox relaxed. Sandboxing on macOS with Nix defaults to false so builds should work on macOS with sandbox = off or sandbox = "relaxed".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for the research! The last bit I fail to understand is what has changed that it's now required to use the metal compiler from xcode: the aarch64-darwin always linked MetalKit and it worked at the time of #4605 (comment)

CC @philiptaron

Copy link
Contributor Author

@josephst josephst Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like previously the metal compiler was only called if LLAMA_METAL_SHADER_DEBUG was true.

Now, it's always called and LLAMA_METAL_SHADER_DEBUG affects which flags it's called with. Looks like the change was about a week ago

@josephst
Copy link
Contributor Author

Just pushed a new version of this PR - still using Apple-provided xcrun, with /usr/bin/xcrun added to __impureHostDeps when creating the symlink. It's building and running properly on my system, including with sandboxing on (nix build .# --option sandbox=true)

@josephst josephst force-pushed the nix-darwin-xcrun branch 2 times, most recently from d119d47 to 2aaea82 Compare March 19, 2024 23:05
@philiptaron philiptaron self-requested a review March 19, 2024 23:18
@@ -87,6 +88,11 @@ let
]
);

darwinSymlinks = runCommand "darwin-build-symlinks" {} ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to xcrunHost/hostsXcrun. I also think that makeWrapper was a better solution but it's ok if you keep this as is

@josephst
Copy link
Contributor Author

josephst commented Mar 21, 2024

Updated to incorporate latest changes. Thanks for the review, @SomeoneSerge!

Note that an alternative to all this is setting Cmake flag to -DLLAMA_METAL_EMBED_LIBRARY=ON, which will avoid compiling the shader. It looks like instead, the shader is converted to a string and embedded in the binary; when the program is run, the shader source code is loaded from this string and compiled using newLibraryWithSource. My first impression is that compiling the shader each time the app is run is going to affect performance somewhat, although I haven't benchmarked to see how drastic the change is. Tried a quick Google search to see if there's any comparison between newLibraryWithURL (using the already-compiled shader) and newLibraryWithSource (using a pre-compiled shader) but didn't see any comparisons.

@philiptaron
Copy link
Collaborator

I'll be able to run this either tonight (PDT) or tomorrow. Thanks for your patience.

@SomeoneSerge
Copy link
Collaborator

It looks like instead, the shader is converted to a string and embedded in the binary

This sounds like opengl/opencl too. Do you know if apple does any caching for kernels built on the fly?

I'd suggest we provide both options. E.g. make a precompileMetalShaders ? false flag for callPackage, disable chroot when true, otherwise set DLLAMA_METAL_EMBED_LIBRARY? I'd recommend making false the default because it's less likely to break. We can expose both as flake outputs.

@josephst
Copy link
Contributor Author

@philiptaron actually, hold off on testing. It's building but as I run it I'm getting some unusual issues related to CMake. I think the final CMake install steps aren't copying default.metallib (that was just compiled) to the $out/bin directory. Need to investigate further.

@josephst
Copy link
Contributor Author

Should be ready for testing now - will work on adding additional commit with option for embedding (and avoiding Xcode/xcrun dependency) with precompileMetalShaders ? false or something similar, but now builds on my system with sandbox = relaxed and runs well with Mistral 0.2 7B.

This may also address #6020.

@swdunlop
Copy link

The following worked perfectly for me on MacOS Sonoma 14.3.1 / M2:

llama.cpp on  nix-darwin-xcrun via C v15.0.0-clang via △ via 🐍 v3.9.6 via 🐦 v5.9.2 via ↯ took 1m11s
$ nix run .#llama-server -- -m nomic-embed-text.gguf --embeddings

Thank you for chasing this the whole way through.

@josephst
Copy link
Contributor Author

@philiptaron ready for review - ultimately broke this up into 3 commits (one for fixing xcrun, one for moving default.metallib into the bin/ dir so that it can be loaded, one for enable/disable precompilation of the .metal files).

@philiptaron
Copy link
Collaborator

I intend to merge once CI finishes with success.

@philiptaron
Copy link
Collaborator

Sorry, I merged something else and now we have some conflicts.

@@ -216,7 +235,10 @@ effectiveStdenv.mkDerivation (
# Should likely use `rocmPackages.clr.gpuTargets`.
"-DAMDGPU_TARGETS=gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx940;gfx941;gfx942;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102"
]
++ optionals useMetalKit [ (lib.cmakeFeature "CMAKE_C_FLAGS" "-D__ARM_FEATURE_DOTPROD=1") ]
++ optionals useMetalKit [
(lib.cmakeFeature "CMAKE_C_FLAGS" "-D__ARM_FEATURE_DOTPROD=1")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to mention this before and now that there's a merge conflict I have a reason to: multiple -DCMAKE_C_FLAGS options aren't going to compose. I think we should move those to NIX_CFLAGS_COMPILE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other uses of CMAKE_C_FLAGS currently? Also, could you elaborate on moving this option into NIX_CFLAGS_COMPILE?

Right now, thinking of something like

let
  cmakeCFlags = lib.optionalString useMetalKit [ "-D__ARM_FEATURE_DOTPROD=1" ];
# ...
in {
# ...

cmakeFlags = # ...rest of section
  ++ lib.cmakeFeature "CMAKE_C_FLAGS" cmakeCFlags;
# ...
}

Which would allow other backends to configure CMAKE_C_FLAGS via appending to cmakeCFlags

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIX_CFLAGS_COMPILE is an environment variable interpreted by nixpkgs' cc-wrapper. You can just pass it to mkDerivation: NIX_CFLAGS_COMPILE = [ "-DA=1" "-DB=2" "-D__ARM_FEATURE_DOTPROD=1" ]

Are there any other uses of CMAKE_C_FLAGS currently?

I thought that was the merge conflict?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was confused, it must've been just a change in formatting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good to know about that variable. Merge conflict was from the following line (blas support) being deleted in master, likely a merge conflict line since I split the preceding line onto two lines

is usable during build (used for compiling Metal shaders)

Fixes ggerganov#6117
When metal files are compiled to default.metallib, Cmake needs to add this to the install directory so that it's visible to llama-cpp

Also, update package.nix to use absolute path for default.metallib (it's not finding the bundle)
…ompilation of metal shader

Precompilation requires Xcode to be installed and requires disable sandbox on nix-darwin
@josephst
Copy link
Contributor Author

Just rebased onto master so should be able to be merged

@josephst josephst changed the title nix: add /usr/bin to PATH for building MetalKit with xcrun on MacOS nix: make xcrun visible in Nix sandbox for precompiling Metal shaders Mar 25, 2024
@philiptaron
Copy link
Collaborator

@SomeoneSerge you want to click the button this time? I think I don't need to wait for CI to pass given I saw it pass prior to the merge conflict.

@philiptaron philiptaron merged commit e190f1f into ggerganov:master Mar 26, 2024
55 of 57 checks passed
josephst added a commit to josephst/nixpkgs that referenced this pull request Mar 26, 2024
port of ggerganov/llama.cpp#6118, although compiling shaders with XCode disabled as it requires disabling sandbox (and only works on MacOS anyways)
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
…rs (ggerganov#6118)

* Symlink to /usr/bin/xcrun so that `xcrun` binary
is usable during build (used for compiling Metal shaders)

Fixes ggerganov#6117

* cmake - copy default.metallib to install directory

When metal files are compiled to default.metallib, Cmake needs to add this to the install directory so that it's visible to llama-cpp

Also, update package.nix to use absolute path for default.metallib (it's not finding the bundle)

* add `precompileMetalShaders` flag (defaults to false) to disable precompilation of metal shader

Precompilation requires Xcode to be installed and requires disable sandbox on nix-darwin
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 3, 2024
…rs (ggerganov#6118)

* Symlink to /usr/bin/xcrun so that `xcrun` binary
is usable during build (used for compiling Metal shaders)

Fixes ggerganov#6117

* cmake - copy default.metallib to install directory

When metal files are compiled to default.metallib, Cmake needs to add this to the install directory so that it's visible to llama-cpp

Also, update package.nix to use absolute path for default.metallib (it's not finding the bundle)

* add `precompileMetalShaders` flag (defaults to false) to disable precompilation of metal shader

Precompilation requires Xcode to be installed and requires disable sandbox on nix-darwin
tybalex pushed a commit to rubra-ai/tools.cpp that referenced this pull request Apr 17, 2024
…rs (ggerganov#6118)

* Symlink to /usr/bin/xcrun so that `xcrun` binary
is usable during build (used for compiling Metal shaders)

Fixes ggerganov#6117

* cmake - copy default.metallib to install directory

When metal files are compiled to default.metallib, Cmake needs to add this to the install directory so that it's visible to llama-cpp

Also, update package.nix to use absolute path for default.metallib (it's not finding the bundle)

* add `precompileMetalShaders` flag (defaults to false) to disable precompilation of metal shader

Precompilation requires Xcode to be installed and requires disable sandbox on nix-darwin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot find xcrun when building with Nix flakes on Darwin/MacOS
4 participants