Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nix flake #40

Merged
merged 8 commits into from
Mar 17, 2023
Merged

Nix flake #40

merged 8 commits into from
Mar 17, 2023

Conversation

niklaskorz
Copy link
Contributor

@niklaskorz niklaskorz commented Mar 12, 2023

This pull request adds a simple Nix Flake for building and distributing the binaries of this repository in a combined package.

The main binary can be executed like this (assuming experimental-features = nix-command flakes), for example:

$ nix shell github:niklaskorz/llama.cpp/nix -c llama --help
usage: llama-cpp [options]

options:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: random)
  -n N, --n_predict N   number of tokens to predict (default: 128)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --temp N              temperature (default: 0.8)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -m FNAME, --model FNAME
                        model path (default: models/llama-7B/ggml-model.bin)

When merged into the main branch of this repository, nix shell github:ggerganov/llama.cpp would be used instead.

The binary names in the Nix package are:

  • main=> llama
  • quantize => quantize
  • convert-pth-to-ggml.py => convert-pth-to-ggml

Note that convert-pth-to-ggml directly executes in a Python 3.10 environment with the required Python packages.

@Dietr1ch
Copy link

Can you add the .envrc to load the flake right away?

A single line .envrc should do,

use_flake

@Dietr1ch
Copy link

Do you plan to add support for the C++ part?

BTW, I ended up using this trivial shell.nix because I haven't learnt how to use flakes yet,

{ pkgs ? import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/nixos-22.11.tar.gz") {} }:

pkgs.mkShell {
  buildInputs = with pkgs; [
    gnumake
    clang

    black
    python
    python310Packages.torch-bin
    python310Packages.numpy
    python310Packages.sentencepiece
  ];
}

Differences so far are, GNU Make, clang, and black.

@niklaskorz
Copy link
Contributor Author

niklaskorz commented Mar 13, 2023

Do you plan to add support for the C++ part?

BTW, I ended up using this trivial shell.nix because I haven't learnt how to use flakes yet,

{ pkgs ? import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/nixos-22.11.tar.gz") {} }:

pkgs.mkShell {
  buildInputs = with pkgs; [
    gnumake
    clang

    black
    python
    python310Packages.torch-bin
    python310Packages.numpy
    python310Packages.sentencepiece
  ];
}

Differences so far are, GNU Make, clang, and black.

No need to include these, the C/C++ compiler and make (and even CMake I think) are already included automatically by Nix's stdenv. I am hesitant with including black unless @ggerganov wants the Python script to explicitly be black-formatted in the future.

Edit: CMake is not pre-included in stdenv, but GCC/G++ and GNU make are.

@niklaskorz
Copy link
Contributor Author

niklaskorz commented Mar 13, 2023

@Dietr1ch I have added shell.nix and default.nix for backwards-compatibility with non-flake-nix, as well as the direnv config and a development shell.

@niklaskorz niklaskorz marked this pull request as ready for review March 13, 2023 09:52
Copy link

@Xe Xe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NixOS expert here: I have tested this locally and it works perfectly.

@prusnak
Copy link
Collaborator

prusnak commented Mar 14, 2023

NixOS expert here: I have tested this locally and it works perfectly.

I don't see flake-compat being used anywhere, why is it imported?

@niklaskorz
Copy link
Contributor Author

NixOS expert here: I have tested this locally and it works perfectly.

I don't see flake-compat being used anywhere, why is it imported?

flake-compat has to be referenced by the flake to be included in the lockfile, which shell.nix and default.nix then read to load the pinned version of flake-compat

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to move these files in a subfolder?
I want root to be clean as possible

@niklaskorz
Copy link
Contributor Author

Is it possible to move these files in a subfolder? I want root to be clean as possible

I don't think that is possible, Nix expects flake.nix and flake.lock to be at the root of the repository, same goes for direnv with .envrc. I personally am fine with removing default.nix and shell.nix if those are too much, as they are only needed for backwards-compatibility.

@prusnak
Copy link
Collaborator

prusnak commented Mar 14, 2023

Yeah, let's drop shell/default.nix. I think we can also drop direnv - it's not necessary for running llama as nix flake.

@niklaskorz
Copy link
Contributor Author

Yeah, let's drop shell/default.nix. I think we can also drop direnv - it's not necessary for running llama as nix flake.

Done, and I added .envrc to gitignore so it's still possible to have this configured locally for anyone who wants to.

@niklaskorz
Copy link
Contributor Author

niklaskorz commented Mar 14, 2023

Rebased to the latest master commit, now the build fails in Nix:

ggml.c:1364:25: warning: implicit declaration of function 'vdotq_s32' is invalid in C99 [-Wimplicit-function-declaration]
        int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls);
                        ^
ggml.c:1364:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int'
        int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls);
                  ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:1365:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int'
        int32x4_t p_1 = vdotq_s32(vdupq_n_s32(0), v0_1ls, v1_1ls);
                  ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:1367:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int'
        p_0 = vdotq_s32(p_0, v0_0hs, v1_0hs);
            ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:1368:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int'
        p_1 = vdotq_s32(p_1, v0_1hs, v1_1hs);
            ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning and 4 errors generated.

Interestingly, this does not happen when building in my normal macOS environment without Nix.

@prusnak
Copy link
Collaborator

prusnak commented Mar 14, 2023

Interestingly, this does not happen when building in my normal macOS environment without Nix.

My guess is because stdenv in nixpkgs for macos is still stuck at Clang 11.x while recent macOS ships Clang 14.x.

The fix would be to override the stdenv for macos to use Clang 14.x.

@niklaskorz
Copy link
Contributor Author

niklaskorz commented Mar 14, 2023

Interestingly, this does not happen when building in my normal macOS environment without Nix.

My guess is because stdenv in nixpkgs for macos is still stuck at Clang 11.x while recent macOS ships Clang 14.x.

The fix would be to override the stdenv for macos to use Clang 14.x.

I tried that, it doesn't change anything. The problem appears to be the macOS SDK, not the compiler. Unfortunately, nixpkgs does not include any SDK newer than macOS 12. I suspect that vdotq_s32 has only been added in macOS 13.

Edit: it's defined in /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/14.0.0/include/arm_neon.h, I'll have a look how arm_neon.h differs from the header included in nixpkg's llvmPackages_14.

Edit 2: the function is defined in both Nix's standard Clang 11 <arm_neon.h> as well as nixpkg's llvmPackages_14... so now to find out what is preventing that to be included.

Edit 3: Thanks to this reddit thread, I have found that __ARM_FEATURE_DOTPROD is only defined in Xcode 13 and newer, and apparently not in any Clang version shipped by Nix for macOS. Adding -D__ARM_FEATURE_DOTPROD as compiler flags makes it compile (and link) without any issues. I know that 47857e5 now made this optional which would also fix this error, but I think we could manually add the define for the flake only (possibly by passing it to CMake?).

@niklaskorz
Copy link
Contributor Author

The flake package is now built with CMake instead of Make and passes -DCMAKE_C_FLAGS=-D__ARM_FEATURE_DOTPROD=1=1 only for macOS on ARM (i.e., aarch64-darwin). So everything is working again!

Also, the flake package is now using the same binary names as in CMakeLists (llama and quantize).

@ggerganov
Copy link
Member

ggerganov commented Mar 15, 2023

So I am thinking this kind of deploys (together with docker #132) should probably live in a separate repo(s).
What I don't like it that we are cluttering the root source tree. Another problem is we have to sync the build scripts with every change.

The alternative is to create for example llama.nix or llama.docker and they can have llama.cpp as submodule. All necessary build files will be there and they will be synched to the specific submodule version. It could also be one common repo for all these kind of deploy systems: llama.deploy.

For now I think we will merge this to make it easily available during the initial interest, but later we should probably do this.
Any comments?

@ingenieroariel
Copy link

As a person interested in a flake.nix a good future point to remove it for uncluttering is when it is available on NixOS/nixpkgs so that people can do nix run nixpkgs#llama.

For now, having a flake.nix so that we can do nix run github:ggerganov/llama.cpp is a great step forward.

@ggerganov
Copy link
Member

@prusnak Merge if you approve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants