Nix flake #40

niklaskorz · 2023-03-12T09:25:56Z

This pull request adds a simple Nix Flake for building and distributing the binaries of this repository in a combined package.

The main binary can be executed like this (assuming experimental-features = nix-command flakes), for example:

$ nix shell github:niklaskorz/llama.cpp/nix -c llama --help
usage: llama-cpp [options]

options:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: random)
  -n N, --n_predict N   number of tokens to predict (default: 128)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --temp N              temperature (default: 0.8)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -m FNAME, --model FNAME
                        model path (default: models/llama-7B/ggml-model.bin)

When merged into the main branch of this repository, nix shell github:ggerganov/llama.cpp would be used instead.

The binary names in the Nix package are:

main=> llama
quantize => quantize
convert-pth-to-ggml.py => convert-pth-to-ggml

Note that convert-pth-to-ggml directly executes in a Python 3.10 environment with the required Python packages.

flake.nix

Dietr1ch · 2023-03-13T01:00:14Z

Can you add the .envrc to load the flake right away?

A single line .envrc should do,

use_flake

Dietr1ch · 2023-03-13T01:05:41Z

Do you plan to add support for the C++ part?

BTW, I ended up using this trivial shell.nix because I haven't learnt how to use flakes yet,

{ pkgs ? import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/nixos-22.11.tar.gz") {} }:

pkgs.mkShell {
  buildInputs = with pkgs; [
    gnumake
    clang

    black
    python
    python310Packages.torch-bin
    python310Packages.numpy
    python310Packages.sentencepiece
  ];
}

Differences so far are, GNU Make, clang, and black.

niklaskorz · 2023-03-13T09:37:44Z

Do you plan to add support for the C++ part?

BTW, I ended up using this trivial shell.nix because I haven't learnt how to use flakes yet,

{ pkgs ? import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/nixos-22.11.tar.gz") {} }:

pkgs.mkShell {
  buildInputs = with pkgs; [
    gnumake
    clang

    black
    python
    python310Packages.torch-bin
    python310Packages.numpy
    python310Packages.sentencepiece
  ];
}

Differences so far are, GNU Make, clang, and black.

No need to include these, the C/C++ compiler and make (and even CMake I think) are already included automatically by Nix's stdenv. I am hesitant with including black unless @ggerganov wants the Python script to explicitly be black-formatted in the future.

Edit: CMake is not pre-included in stdenv, but GCC/G++ and GNU make are.

niklaskorz · 2023-03-13T09:50:06Z

@Dietr1ch I have added shell.nix and default.nix for backwards-compatibility with non-flake-nix, as well as the direnv config and a development shell.

Xe

NixOS expert here: I have tested this locally and it works perfectly.

prusnak · 2023-03-14T12:13:28Z

NixOS expert here: I have tested this locally and it works perfectly.

I don't see flake-compat being used anywhere, why is it imported?

niklaskorz · 2023-03-14T12:25:54Z

NixOS expert here: I have tested this locally and it works perfectly.

I don't see flake-compat being used anywhere, why is it imported?

flake-compat has to be referenced by the flake to be included in the lockfile, which shell.nix and default.nix then read to load the pinned version of flake-compat

ggerganov

Is it possible to move these files in a subfolder?
I want root to be clean as possible

niklaskorz · 2023-03-14T17:12:42Z

Is it possible to move these files in a subfolder? I want root to be clean as possible

I don't think that is possible, Nix expects flake.nix and flake.lock to be at the root of the repository, same goes for direnv with .envrc. I personally am fine with removing default.nix and shell.nix if those are too much, as they are only needed for backwards-compatibility.

prusnak · 2023-03-14T17:17:36Z

Yeah, let's drop shell/default.nix. I think we can also drop direnv - it's not necessary for running llama as nix flake.

niklaskorz · 2023-03-14T17:35:47Z

Yeah, let's drop shell/default.nix. I think we can also drop direnv - it's not necessary for running llama as nix flake.

Done, and I added .envrc to gitignore so it's still possible to have this configured locally for anyone who wants to.

niklaskorz · 2023-03-14T18:35:33Z

Rebased to the latest master commit, now the build fails in Nix:

ggml.c:1364:25: warning: implicit declaration of function 'vdotq_s32' is invalid in C99 [-Wimplicit-function-declaration]
        int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls);
                        ^
ggml.c:1364:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int'
        int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls);
                  ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:1365:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values) with an expression of incompatible type 'int'
        int32x4_t p_1 = vdotq_s32(vdupq_n_s32(0), v0_1ls, v1_1ls);
                  ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:1367:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int'
        p_0 = vdotq_s32(p_0, v0_0hs, v1_0hs);
            ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:1368:13: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int'
        p_1 = vdotq_s32(p_1, v0_1hs, v1_1hs);
            ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning and 4 errors generated.

Interestingly, this does not happen when building in my normal macOS environment without Nix.

prusnak · 2023-03-14T18:44:47Z

Interestingly, this does not happen when building in my normal macOS environment without Nix.

My guess is because stdenv in nixpkgs for macos is still stuck at Clang 11.x while recent macOS ships Clang 14.x.

The fix would be to override the stdenv for macos to use Clang 14.x.

niklaskorz · 2023-03-14T18:46:03Z

Interestingly, this does not happen when building in my normal macOS environment without Nix.

My guess is because stdenv in nixpkgs for macos is still stuck at Clang 11.x while recent macOS ships Clang 14.x.

The fix would be to override the stdenv for macos to use Clang 14.x.

I tried that, it doesn't change anything. The problem appears to be the macOS SDK, not the compiler. Unfortunately, nixpkgs does not include any SDK newer than macOS 12. I suspect that vdotq_s32 has only been added in macOS 13.

Edit: it's defined in /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/14.0.0/include/arm_neon.h, I'll have a look how arm_neon.h differs from the header included in nixpkg's llvmPackages_14.

Edit 2: the function is defined in both Nix's standard Clang 11 <arm_neon.h> as well as nixpkg's llvmPackages_14... so now to find out what is preventing that to be included.

Edit 3: Thanks to this reddit thread, I have found that __ARM_FEATURE_DOTPROD is only defined in Xcode 13 and newer, and apparently not in any Clang version shipped by Nix for macOS. Adding -D__ARM_FEATURE_DOTPROD as compiler flags makes it compile (and link) without any issues. I know that 47857e5 now made this optional which would also fix this error, but I think we could manually add the define for the flake only (possibly by passing it to CMake?).

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>

niklaskorz · 2023-03-14T20:18:39Z

The flake package is now built with CMake instead of Make and passes -DCMAKE_C_FLAGS=-D__ARM_FEATURE_DOTPROD=1=1 only for macOS on ARM (i.e., aarch64-darwin). So everything is working again!

Also, the flake package is now using the same binary names as in CMakeLists (llama and quantize).

ggerganov · 2023-03-15T19:51:29Z

So I am thinking this kind of deploys (together with docker #132) should probably live in a separate repo(s).
What I don't like it that we are cluttering the root source tree. Another problem is we have to sync the build scripts with every change.

The alternative is to create for example llama.nix or llama.docker and they can have llama.cpp as submodule. All necessary build files will be there and they will be synched to the specific submodule version. It could also be one common repo for all these kind of deploy systems: llama.deploy.

For now I think we will merge this to make it easily available during the initial interest, but later we should probably do this.
Any comments?

ingenieroariel · 2023-03-17T21:49:08Z

As a person interested in a flake.nix a good future point to remove it for uncluttering is when it is available on NixOS/nixpkgs so that people can do nix run nixpkgs#llama.

For now, having a flake.nix so that we can do nix run github:ggerganov/llama.cpp is a great step forward.

ggerganov · 2023-03-17T21:56:47Z

@prusnak Merge if you approve

…r-fix Windows console ANSI color issue fixed

prusnak requested changes Mar 12, 2023

View reviewed changes

flake.nix Outdated Show resolved Hide resolved

niklaskorz marked this pull request as ready for review March 13, 2023 09:52

Xe approved these changes Mar 14, 2023

View reviewed changes

ggerganov requested changes Mar 14, 2023

View reviewed changes

niklaskorz and others added 8 commits March 14, 2023 21:15

Nix flake

8defc0a

Nix: only add Accelerate framework on macOS

034b589

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>

Nix: development shel, direnv and compatibility

a9230b6

direnv: use flake instead of use_flake

3ee54ef

Nix: use python packages supplied by withPackages

1171c0e

Nix: remove channel compatibility

0759896

Remove direnv

a296589

Nix: fix ARM neon dotproduct on macOS

84fb7f5

ggerganov approved these changes Mar 17, 2023

View reviewed changes

prusnak approved these changes Mar 17, 2023

View reviewed changes

prusnak merged commit a292747 into ggml-org:master Mar 17, 2023

Hades32 pushed a commit to Hades32/llama.cpp that referenced this pull request Mar 21, 2023

Merge pull request ggml-org#40 from rupeshs/windows-console-ansi-colo…

b64ca1c

…r-fix Windows console ANSI color issue fixed

yigber mentioned this pull request Dec 21, 2023

make process hangs if LLAMA_CUBLAS=1, at the line that includes the file scripts/get-flags.mk for the second time #4575

Closed

4 tasks

18222901154 mentioned this pull request Dec 26, 2023

value has been optimized out #4639

Closed

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nix flake #40

Nix flake #40

niklaskorz commented Mar 12, 2023 •

edited

Loading

Dietr1ch commented Mar 13, 2023

Dietr1ch commented Mar 13, 2023

niklaskorz commented Mar 13, 2023 •

edited

Loading

niklaskorz commented Mar 13, 2023 •

edited

Loading

Xe left a comment

prusnak commented Mar 14, 2023

niklaskorz commented Mar 14, 2023

ggerganov left a comment

niklaskorz commented Mar 14, 2023

prusnak commented Mar 14, 2023

niklaskorz commented Mar 14, 2023

niklaskorz commented Mar 14, 2023 •

edited

Loading

prusnak commented Mar 14, 2023

niklaskorz commented Mar 14, 2023 •

edited

Loading

niklaskorz commented Mar 14, 2023

ggerganov commented Mar 15, 2023 •

edited

Loading

ingenieroariel commented Mar 17, 2023

ggerganov commented Mar 17, 2023

Nix flake #40

Nix flake #40

Conversation

niklaskorz commented Mar 12, 2023 • edited Loading

Dietr1ch commented Mar 13, 2023

Dietr1ch commented Mar 13, 2023

niklaskorz commented Mar 13, 2023 • edited Loading

niklaskorz commented Mar 13, 2023 • edited Loading

Xe left a comment

Choose a reason for hiding this comment

prusnak commented Mar 14, 2023

niklaskorz commented Mar 14, 2023

ggerganov left a comment

Choose a reason for hiding this comment

niklaskorz commented Mar 14, 2023

prusnak commented Mar 14, 2023

niklaskorz commented Mar 14, 2023

niklaskorz commented Mar 14, 2023 • edited Loading

prusnak commented Mar 14, 2023

niklaskorz commented Mar 14, 2023 • edited Loading

niklaskorz commented Mar 14, 2023

ggerganov commented Mar 15, 2023 • edited Loading

ingenieroariel commented Mar 17, 2023

ggerganov commented Mar 17, 2023

niklaskorz commented Mar 12, 2023 •

edited

Loading

niklaskorz commented Mar 13, 2023 •

edited

Loading

niklaskorz commented Mar 13, 2023 •

edited

Loading

niklaskorz commented Mar 14, 2023 •

edited

Loading

niklaskorz commented Mar 14, 2023 •

edited

Loading

ggerganov commented Mar 15, 2023 •

edited

Loading