Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package request: ollama and ollama-webui as NixOS modules #273556

Closed
malteneuss opened this issue Dec 11, 2023 · 16 comments
Closed

Package request: ollama and ollama-webui as NixOS modules #273556

malteneuss opened this issue Dec 11, 2023 · 16 comments
Labels
0.kind: packaging request Request for a new package to be added

Comments

@malteneuss
Copy link
Contributor

malteneuss commented Dec 11, 2023

Project description

To make open-source large langue models (LLM) accessible there are projects like Ollama that make it almost trivial to download and run them locally on a consumer computer.
We already have Ollama in Nixpkgs, but that can only be run conveniently in a terminal (and doesn't store previous chats). What's missing is a web UI, e.g. Ollama-WebUI that mimics ChatGPT's frontend and integrates nicely with Ollama.

My request is to add Ollama-WebUI (the only satisfactory web UI i could find) to Nixpkgs, and then to create NixOS modules to have them both as convenient services to deploy.
I've managed to piece together some working config with some rough edges to start with:

# Download LLMs per api
# curl http://localhost:11434/api/pull -d '{ "name": "llama2" }'
{ self, pkgs, config, lib, ... }:

with lib;

let
  cfg = config.services.ollama;

  ollama-webui-static = pkgs.buildNpmPackage rec {
    pname = "ollama-webui";
    version = "0.0.1";

    src = pkgs.fetchFromGitHub {
      owner = "ollama-webui";
      repo = "ollama-webui";
      rev = "970a71354b5dabc48862731dae2a0bfef733bfa9";
      sha256 = "5cHxHh2rHk/WGw8XyroKqCBG1Do+GnTvkgnX711bPEQ=";
      # hash = "sha256-noidea2er";
    };
    npmDepsHash = "sha256-N+wyvyKqsDfUqv3TQbxjuf8DF0uEJ7OBrwdCnX+IMZ4=";

    PUBLIC_API_BASE_URL = "http://localhost:11434/api";

    # Ollama URL for the backend to connect
    # The path '/ollama/api' will be redirected to the specified backend URL
    OLLAMA_API_BASE_URL = "http://localhost:11434/api";
    # npm run build creates a static "build" folder.
    installPhase = ''
      cp -R ./build $out
    '';
    # meta = with pkgs.stdenv.lib; {
    #   homepage = "https://github.com/my-username/my-repo";
    #   description = "ChatGPT-Style Web Interface for Ollama";
    #   license = licenses.mit;
    #   # maintainers = [ maintainers.john ];
    # };
  };
  ollama-webui = pkgs.writeShellScriptBin "ollama-webui" ''
    # cors: allow broswer to make requests to ollama on different port than website
    ${pkgs.nodePackages.http-server}/bin/http-server ${ollama-webui-static} --cors='*' --port 8080
  '';
in {

  options = {
    services.ollama = {
      enable = mkOption {
        type = types.bool;
        description = "Enable Ollama service.";
        default = true;
      };

      host = mkOption rec {
        type = types.str;
        default = "127.0.0.1";
        example = default;
        description = "The host/domain name";
      };

      port = mkOption {
        type = types.port;
        default = 11434;
        description = ''
          The port to serve ollama over.
        '';
      };

    };
  };

  config = mkIf cfg.enable {

    # create a Linux user that will run ollama
    # and has access rights to store LLM files.
    users.users.ollama = {
      name = "ollama";
      group = "ollama";
      description = "Ollama user";
      isSystemUser = true;
    };
    # suggested by Nix build, no idea why
    users.groups.ollama = { };

    systemd.services.ollama = {
      description = "Ollama Service";
      wantedBy = [ "multi-user.target" ];
      after = [ "network.target" ];

      environment = {
        # make ollama accesible to outside of localhost
        OLLAMA_HOST = "0.0.0.0:${toString cfg.port}";
        # allow acces from web-uis under some url, otherwise ollama returns 403 forbidden due to CORS.
        OLLAMA_ORIGINS = "http://localhost:8080,http://10.0.0.10:*";
        # need to create this folder manually for user ollama for now.
        HOME = "/var/lib/ollama";
      };

      serviceConfig = {
        ExecStart = "${pkgs.unstable.ollama}/bin/ollama serve";
        # DynamicUser = "true";
        User = "ollama";
        Type = "simple";
        Restart = "always";
        # RestartSec = 3;
        KillMode = "process";
      };
    };
    systemd.services.ollama-webui = {
      description = "Ollama WebUI Service";
      wantedBy = [ "multi-user.target" ];
      after = [ "network.target" ];

      serviceConfig = {
        ExecStart = "${ollama-webui}/bin/ollama-webui";
        # DynamicUser = "true";
        User = "ollama";
        Type = "simple";
        Restart = "always";
        # RestartSec = 3;
        # KillMode = "process";
      };
    };
  };
}

@elohmeier @dit7ya Would that be a good idea, since you were involved in adding Ollama to nixpkgs? (thanks a lot btw)

Metadata


Add a 👍 reaction to issues you find important.

@malteneuss malteneuss added the 0.kind: packaging request Request for a new package to be added label Dec 11, 2023
@mrjones2014
Copy link

mrjones2014 commented Feb 20, 2024

Hey, I've been using this derivation and it's worked pretty great. However I tried updating the versions and hashes, and now I'm getting this error. I think it means there is an issue with the package-lock.json, but I'm not sure. Any idea how I can work around it?

npm ERR! code 1
npm ERR! path /build/source/node_modules/esbuild
npm ERR! command failed
npm ERR! command sh -c node install.js
npm ERR! [esbuild] Failed to find package "@esbuild/linux-x64" on the file system
npm ERR!
npm ERR! This can happen if you use the "--no-optional" flag. The "optionalDependencies"
npm ERR! package.json feature is used by esbuild to install the correct binary executable
npm ERR! for your current platform. This install script will now attempt to work around
npm ERR! this. If that fails, you need to remove the "--no-optional" flag to use esbuild.
npm ERR!
npm ERR! [esbuild] Trying to install package "@esbuild/linux-x64" using npm
npm ERR! [esbuild] Failed to install package "@esbuild/linux-x64" using npm: Command failed: npm install --loglevel=error --prefer-offline --no-audit --progress=false @esbuild/linux-x64@0.18.20
npm ERR! npm ERR! code ENOTCACHED
npm ERR! npm ERR! request to https://registry.npmjs.org/@esbuild%2flinux-x64 failed: cache mode is 'only-if-cached' but no cached response is available.
npm ERR!
npm ERR! npm ERR! Log files were not written due to an error writing to the directory: /nix/store/1vijb0vqh4b0xkr1810cx9w3q39nvnvx-open-webui-0.0.1-npm-deps/_logs
npm ERR! npm ERR! You can rerun the command with `--loglevel=verbose` to see the logs in your terminal
npm ERR!
npm ERR! [esbuild] Trying to download "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.18.20.tgz"
npm ERR! [esbuild] Failed to download "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.18.20.tgz": getaddrinfo EAI_AGAIN registry.npmjs.org
npm ERR! /build/source/node_modules/esbuild/install.js:275
npm ERR!         throw new Error(`Failed to install package "${pkg}"`);
npm ERR!               ^
npm ERR!
npm ERR! Error: Failed to install package "@esbuild/linux-x64"
npm ERR!     at checkAndPreparePackage (/build/source/node_modules/esbuild/install.js:275:15)
npm ERR!     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
npm ERR!
npm ERR! Node.js v20.11.1

npm ERR! Log files were not written due to an error writing to the directory: /nix/store/1vijb0vqh4b0xkr1810cx9w3q39nvnvx-open-webui-0.0.1-npm-deps/_logs
npm ERR! You can rerun the command with `--loglevel=verbose` to see the logs in your terminal

EDIT: It's an issue with their package-lock.json. I applied a patch to the package.json and package-lock.json including the missing dependency and it works. However it seems they've added a backend component and this static deploy doesn't work anymore.

@mrjones2014
Copy link

For those wanting a temporary solution, this is how I've got it running via podman:

ollama.nix

{ pkgs, ... }: {
  imports = [ ./open-webui.nix ];
  services.ollama.enable = true;
  environment.systemPackages = [ pkgs.oterm ];
}

open-webui.nix

# Download LLMs per api
# curl http://localhost:11434/api/pull -d '{ "name": "llama2" }'
{ config, ... }: {
  virtualisation = {
    podman = {
      enable = true;
      dockerCompat = true;
      defaultNetwork.settings.dns_enabled = true;
    };
  };
  virtualisation.oci-containers.backend = "podman";
  virtualisation.oci-containers.containers.open-webui = {
    autoStart = true;
    image = "ghcr.io/open-webui/open-webui";
    ports = [ "3000:8080" ];
    # TODO figure out how to create the data directory declaratively
    volumes = [ "${config.users.users.mat.home}/open-webui:/app/backend/data" ];
    extraOptions =
      [ "--network=host" "--add-host=host.containers.internal:host-gateway" ];
    environment = { OLLAMA_API_BASE_URL = "http://127.0.0.1:11434/api"; };
  };
  networking.firewall = { allowedTCPPorts = [ 80 443 8080 11434 3000 ]; };
}

@mschwaig
Copy link
Member

mschwaig commented Mar 4, 2024

ollama-webui has been renamed to open-webui.

Are people still interested in having this in nixpkgs?
Is anybody running this in a container instead?

@mrjones2014
Copy link

Are people still interested in having this in nixpkgs?

I am very interested. I plan to set it up on my home server with NixOS (just haven't gotten around to fixing my server and installing NixOS on it yet).

Is anybody running this in a container instead?

Currently I'm running it in a Podman container. It works, however there's no GPU support for the containerized version.

@Geezus42
Copy link

Geezus42 commented Mar 4, 2024

I am running it in a container. I do not have a GPU, be there is GPU support, though people have reported a lot of issues getting it working. It seems to be improving, and I wish I could remember if it was on Github or in the Discord, but I'm pretty sure there was a fix for using GPU with Podman. I would recommend asking in the Discord, there are several helpful members.

For GPU Support

Installing with Podman

Might be useful...
https://blog.machinezoo.com/Local_LLMs_on_linux_with_ollama

@siph
Copy link
Member

siph commented Mar 4, 2024

Ollama is up-to-date in nixpkgs and you can enable GPU support with the acceleration value:

# one of `[ null "rocm" "cuda" ]`
, acceleration ? null

For example, you can install ollama with rocm support using home-manager like this:

home.packages = with pkgs; [
  (ollama.override { acceleration = "rocm"; })
];

There is also a nice flake here: https://github.com/abysssol/ollama-flake/

I'm using an old revision of that flake that supports my old GPU in combination with the web-ui running in a container.

@mrjones2014
Copy link

If you're using the NixOS options it would be:

services.ollama = {
  enable = true;
  acceleration = "cuda"; # or "rocm"
};

@abysssol
Copy link
Contributor

Does anyone here know why the host and port options ended up being merged into one option, listenAddress?
Is there any reason to keep it like this? I feel like keeping them separate is more modular/clean.

Also, @siph, I'm considering archiving ollama-flake since ollama is being kept up to date in nixpkgs.
Do you have any reason not to? Some usecase where a flake is preferable to using nixpkgs?

@siph
Copy link
Member

siph commented Mar 12, 2024

@abysssol I'm not updated to the latest version because my old hardware (RX 580) doesn't get recognized by rocm when I do. I was already using your flake and I'm just pinning to the latest version that works for me. I would change to nixpkgs if I wanted the current version.

I agree on separating host and port.

@mrjones2014
Copy link

Has anyone gotten open-webui, the latest version with the backend, building and running natively instead of through podman?

@mschwaig
Copy link
Member

As far as I remember the branch that I linked here runs fine:
#275448 (comment)

It's just far away from a state where it could land in nixpgks and I'm still too busy with ROCm and a few other things right now to move it forward.

@taoi11
Copy link

taoi11 commented Mar 22, 2024

If you're using the NixOS options it would be:

services.ollama = {
  enable = true;
  acceleration = "cuda"; # or "rocm"
};

If I have a Nvida and AMD GPU in the same system.
Can Nix work with that ?
Would I be able to use cuda and rocm at the same time ?

@abysssol
Copy link
Contributor

@taoi11 not currently, though it could be changed to support that (again). It used to, but I changed that with the expectation that the nix package would supply ollama with a single version of llama-cpp built with nix. I think llama-cpp only supports a single acceleration backend at a time, such as cuda, rocm, or vulkan. Ollama only supports multiple simultaneously because it spins up multiple separate llama-cpp instances compiled with different acceleration support, if I understand correctly.

Do you think it should be possible to enable cuda, rocm, and vulkan simultaneously? What would a use case be for that? And if not, how would the options look to enable ((cuda or rocm) xor vulkan)? Should I just add an acceleration option "cuda+rocm", or change the option layout entirely?

@mschwaig
Copy link
Member

@taoi11 you would want to use the Vulkan backend of llama.cpp for that or another cross-platform backend, but that's not currently supported in upstream ollama, and it is also not supported in nixpkgs.

@abysssol the best thing we can do in nixpkgs is allow passing different versions of llama-cpp into the ollama build again, so that we can pass in a version that is compiled with Vulkan support. It might also be possible to hack together a Vulkan acceleartion option before ollama officially supports it by doing somthing like this: ollama/ollama#2396 (comment), but 'puppeteering' their embedded llama.cpp build to do different things does not sound like fun to me.


What follows is a longer technical explanation.

Ollama supports producing an executable that supports acceleration through both the cuda and ROCm backend, as far as I know, still by linking two versions of llama-cpp at the same time. This means it can either run with one backend or the other, not both at the same time.

It would be possible with some, probably considerable, effort for llama-cpp to support something like this itself. Making it possible to compile one binary that contains all of the selected acceleration backends, and then choosing or having it choose the appropriate backend at runtime and use it.

But as soon as you talk about a system with two different GPUs, you are probably talking about using them, (1) at the same time (2) together and that's something completely different.

Using two acceleration backends like that would be basically impossible to implement in ollama and terribly difficult in llama-cpp, because if you accelerate some part of the computation on one card, you now need have some way to pass the intermediary result to the other card, and that is a terribly annoying problem to solve when every backend uses a different API, because you would have to make those otherwise unrelated acceleration backends work similarly enough that they can share theses intermediary results through some internal API.

The way to go for mixed GPU setups is one backend that uses an API which supports different underlying hardware, because that way since all of the GPUs are programmed with the same API the intermediary results are compatible.

This is possible with the Vulkan backend, and it is actually implemented in llama.cpp.

The downside of using the Vulkan backend would be that, last time that I checked, it was significantly less performant than the Cuda or ROCm backends, and probably running across multiple GPUs is also less performant than if you had the same compute on a single GPU. So you have to read up, or measure and check for yourself, to see if it is even worth it to use both cards, and to make sure it's not broken because that's still a more exotic setup.

@taoi11
Copy link

taoi11 commented Mar 23, 2024

@abysssol @mschwaig
This is all very new to me. Thank you but for explaining this to me so simply.

The Idea I had was to figure out some way to Frankenstein two of me extra (older) GPUs together.
I will go down the vulkan rabbit hole and see what comes of it.

Many thanks!

@malteneuss
Copy link
Contributor Author

Closing as ollama was added to nixpkgs. Ollama-webui (renamed to open-webui) may be added as part of another issue. @mschwaig has brought his branch quite far for that: https://github.com/mschwaig/nixpkgs/tree/open-webui

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: packaging request Request for a new package to be added
Projects
None yet
Development

No branches or pull requests

7 participants