Package request: ollama and ollama-webui as NixOS modules #273556

malteneuss · 2023-12-11T14:38:38Z

Project description

To make open-source large langue models (LLM) accessible there are projects like Ollama that make it almost trivial to download and run them locally on a consumer computer.
We already have Ollama in Nixpkgs, but that can only be run conveniently in a terminal (and doesn't store previous chats). What's missing is a web UI, e.g. Ollama-WebUI that mimics ChatGPT's frontend and integrates nicely with Ollama.

My request is to add Ollama-WebUI (the only satisfactory web UI i could find) to Nixpkgs, and then to create NixOS modules to have them both as convenient services to deploy.
I've managed to piece together some working config with some rough edges to start with:

# Download LLMs per api
# curl http://localhost:11434/api/pull -d '{ "name": "llama2" }'
{ self, pkgs, config, lib, ... }:

with lib;

let
  cfg = config.services.ollama;

  ollama-webui-static = pkgs.buildNpmPackage rec {
    pname = "ollama-webui";
    version = "0.0.1";

    src = pkgs.fetchFromGitHub {
      owner = "ollama-webui";
      repo = "ollama-webui";
      rev = "970a71354b5dabc48862731dae2a0bfef733bfa9";
      sha256 = "5cHxHh2rHk/WGw8XyroKqCBG1Do+GnTvkgnX711bPEQ=";
      # hash = "sha256-noidea2er";
    };
    npmDepsHash = "sha256-N+wyvyKqsDfUqv3TQbxjuf8DF0uEJ7OBrwdCnX+IMZ4=";

    PUBLIC_API_BASE_URL = "http://localhost:11434/api";

    # Ollama URL for the backend to connect
    # The path '/ollama/api' will be redirected to the specified backend URL
    OLLAMA_API_BASE_URL = "http://localhost:11434/api";
    # npm run build creates a static "build" folder.
    installPhase = ''
      cp -R ./build $out
    '';
    # meta = with pkgs.stdenv.lib; {
    #   homepage = "https://github.com/my-username/my-repo";
    #   description = "ChatGPT-Style Web Interface for Ollama";
    #   license = licenses.mit;
    #   # maintainers = [ maintainers.john ];
    # };
  };
  ollama-webui = pkgs.writeShellScriptBin "ollama-webui" ''
    # cors: allow broswer to make requests to ollama on different port than website
    ${pkgs.nodePackages.http-server}/bin/http-server ${ollama-webui-static} --cors='*' --port 8080
  '';
in {

  options = {
    services.ollama = {
      enable = mkOption {
        type = types.bool;
        description = "Enable Ollama service.";
        default = true;
      };

      host = mkOption rec {
        type = types.str;
        default = "127.0.0.1";
        example = default;
        description = "The host/domain name";
      };

      port = mkOption {
        type = types.port;
        default = 11434;
        description = ''
          The port to serve ollama over.
        '';
      };

    };
  };

  config = mkIf cfg.enable {

    # create a Linux user that will run ollama
    # and has access rights to store LLM files.
    users.users.ollama = {
      name = "ollama";
      group = "ollama";
      description = "Ollama user";
      isSystemUser = true;
    };
    # suggested by Nix build, no idea why
    users.groups.ollama = { };

    systemd.services.ollama = {
      description = "Ollama Service";
      wantedBy = [ "multi-user.target" ];
      after = [ "network.target" ];

      environment = {
        # make ollama accesible to outside of localhost
        OLLAMA_HOST = "0.0.0.0:${toString cfg.port}";
        # allow acces from web-uis under some url, otherwise ollama returns 403 forbidden due to CORS.
        OLLAMA_ORIGINS = "http://localhost:8080,http://10.0.0.10:*";
        # need to create this folder manually for user ollama for now.
        HOME = "/var/lib/ollama";
      };

      serviceConfig = {
        ExecStart = "${pkgs.unstable.ollama}/bin/ollama serve";
        # DynamicUser = "true";
        User = "ollama";
        Type = "simple";
        Restart = "always";
        # RestartSec = 3;
        KillMode = "process";
      };
    };
    systemd.services.ollama-webui = {
      description = "Ollama WebUI Service";
      wantedBy = [ "multi-user.target" ];
      after = [ "network.target" ];

      serviceConfig = {
        ExecStart = "${ollama-webui}/bin/ollama-webui";
        # DynamicUser = "true";
        User = "ollama";
        Type = "simple";
        Restart = "always";
        # RestartSec = 3;
        # KillMode = "process";
      };
    };
  };
}

@elohmeier @dit7ya Would that be a good idea, since you were involved in adding Ollama to nixpkgs? (thanks a lot btw)

Metadata

homepage URL: None yet
source URL: https://github.com/ollama-webui/ollama-webui
license: mit
platforms: unix, linux, darwin

Add a 👍 reaction to issues you find important.

The text was updated successfully, but these errors were encountered:

mrjones2014 · 2024-02-20T00:01:16Z

Hey, I've been using this derivation and it's worked pretty great. However I tried updating the versions and hashes, and now I'm getting this error. I think it means there is an issue with the package-lock.json, but I'm not sure. Any idea how I can work around it?

npm ERR! code 1
npm ERR! path /build/source/node_modules/esbuild
npm ERR! command failed
npm ERR! command sh -c node install.js
npm ERR! [esbuild] Failed to find package "@esbuild/linux-x64" on the file system
npm ERR!
npm ERR! This can happen if you use the "--no-optional" flag. The "optionalDependencies"
npm ERR! package.json feature is used by esbuild to install the correct binary executable
npm ERR! for your current platform. This install script will now attempt to work around
npm ERR! this. If that fails, you need to remove the "--no-optional" flag to use esbuild.
npm ERR!
npm ERR! [esbuild] Trying to install package "@esbuild/linux-x64" using npm
npm ERR! [esbuild] Failed to install package "@esbuild/linux-x64" using npm: Command failed: npm install --loglevel=error --prefer-offline --no-audit --progress=false @esbuild/linux-x64@0.18.20
npm ERR! npm ERR! code ENOTCACHED
npm ERR! npm ERR! request to https://registry.npmjs.org/@esbuild%2flinux-x64 failed: cache mode is 'only-if-cached' but no cached response is available.
npm ERR!
npm ERR! npm ERR! Log files were not written due to an error writing to the directory: /nix/store/1vijb0vqh4b0xkr1810cx9w3q39nvnvx-open-webui-0.0.1-npm-deps/_logs
npm ERR! npm ERR! You can rerun the command with `--loglevel=verbose` to see the logs in your terminal
npm ERR!
npm ERR! [esbuild] Trying to download "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.18.20.tgz"
npm ERR! [esbuild] Failed to download "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.18.20.tgz": getaddrinfo EAI_AGAIN registry.npmjs.org
npm ERR! /build/source/node_modules/esbuild/install.js:275
npm ERR!         throw new Error(`Failed to install package "${pkg}"`);
npm ERR!               ^
npm ERR!
npm ERR! Error: Failed to install package "@esbuild/linux-x64"
npm ERR!     at checkAndPreparePackage (/build/source/node_modules/esbuild/install.js:275:15)
npm ERR!     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
npm ERR!
npm ERR! Node.js v20.11.1

npm ERR! Log files were not written due to an error writing to the directory: /nix/store/1vijb0vqh4b0xkr1810cx9w3q39nvnvx-open-webui-0.0.1-npm-deps/_logs
npm ERR! You can rerun the command with `--loglevel=verbose` to see the logs in your terminal

EDIT: It's an issue with their package-lock.json. I applied a patch to the package.json and package-lock.json including the missing dependency and it works. However it seems they've added a backend component and this static deploy doesn't work anymore.

mrjones2014 · 2024-02-20T14:17:44Z

For those wanting a temporary solution, this is how I've got it running via podman:

ollama.nix

{ pkgs, ... }: {
  imports = [ ./open-webui.nix ];
  services.ollama.enable = true;
  environment.systemPackages = [ pkgs.oterm ];
}

open-webui.nix

# Download LLMs per api
# curl http://localhost:11434/api/pull -d '{ "name": "llama2" }'
{ config, ... }: {
  virtualisation = {
    podman = {
      enable = true;
      dockerCompat = true;
      defaultNetwork.settings.dns_enabled = true;
    };
  };
  virtualisation.oci-containers.backend = "podman";
  virtualisation.oci-containers.containers.open-webui = {
    autoStart = true;
    image = "ghcr.io/open-webui/open-webui";
    ports = [ "3000:8080" ];
    # TODO figure out how to create the data directory declaratively
    volumes = [ "${config.users.users.mat.home}/open-webui:/app/backend/data" ];
    extraOptions =
      [ "--network=host" "--add-host=host.containers.internal:host-gateway" ];
    environment = { OLLAMA_API_BASE_URL = "http://127.0.0.1:11434/api"; };
  };
  networking.firewall = { allowedTCPPorts = [ 80 443 8080 11434 3000 ]; };
}

mschwaig · 2024-03-04T02:04:57Z

ollama-webui has been renamed to open-webui.

Are people still interested in having this in nixpkgs?
Is anybody running this in a container instead?

mrjones2014 · 2024-03-04T11:44:34Z

Are people still interested in having this in nixpkgs?

I am very interested. I plan to set it up on my home server with NixOS (just haven't gotten around to fixing my server and installing NixOS on it yet).

Is anybody running this in a container instead?

Currently I'm running it in a Podman container. It works, however there's no GPU support for the containerized version.

Geezus42 · 2024-03-04T16:22:24Z

I am running it in a container. I do not have a GPU, be there is GPU support, though people have reported a lot of issues getting it working. It seems to be improving, and I wish I could remember if it was on Github or in the Discord, but I'm pretty sure there was a fix for using GPU with Podman. I would recommend asking in the Discord, there are several helpful members.

For GPU Support

Installing with Podman

Might be useful...
https://blog.machinezoo.com/Local_LLMs_on_linux_with_ollama

siph · 2024-03-04T23:40:43Z

Ollama is up-to-date in nixpkgs and you can enable GPU support with the acceleration value:

nixpkgs/pkgs/tools/misc/ollama/default.nix

Lines 21 to 22 in c7f550a

    
             # one of `[ null "rocm" "cuda" ]` 
        
           , acceleration ? null

For example, you can install ollama with rocm support using home-manager like this:

home.packages = with pkgs; [
  (ollama.override { acceleration = "rocm"; })
];

There is also a nice flake here: https://github.com/abysssol/ollama-flake/

I'm using an old revision of that flake that supports my old GPU in combination with the web-ui running in a container.

mrjones2014 · 2024-03-09T00:22:20Z

If you're using the NixOS options it would be:

services.ollama = {
  enable = true;
  acceleration = "cuda"; # or "rocm"
};

abysssol · 2024-03-12T16:08:05Z

Does anyone here know why the host and port options ended up being merged into one option, listenAddress?
Is there any reason to keep it like this? I feel like keeping them separate is more modular/clean.

Also, @siph, I'm considering archiving ollama-flake since ollama is being kept up to date in nixpkgs.
Do you have any reason not to? Some usecase where a flake is preferable to using nixpkgs?

siph · 2024-03-12T17:13:10Z

@abysssol I'm not updated to the latest version because my old hardware (RX 580) doesn't get recognized by rocm when I do. I was already using your flake and I'm just pinning to the latest version that works for me. I would change to nixpkgs if I wanted the current version.

I agree on separating host and port.

mrjones2014 · 2024-03-19T00:12:40Z

Has anyone gotten open-webui, the latest version with the backend, building and running natively instead of through podman?

mschwaig · 2024-03-19T01:09:28Z

As far as I remember the branch that I linked here runs fine:
#275448 (comment)

It's just far away from a state where it could land in nixpgks and I'm still too busy with ROCm and a few other things right now to move it forward.

taoi11 · 2024-03-22T23:24:03Z

If you're using the NixOS options it would be:

services.ollama = {
  enable = true;
  acceleration = "cuda"; # or "rocm"
};

If I have a Nvida and AMD GPU in the same system.
Can Nix work with that ?
Would I be able to use cuda and rocm at the same time ?

abysssol · 2024-03-23T05:18:40Z

@taoi11 not currently, though it could be changed to support that (again). It used to, but I changed that with the expectation that the nix package would supply ollama with a single version of llama-cpp built with nix. I think llama-cpp only supports a single acceleration backend at a time, such as cuda, rocm, or vulkan. Ollama only supports multiple simultaneously because it spins up multiple separate llama-cpp instances compiled with different acceleration support, if I understand correctly.

Do you think it should be possible to enable cuda, rocm, and vulkan simultaneously? What would a use case be for that? And if not, how would the options look to enable ((cuda or rocm) xor vulkan)? Should I just add an acceleration option "cuda+rocm", or change the option layout entirely?

mschwaig · 2024-03-23T15:07:45Z

@taoi11 you would want to use the Vulkan backend of llama.cpp for that or another cross-platform backend, but that's not currently supported in upstream ollama, and it is also not supported in nixpkgs.

@abysssol the best thing we can do in nixpkgs is allow passing different versions of llama-cpp into the ollama build again, so that we can pass in a version that is compiled with Vulkan support. It might also be possible to hack together a Vulkan acceleartion option before ollama officially supports it by doing somthing like this: ollama/ollama#2396 (comment), but 'puppeteering' their embedded llama.cpp build to do different things does not sound like fun to me.

What follows is a longer technical explanation.

Ollama supports producing an executable that supports acceleration through both the cuda and ROCm backend, as far as I know, still by linking two versions of llama-cpp at the same time. This means it can either run with one backend or the other, not both at the same time.

It would be possible with some, probably considerable, effort for llama-cpp to support something like this itself. Making it possible to compile one binary that contains all of the selected acceleration backends, and then choosing or having it choose the appropriate backend at runtime and use it.

But as soon as you talk about a system with two different GPUs, you are probably talking about using them, (1) at the same time (2) together and that's something completely different.

Using two acceleration backends like that would be basically impossible to implement in ollama and terribly difficult in llama-cpp, because if you accelerate some part of the computation on one card, you now need have some way to pass the intermediary result to the other card, and that is a terribly annoying problem to solve when every backend uses a different API, because you would have to make those otherwise unrelated acceleration backends work similarly enough that they can share theses intermediary results through some internal API.

The way to go for mixed GPU setups is one backend that uses an API which supports different underlying hardware, because that way since all of the GPUs are programmed with the same API the intermediary results are compatible.

This is possible with the Vulkan backend, and it is actually implemented in llama.cpp.

The downside of using the Vulkan backend would be that, last time that I checked, it was significantly less performant than the Cuda or ROCm backends, and probably running across multiple GPUs is also less performant than if you had the same compute on a single GPU. So you have to read up, or measure and check for yourself, to see if it is even worth it to use both cards, and to make sure it's not broken because that's still a more exotic setup.

taoi11 · 2024-03-23T16:12:56Z

@abysssol @mschwaig
This is all very new to me. Thank you but for explaining this to me so simply.

The Idea I had was to figure out some way to Frankenstein two of me extra (older) GPUs together.
I will go down the vulkan rabbit hole and see what comes of it.

Many thanks!

malteneuss · 2024-04-06T20:55:18Z

Closing as ollama was added to nixpkgs. Ollama-webui (renamed to open-webui) may be added as part of another issue. @mschwaig has brought his branch quite far for that: https://github.com/mschwaig/nixpkgs/tree/open-webui

malteneuss added the 0.kind: packaging request Request for a new package to be added label Dec 11, 2023

malteneuss mentioned this issue Dec 19, 2023

Add Ollama-webui package and service for Mixtral #275448

Closed

13 tasks

malteneuss closed this as completed Apr 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package request: ollama and ollama-webui as NixOS modules #273556

Package request: ollama and ollama-webui as NixOS modules #273556

malteneuss commented Dec 11, 2023 •

edited

Loading

mrjones2014 commented Feb 20, 2024 •

edited

Loading

mrjones2014 commented Feb 20, 2024

mschwaig commented Mar 4, 2024

mrjones2014 commented Mar 4, 2024

Geezus42 commented Mar 4, 2024 •

edited

Loading

siph commented Mar 4, 2024

mrjones2014 commented Mar 9, 2024

abysssol commented Mar 12, 2024

siph commented Mar 12, 2024

mrjones2014 commented Mar 19, 2024

mschwaig commented Mar 19, 2024

taoi11 commented Mar 22, 2024

abysssol commented Mar 23, 2024

mschwaig commented Mar 23, 2024

taoi11 commented Mar 23, 2024

malteneuss commented Apr 6, 2024

Package request: ollama and ollama-webui as NixOS modules #273556

Package request: ollama and ollama-webui as NixOS modules #273556

Comments

malteneuss commented Dec 11, 2023 • edited Loading

mrjones2014 commented Feb 20, 2024 • edited Loading

mrjones2014 commented Feb 20, 2024

mschwaig commented Mar 4, 2024

mrjones2014 commented Mar 4, 2024

Geezus42 commented Mar 4, 2024 • edited Loading

siph commented Mar 4, 2024

mrjones2014 commented Mar 9, 2024

abysssol commented Mar 12, 2024

siph commented Mar 12, 2024

mrjones2014 commented Mar 19, 2024

mschwaig commented Mar 19, 2024

taoi11 commented Mar 22, 2024

abysssol commented Mar 23, 2024

mschwaig commented Mar 23, 2024

taoi11 commented Mar 23, 2024

malteneuss commented Apr 6, 2024

malteneuss commented Dec 11, 2023 •

edited

Loading

mrjones2014 commented Feb 20, 2024 •

edited

Loading

Geezus42 commented Mar 4, 2024 •

edited

Loading