-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package request: ollama and ollama-webui as NixOS modules #273556
Comments
Hey, I've been using this derivation and it's worked pretty great. However I tried updating the versions and hashes, and now I'm getting this error. I think it means there is an issue with the package-lock.json, but I'm not sure. Any idea how I can work around it?
EDIT: It's an issue with their package-lock.json. I applied a patch to the package.json and package-lock.json including the missing dependency and it works. However it seems they've added a backend component and this static deploy doesn't work anymore. |
For those wanting a temporary solution, this is how I've got it running via podman:
{ pkgs, ... }: {
imports = [ ./open-webui.nix ];
services.ollama.enable = true;
environment.systemPackages = [ pkgs.oterm ];
}
# Download LLMs per api
# curl http://localhost:11434/api/pull -d '{ "name": "llama2" }'
{ config, ... }: {
virtualisation = {
podman = {
enable = true;
dockerCompat = true;
defaultNetwork.settings.dns_enabled = true;
};
};
virtualisation.oci-containers.backend = "podman";
virtualisation.oci-containers.containers.open-webui = {
autoStart = true;
image = "ghcr.io/open-webui/open-webui";
ports = [ "3000:8080" ];
# TODO figure out how to create the data directory declaratively
volumes = [ "${config.users.users.mat.home}/open-webui:/app/backend/data" ];
extraOptions =
[ "--network=host" "--add-host=host.containers.internal:host-gateway" ];
environment = { OLLAMA_API_BASE_URL = "http://127.0.0.1:11434/api"; };
};
networking.firewall = { allowedTCPPorts = [ 80 443 8080 11434 3000 ]; };
} |
Are people still interested in having this in nixpkgs? |
I am very interested. I plan to set it up on my home server with NixOS (just haven't gotten around to fixing my server and installing NixOS on it yet).
Currently I'm running it in a Podman container. It works, however there's no GPU support for the containerized version. |
I am running it in a container. I do not have a GPU, be there is GPU support, though people have reported a lot of issues getting it working. It seems to be improving, and I wish I could remember if it was on Github or in the Discord, but I'm pretty sure there was a fix for using GPU with Podman. I would recommend asking in the Discord, there are several helpful members. Might be useful... |
Ollama is up-to-date in nixpkgs and you can enable GPU support with the nixpkgs/pkgs/tools/misc/ollama/default.nix Lines 21 to 22 in c7f550a
For example, you can install home.packages = with pkgs; [
(ollama.override { acceleration = "rocm"; })
]; There is also a nice flake here: https://github.com/abysssol/ollama-flake/ I'm using an old revision of that flake that supports my old GPU in combination with the web-ui running in a container. |
If you're using the NixOS options it would be: services.ollama = {
enable = true;
acceleration = "cuda"; # or "rocm"
}; |
Does anyone here know why the Also, @siph, I'm considering archiving |
@abysssol I'm not updated to the latest version because my old hardware (RX 580) doesn't get recognized by rocm when I do. I was already using your flake and I'm just pinning to the latest version that works for me. I would change to I agree on separating |
Has anyone gotten |
As far as I remember the branch that I linked here runs fine: It's just far away from a state where it could land in nixpgks and I'm still too busy with ROCm and a few other things right now to move it forward. |
If I have a Nvida and AMD GPU in the same system. |
@taoi11 not currently, though it could be changed to support that (again). It used to, but I changed that with the expectation that the nix package would supply ollama with a single version of llama-cpp built with nix. I think llama-cpp only supports a single acceleration backend at a time, such as cuda, rocm, or vulkan. Ollama only supports multiple simultaneously because it spins up multiple separate llama-cpp instances compiled with different acceleration support, if I understand correctly. Do you think it should be possible to enable cuda, rocm, and vulkan simultaneously? What would a use case be for that? And if not, how would the options look to enable ((cuda or rocm) xor vulkan)? Should I just add an |
@taoi11 you would want to use the Vulkan backend of llama.cpp for that or another cross-platform backend, but that's not currently supported in upstream ollama, and it is also not supported in nixpkgs. @abysssol the best thing we can do in nixpkgs is allow passing different versions of What follows is a longer technical explanation. Ollama supports producing an executable that supports acceleration through both the It would be possible with some, probably considerable, effort for llama-cpp to support something like this itself. Making it possible to compile one binary that contains all of the selected acceleration backends, and then choosing or having it choose the appropriate backend at runtime and use it. But as soon as you talk about a system with two different GPUs, you are probably talking about using them, (1) at the same time (2) together and that's something completely different. Using two acceleration backends like that would be basically impossible to implement in ollama and terribly difficult in llama-cpp, because if you accelerate some part of the computation on one card, you now need have some way to pass the intermediary result to the other card, and that is a terribly annoying problem to solve when every backend uses a different API, because you would have to make those otherwise unrelated acceleration backends work similarly enough that they can share theses intermediary results through some internal API. The way to go for mixed GPU setups is one backend that uses an API which supports different underlying hardware, because that way since all of the GPUs are programmed with the same API the intermediary results are compatible. This is possible with the Vulkan backend, and it is actually implemented in llama.cpp. The downside of using the Vulkan backend would be that, last time that I checked, it was significantly less performant than the Cuda or ROCm backends, and probably running across multiple GPUs is also less performant than if you had the same compute on a single GPU. So you have to read up, or measure and check for yourself, to see if it is even worth it to use both cards, and to make sure it's not broken because that's still a more exotic setup. |
Closing as ollama was added to nixpkgs. Ollama-webui (renamed to open-webui) may be added as part of another issue. @mschwaig has brought his branch quite far for that: https://github.com/mschwaig/nixpkgs/tree/open-webui |
Project description
To make open-source large langue models (LLM) accessible there are projects like Ollama that make it almost trivial to download and run them locally on a consumer computer.
We already have Ollama in Nixpkgs, but that can only be run conveniently in a terminal (and doesn't store previous chats). What's missing is a web UI, e.g. Ollama-WebUI that mimics ChatGPT's frontend and integrates nicely with Ollama.
My request is to add Ollama-WebUI (the only satisfactory web UI i could find) to Nixpkgs, and then to create NixOS modules to have them both as convenient services to deploy.
I've managed to piece together some working config with some rough edges to start with:
@elohmeier @dit7ya Would that be a good idea, since you were involved in adding Ollama to nixpkgs? (thanks a lot btw)
Metadata
Add a 👍 reaction to issues you find important.
The text was updated successfully, but these errors were encountered: