Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't deploy on multi-node cluster #5

Open
mausch opened this issue Apr 26, 2024 · 7 comments
Open

Can't deploy on multi-node cluster #5

mausch opened this issue Apr 26, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@mausch
Copy link

mausch commented Apr 26, 2024

When deploying on a multi-node cluster (EKS in my case but I guess it could be any other), there's a PVC clash between the model store and the model pod.
The model pod gets this error and so it cannot start:

Multi-Attach error for volume "pvc-63d894e9-1945-4ec7-988f-0fc6a08adc1a" Volume is already used by pod(s) ollama-models-store-0-x-ollama-operator-system-x-vcl-4497a69570
@nekomeowww nekomeowww self-assigned this Apr 26, 2024
@nekomeowww nekomeowww added the bug Something isn't working label Apr 26, 2024
@aep
Copy link

aep commented Jun 28, 2024

as far as i understand, the shared storage is required because one pod downloads the models, and the other runs it.

RWX storage is commonly NFS which is slow and buggy

a quick and easy solution might be to make the model storage a daemonset,
and contact the node-local one from the model pod

@ilyapaff
Copy link

The same problem.

Here it is necessary either to prohibit the use of RWO or to make a restriction in the documentation that it works only within one node.

Using a shared RWO is initially a mistake, since a Kubernetes cluster usually consists of several nodes.

The solution may be to get the model from the storage over the network. (without a shared disk)

Another solution is to store the model in the Model workload, without deploying a separate repository.
Each Models will have its own pvc, downloading it there at the first launch and saving the pvc after deleting the CR Model (or not saving it, it's unclear why we need the cached model if we deleted it)

@nekomeowww
Copy link
Owner

Actually we have deployed to two clusters with 5 worker nodes (my cloud), and 3 worker nodes (my company team), we never encounter such issue in our scenario case.

TBH, I appologize for the delayed message reply, I was in a short where I haven't find any oppertunities to have a chance to deploy any of the multi-node environment such as AWS with AWS EBS.

If any of you are able to reach to a cluster with multiple nodes with more advanced fs and storage class, perhaps, we can work together to test this out.

@nekomeowww
Copy link
Owner

a quick and easy solution might be to make the model storage a daemonset,
and contact the node-local one from the model pod

DaemonSet was an idea that I considered before. I remember there was some problems with such approach, I can't really remember why, I will reply in this thread if it reminds me any.

@nekomeowww
Copy link
Owner

The solution may be to get the model from the storage over the network. (without a shared disk)

The problem is, because of the models are always big, and we cannot treat them as images where nodes will use their node storage resources to store and manage with, while in most cases, users (or tenants, admins, operators, orgs) would think of that they can have a universal storage cluster to store the models all together to reduce the cost, with this being said, Kubernetes is namespaced (ns level), where storages are expected to be treated as clustered (cluster level), from the fundamental perspective to see, these two concepts are mutually exclusive.

Each Models will have its own pvc, downloading it there at the first launch and saving the pvc after deleting the CR Model (or not saving it, it's unclear why we need the cached model if we deleted it)

Tweaking models, and use ollama build, and experiement with different prompts and default configuration parameters is the most common use cases. One of the use case on my side is to have multiple test and eval instance running together, and they will be able to get the shared cached models from Statefulset instead of downloading them all over again. If we choose the own PVC approch, then building will cost hundreds of layers of storage cost.

@nekomeowww
Copy link
Owner

After several months of experiementing on model serving and production server deployment and architectual thinking.

I want to propose a new parameter to specify the storage mode to allow you folks to try it to find out which way suits the best:

  • I will implement a new parameter to specify the storage mode.
    • a new parameter called cache accepts different enum values: Node, Namespace, None, (maybe Cluster if integrated other fs and methods, but requires advanced setup), where
    • Node will create a DaemonSet on node level, where each node will get its own PVC to cache and store the needed data, and for Ollama Operator, I will try to calculate the needed routes for them to load the models from user-namespace Pods.
    • Namespace is the current behavior, a StatefulSet with it own PVC to share within a namespace.
    • None will not create any of the cache workloads, each time a model deploys, it will download on it's own with sidecar container as server.

@ezequielfalcon
Copy link

Hello. I am having the same issue, I get a multi-attach error which is normal because the PVC is being created with ReadWriteOnce.. if 2 different pods have to access a PVC (in this case the StatefulSet and the actual model deployment) the PVC access mode should be ReadWriteMany. If I try to deploy the model using this access mode, the StatefulSet anyway deploys a ReadWriteOnce PVC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants