-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not all rooms are recognized #175
Comments
For now, you can use something like the following:
I have no idea on how to extract the aliases without modifying the code, so you can also append something like this after the first code block:
It might also help to set something like this at the beginning of the configuration to prevent the:
|
Thank you. things are going a little better with your tips. But still not good. I think I'll wait a little longer. With version 2024.7, Ollama may be better integrated. |
@WW1983 I am also using |
Thank you. Use Ollama on a Minisforum MS01 with a Tesla P4 Nvidia GPU. So it should work. Is there also a model for Ollama? |
Have not really played with Ollama, but if it supports GGUF models, my guess would be that you can use this one (literally first link in google) - https://huggingface.co/mudler/LocalAI-Llama3-8b-Function-Call-v0.2 (or this one https://huggingface.co/mzbac/llama-3-8B-Instruct-function-calling-v0.2 - safetensors) On a second thought, it might not work, you can always spin up a container with LocalAI while waiting for better Ollama support ) |
I'll try it. But unfortunately I have the problem that I can't pass the GPU through the docker container |
@WW1983 i wont go into the details on how to configure CUDA and NVIDIA Container Toolkit (there are plenty tutorials), but i will say few things:
localai:
container_name: localai
image: localai/localai:latest-gpu-nvidia-cuda-12
restart: unless-stopped
environment:
TZ: Asia/Dubai
volumes:
- ./data/localai:/build/models:cached
ports:
- 8181:8080
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
whisper:
container_name: whisper
image: rhasspy/wyoming-whisper
restart: unless-stopped
command: --model medium-int8 --language en --device cuda
environment:
TZ: Asia/Dubai
volumes:
- ./data/whisper:/data
- /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8:/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8:ro
- /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8:/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8:ro
- /usr/lib/x86_64-linux-gnu/libcublasLt.so.12:/usr/lib/x86_64-linux-gnu/libcublasLt.so.12:ro
- /usr/lib/x86_64-linux-gnu/libcublas.so.12:/usr/lib/x86_64-linux-gnu/libcublas.so.12:ro
ports:
- 10300:10300
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu] Thing is, that way you can still use GPU for other tasks. In my case, this VM is dedicated to the AI stuff only, so here is what i run on it with single GPU shared across all services:
And one more thing, if you want it to behave nicely, you need to enable |
Thank you. I try it now with Ubuntu 24.04. and it works. How to connect it with Home Assitat? What selection do I have to make? "Ollama AI"? |
Backend: Generic OpenAI |
You can do that in: Settings -> Voice Assistans -> YOUR_PIPELINE -> 3 dots -> Debug |
P4 is a little bit slow compared to 3090 Ti, if you have 175 USD to spare, you might aswell buy P40 (saw it on amazon). BUT, given that many libraries now begin to support RT and Tensor cores, you can look at A2000 (about 40% faster but have only 6Gb of RAM which is bad), or A4000 which is roughly 3 times faster than P4, but the price is a bit too high. Or even go ebay route and gamble on RTX 3090, yes RTX 4060 Ti is technically the same as 3090, BUT, memory is the main reason I still haven't sold mine and use it for the AI VM. I tried to use 3080 Ti, but the memory is not enough, so I just use it for VR VM. Sadly, you won't be able to find a cheap GPU with that amount of memory. Basically, any RTX (2nd gen+) or a2000/4000 is the only way to go now if you want fast responses from llms. I was planning to buy A6000, but it is way cheaper to buy new 3090 Ti and have two of them to achieve same 48GB of memory. P.S. Funny how 4090 is not much faster than 3090 Ti. P.P.S. if you would decide to buy a brand new 3090, you might aswell spend 250 USD extra and go 4090. |
Thank you for all your tips. But I think that's a bit too much for me. I just wanted to build something small to experiment. That should be enough at the beginning. |
I still have a problem. Despite updates. |
hello,
not all of my rooms are recognized. If I ask a question about a device in a certain room, it switches to another room. In this case, I asked if the window in the dressing room was open. It checks the window in the bedroom.
I have al Linuxserver wiht Ollama. Model: llama3:8b
Does anyone have an idea why the rooms are not recognized?
The text was updated successfully, but these errors were encountered: