Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HELPPPPP: How many Workers can i handle if iam having 48GB VRAM RTX A6000 (SINGLE GPU) and using 72B llama3.3 model #349

Open
SaiAkhil066 opened this issue Dec 12, 2024 · 1 comment
Labels
investigating Bugs that are still being investigated whether they are valid

Comments

@SaiAkhil066
Copy link

How many Workers can i handle if iam having 48GB VRAM RTX A6000 (SINGLE GPU) and using 72B llama3.3 model and also tell that can i use my CPU bcs it is of having 32 threads, so as per that we can run 32 workers inferencing parallely if we are using verba project in LAN network, please HELP me with this... tell me clearly what to do and how to switch from GPU to CPU if I want to do

@thomashacker
Copy link
Collaborator

Hey, thanks for the issue! Can you share more information about what you're trying to achieve?

I think it would make sense to direct your question to the Ollama GitHub (https://github.com/ollama/ollama) since this will be the most computationally expensive part of using Verba.

@thomashacker thomashacker added the investigating Bugs that are still being investigated whether they are valid label Dec 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigating Bugs that are still being investigated whether they are valid
Projects
None yet
Development

No branches or pull requests

2 participants