HELPPPPP: How many Workers can i handle if iam having 48GB VRAM RTX A6000 (SINGLE GPU) and using 72B llama3.3 model #349
Labels
investigating
Bugs that are still being investigated whether they are valid
How many Workers can i handle if iam having 48GB VRAM RTX A6000 (SINGLE GPU) and using 72B llama3.3 model and also tell that can i use my CPU bcs it is of having 32 threads, so as per that we can run 32 workers inferencing parallely if we are using verba project in LAN network, please HELP me with this... tell me clearly what to do and how to switch from GPU to CPU if I want to do
The text was updated successfully, but these errors were encountered: