Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arc Installation - OSError: [WinError 126] The specified module could not be found. Error loading "backend_with_compiler.dll" or one of its dependencies. #5123

Closed
1 task done
ghost opened this issue Dec 30, 2023 · 15 comments
Labels
bug Something isn't working stale

Comments

@ghost
Copy link

ghost commented Dec 30, 2023

Describe the bug

I receive the below error after installing OobaBooga using the default Arc install option on Windows. The install seemed to go well but running it results in the below DLL load error. Other threads that mentioned this loading error suggested it might be a PATH issue. I tried adding a few paths to the OS environment but couldn't resolve it.

It's an Arc A770 on Windows 10. Intel® Graphics Driver 31.0.101.5081/31.0.101.5122 (WHQL Certified). I also tried rolling back to driver 4676 and doing a clean install with the same results. Some of the paths I added were those listed here. I'm also not seeing any of the DLL's listed at that link in those directories. Instead, I have intel-ext-pt-gpu.dll and intel-ext-pt-python.dll in "%PYTHON_ENV_DIR%\lib\site-packages\intel_extension_for_pytorch\bin" and no DLL's in "%PYTHON_ENV_DIR%\lib\site-packages\torch\lib". backend_with_compiler.dll does exist at that location.

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

  1. Put an Arc a770 into your computer.
  2. Install the drivers.
  3. Do a clean install of OobaBooga using the latest snapshot.
    3a) Select "D" for the Arc install.
  4. Run start_windows.bat

Screenshot

No response

Logs

Traceback (most recent call last) ─────────────────────────────────────────┐
│ C:\text-generation-webui\server.py:6 in <module>                                                                 │
│                                                                                                                     │
│     5                                                                                                               │
│ >   6 import accelerate  # This early import makes Intel GPUs happy                                                 │
│     7                                                                                                               │
│                                                                                                                     │
│ C:\text-generation-webui\installer_files\env\Lib\site-packages\accelerate\__init__.py:3 in <module>              │
│                                                                                                                     │
│    2                                                                                                                │
│ >  3 from .accelerator import Accelerator                                                                           │
│    4 from .big_modeling import (                                                                                    │
│                                                                                                                     │
│ C:\text-generation-webui\installer_files\env\Lib\site-packages\accelerate\accelerator.py:32 in <module>          │
│                                                                                                                     │
│     31                                                                                                              │
│ >   32 import torch                                                                                                 │
│     33 import torch.utils.hooks as hooks                                                                            │
│                                                                                                                     │
│ C:\text-generation-webui\installer_files\env\Lib\site-packages\torch\__init__.py:139 in <module>                 │
│                                                                                                                     │
│    138                 err.strerror += f' Error loading "{dll}" or one of its dependencies.'                        │
│ >  139                 raise err                                                                                    │
│    140                                                                                                              │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
OSError: [WinError 126] The specified module could not be found. Error loading
"C:\text-generation-webui\installer_files\env\Lib\site-packages\torch\lib\backend_with_compiler.dll" or one of its
dependencies.
Press any key to continue . . .

System Info

Windows 10 Pro.
Arc a770 (shows in Computer Management/Task Manager)
Not using WSL
@ghost ghost added the bug Something isn't working label Dec 30, 2023
@TinJon06
Copy link

Same issue here, using A750 instead.

@soda79
Copy link

soda79 commented Dec 31, 2023

Same issue, using Arc A770 LE.

@ghost
Copy link
Author

ghost commented Dec 31, 2023

Some progress, maybe... Installing the OneAPI Base ToolKit and downloading libuv from here then placing the .lib and .dll files in \env\lib\site-packages\torch\lib seemed to resolve the aforementioned missing dll error. Ooba now loads but with a "No CUDA runtime is found" message, resulting in the use of the CPU.

@TinJon06
Copy link

TinJon06 commented Jan 1, 2024

Some progress, maybe... Installing the OneAPI Base ToolKit and downloading libuv from here then placing the .lib and .dll files in \env\lib\site-packages\torch\lib seemed to resolve the aforementioned missing dll error. Ooba now loads but with a "No CUDA runtime is found" message, resulting in the use of the CPU.

@discreteness I have the Intel OneAPI Base Toolkit installed and I placed the .lib and .dll files in the path you mentioned, but still resulting in a Traceback. Am I doing something wrong? What .lib and .dll should I place, I mean, what name of the .lib and .dll?

@ghost
Copy link
Author

ghost commented Jan 2, 2024

@TinJon06 uv.dll, uv.lib, uv_a.lib

You might need to set your Windows PATHs to the OneAPI directory that contains the intel extension for pytorch DLLs listed here. Or, directly place the files (sycl7.dll;pi_level_zero.dll;pi_win_proxy_loader.dll;mkl_core.2.dll;mkl_sycl_blas.4.dll;mkl_sycl_lapack.4.dll;mkl_sycl_dft.4.dll;mkl_tbb_thread.2.dll;libmmd.dll;svml_dispmd.dll) in env/lib/site-packages/intel_extension_for_pytorch/bin

Also, try installing these (run cmd_windows.bat first to get into your Ooba environment shell):
pip install mkl==2024.0 dpcpp-cpp-rt

And, while you're at it, downloading and installing the Intel Distribution for Python 3 and adding them to your Windows PATHs.

Note: Running "conda install -c intel intelpython3_full" didn't work for me (gave errors).

If you figure out how to get it to use the GPU afterwards, please let us know. I'm still stuck there. :)

@TinJon06
Copy link

TinJon06 commented Jan 2, 2024

A bit of an update, I can get it to run on the Arc GPU by using the Transformer loader, but with huge caveats.
Here's a video recording:

2024-01-02.21-07-20.mp4

The Caveats are the following:

  1. It cannot use auto-devices, it causes a Traceback, specifically AssertionError. It says, "AssertionError: Torch not compiled with CUDA enabled." Without auto-devices, you cannot split a big model, causing out of memory errors.
  2. It cannot use load-in-8bit, again it causes a Traceback, specifically RuntimeError. It says, "RuntimeError: No GPU found. A GPU is needed for quantization." Which is weird, I have a GPU which is Arc A750. Without loading in 8 bit, it may cause out of memory errors when trying big models.

In the recording, I used Pygmalion 2.7B which can be loaded and generate without problems. I want to load the 6B variant but without auto-devices or load-in-8bit, I cannot test this. Also, I haven't tested other loaders as of writing.

@ghost
Copy link
Author

ghost commented Jan 2, 2024

@TinJon06
I had the same issue. I was able to load a 7b model and get ~ 3.5 tokens/sec but loading in 4/8 bit failed, as did trying to split it into GPU + CPU memory. I examined the code in modeling_utils.py. I tried commenting out some of the cuda conditions only to get more errors in bitsandbytes. I think so much of the code is setup around CUDA and will require specialized instructions for Arc GPUs. This comment talks a bit about why that is. AMD basically setup their GPU to interface cleanly with NVIDIA's CUDA, so with ROCM they were basically interchangable code-wise. Intel went their own route and a lot of apps haven't caught up yet.

@ghost
Copy link
Author

ghost commented Jan 4, 2024

Llama.cpp is in the process of adding Arc support, so there may be some relief with GGUF models and the like. And it looks like bitsandbytes is adding support for Arc as well.

@TinJon06
Copy link

TinJon06 commented Jan 6, 2024

Llama.cpp is in the process of adding Arc support, so there may be some relief with GGUF models and the like. And it looks like bitsandbytes is adding support for Arc as well.

I hope those goes all well, and hoping that it will be added here on the webui soon.

@ghost
Copy link
Author

ghost commented Jan 6, 2024

I was able to compile llama.cpp from source with CLBlast (instructions here) and get it running on my Arc. And, I was also able to compile and install llama_cpp_python into Ooba, however there appears to be a problem loading the resulting llama.dll and/or it's dependencies, similar to the issues others experienced here. So, I'm stuck there.

@Nuullll
Copy link
Contributor

Nuullll commented Jan 8, 2024

Retry with the latest dev branch. #5191 should have fixed this.

@idelacio
Copy link

idelacio commented Jan 11, 2024

Retry with the latest dev branch. #5191 should have fixed this.

It builds now but on starting I get the attached errors
SadArcLogs.txt

The Nvidia version runs just fine. (same version, rebuilt, both builds tested from C drive (logs are D drive build but same errors))

Running Windows Server 2019
Dual card setup-
Arc 770 16GB in primary PCIE slot
3060 12GB in secondary

@idelacio
Copy link

It now builds and interface loads from main branch version.

Not sure how to run models from the card though, AWQ and GPTQ don't work at all and error out and GGUF just works from the CPU.

Copy link

This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

@Magentatl
Copy link

Hi all, have the same problem but have no skills to solve it myself. May I ask your for help? I used llm-ipex and intel guides too, but with no significant success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

5 participants