Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does matmulfreellm support Windows 10? #22

Open
fangkuoyu opened this issue Jun 17, 2024 · 6 comments
Open

Does matmulfreellm support Windows 10? #22

fangkuoyu opened this issue Jun 17, 2024 · 6 comments

Comments

@fangkuoyu
Copy link

I have installed matmulfreellm with Triton for Windows via triton-2.0.0-cp310-cp310-win_amd64.whl which makes matmulfreellm work on the 'configuration' file but fail on the 'generate' file. The 'generate' file with 'ridger/MMfreeLM-370M' gets some problems at model.generate. A brief of error messages are

File "C:\Users\hp\Anaconda3\envs\mat_env\lib\subprocess.py", line 1457, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError:

printout executable and args are

None whereis libcuda.so

Does matmulfreellm supports Windows 10?

Some libraries are listed as follows.

torch              2.3.1+cu118
torchaudio         2.3.1+cu118
torchvision        0.18.1+cu118
transformers       4.41.2
triton             2.0.0 
@ridgerchu
Copy link
Owner

Hi, we have not tested Windows 10 yet since the server we used are all Linux... You can try to upgrade triton to 2.2.0 and cuda to 12.2 or above

@fangkuoyu
Copy link
Author

@ridgerchu Thanks for your comments. I have upgraded to Triton 2.1.0 for Windows (triton-2.1.0-cp310-cp310-win_amd64.whl). However, the system still doesn't work. I haven't found Trion 2.2.0 for Windows yet. I will suspend my test for now. Do you have any plans to support Windows officially?

@ridgerchu
Copy link
Owner

It depends when the triton 2.2.0 for windows release I think, without 2.2.0 this repo cannot work well

@fangkuoyu
Copy link
Author

@ridgerchu Got it. Thanks!

@aoguai
Copy link

aoguai commented Jul 24, 2024

I successfully ran the project on Windows, and I am documenting my process here in hopes of helping others who need it:

Steps to Run the Project on Windows

  1. Clone the Project and Install Dependencies
    First, you need to clone the project and install the required dependencies as per the README:

    git clone https://github.com/ridgerchu/matmulfreellm.git
    pip install einops transformers ninja cmake wheel
  2. Install PyTorch with GPU Support
    To use the GPU, I recommend manually selecting the appropriate version of PyTorch from the PyTorch website. You will also need to have CUDA and Visual Studio installed.

  3. Install Triton
    Installing Triton is a bit tricky because it officially supports only Linux. However, you can find Windows-compatible builds on GitHub. For example, you can use builds from this GitHub action or Triton Windows Builds.

    For my setup, I installed Triton using the following command:

    pip install triton-2.1.0-cp310-cp310-win_amd64.whl

My Environment

  • Python: 3.10
  • Torch: 2.4.0+cu118

Verifying NVIDIA and CUDA Installation

Run nvidia-smi to check the GPU and driver information:

nvidia-smi
Wed Jul 24 22:17:02 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.86                 Driver Version: 551.86         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050 ...  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   44C    P8              6W /   74W |     630MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------------------------------------------------------+

Run nvcc --version to check the CUDA compiler version:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:36:15_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

Setting Up Environment Variables

You need to set the appropriate environment variables, or use the corresponding Visual Studio command prompt. Ensure you use the correct Visual Studio command prompt (x64 Native Tools Command Prompt for VS XXXX), which automatically configures environment variables.

Alternatively, you can manually add the environment variables before starting your script:

import os

# CUDA Compiler Path
os.environ['CC'] = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc.exe"

# Add Visual Studio Compiler Path to PATH
os.environ['PATH'] = r"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\bin\Hostx64\x64;" + os.environ['PATH']

# Add CUDA and Other Include Paths to INCLUDE
os.environ['INCLUDE'] = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include;" + \
                        r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt;" + \
                        r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um;" + \
                        r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared;" + \
                        r"D:\yy\Python\Python310\include"  // Your Python path

# Add CUDA and Other Library Paths to LIB
os.environ['LIB'] = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\lib\x64;" + \
                    r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um\x64;" + \
                    r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\x64;" + \
                    r"D:\yy\Python\Python310\libs"  // Your Python path

Running the Script

Finally, to run the script:

python generate.py

You should see output similar to the following:

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
ptxas info    : 0 bytes gmem
ptxas info    : Compiling entry function 'fused_recurrent_hgrn_fwd_kernel_0d1d2d3d4d' for 'sm_86'
ptxas info    : Function properties for fused_recurrent_hgrn_fwd_kernel_0d1d2d3d4d
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 18 registers, 392 bytes cmem[0]
In a shocking finding, scientists discovered a herd of unicorns living in a remote, 200-year-old forest.

@vv-get-ti
Copy link

I successfully ran the project on Windows, and I am documenting my process here in hopes of helping others who need it:

Steps to Run the Project on Windows

  1. Clone the Project and Install Dependencies
    First, you need to clone the project and install the required dependencies as per the README:
    git clone https://github.com/ridgerchu/matmulfreellm.git
    pip install einops transformers ninja cmake wheel
  2. Install PyTorch with GPU Support
    To use the GPU, I recommend manually selecting the appropriate version of PyTorch from the PyTorch website. You will also need to have CUDA and Visual Studio installed.
  3. Install Triton
    Installing Triton is a bit tricky because it officially supports only Linux. However, you can find Windows-compatible builds on GitHub. For example, you can use builds from this GitHub action or Triton Windows Builds.
    For my setup, I installed Triton using the following command:
    pip install triton-2.1.0-cp310-cp310-win_amd64.whl

My Environment

  • Python: 3.10
  • Torch: 2.4.0+cu118

Verifying NVIDIA and CUDA Installation

Run nvidia-smi to check the GPU and driver information:

nvidia-smi
Wed Jul 24 22:17:02 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.86                 Driver Version: 551.86         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050 ...  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   44C    P8              6W /   74W |     630MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------------------------------------------------------+

Run nvcc --version to check the CUDA compiler version:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:36:15_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

Setting Up Environment Variables

You need to set the appropriate environment variables, or use the corresponding Visual Studio command prompt. Ensure you use the correct Visual Studio command prompt (x64 Native Tools Command Prompt for VS XXXX), which automatically configures environment variables.

Alternatively, you can manually add the environment variables before starting your script:

import os

# CUDA Compiler Path
os.environ['CC'] = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc.exe"

# Add Visual Studio Compiler Path to PATH
os.environ['PATH'] = r"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\bin\Hostx64\x64;" + os.environ['PATH']

# Add CUDA and Other Include Paths to INCLUDE
os.environ['INCLUDE'] = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include;" + \
                        r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt;" + \
                        r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um;" + \
                        r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared;" + \
                        r"D:\yy\Python\Python310\include"  // Your Python path

# Add CUDA and Other Library Paths to LIB
os.environ['LIB'] = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\lib\x64;" + \
                    r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um\x64;" + \
                    r"C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\x64;" + \
                    r"D:\yy\Python\Python310\libs"  // Your Python path

Running the Script

Finally, to run the script:

python generate.py

You should see output similar to the following:

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
ptxas info    : 0 bytes gmem
ptxas info    : Compiling entry function 'fused_recurrent_hgrn_fwd_kernel_0d1d2d3d4d' for 'sm_86'
ptxas info    : Function properties for fused_recurrent_hgrn_fwd_kernel_0d1d2d3d4d
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 18 registers, 392 bytes cmem[0]
In a shocking finding, scientists discovered a herd of unicorns living in a remote, 200-year-old forest.

it is so effective

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants