Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update the nvjitlink bindings test #228

Merged
merged 2 commits into from
Nov 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CUDA-Python
# cuda-python

CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It consists of multiple components:

Expand All @@ -7,21 +7,21 @@ CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It c
* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): Pythonic exposure of CUB cooperative algorithms
* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): Pythonic exposure of Thrust parallel algorithms

For access to NVIDIA Math Libraries, please refer to [nvmath-python](https://docs.nvidia.com/cuda/nvmath-python/latest).
For access to NVIDIA CPU & GPU Math Libraries, please refer to [nvmath-python](https://docs.nvidia.com/cuda/nvmath-python/latest).

CUDA Python is currently undergoing an overhaul to improve existing and bring up new components. All of the previously available functionalities from the cuda-python package will continue to be available, please refer to the [cuda.bindings](https://nvidia.github.io/cuda-python/cuda-bindings/latest) documentation for installation guide and further detail.

## CUDA-Python as a metapackage
## cuda-python as a metapackage

CUDA-Python is structured to become a metapackage that contains a collection of subpackages. Each subpackage is versioned independently, allowing installation of each component as needed.
`cuda-python` is being re-structured to become a metapackage that contains a collection of subpackages. Each subpackage is versioned independently, allowing installation of each component as needed.

### Subpackage: `cuda.core`

The `cuda.core` package offers idiomatic, pythonic access to CUDA Runtime and other functionalities.

The goals are to

1. Provide **idiomatic (pythonic)** access to CUDA Driver/Runtime
1. Provide **idiomatic ("pythonic")** access to CUDA Driver, Runtime, and JIT compiler toolchain
2. Focus on **developer productivity** by ensuring end-to-end CUDA development can be performed quickly and entirely in Python
3. **Avoid homegrown** Python abstractions for CUDA for new Python GPU libraries starting from scratch
4. **Ease** developer **burden of maintaining** and catching up with latest CUDA features
Expand Down
32 changes: 29 additions & 3 deletions cuda_bindings/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
# `cuda.bindings`: Low-level CUDA interfaces

CUDA Python is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the [Overview](https://nvidia.github.io/cuda-python/cuda-bindings/latest/overview.html) for the workflow and performance results.
`cuda.bindings` is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the [Overview](https://nvidia.github.io/cuda-python/cuda-bindings/latest/overview.html) for the workflow and performance results.

`cuda.bindings` is a subpackage of `cuda-python`.

## Installing

CUDA Python can be installed from:

* PYPI
* Conda (nvidia channel)
* PyPI
* Conda (conda-forge/nvidia channels)
* Source builds

Differences between these options are described in [Installation](https://nvidia.github.io/cuda-python/cuda-bindings/latest/install.html) documentation. Each package guarantees minor version compatibility.
Expand All @@ -31,6 +33,30 @@ Source builds work for multiple Python versions, however pre-build PyPI and Cond

* Python 3.9 to 3.12

## Developing

We use `pre-commit` to manage various tools to help development and ensure consistency.
```shell
pip install pre-commit
```

### Code linting

Run this command before checking in the code changes
```shell
pre-commit run -a --show-diff-on-failure
```
to ensure the code formatting is in line of the requirements (as listed in [`pyproject.toml`](./pyproject.toml)).

### Code signing

This repository implements a security check to prevent the CI system from running untrusted code. A part of the
security check consists of checking if the git commits are signed. See
[here](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/faqs/#why-did-i-receive-a-comment-that-my-pull-request-requires-additional-validation)
and
[here](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification)
for more details, including how to sign your commits.

## Testing

Latest dependencies can be found in [requirements.txt](https://github.com/NVIDIA/cuda-python/blob/main/cuda_bindings/requirements.txt).
Expand Down
112 changes: 76 additions & 36 deletions cuda_bindings/tests/test_nvjitlink.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,22 @@

import pytest

from cuda.bindings import nvjitlink
from cuda.bindings import nvjitlink, nvrtc

ptx_kernel = """
.version 8.5
.target sm_90
# Establish a handful of compatible architectures and PTX versions to test with
ARCHITECTURES = ["sm_60", "sm_75", "sm_80", "sm_90"]
PTX_VERSIONS = ["5.0", "6.4", "7.0", "8.5"]


def ptx_header(version, arch):
return f"""
.version {version}
.target {arch}
.address_size 64
"""


ptx_kernel = """
.visible .entry _Z6kernelPi(
.param .u64 _Z6kernelPi_param_0
)
Expand All @@ -28,18 +37,40 @@
"""

minimal_ptx_kernel = """
.version 8.5
.target sm_90
.address_size 64

.func _MinimalKernel()
{
ret;
}
"""

ptx_kernel_bytes = ptx_kernel.encode("utf-8")
minimal_ptx_kernel_bytes = minimal_ptx_kernel.encode("utf-8")
ptx_kernel_bytes = [
(ptx_header(version, arch) + ptx_kernel).encode("utf-8") for version, arch in zip(PTX_VERSIONS, ARCHITECTURES)
]
minimal_ptx_kernel_bytes = [
(ptx_header(version, arch) + minimal_ptx_kernel).encode("utf-8")
for version, arch in zip(PTX_VERSIONS, ARCHITECTURES)
]


# create a valid LTOIR input for testing
@pytest.fixture
def get_dummy_ltoir():
def CHECK_NVRTC(err):
if err != nvrtc.nvrtcResult.NVRTC_SUCCESS:
raise RuntimeError(f"Nvrtc Error: {err}")

empty_cplusplus_kernel = "__global__ void A() {}"
err, program_handle = nvrtc.nvrtcCreateProgram(empty_cplusplus_kernel.encode(), b"", 0, [], [])
CHECK_NVRTC(err)
nvrtc.nvrtcCompileProgram(program_handle, 1, [b"-dlto"])
err, size = nvrtc.nvrtcGetLTOIRSize(program_handle)
CHECK_NVRTC(err)
empty_kernel_ltoir = b" " * size
(err,) = nvrtc.nvrtcGetLTOIR(program_handle, empty_kernel_ltoir)
CHECK_NVRTC(err)
(err,) = nvrtc.nvrtcDestroyProgram(program_handle)
CHECK_NVRTC(err)
return empty_kernel_ltoir


def test_unrecognized_option_error():
Expand All @@ -52,39 +83,41 @@ def test_invalid_arch_error():
nvjitlink.create(1, ["-arch=sm_XX"])


def test_create_and_destroy():
handle = nvjitlink.create(1, ["-arch=sm_53"])
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_create_and_destroy(option):
handle = nvjitlink.create(1, [f"-arch={option}"])
assert handle != 0
nvjitlink.destroy(handle)


def test_complete_empty():
handle = nvjitlink.create(1, ["-arch=sm_90"])
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_complete_empty(option):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.complete(handle)
nvjitlink.destroy(handle)


def test_add_data():
handle = nvjitlink.create(1, ["-arch=sm_90"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_kernel_bytes, len(ptx_kernel_bytes), "test_data")
nvjitlink.add_data(
handle, nvjitlink.InputType.ANY, minimal_ptx_kernel_bytes, len(minimal_ptx_kernel_bytes), "minimal_test_data"
)
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_add_data(option, ptx_bytes):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_bytes, len(ptx_bytes), "test_data")
nvjitlink.complete(handle)
nvjitlink.destroy(handle)


def test_add_file(tmp_path):
handle = nvjitlink.create(1, ["-arch=sm_90"])
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_add_file(option, ptx_bytes, tmp_path):
handle = nvjitlink.create(1, [f"-arch={option}"])
file_path = tmp_path / "test_file.cubin"
file_path.write_bytes(ptx_kernel_bytes)
file_path.write_bytes(ptx_bytes)
nvjitlink.add_file(handle, nvjitlink.InputType.ANY, str(file_path))
nvjitlink.complete(handle)
nvjitlink.destroy(handle)


def test_get_error_log():
handle = nvjitlink.create(1, ["-arch=sm_90"])
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_get_error_log(option):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.complete(handle)
log_size = nvjitlink.get_error_log_size(handle)
log = bytearray(log_size)
Expand All @@ -93,9 +126,10 @@ def test_get_error_log():
nvjitlink.destroy(handle)


def test_get_info_log():
handle = nvjitlink.create(1, ["-arch=sm_90"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_kernel_bytes, len(ptx_kernel_bytes), "test_data")
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_get_info_log(option, ptx_bytes):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_bytes, len(ptx_bytes), "test_data")
nvjitlink.complete(handle)
log_size = nvjitlink.get_info_log_size(handle)
log = bytearray(log_size)
Expand All @@ -104,9 +138,10 @@ def test_get_info_log():
nvjitlink.destroy(handle)


def test_get_linked_cubin():
handle = nvjitlink.create(1, ["-arch=sm_90"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_kernel_bytes, len(ptx_kernel_bytes), "test_data")
@pytest.mark.parametrize("option, ptx_bytes", zip(ARCHITECTURES, ptx_kernel_bytes))
def test_get_linked_cubin(option, ptx_bytes):
handle = nvjitlink.create(1, [f"-arch={option}"])
nvjitlink.add_data(handle, nvjitlink.InputType.ANY, ptx_bytes, len(ptx_bytes), "test_data")
nvjitlink.complete(handle)
cubin_size = nvjitlink.get_linked_cubin_size(handle)
cubin = bytearray(cubin_size)
Expand All @@ -115,11 +150,16 @@ def test_get_linked_cubin():
nvjitlink.destroy(handle)


def test_get_linked_ptx():
# TODO improve this test to call get_linked_ptx without this error
handle = nvjitlink.create(2, ["-arch=sm_90", "-lto"])
with pytest.raises(nvjitlink.nvJitLinkError, match="ERROR_NVVM_COMPILE"):
nvjitlink.complete(handle)
@pytest.mark.parametrize("option", ARCHITECTURES)
def test_get_linked_ptx(option, get_dummy_ltoir):
handle = nvjitlink.create(3, [f"-arch={option}", "-lto", "-ptx"])
nvjitlink.add_data(handle, nvjitlink.InputType.LTOIR, get_dummy_ltoir, len(get_dummy_ltoir), "test_data")
nvjitlink.complete(handle)
ptx_size = nvjitlink.get_linked_ptx_size(handle)
ptx = bytearray(ptx_size)
nvjitlink.get_linked_ptx(handle, ptx)
assert len(ptx) == ptx_size
nvjitlink.destroy(handle)


def test_package_version():
Expand Down
36 changes: 35 additions & 1 deletion cuda_core/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,43 @@
# `cuda.core`: (experimental) pythonic CUDA module

Currently under active development. To build from source, just do:
Currently under active developmen; see [the documentation](https://nvidia.github.io/cuda-python/cuda-core/latest/) for more details.

## Installing

TO build from source, just do:
```shell
$ git clone https://github.com/NVIDIA/cuda-python
$ cd cuda-python/cuda_core # move to the directory where this README locates
$ pip install .
```
For now `cuda-python` is a required dependency.

## Developing

We use `pre-commit` to manage various tools to help development and ensure consistency.
```shell
pip install pre-commit
```

### Code linting

Run this command before checking in the code changes
```shell
pre-commit run -a --show-diff-on-failure
```
to ensure the code formatting is in line of the requirements (as listed in [`pyproject.toml`](./pyproject.toml)).

### Code signing

This repository implements a security check to prevent the CI system from running untrusted code. A part of the
security check consists of checking if the git commits are signed. See
[here](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/faqs/#why-did-i-receive-a-comment-that-my-pull-request-requires-additional-validation)
and
[here](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification)
for more details, including how to sign your commits.

## Testing

To run these tests:
* `python -m pytest tests/` against local builds
* `pytest tests/` against installed packages
4 changes: 1 addition & 3 deletions cuda_core/cuda/core/experimental/_stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,9 +211,7 @@ def wait(self, event_or_stream: Union[Event, Stream]):
try:
stream = Stream._init(event_or_stream)
except Exception as e:
raise ValueError(
"only an Event, Stream, or object supporting __cuda_stream__ can be waited"
) from e
raise ValueError("only an Event, Stream, or object supporting __cuda_stream__ can be waited") from e
else:
stream = event_or_stream
event = handle_return(cuda.cuEventCreate(cuda.CUevent_flags.CU_EVENT_DISABLE_TIMING))
Expand Down
2 changes: 1 addition & 1 deletion cuda_python/docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ multiple components:
- `cuda.cooperative`_: Pythonic exposure of CUB cooperative algorithms
- `cuda.parallel`_: Pythonic exposure of Thrust parallel algorithms

For access to NVIDIA Math Libraries, please refer to `nvmath-python`_.
For access to NVIDIA CPU & GPU Math Libraries, please refer to `nvmath-python`_.

.. _nvmath-python: https://docs.nvidia.com/cuda/nvmath-python/latest

Expand Down