Skip to content

Commit

Permalink
[vulkan phase2] Vulkan Runtime (#6924)
Browse files Browse the repository at this point in the history
* Import Vulkan runtime changes from personal branch

* Fix build to work with latest changes in main

* Hookup Vulkan into Target, DeviceInterface and OffloadGPULoops

* Add Vulkan runtime to Makefile

* Add Vulkan target to Python bindings

* Add runtime linker support to target Vulkan CodeGen

* Add Vulkan windows decorator to runtime targets

* Wrap debug messages for internal runtime classes with DEBUG_INTERNAL
Error on failed string termination

* Silence clang-tidy warnings for redundant expressions on Vulkan enum values

* Clang tidy & format pass

* Fix formatting for single line statements

* Move Vulkan option to top-level CMakeLists.txt and enable SPIR-V as needed

* Fix Vulkan & SPIRV dependencies for makefile

* Add Halide version info to Makefile
Add HALIDE_VERSION compiler definitions to compilation

* Add HL_VERSION_FLAGS to RUNTIME_CXX_FLAGS

* Finish refactoring of Vulkan CodeGen to use SpirV-IR.
Added splitmix64 based hashing scheme for types and constants.
Numerous fixes to instruction packing.
Added debug symbols to all variables.

* Clang tidy/format pass.

* Fix formatting

* Remove leftover ifdef

* Fix build error for clang OSX for mismatched type comparison

* Refactor loops and conditionals to use blocks

* Clang tidy/format pass

* Add detailed comments for acquire context parameters

* Add comments describing loader method exports and dynamically resolved function pointers
Other minor cleanups

* Change aborts to debug asserts for context parameters.
Add error handling to acquire context.

* Cache Vulkan descriptor sets and other shader module objects in compilation cache for reuse

* Replace platform specific strncpy for grabbing Extension strings with StringUtils::copy_upto

* Enable device features for selected device

* Fix alignment constraints for to match Vulkan buffer memory requirements.
Add env vars to control Vulkan Memory Allocator config.

* Add Vulkan to list of supported APIs in README.md
Add Vulkan specific README_vulkan.md

* Clang tidy/format pass

* Fix conform_alignment to handle zero values

* Fix declaration of custom_allocation_callbacks to be static.
Change to constexpr for invalid values

* Whitespace change to trigger build.

* Handle Vulkan kernels that don't require storage buffers.
Updated test status. Fixes 7 test cases.

* Add src/mini_vulkan.h Apache 2.0 license requirements to License file

* Add descriptor set binding info as pre-amble to SPIR-V code module
Fix shared memory allocation to use global variables in workgroup storage space
Add extern calls for spirv and glsl builtins
Add memory fence call to gpu thread barrier
Add missing visitors to Vulkan CodeGen
Add scalar index & vector index methods for load/store

* Clang tidy & format pass

* Update test results for Vulkan docs. Passing: 326 Failing: 39

* Fix formatting

* Remove extraneous parentheses for is_array_type()

* Add Vulkan library to linkage fo Halide generator helpers

* Add SPIR-V formatted output (for debugging)

* Only declare SIMT intrinics that are actually used.
Cleanup & refactor add_kernel method.

* Add Vulkan handler to test targets

* Clang format/tidy pass

* Add doc-strings to SPIR-V interface

* Adjust runtime array to widest vector width based on alignment and dense vector loads/stores
Fix scalar and vector load/stores
Fix casts for vectors
Add missing nan, inf, neg_inf, is_finite builtins

* Add missing bitwise and logical and methods.
Cleanups.

* Add comments about necessary packages on Ubuntu v22.04 vs earlier versions

* Clang tidy & format pass.

* Update Vulkan test results. Pass: 329 Fail: 36

* Remove unused Produce/Consume visitor method

* Fix Molten VK initialization to work with v1.3+ loader
Add support for direct casts for same-size types
Add missing mux, mix, lerp, sinh, tanh, etc intrinsics
Add explicit storage access for variables
Add a macro to enable debug messages in Vulkan Memory Allocator

* Disable dynamic shared memory portion of test for Vulkan (since its not supported yet)

* Disable uncached portion of test for Vulkan (since it may OOM)

* Disable float64 support in Type::supports_type() for Vulkan target since it's not widely supported

* Fix Shuffle to handle all known cases
Hookup VulkanMemoryAllocator to gpu allocation cache.
Fix if_then_else to allow calls and statements to be used
Fix loop counter comparison, and don't allow dynamic loops to be unrolled.
Fix scalarize to use CompositeInsert instead of VectorInsertDynamic
Fix FMod to use FRem (cause SPIR-V's FMod doesn't do what you'd expect ... but FRem does?!)
Use exact same sematics for barriers as GLSL Compute ... still not passing everything
Fix SPIR-V block termination checks, keys for null constants, and other cleanups

* Clang tidy & format pass

* Update correctness test results.  PASS: 338, FAIL: 27

* Move counter inside debug #define to fix build

* Relax tolerance for newton's method to match other GPU APIs
Skip gpu dynamic shared testfor Vulkan (since dynamic shared allocations aren't supported yet)
Update correctness test status. PASS: 340, FAIL: 25

* Clang format/tidy pass

* Skip Vulkan for float64 for correctness test round (since f64 is optional)

* Skip Vulkan for tests that rely upon device crop, and slice.

* Only test small vector widths for Vulkan (since widths >=8 are optional)

* Caninicalize gpu vars for Vulkan

* Fix loop initialization, and increments
Add all explicit types, and fix constant declarations
Add missing fast intrinsics
Convert results of logical ops into expected types (instead of bools)

* Add SpvInstruction::add_operands(), add_immediates() and template based
append()
Make integer logical operations explicit.
Better handling of constant data.

* Clang format & tidy pass

* Fix windows build ... refactor convert_to_bool to use std::vectors
rather than dynamic fixed sized arrays

* Skip asyn_device_copy, device_buffer_copy, device_crop, and device_slice
tests for Vulkan (for now).

* Don't test large vector widths for Vulkan (since they are optionally
supported)

* Clear Vulkan buffer allocations prior to use (tbd if this is necessary)

* Skip Vulkan for async copy chain test

* Skip Vulkan for interpreter test

* Clang tidy/format pass

* Fix formatting

* Fix build ... use error messages for errors

* Separate shared memory resources by element type for Vulkan.

* Add Vulkan to conditional for fusing gpu loops

* Reorder reset method to match declaration ordering.

* Cleanup debug log messages for Vulkan resources

* Assert alignment is power of two

* Only split regions that have already been freed.
Add more debug messages to log

* Explicitly cleanup Vulkan command buffers as after they are used
Avoid recreating descriptor sets
Tidy up Vulkan debug messages

* Fix Div, Mod, and div_round_to_zero for integer cases
Cleanup reset method

* Skip Vulkan for async_copy_chain

* Skip 64-bit values on Vulkan since they are optionally supported

* Skip interleave_rgb for Vulkan (which doesn't support cropping)

* Skip interpreter for Vulkan (which doesn't support dynamic allocation of
shared mem).

* Clang Tidy/Format pass

* Handle calls to pow with negative values for Vulkan
Add integer and float constant helpers to SPIRV

* Only test real numbers for pow with Vulkan

* Clang tidy/format pass

* Fix logic so a region request of an entire block matches if exactly the same size as an empty block

* Create a zero size buffer to check for alignment
Return null handles after freeing

* Add more verbose debug output for malloc

* Fix UConvert logic to avoid narrowing an integer type less than 8 bits
Remove optimization path for division which seems to fail worse than DIV
Cleanup DIV and MOD operators

* Clang format/tidy pass

* Fix SConvert & UConvert ops

* Add retain semantics to block allocator interface
Update test to validate retain/release/reclaim functionality

* Implement device_crop, device_slice and release_crop for Vulkan.
Re-enable device_crop, device_slice and interleave_rgb tests.

* Clang format/tidy pass

* Implement device copy for Vulkan.
Enable device copy test.

* Clang format/tidy pass

* Fix signed mod operator and use euclidean identity (just like glsl)

* Clang format/tidy pass

* Fix to handle Mod on vectors (use vector constant for bitwise and)

* Fix pow operator for Vulkan, and re-enable math test to full range.

* Add error checking for return types for conditionals
Use bool types for ops that require them, and adapt to expected return
types

* Handle deallocation for existing regions prior to coalescing.
Cleanup region allocator logic for availability.
Augment block_allocator test to cover allocation reuse.

* Clang tidy/format pass

* Fix reserved accounting for regions

* Add more details to Windows specific Vulkan build config

* Update SPIR-V headers to v1.6

* Add support for dynamic shared memory allocations for Vulkan
Add dynamic workgroup dispatching to Vulkan
Add optional feature flags for Vulkan capabilities
Add Vulkan API version flags for target features
Enable v1.3 path if requested
Re-enable tests for added features
Update Vulkan docs with status updates and feature flags

* Enable Vulkan asyc_device_copy test.

* Disable Vulkan performance test for async gpu (for now).

* Disable Vulkan from python AOT tests and tutorials (since it requires linkage against the vulkan loader system library).

* Update Vulkan readme with latest status.  Everything works!  More or less. =)

* Clang format pass

* Cleanup formatting for Halide version info in Makefile

* Fix typos and address review comments for Vulkan readme

* Change value casts to match Halide conventions

* Fix typos in comments

* Add static_assert to rotl to make compilation errors clearer (instead of using enable_if)
Fix debug(3) formatting to avoid super long messages
Use lookup table for SPIR-V op code names

* Fix typos and logic for Vulkan capabilities

* Remove leftover debug ifdef

* Fix typo in comments

* Rename copy_upto(...) method to be copy_up_to(...)

* Handle error case for uninitialized buffer allocation (rather than abort)
Fix typos in comments

* Support any arbitary number of devices and queues for context creation
Fix typos in comments

* Add get/set alloc_config methods and API hooks for configuring the VulkanMemoryAllocator

* Remove leftover debug ifdef

* Hookup API methods for get/set alloc_config when initializing the VulkanMemoryAllocator

* Remove empty lines in main

* Add required capability flags for 8-bit and 16-bit uniform and storage buffer access
Handle casts for GLSL ops (spec requires all args to be the same type as the return type)

* Add VkPhysicalDevice8BitStorageFeaturesKHR and related constants

* Query for 8-bit and 16-bit uniform and storage access support.
Enable these as part of the device feature query chain.

* Use VK_WHOLE_SIZE for setting buffer (to pass validation ... otherwise size has to be a multiple of alignment)
Remove useless debug asserts for static variables
Fix debug logging messages for allocations of scalars (which may not have a dim array)

* Query for device limits to enforce min alignment constraints for storage and uniform buffers

* Fix shutdown sequence to iterate over descriptor sets
Avoid bug in validation layer by reordering destruction sequence

* Clang format & tidy pass

* Fix logic for locating entry point shader binding ... assume exact match for entry point name
Cleanup entry point binding variables and clarify usage

* Remove accidentally uncommented debug statements

* Cleanup debug output for buffer related updates

* Fix split and allocate methods in region allocator to fix issues with alignment constraints
- discovered a hang if requested size couldn't be fulfilled after adjusting to aligned sizes
- cause was incorrect splitting of existing regions
Cleanup region allocator iteration, cleanup and shutdown
Added maximum_pool_size configuration option to Vulkan Memory Allocator to restrict pool sizes

* Added notes about TARGET_VULKAN=ON being the default now
Added links to LunarG MoltenVK SDK installer, and brew packages

* Fix markdown formatting

* Fix error code handling in Vulkan runtime and internal datastructures.
Refactor all (well nearly all) return values to use halide error codes.
Reduce the usage of abort_if() for recoverable errors.

* Fix typo in error message

* Fix typo in readme

* Skip GPU allocation cache test on MacOSX since MoltenVK only supports 30
buffers to be allocated

* Skip widening reduction test on Vulkan for Mac OSX/IOS since MoltenVK
fails to translate calls with vector types for builtins like min/max. etc

* Skip doubles in vector cast test on Vulkan for Mac OSX/IOS since Molten
doesn't support them

* Skip gpu_dynamic_shared and gpu_specialize test for Vulkan on Mac
OSX/IOS since MoltenVK doesn't support the dynamic shared memory
allocation or dynamic grid size.

* Clang format / tidy pass

* Resolve conflicts for mini_webgpu.h ... revert to main

* Use unique intrinsic var names for each kernel
Cleanup constant value declarations with template helper methods
Add comments on workgroup size usage

* Wrap debug output under ifdef DEBUG_RUNTIME_INTERNAL macro guard
Add nearest_multiple constraint to block/region allocator

* Add vk_clear_device_buffer utility method
Add nearest_multiple constrating to vulkan memory allocatori
+ fixes correctness/multiple_outputs test
Add vkCreateBuffer/vkDestroyBuffer debug output i
+ for gpu_object_lifetime_tracker
Cleanup shutdown for shader_module destruction

* Add note about nearest_multiple constraint for vulkan memory allocator

* Hookup gpu_object_lifetime_tracker with Vulkan debug statements

* Skip dynamic shared memory portion of test for Vulkan on iOS/OSX.

* Fix stale comment for float type support.
Fix incorrect lowering for intrinsic.

---------

Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com>
Co-authored-by: Steven Johnson <srj@google.com>
  • Loading branch information
3 people committed Apr 25, 2023
1 parent fcddcf8 commit 4d86539
Show file tree
Hide file tree
Showing 69 changed files with 21,799 additions and 1,770 deletions.
4 changes: 4 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,10 @@ endif ()

# Enable the SPIR-V target if requested (must declare before processing dependencies)
option(TARGET_SPIRV "Include SPIR-V target" OFF)
option(TARGET_VULKAN "Include Vulkan target" ON)
if (TARGET_VULKAN)
set(TARGET_SPIRV ON) # required
endif()

##
# Import dependencies
Expand Down
17 changes: 17 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,23 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM,OUT OF OR IN CONNECTION WITH THE MATERIALS OR THE USE OR OTHER DEALINGS
IN THE MATERIALS.


----

src/mini_vulkan.h is Copyright (c) 2014-2017 The Khronos Group Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

----

apps/linear_algebra/include/cblas.h is licensed under the BLAS license.
Expand Down
50 changes: 49 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@
# For correctness and performance tests this include halide build time and run time. For
# the tests in test/generator/ this times only the halide build time.

# Halide project version
HALIDE_VERSION_MAJOR ?= 15
HALIDE_VERSION_MINOR ?= 0
HALIDE_VERSION_PATCH ?= 0
HALIDE_VERSION=$(HALIDE_VERSION_MAJOR).$(HALIDE_VERSION_MINOR).$(HALIDE_VERSION_PATCH)

# Disable built-in makefile rules for all apps to avoid pointless file-system
# scanning and general weirdness resulting from implicit rules.
MAKEFLAGS += --no-builtin-rules
Expand Down Expand Up @@ -124,6 +130,8 @@ WITH_OPENCL ?= not-empty
WITH_METAL ?= not-empty
WITH_OPENGLCOMPUTE ?= not-empty
WITH_D3D12 ?= not-empty
WITH_VULKAN ?= not-empty
WITH_SPIRV ?= not-empty
WITH_WEBGPU ?= not-empty
WITH_INTROSPECTION ?= not-empty
WITH_EXCEPTIONS ?=
Expand All @@ -134,6 +142,12 @@ WITH_LLVM_INSIDE_SHARED_LIBHALIDE ?= not-empty
HL_TARGET ?= host
HL_JIT_TARGET ?= host

HL_VERSION_FLAGS = \
-DHALIDE_VERSION="$(HALIDE_VERSION)" \
-DHALIDE_VERSION_MAJOR=$(HALIDE_VERSION_MAJOR) \
-DHALIDE_VERSION_MINOR=$(HALIDE_VERSION_MINOR) \
-DHALIDE_VERSION_PATCH=$(HALIDE_VERSION_PATCH)

X86_CXX_FLAGS=$(if $(WITH_X86), -DWITH_X86, )
X86_LLVM_CONFIG_LIB=$(if $(WITH_X86), x86, )

Expand Down Expand Up @@ -176,6 +190,12 @@ EXCEPTIONS_CXX_FLAGS=$(if $(WITH_EXCEPTIONS), -DHALIDE_WITH_EXCEPTIONS -fexcepti
HEXAGON_CXX_FLAGS=$(if $(WITH_HEXAGON), -DWITH_HEXAGON, )
HEXAGON_LLVM_CONFIG_LIB=$(if $(WITH_HEXAGON), hexagon, )

SPIRV_CXX_FLAGS=$(if $(WITH_SPIRV), -DWITH_SPIRV -isystem $(ROOT_DIR)/dependencies/spirv/include, )
SPIRV_LLVM_CONFIG_LIB=$(if $(WITH_SPIRV), , )

VULKAN_CXX_FLAGS=$(if $(WITH_VULKAN), -DWITH_VULKAN, )
VULKAN_LLVM_CONFIG_LIB=$(if $(WITH_VULKAN), , )

WEBASSEMBLY_CXX_FLAGS=$(if $(WITH_WEBASSEMBLY), -DWITH_WEBASSEMBLY, )
WEBASSEMBLY_LLVM_CONFIG_LIB=$(if $(WITH_WEBASSEMBLY), webassembly, )

Expand All @@ -198,7 +218,7 @@ LLVM_CXX_FLAGS_LIBCPP := $(findstring -stdlib=libc++, $(LLVM_CXX_FLAGS))
endif

CXX_FLAGS = $(CXXFLAGS) $(CXX_WARNING_FLAGS) $(RTTI_CXX_FLAGS) -Woverloaded-virtual $(FPIC) $(OPTIMIZE) -fno-omit-frame-pointer -DCOMPILING_HALIDE

CXX_FLAGS += $(HL_VERSION_FLAGS)
CXX_FLAGS += $(LLVM_CXX_FLAGS)
CXX_FLAGS += $(PTX_CXX_FLAGS)
CXX_FLAGS += $(ARM_CXX_FLAGS)
Expand All @@ -215,6 +235,8 @@ CXX_FLAGS += $(INTROSPECTION_CXX_FLAGS)
CXX_FLAGS += $(EXCEPTIONS_CXX_FLAGS)
CXX_FLAGS += $(AMDGPU_CXX_FLAGS)
CXX_FLAGS += $(RISCV_CXX_FLAGS)
CXX_FLAGS += $(SPIRV_CXX_FLAGS)
CXX_FLAGS += $(VULKAN_CXX_FLAGS)
CXX_FLAGS += $(WEBASSEMBLY_CXX_FLAGS)

# This is required on some hosts like powerpc64le-linux-gnu because we may build
Expand All @@ -241,6 +263,8 @@ LLVM_STATIC_LIBFILES = \
$(POWERPC_LLVM_CONFIG_LIB) \
$(HEXAGON_LLVM_CONFIG_LIB) \
$(AMDGPU_LLVM_CONFIG_LIB) \
$(SPIRV_LLVM_CONFIG_LIB) \
$(VULKAN_LLVM_CONFIG_LIB) \
$(WEBASSEMBLY_LLVM_CONFIG_LIB) \
$(RISCV_LLVM_CONFIG_LIB)

Expand All @@ -265,6 +289,7 @@ TEST_LD_FLAGS = -L$(BIN_DIR) -lHalide $(COMMON_LD_FLAGS)

# In the tests, some of our expectations change depending on the llvm version
TEST_CXX_FLAGS += -DLLVM_VERSION=$(LLVM_VERSION_TIMES_10)
TEST_CXX_FLAGS += $(HL_VERSION_FLAGS)

# In the tests, default to exporting no symbols that aren't explicitly exported
TEST_CXX_FLAGS += -fvisibility=hidden -fvisibility-inlines-hidden
Expand Down Expand Up @@ -305,13 +330,22 @@ TEST_METAL = 1
endif
endif

ifneq ($(WITH_VULKAN), )
ifneq (,$(findstring vulkan,$(HL_TARGET)))
TEST_VULKAN = 1
endif
endif

ifeq ($(UNAME), Linux)
ifneq ($(TEST_CUDA), )
CUDA_LD_FLAGS ?= -L/usr/lib/nvidia-current -lcuda
endif
ifneq ($(TEST_OPENCL), )
OPENCL_LD_FLAGS ?= -lOpenCL
endif
ifneq ($(TEST_VULKAN), )
VULKAN_LD_FLAGS ?= -lvulkan
endif
OPENGL_LD_FLAGS ?= -lGL
HOST_OS=linux
endif
Expand All @@ -324,6 +358,10 @@ endif
ifneq ($(TEST_OPENCL), )
OPENCL_LD_FLAGS ?= -framework OpenCL
endif
ifneq ($(TEST_VULKAN), )
# The Vulkan loader is distributed as a dylib on OSX (not a framework)
VULKAN_LD_FLAGS ?= -lvulkan
endif
ifneq ($(TEST_METAL), )
METAL_LD_FLAGS ?= -framework Metal -framework Foundation
endif
Expand All @@ -335,6 +373,10 @@ ifneq ($(TEST_OPENCL), )
TEST_CXX_FLAGS += -DTEST_OPENCL
endif

ifneq ($(TEST_VULKAN), )
TEST_CXX_FLAGS += -DTEST_VULKAN
endif

ifneq ($(TEST_METAL), )
# Using Metal APIs requires writing Objective-C++ (or Swift). Add ObjC++
# to allow tests to create and destroy Metal contexts, etc. This requires
Expand Down Expand Up @@ -433,6 +475,7 @@ SOURCE_FILES = \
CodeGen_LLVM.cpp \
CodeGen_Metal_Dev.cpp \
CodeGen_OpenCL_Dev.cpp \
CodeGen_Vulkan_Dev.cpp \
CodeGen_OpenGLCompute_Dev.cpp \
CodeGen_Posix.cpp \
CodeGen_PowerPC.cpp \
Expand Down Expand Up @@ -623,6 +666,7 @@ HEADER_FILES = \
CodeGen_LLVM.h \
CodeGen_Metal_Dev.h \
CodeGen_OpenCL_Dev.h \
CodeGen_Vulkan_Dev.h \
CodeGen_OpenGLCompute_Dev.h \
CodeGen_Posix.h \
CodeGen_PTX_Dev.h \
Expand Down Expand Up @@ -853,8 +897,10 @@ RUNTIME_CPP_COMPONENTS = \
windows_profiler \
windows_threads \
windows_threads_tsan \
windows_vulkan \
windows_yield \
write_debug_image \
vulkan \
x86_cpu_features \

RUNTIME_LL_COMPONENTS = \
Expand Down Expand Up @@ -883,6 +929,7 @@ RUNTIME_EXPORTED_INCLUDES = $(INCLUDE_DIR)/HalideRuntime.h \
$(INCLUDE_DIR)/HalideRuntimeOpenGLCompute.h \
$(INCLUDE_DIR)/HalideRuntimeMetal.h \
$(INCLUDE_DIR)/HalideRuntimeQurt.h \
$(INCLUDE_DIR)/HalideRuntimeVulkan.h \
$(INCLUDE_DIR)/HalideRuntimeWebGPU.h \
$(INCLUDE_DIR)/HalideBuffer.h \
$(INCLUDE_DIR)/HalidePyTorchHelpers.h \
Expand Down Expand Up @@ -1049,6 +1096,7 @@ RUNTIME_CXX_FLAGS = \
-Wno-unused-function \
-Wvla \
-Wsign-compare
RUNTIME_CXX_FLAGS += $(HL_VERSION_FLAGS)

$(BUILD_DIR)/initmod.windows_%_x86_32.ll: $(SRC_DIR)/runtime/windows_%_x86.cpp $(BUILD_DIR)/clang_ok
@mkdir -p $(@D)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ currently targets:
- CPU architectures: X86, ARM, Hexagon, PowerPC, RISC-V
- Operating systems: Linux, Windows, macOS, Android, iOS, Qualcomm QuRT
- GPU Compute APIs: CUDA, OpenCL, OpenGL Compute Shaders, Apple Metal, Microsoft
Direct X 12
Direct X 12, Vulkan

Rather than being a standalone programming language, Halide is embedded in C++.
This means you write C++ code that builds an in-memory representation of a
Expand Down
Loading

0 comments on commit 4d86539

Please sign in to comment.