You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cross platform binary and code generation with best scheduling on computational graph applied by Halide or Halide autoschedulers, reduced memory usage by scheduling every used network model before execution
Current Behavior
Multiple conditional defines for different platforms and instruction bloated code, lack of GPU support i.e. OpenCL, OpenGL Compute, CUDA, current computational graph has great parallelism but very frustrating locality
The text was updated successfully, but these errors were encountered:
Despite of Apr 1 and a little bit offensive and outdated "Current Behaviour", it should be pretty straightforward to translate llama.cpp graph for some model to halide, to experiment on performance improvement which could be possible to achieve.
Prerequisites
Examples of work with neural networks in Halide:
https://github.com/halide/Halide/blob/main/apps/resnet_50/Resnet50Generator.cpp
https://github.com/halide/Halide/tree/main/apps/hannk
https://github.com/halide/Halide/blob/main/apps/onnx/model.cpp
Expected Behavior
Cross platform binary and code generation with best scheduling on computational graph applied by Halide or Halide autoschedulers, reduced memory usage by scheduling every used network model before execution
Current Behavior
Multiple conditional defines for different platforms and instruction bloated code, lack of GPU support i.e. OpenCL, OpenGL Compute, CUDA, current computational graph has great parallelism but very frustrating locality
The text was updated successfully, but these errors were encountered: