Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major revamp to Halide 16.0 with Anderson2021 GPU autoscheduler #67

Open
6 of 7 tasks
antonysigma opened this issue Apr 21, 2023 · 0 comments
Open
6 of 7 tasks

Comments

@antonysigma
Copy link
Collaborator

antonysigma commented Apr 21, 2023

(Adding the task dependencies for my own reminder.)

  1. Wait for the Halide 16.0 release.
  2. Refactor the Halide::BoundaryConditions calls to use the new APIs;
  3. Similarly, refactor Generator::* related code to use Halide 16.0 APIs;
  4. In algorithms/ladmm.py, ensure all Numpy matrices are Fortran order by default; this avoids the frequent C-order to F-order typecasting overhead in the (L-)ADMM iterations;
  5. Similarly, ensure Halide-accelerated linear operators, e.g. A_mask.cpython.so writes to the output buffers in F-order, not some orphan buffers that are immediately destroyed. This should solve the convergence failure bugs whenever implem='Halide' is defined.
  6. Wait until Anderson2021 algorithm optimizer is ready for production (ASAN reports out-of-bounds read error in anderson2021_test_apps_autoscheduler halide/Halide#7606).
  7. (Optional) Compile the Halide generators with C++20; this should cut the compile time in half thanks to new C++ Concepts feature;
  8. (Optional) reduce code bloat of ladmm-iter-gen.cpp with the broadcast operator Halide::_.
  9. Replace Li2018 autoscheduler with Anderson2021: the latter utilizes the GPU cache and shared memory in the SM far better.

References:
halide/Halide#6856
halide/Halide#7459

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant