Skip to content

Releases: JuliaGPU/AcceleratedKernels.jl

v0.2.1

01 Dec 04:24
99247e6
Compare
Choose a tag to compare

AcceleratedKernels v0.2.1

Diff since v0.2.0

Merged pull requests:

  • Add Buildkite CI for CUDA (#9) (@jpsamaroo)
  • added foreach + tests. Started updating indices within kernels to use… local types without int64 promotions - about 25% faster in sort for example. Set default block_size to 256 (#11) (@anicusan)

Closed issues:

  • Support for a :serial scheduler (#7)

v0.2.0

08 Nov 23:13
5068a6d
Compare
Choose a tag to compare

AcceleratedKernels v0.2.0

  • N-dimensional reduce and mapreduce
  • map
  • docs + and tests for each of the above.
  • In-place functions now also return the modified argument as in Base.

Diff since v0.1.0

Merged pull requests:

v0.1.0

16 Oct 16:24
1adfc5f
Compare
Choose a tag to compare

AcceleratedKernels v0.1.0

Merged pull requests:

  • Remove unnecessary Const (#3) (@pxl-th)
  • Remove redundant Const and inbounds macros (#4) (@pxl-th)

Closed issues:

  • Use GPUArraysCore? (#1)
  • Invalid IR for sortperm! on AMDGPU (#2)

0.1.0

25 Sep 22:57
8cb6bf3
Compare
Choose a tag to compare

First release of AcceleratedKernels.jl, for archiving purposes supporting the "AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled Codebase" paper.