Accumulate / Prefix Sum / Scan
Compute accumulated running totals along a sequence by applying a binary operator to all elements up to the current one; often used in GPU programming as a first step in finding / extracting subsets of data.
+Accumulate / Prefix Sum / Scan
Compute accumulated running totals along a sequence by applying a binary operator to all elements up to the current one; often used in GPU programming as a first step in finding / extracting subsets of data.
accumulate!
(in-place),accumulate
(allocating); inclusive or exclusive.
@@ -8,11 +8,11 @@
Function signature:
accumulate!(op, v::AbstractGPUVector; init, inclusive::Bool=true,
- block_size::Int=128,
+ block_size::Int=256,
temp::Union{Nothing, AbstractGPUVector}=nothing,
temp_flags::Union{Nothing, AbstractGPUVector}=nothing)
accumulate(op, v::AbstractGPUVector; init, inclusive::Bool=true,
- block_size::Int=128,
+ block_size::Int=256,
temp::Union{Nothing, AbstractGPUVector}=nothing,
temp_flags::Union{Nothing, AbstractGPUVector}=nothing)
Example computing an inclusive prefix sum (the typical GPU "scan"):
@@ -22,4 +22,4 @@ v = oneAPI.ones(Int32, 100_000) AK.accumulate!(+, v, init=0)The temporaries temp
and temp_flags
should both have at least (length(v) + 2 * block_size - 1) ÷ (2 * block_size)
elements; eltype(v) === eltype(temp)
; the elements in temp_flags
can be any integers, but Int8
is used by default to reduce memory usage.
Settings
This document was generated with Documenter.jl version 1.7.0 on Tuesday 12 November 2024. Using Julia version 1.11.1.