You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
produce C {
for (i0.outer, 0, 3) {
for (i0.inner.outer, 0, 2) {
for (i0.inner.inner, 0, 32) {
if (likely(((i0.outer*48) < ((100 - i0.inner.inner) - (i0.inner.outer*32))))) {
C[(((i0.outer*48) + (i0.inner.outer*32)) + i0.inner.inner)] = A[(((i0.outer*48) + (i0.inner.outer*32)) + i0.inner.inner)]
}
}
}
}
}
While the generated code is functionally correct but it's inefficient in the sense that some points of the iteration space are visited more than once. In particular, when i0.outer is 0, we execute the assignment for points [0-63], when i0.outer is 1, we execute it for points [48-99], and when i0.outer is 2, we execute it for points [96-99].
If we add a predicate that relates i0.inner.outer and i0.inner.inner (i.e., i0.inner.outer*32 + i0.inner.inner < 48) the problem will be solved.
The text was updated successfully, but these errors were encountered:
Consider the following program in which we split a loop by a factor of 48 and then we split the inner loop (which is of size 48) by a factor of 32.
The generated Halide IR is:
While the generated code is functionally correct but it's inefficient in the sense that some points of the iteration space are visited more than once. In particular, when
i0.outer
is 0, we execute the assignment for points [0-63], wheni0.outer
is 1, we execute it for points [48-99], and wheni0.outer
is 2, we execute it for points [96-99].If we add a predicate that relates
i0.inner.outer
andi0.inner.inner
(i.e.,i0.inner.outer*32 + i0.inner.inner < 48
) the problem will be solved.The text was updated successfully, but these errors were encountered: