-
hi, I hope to use halide to simulate a three-level cache architecture for cpu. and the memory size is divided by each top layer. such as
I use this cpp function to run Func l1, l2, l3, l2_out, l3_out;
l3.store_in(MemoryType::L3);
l2.store_in(MemoryType::L2);
l1.store_in(MemoryType::L1);
l3_out.store_in(MemoryType::L3);
l2_out.store_in(MemoryType::L2);
auto l2_size = 16*256, l1_size = 4*256;
for (auto i = 0; i < 16; i++) { // 16 times l3->l2
RDom r_l2(0, l2_size, "l2_reduce");
l2(x2) = l3(x2);
l2(r_l2) = l3(i * l2_size + r_l2);
for (auto j = 0; j < 4; j++) { // 4 times l1->l2
RDom r_l1(0, l1_size, "l1_reduce");
l1(x1) = l2(j * l1_size + r_l1);
l2_out(j*l1_size+r_l1) = l1(r_l1);
}
l3_out(i*l2_size + r_l2) = l2_out(r_l2);
} it seems I can't define reduction in pure fuction definition.
is there any way to run it? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Those for loops make me worry you're misunderstanding the metaprogramming aspect of Halide. Each loop iteration you have adds new definitions for those Funcs, rather than actually moving any data around. I'd suggest starting with the tutorials here: https://halide-lang.org/tutorials/tutorial_introduction.html |
Beta Was this translation helpful? Give feedback.
-
finally, I understand the meta function, so I need to add dma instruction with schedule functions , thanks. |
Beta Was this translation helpful? Give feedback.
finally, I understand the meta function, so I need to add dma instruction with schedule functions , thanks.