You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for (inti=0; i<M; i++)
for (intj=0; j<N; j++)
po[i+j*M] =pa[j+i*N];
than this (1D-blocking)
// No min function in C ...#definemin(a,b) (((a)<(b))?(a):(b))
constblck=32;
for (inti=0; i<M; i+=blck)
for (intj=0; j<N; ++j)
for (intii=i; ii<min(i+blck,M); ++ii)
po[ii+j*M] =pa[j+ii*N];
or 4x slower than this (2D blocking)
#definemin(a,b) (((a)<(b))?(a):(b))
constblck=32;
for (intj=0; j<N; j+=blck)
for (inti=0; i<M; i+=blck)
for (intjj=j; jj<min(j+blck, N); ++jj)
for (intii=i; ii<min(i+blck,M); ++ii)
po[ii+jj*M] =pa[jj+ii*N];
The text was updated successfully, but these errors were encountered:
As discussed today #6 will be merged and the review items will be logged so that in 3~4 weeks they can be revisited to improve the code.
Some helpful background to understand what those advice and isntance columns mean:
Original idea;
Tasks:
Document the ConstraintCluster purpose
halo2/halo2_proofs/src/plonk/evaluation.rs
Lines 181 to 197 in 9eaccbb
Explain where the 5 comes from, potentially replacing it by its source instead of a hard coded constant that may rot after refactoring
halo2/halo2_proofs/src/plonk/evaluation.rs
Lines 313 to 315 in 9eaccbb
halo2/halo2_proofs/src/plonk/evaluation.rs
Lines 767 to 777 in 9eaccbb
I suspect it's linked to the number of lookup columns?
Break down
evaluate_h
functionhalo2/halo2_proofs/src/plonk/evaluation.rs
Lines 393 to 976 in 9eaccbb
Optimize transposition
halo2/halo2_proofs/src/poly/domain.rs
Lines 188 to 212 in 9eaccbb
The transposition could be done without an intermediary step + flatten at the end.
Also if this is a bottleneck, transposition can be improved 4x even on serial, with cache blocking.
See my benchmarks of transposition algorithms at: https://github.com/mratsim/laser/blob/e23b5d63f58441968188fb95e16862d1498bb845/benchmarks/transpose/transpose_bench.nim#L558-L674
The change in algorithm is simple,
this is 3x slower
than this (1D-blocking)
or 4x slower than this (2D blocking)
The text was updated successfully, but these errors were encountered: