Skip to content

Commit

Permalink
feat: passes copy_cycles by const reference to avoid copying (#8051)
Browse files Browse the repository at this point in the history
As part of my memory deep-dive, I discovered some unnecessary memory
copying of the copy_cycles vector.

In particular, this vector has length variables.size(), which can be
huge. In the case of ultra_honk_bench on a mock 2^20 sized circuit which
has approximately 2^20 variables as well, this copy_cycles vector takes
up around 24MB (for just the 2^20 empty vectors) plus 32MB of memory for
the actual data (it has full capacity of 36MB, since it's made up of a
bunch of small vectors of lengths 2, 3, and 4, and the length 3 vectors
have the same capacity as a length 4 vector).

When this copy_cycles vector is passed into
`compute_permutation_argument_polynomials` and
`compute_permutation_mapping`, it was actually passed by value, which
meant that 24+32=56MB of memory was getting copied twice. This had the
effect of bumping up peak memory during proving key construction by an
extra 112MB.

More context on why I think memory during proving key construction is
important:
Proving key construction memory is not the memory bottleneck in the case
of ultra_honk_bench (ever since my recent memory fix removing
unnecessary polynomial initialization in the TraceData constructor)
since sumcheck takes up way more memory. However, we care a lot about
peak memory during proving key construction because we expect this to be
the bottleneck during ClientIVC when we need to construct the proving
key of one instance while holding all the memory of the accumulator as
well.

The graph below shows the 112MB allocation under the first half of
compute_permutation_argument_polynomials.
![Tracy
graph](https://github.com/user-attachments/assets/bc4df740-edf2-42e4-8304-9f24a19d9c76)

Effect on timing:
Not sure why but it seems to make the benchmark 150ms slower?
Before: `construct_proof_ultrahonk_power_of_2/20 4858 ms 4423 ms 1`
After: `construct_proof_ultrahonk_power_of_2/20 5016 ms 4579 ms 1`
  • Loading branch information
lucasxia01 authored Aug 16, 2024
1 parent ec5a5fb commit 495d363
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ template <typename Flavor, bool generalized>
PermutationMapping<Flavor::NUM_WIRES, generalized> compute_permutation_mapping(
const typename Flavor::CircuitBuilder& circuit_constructor,
typename Flavor::ProvingKey* proving_key,
std::vector<CyclicPermutation> wire_copy_cycles)
const std::vector<CyclicPermutation>& wire_copy_cycles)
{

// Initialize the table of permutations so that every element points to itself
Expand Down Expand Up @@ -368,7 +368,7 @@ inline std::tuple<LegacyPolynomial<FF>, LegacyPolynomial<FF>> compute_first_and_
template <typename Flavor>
void compute_permutation_argument_polynomials(const typename Flavor::CircuitBuilder& circuit,
typename Flavor::ProvingKey* key,
std::vector<CyclicPermutation> copy_cycles)
const std::vector<CyclicPermutation>& copy_cycles)
{
constexpr bool generalized = IsUltraPlonkFlavor<Flavor> || IsUltraFlavor<Flavor>;
auto mapping = compute_permutation_mapping<Flavor, generalized>(circuit, key, copy_cycles);
Expand Down

0 comments on commit 495d363

Please sign in to comment.