tari-project · CjS77 · Jul 28, 2023 · Jul 20, 2023 · Jul 22, 2023 · Jul 23, 2023
diff --git a/src/RFC-0181_BulletproofsPlus.md b/src/RFC-0181_BulletproofsPlus.md
@@ -76,9 +76,20 @@ An approach to mask and value recovery was [used by Grin](https://github.com/mim
 
 To reduce confusion in our description and more closely match implementation libraries, we use additive notation and uppercase letters for group elements, and otherwise assume notation from the preprints.
 Denote the commitment value generator by $G\_c$ and the commitment mask generator vector by $\vec{H}\_c$.
-We note that the terms of the vector $\vec{d}$ can be succinctly expressed by noting that for $0 \leq i < n$ and $0 \leq j < m$ we have $d\_{jn+i} = z^{2(j+1)} 2^i$, which can be efficiently defined iteratively.
 Because the preprint uses the notation $A$ differently in the weighted inner product and range proving protocols, we rename it to $A'$ in the weighted inner product protocol.
 
+A specific definition of note relates to that of the vector $\vec{d}$ introduced in the preprint.
+This vector is defined as
+\\[
+ \vec{d} = \sum\_{j=0}^{m-1} z^{2(j+1)} \vec{d}\_j \tag{1}
+\\]
+where each
+\\[
+ \vec{d}\_j = (\underbrace{0,\ldots,0}\_{jn}, \vec{2}^n, \underbrace{0,\ldots,0}\_{(m-j-1)n}) \tag{2}
+\\]
+contains only powers of two.
+In particular, this means we can express individual elements of $\vec{d}$ as $d\_{jn+i} = z^{2(j+1)} 2^i$ for $0 \leq i < n$ and $0 \leq j < m$.
+
 Finally, we note one additional unfortunate notation change that applies to the implementation.
 Both the Bulletproofs+ and Zarcanum preprints use $G$ as the commitment value generator, and either $H$ or $\vec{H}\_c$ (in our notation) for masking.
 However, in the Tari protocol (as in other similar protocols), this notation is switched!
@@ -114,7 +125,7 @@ Here
 \\[
 \begin{align*}
 x &= \langle \vec{1}^{mn}, \overrightarrow{y}^{mn} \rangle z - \langle \vec{1}^{mn}, \vec{d} \rangle y^{mn+1}z - \langle \vec{1}^{mn}, \overrightarrow{y}^{mn} \rangle z^2 \\\\
-&= z\sum\_{i=1}^{mn} y^i - y^{mn+1}z\sum\_{i=0}^{mn-1}d\_i - z^2\sum_{i=1}^{mn} y^i
+&= z\sum\_{i=1}^{mn} y^i - y^{mn+1}z\sum\_{i=0}^{mn-1}d\_i - z^2\sum_{i=1}^{mn} y^i \tag{3}
 \end{align*}
 \\]
 is a scalar defined entirely in terms of constants and challenge values from the proof.
@@ -171,149 +182,38 @@ The recovered masks must then be checked against the extended commitment once th
 Otherwise, if the verifier uses a different nonce seed than the prover did (or if the prover otherwise did not derive the nonces using a nonce seed at all), it will recover incorrect masks.
 If the verifier is able to construct the extended commitment from the value and recovered masks, the recovery succeeds; otherwise, the recovery fails.
 
-## Comparative performance
-
-As we moved from Bulletproofs [1] to Bulletproofs+ [2] in our
-[blockchain project](https://github.com/tari-project), the natural benchmark comparison is with the experimental results
-in [2] and Dalek's Bulletproofs [3]. Compared with Dalek's Bulletproofs, our average proof creation is 30% slower,
-while verification is on par. Compared with the experimental results in [2], we could not recreate the 16% reduction in
-prover time; however, our 1% increase in verification time is on par with their 3%. Immediate benefits are evident
-when employing batch verification; exponential gains range from 37% to 79% for batch sizes from 2 to 256 proofs.
-
-Extended commitments add virtually no overhead in single or aggregated range proof creation or verification time,
-neither in batched verification time nor when compared to regular Pedersen commitments.
-
-Mask/blinding factor recovery adds moderate (5% for single proof-verification with extension degree zero) to significant
-(22% for 256 single batched proofs verification with extension degree two) overhead to verification performance;
-comparisons below were made without activating the recovery feature. Deactivating proof verification for a
-"mask-recovery-only" mode of operation is possible and omits the expensive multi-exponentiation multiplication,
-resulting in linear performance (as opposed to exponential gains/cost). Batched "mask-recovery-only" is approximately
-10% more costly on average when compared to non-batched recovery.
-
-**Note:** The test results listed here are relative; the numbers are not absolute. The tests were run on an Intel(R)
-Core(TM) i7-7820HQ CPU laptop without using the `simd_backend` feature.
-
-### Aggregated 64-bit range proof creation
-
-_**Notes:**_
-- Median values are used for comparison.
-- In the headings and legends:
- - `ed_0` means extension degree zero
- - `ed_1` means extension degree one
- - `ed_2` means extension degree two
-
-#### BP vs. BP+ (creation)
-
-BP+ creation is 30% slower than BP.
-
-| Size | BP Med (ms) | BP+ Med (ms) | Diff Med (%) |
-|------|-------------|--------------|--------------|
-| 1 | 16.29 | 21.24 | 130% |
-| 2 | 31.63 | 41.08 | 130% |
-| 4 | 60.47 | 80.46 | 133% |
-| 8 | 119.18 | 156.56 | 131% |
-| 16 | 240.18 | 306.03 | 127% |
-| 32 | 460.67 | 598.57 | 130% |
-| | | Average | 130% |
-
-<p align="center"><img src="assets/img_bp_vs_bp_plus_creation.png" width="550" /></p>
-
-#### BP+ extension degrees (creation)
-
-Extended commitments add virtually no overhead to creation time.
-
-| Size | BP+ Med ed_0 (ms) | BP+ Med ed_1 (ms) | BP+ Med ed_2 (ms) | Diff Med ed_0-1 (%) | Diff Med ed_0-2 (%) |
-|------|-------------------|-------------------|-------------------|---------------------|---------------------|
-| 1 | 21.24 | 21.48 | 22.467 | 101.12% | 105.77% |
-| 2 | 41.08 | 41.45 | 42.074 | 100.91% | 102.43% |
-| 4 | 80.46 | 80.70 | 80.76 | 100.31% | 100.38% |
-| 8 | 156.56 | 157.07 | 157.06 | 100.33% | 100.32% |
-| 16 | 306.03 | 306.28 | 305.49 | 100.08% | 99.82% |
-| 32 | 598.57 | 598.47 | 598.01 | 99.98% | 99.91% |
-| | | | Average | 100% | 101% |
-
-<p align="center"><img src="assets/img_bp_plus_creation_extension_degrees.png" width="550" /></p>
+## Sum optimization
 
-### Aggregated 64-bit range proof verification
+From Equation (3), the verifier must compute $\sum\_i d\_i$.
+Because the vector $\vec{d}$ contains $mn$ elements by Equations (1) and (2), computing the sum naively is a slow process.
+The implementation takes advantage of the fact that this sum can be expressed in terms of a partial sum of a geometric series to compute it much more efficiently; we describe this here.
 
-#### BP vs. BP+ (verification)
-
-BP+ verification showed gains for smaller aggregation sizes compared to BP, but is slower for larger aggregation sizes.
-
-| Size | BP Med (ms) | BP+ Med (ms) | Diff Med (%) |
-|------|-------------|--------------|--------------|
-| 1 | 2.34 | 2.17 | 93% |
-| 2 | 3.76 | 3.71 | 99% |
-| 4 | 6.44 | 6.18 | 96% |
-| 8 | 11.10 | 10.96 | 99% |
-| 16 | 17.57 | 19.52 | 111% |
-| 32 | 33.69 | 36.97 | 110% |
-| | | Average | 101% |
-
-<p align="center"><img src="assets/img_bp_vs_bpplus_verification.png" width="550" /></p>
-
-#### BP+ extension degrees (verification)
-
-Extended commitments add virtually no overhead to verification time.
-
-| Size | BP+ Med ed_0 (ms) | BP+ Med ed_1 (ms) | BP+ Med ed_2 (ms) | Diff Med ed_0-1 (%) | Diff Med ed_0-2 (%) |
-|------|-------------------|-------------------|-------------------|---------------------|---------------------|
-| 1 | 2.17 | 2.20 | 2.20 | 102% | 102% |
-| 2 | 3.71 | 3.74 | 3.76 | 101% | 101% |
-| 4 | 6.18 | 6.26 | 6.28 | 101% | 102% |
-| 8 | 10.96 | 11.05 | 10.97 | 101% | 100% |
-| 16 | 19.52 | 19.66 | 19.51 | 101% | 100% |
-| 32 | 36.97 | 36.99 | 36.87 | 100% | 100% |
-| | | | Average | 101% | 101% |
-
-
-<p align="center"><img src="assets/img_bp_plus_verification_extension_degrees.png" width="550" /></p>
-
-### Batched 64-bit single range proof verification
-
-Batched verification shows significant gains when compared to linear verification.
-
-| Batch size | BP+ linear (ms) | BP+ ext_deg 0 (ms) | BP+ ext_deg 1 (ms) | BP+ ext_deg 2 (ms) | Diff (%) | Gains (%) |
-|------------|-----------------|--------------------|--------------------|--------------------|----------|-----------|
-| 1 | 2.17 | 2.17 | 2.18 | 2.20 | 100% | 0% |
-| 2 | 4.34 | 2.73 | 2.73 | 2.76 | 63% | 37% |
-| 4 | 8.68 | 3.82 | 3.82 | 3.80 | 44% | 56% |
-| 8 | 17.36 | 5.74 | 5.76 | 5.75 | 33% | 67% |
-| 16 | 34.72 | 9.57 | 9.60 | 9.74 | 28% | 72% |
-| 32 | 69.44 | 17.10 | 17.06 | 17.05 | 25% | 75% |
-| 64 | 138.89 | 32.04 | 32.06 | 31.85 | 23% | 77% |
-| 128 | 277.77 | 60.56 | 60.75 | 60.71 | 22% | 78% |
-| 256 | 555.55 | 118.55 | 118.69 | 119.15 | 21% | 79% |
-
-<p align="center"><img src="assets/img_bp_plus_batched_zoomed.png" width="550" /></p>
-
-<p align="center"><img src="assets/img_bp_plus_batched.png" width="550" /></p>
+We first recall the following [identity](https://mathworld.wolfram.com/GeometricSeries.html) for the partial sum of a geometric series for $r \neq 0$:
+\\[
+ \sum\_{k=0}^{n-1} r^k = \frac{1 - r^n}{1 - r}
+\\]
 
-### Batched 64-bit single range proof mask recovery
+Next, we note that from Equation (2), we have
+\\[
+ \sum\_{i=0}^{mn-1} (d\_j)\_i = \sum\_{k=0}^{n-1} 2^k
+\\]
+for all $0 \leq j < m$.
 
-Mask-recovery-only mode is linear and does not benefit from batched operations; batched recovery is suboptimal.
+Given these facts, we can express the required sum of the elements of $\vec{d}$ as follows:
+\\[
+\begin{align*}
+\langle \vec{1}^{mn}, \vec{d} \rangle &= \sum\_{i=0}^{mn-1} d\_i \\\\
+&= \sum\_{i=0}^{mn-1} \left\( \sum\_{j=0}^{m-1} z^{2(j+1)} (d\_j)\_i \right) \\\\
+&= \sum\_{j=0}^{m-1} z^{2(j+1)} \left\( \sum\_{i=0}^{mn-1} (d\_j)\_i \right) \\\\
+&= \sum\_{j=0}^{m-1} z^{2(j+1)} \left\( \sum\_{k=0}^{n-1} 2^k \right) \\\\
+&= (2^n - 1) \sum\_{j=0}^{m-1} z^{2(j+1)}
+\end{align*}
+\\]
+This requires a sum of only $m$ even powers of $z$, which can be computed iteratively.
 
-| Batch size | No mask ed_0 (ms) | Mask only ed_0 (ms) | Linear mask only (ms) | Linear vs. mask only (%) | No mask vs. mask only (%) |
-|------------|-------------------|---------------------|-----------------------|--------------------------|---------------------------|
-| 1 | 2.10 | 0.22 | 0.22 | 100.0% | 10.4% |
-| 2 | 2.59 | 0.43 | 0.44 | 102.2% | 16.5% |
-| 4 | 3.59 | 0.90 | 0.87 | 97.0% | 25.0% |
-| 8 | 5.46 | 1.87 | 1.74 | 93.4% | 34.2% |
-| 16 | 9.31 | 3.78 | 3.49 | 92.3% | 40.6% |
-| 32 | 16.54 | 9.31 | 6.97 | 74.9% | 56.3% |
-| 64 | 30.39 | 15.27 | 13.95 | 91.3% | 50.3% |
-| 128 | 58.25 | 30.60 | 27.90 | 91.2% | 52.5% |
-| 256 | 113.18 | 61.36 | 55.80 | 90.9% | 54.2% |
-
-<p align="center"><img src="assets/img_bp_plus_batched_masks.png" width="550" /></p>
-
-| Date | Change | Author |
-|:------------|:------------------|:-------|
-| 7 Dec 2022 | First draft | Aaron |
-| 13 Jan 2022 | Performan updates | brianp |
 
-## References
-- [1] [Bulletproofs: Short Proofs for Confidential Transactions and More](https://eprint.iacr.org/2017/1066/20220414:014622)
-- [2] [Bulletproofs+: Shorter Proofs for Privacy-Enhanced Distributed Ledger](https://eprint.iacr.org/2020/735/20200618:154806)
-- [3] [Dalek's Bulletproofs docs](https://doc-internal.dalek.rs/bulletproofs/index.html) 
-- [4] [Bulletproofs+ Audit report](https://suyash67.github.io/homepage/assets/pdfs/bulletproofs_plus_audit_report_v1.1.pdf)
+| Date | Change | Author |
+|:------------|:--------------------|:-------|
+| 7 Dec 2022 | First draft | Aaron |
+| 13 Jan 2022 | Performance updates | brianp |
+| 20 Jul 2023 | Sum optimization | Aaron |