- Jade Xie jade@secbit.io
- Yu Guo yu.guo@secbit.io
This article is mainly inspired by the blog post STIR: Reed–Solomon Proximity Testing with Fewer Queries by the authors of the STIR paper and the presentation ZK11: STIR: Reed–Solomon Proximity Testing with Fewer Queries - Gal Arnon & Giacomo Fenzi, introducing the STIR protocol.
Like FRI, STIR also solves the Reed-Solomon Proximity Testing problem, but compared to FRI, it has lower query complexity, which reduces the size of the argument and the hash complexity of the Verifier. So how does STIR achieve this? The answer is in the name itself, STIR stands for Shift To Improve Rate, and the core idea of STIR is to improve the rate by shifting the evaluation domain in each round. Intuitively, the rate actually characterizes the proportion of true information contained in the codeword. As the rate decreases, the true information decreases, corresponding to an increase in redundancy in the codeword, making it easier for the Verifier to test the proximity of a received message to the encoding space. In other words, the Verifier's testing ability becomes stronger. This means that the Verifier only needs fewer queries to achieve the target security. Let's look at how STIR reduces the rate by comparing FRI and STIR.
For a finite field
The goal of the protocol is to solve the Reed-Solomon Proximity Testing problem, where the Verifier can obtain a function
-
$f$ is a Reed-Solomon codeword, i.e.,$f \in \mathrm{RS}[\mathbb{F},\mathcal{L},d]$ ; -
$f$ is$\delta$ -far from all codewords in the Reed-Solomon encoding space$\mathrm{RS}[\mathbb{F},\mathcal{L},d]$ in relative Hamming distance, i.e.,$\Delta(f, \mathrm{RS}[\mathbb{F},\mathcal{L},d]) > \delta$ .
We consider the above Reed-Solomon Proximity Testing problem under the IOPP (Interactive Oracle Proofs of Proximity) model, where the Verifier can interact with the Prover and obtain the Prover's messages through an oracle, as shown in the following figure.
After a series of interactions with the Prover, the Verifier has two situations:
- If
$f \in \mathrm{RS}[\mathbb{F},\mathcal{L},d]$ , the Verifier accepts :) - If
$\Delta(f, \mathrm{RS}[\mathbb{F},\mathcal{L},d]) > \delta$ , the Verifier rejects with high probability :(
We compare the FRI protocol and the STIR protocol in the case of
In the FRI protocol, assume that
We can see that in each round, the rate
In the STIR protocol, note that
If
When we compile the above IOPP into a SNARK, we need to use the BCS transformation ([BCS16], BCS transformation), which consists of two steps:
- Merkle commit the Prover's messages, and when the Verifier wants to query, open these commitments. This step transforms the IOPP into a succinct interactive argument.
- Use the Fiat-Shamir transform to convert the succinct interactive argument obtained in the first step into a non-interactive one.
In the BCS transformation, the IOPP needs to have a strong soundness property called round-by-round soundness, which requires the IOPP to have a relatively small soundness error in each round, which is a stronger requirement than requiring the entire IOPP to have a relatively small soundness error. Let's assume that the bound for the round-by-round soundness error is
-
The query complexity of FRI is:
$$ O \left( \lambda \cdot \frac{\log d}{- \log \sqrt{\rho}} \right) $$
-
The query complexity of STIR is:
$$ O \left( \lambda \cdot \log \left( \frac{\log d}{- \log \sqrt{\rho}} \right) + \log d \right) $$
In the query complexity of STIR,
Figure 2 in Section 6.4 of the paper [ACFY24] gives the experimental results comparing FRI and STIR. We can see that the reduction in query complexity of STIR results in better performance in terms of argument size and the number of hashes computed by the Verifier compared to FRI. This is understandable, as fewer query complexity means:
- A reduction in the overall argument size is obvious.
- With fewer queries, the Verifier needs to open fewer Merkle commitments, resulting in fewer hash computations.
Here we first introduce several powerful tools for RS encoding, which can help us understand the specific STIR protocol construction.
For a function
The first property is distance preservation.
- If the function
$f$ before folding is in$\mathrm{RS}[\mathbb{F}, \mathcal{L}, d]$ , then for any randomly chosen$r \in \mathbb{F}$ , the folded function is still an RS code, i.e.,$f_r \in \mathrm{RS}[\mathbb{F}, \mathcal{L}^k, d/k]$ . - For
$\delta \in (0, 1 - \sqrt{\rho})$ , if$f$ is$\delta$ -far from$\mathrm{RS}[\mathbb{F}, \mathcal{L}, d]$ , then with probability at least$1 - \mathrm{poly}(|\mathcal{L}|)/\mathbb{F}$ over the choice of random$r$ ,$f_r$ is$\delta$ -far from$\mathrm{RS}[\mathbb{F}, \mathcal{L}^k, d/k]$ .
This property ensures that we can boldly perform folding. If the Prover cheats and provides a function that is
The second property is called Local, which means that to obtain the value of the folded function at any point, we only need to query the values of
For functions
where
An important property of this function is Consistency. Assuming
- If
$f \in \mathrm{RS}[\mathbb{F}, \mathcal{L}, d]$ , it is an evaluation of a polynomial of degree less than$d$ on$\mathcal{L}$ , and this polynomial is consistent with$p$ on$S$ , then$\mathrm{Quotient}(f, S, p) \in \mathrm{RS}[\mathbb{F}, \mathcal{L}, d - |S|]$ . - If for any polynomial
$\hat{u}$ of degree less than$d$ that is$\delta$ -close to$f$ ,$\hat{u}$ is not completely consistent with$p$ on$S$ , i.e., for some$a \in S$ ,$\hat{u}(a) \neq p(a)$ , then$\mathrm{Quotient}(f, S, p)$ is$\delta$ -far from$\mathrm{RS}[\mathbb{F}, \mathcal{L}, d - |S|]$ .
Regarding point 2 above, for codewords
Note that here we require that for any
The
Then we just need to prove
The
Out of Domain Sampling is a powerful tool that can help us limit the number of codewords within
Generally, for a function
This can be explained using the fundamental theorem of algebra. We only need to prove that the probability of two different codewords
First, fix two different codewords
Suppose
How to restrict that the
In this section, we will apply the three tools mentioned earlier to delve into one iteration of the STIR protocol.
Objective:
- Initially given a function
$f$ , we want to prove that it is in$\mathrm{RS}[\mathbb{F},\mathcal{L},d]$ , where$\mathcal{L} =\langle \omega \rangle$ . - After one iteration, prove that function
$f' \in \mathrm{RS}[\mathbb{F},\mathcal{L}',d/k]$ , where$\mathcal{L}' = \omega \cdot \langle \omega^2 \rangle$ .
That is, function
Regarding the evaluation domains
The
The protocol flow for one iteration is shown in the following figure:
- Sample folding randomness: The Verifier first randomly selects a number
$r^{\mathrm{fold}}$ from$\mathbb{F}$ , which will be used to fold function$f$ . - Send folded function: The Prover sends the folded function
$g: \mathcal{L}' \rightarrow \mathbb{F}$ . If the Prover is honest, then function$g$ is the evaluation of polynomial$\hat{g}$ on$\mathcal{L}'$ . Here, evaluation means that$g$ is completely consistent with$\hat{g}$ on$\mathcal{L}'$ , and polynomial$\hat{g}$ is obtained through$\mathrm{Fold}(f, r^{\mathrm{fold}})$ . First, use the random number$r^{\mathrm{fold}}$ to perform$k$ -fold on function$f$ , obtaining$\mathrm{Fold}(f, r^{\mathrm{fold}}) : \mathcal{L}^k \rightarrow \mathbb{F}$ . At this time, the range of the folded function is$\mathcal{L}^k$ , but we want it to take values on$\mathcal{L}'$ . We just need to extend the domain of$\mathrm{Fold}(f, r^{\mathrm{fold}})$ to$\mathcal{L}'$ , obtaining polynomial$\hat{g}: \mathcal{L}' \rightarrow \mathbb{F}$ , which has degree less than$d/k$ . - Out-of-domain sample: The Verifier takes a random number
$r^{\mathrm{out}}$ from$\mathbb{F}\backslash \mathcal{L}'$ and sends it to the Prover. - Out-of-domain reply: The Prover replies with
$\beta \in \mathbb{F}$ . If the Prover is honest, then$\beta := \hat{g}(r^{\mathrm{out}})$ .
📝 Notes The purpose of steps 3 and 4 here is to use Out of Domain Sampling to convert list decoding to unique decoding, that is, the Verifier selects a random number
$r^{\mathrm{out}}$ from$\mathbb{F} \backslash \mathcal{L}$ and requires the Prover to reply with$\beta$ .
- Shift queries: The Verifier selects
$t$ random numbers from$\langle \omega^k \rangle$ , i.e.,$\forall i \in [t],r_i^{\mathrm{shift}} \leftarrow \langle \omega^k \rangle$ . According to the Local property of the folding function, the Verifier can calculate$y_i := f_{\mathrm{fold}}(r_i^{\mathrm{shift}})$ by querying$f$ , where$f_{\mathrm{fold}} :=\mathrm{Fold}(f, r^{\mathrm{fold}})$ .
In step 2, the Prover sent
In steps 3 and 4, the Out-of-domain Sampling method is first used to limit the number of codewords within
Let's form a set
Define the next function
Due to the Local property of the Quotient function, to calculate the values of
At this point, we just need to test whether
Looking closely at the formula for
In this section, we will perform a soundness analysis for one iteration, that is, if the Prover cheats and
Proposition 1 [ACFY24, Lemma 1] If
Proof idea:
- According to the distance-preserving property of the folding function, the function
$f_{r^{\mathrm{fold}}} := \mathrm{Fold}(f, r^{\mathrm{fold}})$ obtained after folding$f$ with the random number$r^{\mathrm{fold}}$ is$\delta$ -far from$\mathrm{RS}[\mathbb{F},\mathcal{L}^k,d/k]$ with probability greater than$1 - \mathrm{poly}(|\mathcal{L}|/|\mathbb{F}|)$ . - According to the property of Out-of-domain Sampling, the probability that
$g$ has at most one codeword$\hat{u}$ within$1 - \sqrt{\rho'}$ range satisfying$\hat{u}(r^{\mathrm{out}}) = \beta$ is greater than$1 - \mathrm{poly}(|\mathcal{L}|)/|\mathbb{F}|$ .
Now let's analyze point 2. The function
Therefore, the probability that
If both item 1 and item 2 hold, this probability is greater than
Let's discuss two cases:
-
If there is no codeword satisfying the requirement in item 2, that is, there is no codeword satisfying
$\hat{u}(r^{\mathrm{out}}) = \beta$ within$1 - \sqrt{\rho'}$ range of$g$ , and according to the construction of the protocol,$p(r^{\mathrm{out}}) = \beta$ . Therefore, for any codeword within$1 - \sqrt{\rho'}$ range of$g$ , we have$\hat{u}(r^{\mathrm{out}}) \neq p(r^{\mathrm{out}})$ . Since$$ f' := \mathrm{Quotient}(g, \mathcal{G}, p) = \frac{g(x) - \hat{p}(x)}{\prod_{a \in \mathcal{G}}(X - a)}. $$
According to the consistency of the Quotient function, at this time
$\hat{u}$ and$p$ are not completely consistent on$\mathcal{G}$ , so$f' = \mathrm{Quotient}(f, \mathcal{G}, p)$ is$(1 - \sqrt{\rho'})$ -far from$\mathrm{RS}[\mathbb{F},\mathcal{L}',d/k- |\mathcal{G}|]$ . -
If there exists a codeword
$\hat{u}$ satisfying the requirement in item 2, there is already a codeword satisfying$\hat{u}(r^{\mathrm{out}}) = \beta$ within$1 - \sqrt{\rho'}$ range of$g$ . According to$$ f' := \mathrm{Quotient}(g, \mathcal{G}, p) = \frac{g(x) - \hat{p}(x)}{\prod_{a \in \mathcal{G}}(X - a)}. $$
Now
$\hat{u}(r^{\mathrm{out}}) = \beta = p(r^{\mathrm{out}})$ is already satisfied. If for all$i \in [t]$ , we have$\hat{u}(r_i^{\mathrm{shift}}) = y_i = p(r_i^{\mathrm{shift}})$ , then$f' = \mathrm{Quotient}(f, \mathcal{G}, p)$ is not more than$(1 - \sqrt{\rho'})$ -far from$\mathrm{RS}[\mathbb{F},\mathcal{L}',d/k- |\mathcal{G}|]$ . Otherwise, according to the consistency of the Quotient function, as long as for some$i$ we have$\hat{u}(r_i^{\mathrm{shift}}) \neq y_i$ , at this time$\hat{u}(r_i^{\mathrm{shift}}) \neq p(r_i^{\mathrm{shift}})$ , it will cause$f'$ to be$(1 - \sqrt{\rho'})$ -far from$\mathrm{RS}[\mathbb{F},\mathcal{L}',d/k- |\mathcal{G}|]$ .Since item 1 holds, we have
$\Delta(f_{r^{\mathrm{fold}}}, \mathrm{RS}[\mathbb{F}, \mathcal{L}^k, d/k]) \ge \delta$ for the folded function, so$$ \begin{aligned} \Pr \left[\forall i \in [t], \hat{u}(r_i^{\mathrm{shift}}) = y_i \right] & = \Pr \left[\forall i \in [t], \hat{u}(r_i^{\mathrm{shift}}) = f_{r^{\mathrm{fold}}}(r_i^{\mathrm{shift}}) \right] \ & \le (1 - \delta)^t. \end{aligned} $$
Therefore, the probability that
$f'$ is (approximately)$(1 - \sqrt{\rho'})$ -far from$\mathrm{RS}[\mathbb{F},\mathcal{L}',d/k- |\mathcal{G}|]$ is at least$1 - (1 - \delta)^t$ .
Thus, Proposition 1 is proved.
In fact, the round-by-round soundness error of the protocol is approximately
Now there's a small problem left to solve. According to the definition of function
We can see that, strictly speaking, this converts the test of
Generally, let's assume that the function we want to perform degree correction on is
- If
$f \in \mathrm{RS}[\mathbb{F},\mathcal{L},d/k]$ , then$f^* \in \mathrm{RS}[\mathbb{F},\mathcal{L},d/k]$ . - If
$f$ is$\delta$ -far from$\mathrm{RS}[\mathbb{F},\mathcal{L},d/k]$ , then with high probability,$f^*$ is also$\delta$ -far from$\mathrm{RS}[\mathbb{F},\mathcal{L},d/k]$ . - Queries to
$f^*$ can be efficiently computed through queries to$f$ .
The STIR paper ([ACFY24], Section 2.3) proposes a method that not only satisfies the above three conditions but also uses the method of summing geometric series to make the calculation in item 3 more efficient.
The method is to randomly sample an element
where
According to the construction of
For
Next, let's analyze item 3. From equation (2), we can see that to calculate the value of
Using the geometric series sum formula for
For the more complex
This method can be extended to multiple functions of different degrees. For
Similar to the degree correction of a single function above, for $\delta < \min { 1 - \sqrt{\rho}, 1 - (1 + 1/d^) \cdot \rho}$, if any $f_i$ is $\delta$-far from $\mathrm{RS}[\mathbb{F},\mathcal{L},d_i]$, then $f^$ is
STIR changes the evaluation domain of the function in each round, changing the original
In the construction of the STIR protocol, several powerful tools for RS encoding are used, making the entire protocol efficient and secure.
- First, consistent with the FRI protocol, the function
$f$ is$k$ -folded, but the resulting function needs to extend its evaluation domain from$\mathcal{L}^k$ to$\mathcal{L}'$ . According to the distance-preserving property of the folding function, we can confidently perform this folding. - Then, to reduce the Verifier's work, the Out of Domain Sampling method is used to convert list decoding to unique decoding. This is where the Verifier selects a random number
$r^{\mathrm{out}}$ from$\mathbb{F} \backslash \mathcal{L}$ and requires the Prover to reply with$\beta$ in the protocol. - At this point, after changing the evaluation domain to
$\mathcal{L}'$ , the problem faced is that the Verifier can only query the values of the$k$ -folded function $\mathrm{f}{r^{\mathrm{fold}}}$ on $\mathcal{L}^k$. Fortunately, the powerful Quotient tool can be used to constrain the function sent by the Prover to be consistent with the folded function on $\mathcal{L}^k$. At this time, the Verifier selects $t$ random numbers $r{i}^{\mathrm{shift}}$ from$\mathcal{L}^k$ for querying. - Finally, combine
$r^{\mathrm{out}}$ and$r_{i}^{\mathrm{shift}}$ , and use the Quotient tool to constrain the values sent by the Prover at these points to be correct.
Combining these tools, a soundness analysis of one iteration of the STIR protocol was performed. In fact, we can obtain that the round-by-round soundness error of STIR is
Finally, to raise the degree of
- [ACFY24] Gal Arnon, Alessandro Chiesa, Giacomo Fenzi, and Eylon Yogev. "STIR: Reed-Solomon proximity testing with fewer queries." In Annual International Cryptology Conference, pp. 380-413. Cham: Springer Nature Switzerland, 2024.
- [BCIKS20] Eli Ben-Sasson, Dan Carmon, Yuval Ishai, Swastik Kopparty, and Shubhangi Saraf. Proximity Gaps for Reed–Solomon Codes. In Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science, pages 900–909, 2020.
- [BCS16] Eli Ben-Sasson, Alessandro Chiesa, and Nicholas Spooner. "Interactive Oracle Proofs". In: Proceedings of the 14th Theory of Cryptography Conference. TCC '16-B. 2016, pp. 31–60.
- [BGKS20] Eli Ben-Sasson, Lior Goldberg, Swastik Kopparty, and Shubhangi Saraf. "DEEP-FRI: Sampling Outside the Box Improves Soundness". In: Proceedings of the 11th Innovations in Theoretical Computer Science Conference. ITCS '20. 2020, 5:1–5:32.
- STIR: Reed–Solomon Proximity Testing with Fewer Queries
- Video: ZK11: STIR: Reed–Solomon Proximity Testing with Fewer Queries - Gal Arnon & Giacomo Fenzi