Skip to content

The Additive NTT

Binary Reed–Solomon codes, and an efficient algorithm to encode them

The most important code we come across in SNARKs is the Reed–Solomon code. This is a simple—and yet extremely powerful and effective—code, and appears not just in Binius but also in DEEP-ALI-based SNARKs like Plonky3 and Stwo.

As we've suggested, error-correcting codes have two desirable properties—distance and rate—which are at odds with each other in general. Indeed, a mathematical result called the Singleton bound formalizes the precise way in which these desiderata are in tension. Reed–Solomon codes meet the Singleton bound, in the sense that the inequality becomes an equality; they thus negotiate the tension between these attributes as favorably as it is possible to do.

Reed–Solomon codes are not just coding-theoretically nice. They also support an extremely rich set of auxiliary protocols, designed to achieve such ends as proximity testing (determining whether a supplied word is close to a codeword or not, by just reading from it at a few places). The crucial protocol here is FRI (short for fast Reed–Solomon interactive oracle proof of proximity), due to Ben-Sasson, Bentov, Horesh and Riabzev. We will use FRI in our evaluation protocol.

Review of Reed–Solomon Codes

Mathematically, each Reed–Solomon code depends on a field F\mathbb{F}, a message length kk and a block length nn. We first need to pick a domain SFS \subset \mathbb{F} of size S=n|S| = n (as a side effect, we need Fn| \mathbb{F} | \geq n for this to make sense).

Our message space will be the set of univariate polynomials P(X)=a0+a1X++ak1Xk1P(X) = a_0 + a_1 \cdot X + \cdots + a_{k - 1} \cdot X^{k - 1} of degree less than kk with coefficients in F\mathbb{F}. The word space will be the set of all nn-tuples of F\mathbb{F}-elements (b0,,bn1)(b_0, \ldots , b_{n - 1}) (here, we can think of associating these with the elements of SS). The codewords will be those nn-tuples which arise as the set of evaluations of some polynomial P(X)P(X) as above on SS: that is, those of the form (P(x0),,P(xn1))(P(x_0), \ldots , P(x_{n - 1})). Here, (x0,,xn1)(x_0, \ldots , x_{n - 1}) is an enumeration of SS's elements.

The Binary Case

In Binius, Reed–Solomon codes also play a central role. On the other hand, since we work with binary fields, things get significantly trickier. The first question that arises has to do with encoding: the actual algorithm which takes as input a message and outputs its corresponding codeword. As should be clear from the above, this is a problem of polynomial evaluation: how do we take the kk coefficients of some polynomial, and output the nn values that that polynomial takes on some domain? We can trivially do this in Θ(kn)\Theta(k \cdot n) steps; the goal is to do better than that.

In the classical, prime-field setting, encoding is very standard: the key algorithm is the NTT (number-theoretic transform), a close relative of the fast Fourier transform. The NTT runs in Θ(log2(k)n)\Theta(\log_2(k) \cdot n) steps. Here, the set SFS \subset \mathbb{F} is a multiplicative subgroup SFS \subset \mathbb{F}^* of power-of-two order nn (the field F\mathbb{F} needs to be chosen in such a way that this kind of subgroup exists).

In binary fields, primitive multiplicative roots of unity of power-of-two order don't exist, and things get significantly trickier. Indeed, it was not known until 2014—i.e., until the arrival of a key work by Lin, Chung and Han [LCH14]—how one might encode in the binary setting in such a way as to recover the Θ(log2(k)n)\Theta(\log_2(k) \cdot n) time complexity of the classical algorithm.

Even the Lin–Chung–Han algorithm has a catch. While classical Reed–Solomon encoding algorithms understand their inputs (a0,,ak1)(a_0, \ldots , a_{k - 1}) as the coefficients in the standard monomial basis of the polynomial being evaluated, Lin, Chung and Han actually need to introduce a special polynomial basis: the so-called novel polynomial basis. These are specially constructed polynomials 1,X1(X),,Xk1(X)1, X_1(X), \ldots, X_{k - 1}(X) unequal to the simple monomials 1,X,,Xk11, X, \ldots, X^{k - 1}. The polynomial we actually end up evaluating is not a0+a1X++ak1Xk1a_0 + a_1 \cdot X + \cdots + a_{k - 1} \cdot X^{k - 1}, but rather a0+a1X1(X)++ak1Xk1(X)a_0 + a_1 \cdot X_1(X) + \cdots + a_{k - 1} \cdot X_{k - 1}(X).

Finally, the set SFS \subset \mathbb{F} we evaluate on is not a multiplicative subgroup, but rather an F2\mathbb{F}_2-subspace of F\mathbb{F} (recall that when the characteristic of F\mathbb{F} is 2, we can view it as a vector space over its own subfield F2\mathbb{F}_2).

The Algorithm

Here, we sketch in slightly more technically precise terms what Lin, Chung and Han's algorithm [LCH14] achieves. We fix parameters \ell and R\mathcal{R}. Ultimately, the code we spin up is going to be Reed–Solomon, with message length k=2k = 2^\ell and block length n=2+Rn = 2^{\ell + \mathcal{R}}; we will write RSS,Tτ[2,2+R]\mathsf{RS}_{S, \mathcal{T}_\tau}[2^\ell, 2^{\ell + \mathcal{R}}] for this code. We fix a binary tower field Tτ\mathcal{T}_\tau, and write β0,,β2τ1\beta_0, \ldots , \beta_{2^\tau - 1} for an F2\mathbb{F}_2-basis of it. From this F2\mathbb{F}_2-basis of Tτ\mathcal{T}_\tau, [LCH14] tells us how to construct a further basis of polynomials—called the novel polynomial basis—which we will write as X0(X),,X21(X)X_0(X), \ldots, X_{2^\ell - 1}(X). Each Xi(X)X_i(X) will be a polynomial over Tτ\mathcal{T}_\tau of degree ii. In fact, these polynomials will themselves constitute a Tτ\mathcal{T}_\tau-basis of the 22^\ell-dimensional vector space Tτ[X]2\mathcal{T}_\tau[X]^{\prec 2^\ell} (i.e., of the set of polynomials over Tτ\mathcal{T}_\tau of degree less than 22^\ell).

Finally, we write STτS \subset \mathcal{T}_\tau for the F2\mathbb{F}_2-span of the basis prefix β0,,β+R1\beta_0, \ldots , \beta_{\ell + \mathcal{R} - 1}.

The additive NTT is an algorithm that takes as input a list (a0,,a21)(a_0, \ldots , a_{2^\ell - 1}) and returns a list (b0,,b2+R1)(b_0, \ldots , b_{2^{\ell + \mathcal{R}} - 1}) defined in the following way. We write P(X)=a0+a1X1(X)++X21(X)P(X) = a_0 + a_1 \cdot X_1(X) + \cdots + X_{2^\ell - 1}(X) for the polynomial whose coefficients in the novel basis are the given input (a0,,a21)(a_0, \ldots , a_{2^\ell - 1}). What we return is the list (P(x0),,P(x2+R1))(P(x_0), \ldots , P(x_{2^{\ell + \mathcal{R}} - 1})) of PP's evaluations on SS. Here, (x0,,x2+R1)(x_0, \ldots , x_{2^{\ell + \mathcal{R}} - 1}) is again an enumeration of SS. In practice, we will get this by taking the lexicographically ordered basis combinations of β0,,β+R1\beta_0, \ldots , \beta_{\ell + \mathcal{R} - 1}.

We refrain here from describing how the additive NTT actually works (i.e., how it achieves the above task). In short, Lin, Chung and Han [LCH14] rely on a totally new sort of polynomial decomposition, specific to the novel basis polynomials. In [DP24, § 3], we show that Lin–Chung–Han's NTT is can actually be re-characterized in terms more familiarly Fourier-theoretic.