Overview of Binius

Some introductory content and remarks

Binius is a state-of-the-art SNARK. SNARKs are complicated—and a huge amount has been changing! In fact, much of what you think you already know about SNARKs might have to be revisited. Binius does things rather differently than other SNARKs do—and everything from building with us to our cryptography is likely to prove fresh and new. In this site area, we drop a crash course on Binius, targeting those seeking to learn the basics and start building.

What is a SNARK?

Roughly, a SNARK (a succinct, noninteractive argument of knowledge) can be thought of as a "proof of a computation". Let's say you happen to be interested in learning the output of some computation—a costly, long-running one let's say, or one which consumes a large amount of "auxiliary advice data". Instead of forcing you to run the computation yourself, I, the prover, can instead just declare the output to you—and moreover hand to you a short, checkable "receipt" or a "proof" alongside my claimed output. This proof is something like an event ticket—you can "check" it, something like running a barcode scanner on it. When you do this, you either get a "good result" (one beep) or a "bad" one (two beeps).

If this system is designed well, it should satisfy two properties:

Succinctness. The receipt, or "barcode", should be much smaller than the full size of the input data would have been. In this way, I can save in how much data I have to transmit to you. It should also be much more efficient to check it—i.e., to run the "barcode scanner" on it—than the full-blown program would have been to run.
Soundness. It should also be a good proxy for my honesty. That is, imagine that I try to trick you by pawning off a wrong or false output to you. If the system is done right, then it should become impossible for me in this case to create a proof (barcode) which causes your checker to accept (beep once). Put differently, if your proof does accept, then you should be as good as convinced that I was honest.

Taking these two attributes together, we see that SNARKs are a great tradeoff for you, the verifier. The amount of work you have to do is much less: you just check my small barcode, instead of running the full computation. On the other hand, the assuredness you get at the end of the day is essentially as good as if you had run the computation yourself.

Now, we haven't said anything yet about how to actually construct this barcode, what it actually consists in, how your scanner is supposed to work, or why it is a good proxy for my honesty. In fact, this is where the whole science begins, and leads to a massive rabbit hole of elegance and complexity.

For now, in this site area, we will explain how to use Binius to construct barcode systems for computations we care about—like Ethereum block proving, for example.

Binary Fields

As we've already mentioned, Binius introduces many new concepts and techniques to the SNARK landscape. If we had to explain these differences in one phrase, it'd be our use of binary fields. As we explain later in this site, our fundamental datatypes are binary strings of power-of-2 width. This makes Binius actually a lot more like standard physical computers, and a lot less like the SNARKs you've worked with before. Indeed, fixed-width binary strings—like single bits, 32-bit words, and 8-bit bytes—have been the fundamental units of computers since their early days.

Why did SNARKs ever use anything else? The short answer is that, for technical reasons, we have to work with a mathematical object called a field—essentially a number system where we can add, multiply and divide. The standard unsigned integer datatypes are actually not fields. We can of course add with no problem—we all know that if I add 0xFFFFFFFF plus 0x00000001, I'll get 0x00000000, with an overflow modulo $2^{32}$ . On the other hand, I can't always divide. Why not? To see this, note that there does not exist an integer $x$ such that $x \cdot 2 \equiv 1 \pmod{2^{32}}$ holds. (Whatever I multiply 2 by, the result will be even, and will remain so even after we reduce modulo $2^{32}$ .)

For this reason, SNARKs have largely used prime fields, the next-simplest thing. Here, we take a prime number $p$ , like the Baby Bear prime $2^{31} - 2^{27} + 1$ . The set of integers $\{0, \ldots , p - 1\}$ is a field, provided that we understand all addition and multiplication with wraparound modulo $p$ . It is a fact of basic algebra that sets of this form are fields. All production-oriented SNARKs in use today, like Plonky3 and Stwo, use prime fields of roughly 32 bits.

In Binius, we use binary fields, fields that look a lot more like classical fixed-width binary registers. The tricky part is that when we multiply two 32-bit elements, we can't just do integer multiplication modulo $2^{32}$ —we have to do something more complicated. On the other hand, these fields give us huge efficiency dividends.

To see why, we need to talk about arithmetization. As we will explain at length shortly, arithmetization is how we capture real-life statements in SNARK design. Let's start with a computational task—like "prove that this Ethereum block is valid". We can't feed this task directly into a SNARK—whether binary or otherwise. Rather, we need to express it in the language of the "frontend" that the SNARK exposes.

In prime-field SNARKs, these front-ends all work modulo $p$ . This is like putting a round peg into a square hole: while the computation we care about (say, the Keccak-256 hash function) has nothing to do with prime fields, we must express it in a language which only recognizes prime fields. This entails a sort of efficiency loss.

As it happens, our arithmetization doesn't win just because of its binary datatypes. It also introduces an new, declarative computational paradigm, which we call M3 (multi-multiset matching). We describe M3 at length in this site area.

Multilinear Polynomials

Why don't other SNARKs use binary fields? Some in the past have. On the other hand, binary fields don't work compatibly with the technique central to those production-oriented SNARKs in common use today: the DEEP-ALI paradigm. In short, this paradigm only works with fields whose multiplicative groups have high two-adicity. Binary fields have none at all! Thus, univariate techniques won't work for us.

Instead, we revisit a much-older stream of research: based on the multivariate sumcheck protocol of Lund, Fortnow, Karloff, and Nisan [LFKN92]. To make this work, we have to introduce a lot of new techniques, including for the commitment of multilinears over binary fields. In the process, we lean on a number of existing works, including Ligero [AHIV23], HyperPlonk [CBBZ23] and BaseFold [ZCF24]. We describe the techniques of our cryptographic backend in the blueprint area of this site. Academic expositions of our techniques appear in the works [DP23] and [DP24].