Lasso Lookup
Lookup protocols are essential in modern SNARKs. At their core, they allow a sort of unstructured access. While a typical polynomial constraint might decree, for example, that one column be the element-wise product of two other columns—an algebraic relationship—a lookup allows us to prove that a column's values are merely present in some other column—and in an unstructured way, no less, in the sense that we don't care where in that latter column the first column's values appear.
As a brief technical explanation, a lookup argument pertains to a table column and a looked-up column . The statement is that all of 's values appear in (somewhere). Here, can be thought of as the global lookup table, which tells us which values are "allowed".
As it turns out, lookups allow us to "bootstrap" many flexible sorts of operations. For example, they let us achieve efficient unsigned integer multplication even in Binius's tower setting; this fact is explained at length in [DP23, § 5.3].
Here, we explain how to express Lasso naturally as an M3 instance.
Background
Binius achieves lookups in the binary tower setting using Lasso, a technique developed by Setty, Thaler and Wahby [STW23]. Under the hood, the essential idea here is offline memory checking, a technique which originates with Blum et al. [Blu+91]. In [DP23, § 4.4], we describe how to adapt Lasso's offline-memory-checking argument to the binary setting. Our treatment here follows that latter work.
We note that that work's Lasso rendition doesn't explicitly mention M3, tables, channels, or anything like that, but was nonetheless essentially those things. I.e., it's not much work to recast that treatment as an M3 instance.
The Instance Specification
We begin by describing the instance specification. For background, we fix a ground field and a length parameter . We are going to treat tables and looked-up columns both with values in , and both of length . (The requirement whereby these columns' lengths are equal can be relaxed.)
For technical reasons, we are moreover going to choose a further tower field ; we are going to choose the tower height (this is explained in [DP23, § 4.4]). Finally, we fix an element which generates the multiplicative group of units .
Channels
We will create just a single channel, of type signature . That is, the things we're allowed to flush into it (i.e., whether pushing or pulling) will be tuples, consisting of a -element and -element. In our setting here, these two items will have the semantic roles of an element and a timestamp, respectively.
Tables
Our first table consists simply of the statement . It is of height , and of type signature . This table is already fixed as of the beginning of the protocol.
We're going to create a single further table, of type signature . That is, it will have two columns, both valued in ; we will call these (for read timestamps) and (for final timestamps). Its height will be , the same as and .
We will allow the prover to populate the second table arbitrarily during its execution. This table will have only the constraint whereby be identically nonzero. Though this is technically not an algebraic constraint, we have a way in Binius of dealing with this (essentially, we run a grand product, and check that the product of its elements is nonzero). We thus suppress this technicality for now.
Flushing rules
We are going to have 4 main flushing rules.
- Pull . Note that this typechecks (it matches the type signature of the channel).
- Push . Here, we write for the trivial, transparent virtual column which simply takes the value identically everywhere.
- Pull . Again, we are pulling two committed things, and the types are right.
- Push . Here, we define to be the virtual column . This is a simple scaling operation, and is definitely a legal way to come up with a virtual column.
Boundary conditions
We're not going to have any boundary conditions in this M3 instance.
The Claim
The claim is that if satisfies the lookup table relation (that is, if every element of is contained somewhere in ) then there will be an easy way to populate the table in such a way as to make the attempt succeed.
Conversely, if does not satisfy the relation (i.e., some element of appears nowhere in ), then this will be unsatisfiable, in that no possible population of the table will cause the channel to balance.
Proving this rigorously turns out to be fairly tricky; this gets done in [DP23, Thm. 4.28]. Below, we are going to give some intuition for why this is true.
Worked Example
Here, we will work through an explicit example, and get some intuitions for why this works. Using the table and looked-up column from the boxed example above, we see that the total list of pushes and pulls will lead to a state like:
/¯\¯¯¯¯¯¯¯¯¯¯¯\
PUSH : : : PULL
----------> | | channel | ---------->
(5, 0) : : : (5, F₀)
(6, 0) \_/___________/ (6, F₁)
(7, 0) (7, F₂)
(8, 0) (8, F₃)
(8, R₀ + 1) (8, R₀)
(6, R₁ + 1) (6, R₁)
(6, R₂ + 1) (6, R₂)
(7, R₃ + 1) (7, R₃)
Here, we cheat a bit notationally, by "taking discrete logarithms" of the columns and with respect to . I.e., instead of writing , we instead write 0
, and more generally write i
instead of , for each . Note that we required that these columns be nonzero, so this makes sense.
Populating the tables
We have "dummied out" the values of and , since we (as the prover) don't know them yet. Adopting the prover's point of view, we now must ask: how should the prover populate the as-yet empty table in order to cause the above channel to balance? Indeed, we can perfectly anticipate which pushes and pulls the verifier is going to do, and what the picture will look like (it's exactly the one above).
Let's walk our way down the table , row by row. First, we hit the 8. We know that our only hope of cancelling out the verifier's initial push (8, 0)
(way on the top left) hinges on setting . On the other hand, as we do this, we simultaneously induce a push—or a "liability"—of (8, 1)
in the bottom left quadrant. We mentally remark this for later, since we will need to cancel it out (this is where will come in!).
Next up is 6. Here, we have a similar story: in order to cancel out the preemptive push (6, 0)
, we have to set . This has the effect of pulling (i.e., counterbalancing) (6, 0)
, but also of pushing the outstanding liability (6, 1)
. More on this later.
Now, on the next row of , we hit 6 again. Now, the relevant outstanding liability is not (6, 0)
(which already got cancelled above!) but rather (6, 1)
, the one we just pushed in the last step. We thus set . This has the effect of cancelling off (6, 1)
, but pushing (6, 2)
.
In the final row, we hit 7; here, we again need to set , since (7, 0)
is still outstanding.
At the very end of doing this, we have stricken off some (though not all) of the initial pushes in the top left, and replaced these with new pushes on the bottom left. At this final stage, we will use our final counts to counterbalance these stray pushes. It's not hard to check that by setting to be the total number of times got read throughout , we get exactly what we want. Thus we will set , , , and . At the very end, we get the channel state:
/¯\¯¯¯¯¯¯¯¯¯¯¯\
PUSH : : : PULL
----------> | | channel | ---------->
(5, 0) : : : (5, 0)
(6, 0) \_/___________/ (6, 2)
(7, 0) (7, 1)
(8, 0) (8, 1)
(8, 1) (8, 0)
(6, 1) (6, 0)
(6, 2) (6, 1)
(7, 1) (7, 0)
which is totally balanced, as desired.
Soundness
The above discussion shows that the prover can succeed if he is honest. What if he's dishonest? Why should the above game become impossible to win?
Let's assume for the sake of argument that 's final element were , instead of . Note that is not in the table . In this case, on the very bottom row of the above diagram, we would have the pair of elements on the push side and on the pull side. Since doesn't appear in , so there are no tuples whatsoever containing as their first component in the upper half of the diagram.
It seems intuitively clear that whatever the prover sets to, the pair of different tuples and will stick out like a sore thumb, with nothing to balance them. In fact, a problem like this is going to persist even of had many s in it (or other bad values, etc.): even after setting the relevant entries of cleverly, there is nothing the prover is going to be able to do to lead to a balanced channel overall.
The key subtlety is that if the prover can trigger an overflow in the "counter space", then it can get away with this after all. That is why we required above that be "sufficiently large" relative to (as well as that everything be nonzero!).