Skip to content

Lasso Lookup

Lookups within the M3 formalism

Lookup protocols are essential in modern SNARKs. At their core, they allow a sort of unstructured access. While a typical polynomial constraint might decree, for example, that one column be the element-wise product of two other columns—an algebraic relationship—a lookup allows us to prove that a column's values are merely present in some other column—and in an unstructured way, no less, in the sense that we don't care where in that latter column the first column's values appear.

As a brief technical explanation, a lookup argument pertains to a table column TT and a looked-up column UU. The statement is that all of UU's values appear in TT (somewhere). Here, TT can be thought of as the global lookup table, which tells us which values are "allowed".

As it turns out, lookups allow us to "bootstrap" many flexible sorts of operations. For example, they let us achieve efficient unsigned integer multplication even in Binius's tower setting; this fact is explained at length in [DP23, § 5.3].

Here, we explain how to express Lasso naturally as an M3 instance.

Background

Binius achieves lookups in the binary tower setting using Lasso, a technique developed by Setty, Thaler and Wahby [STW23]. Under the hood, the essential idea here is offline memory checking, a technique which originates with Blum et al. [Blu+91]. In [DP23, § 4.4], we describe how to adapt Lasso's offline-memory-checking argument to the binary setting. Our treatment here follows that latter work.

We note that that work's Lasso rendition doesn't explicitly mention M3, tables, channels, or anything like that, but was nonetheless essentially those things. I.e., it's not much work to recast that treatment as an M3 instance.

The Instance Specification

We begin by describing the instance specification. For background, we fix a ground field Tι\mathcal{T}_\iota and a length parameter \ell. We are going to treat tables TT and looked-up columns UU both with values in Tι\mathcal{T}_\iota, and both of length 22^\ell. (The requirement whereby these columns' lengths are equal can be relaxed.)

For technical reasons, we are moreover going to choose a further tower field Tζ\mathcal{T}_\zeta; we are going to choose the tower height ζlog(+1)\zeta \coloneqq \lceil \text{log}(\ell + 1) \rceil (this is explained in [DP23, § 4.4]). Finally, we fix an element αTζ\alpha \in \mathcal{T}_\zeta which generates the multiplicative group of units Tζ\mathcal{T}_\zeta^*.

Channels

We will create just a single channel, of type signature (Tι,Tζ)(\mathcal{T}_\iota, \mathcal{T}_\zeta). That is, the things we're allowed to flush into it (i.e., whether pushing or pulling) will be tuples, consisting of a Tι\mathcal{T}_\iota-element and Tζ\mathcal{T}_\zeta-element. In our setting here, these two items will have the semantic roles of an element and a timestamp, respectively.

Tables

Our first table consists simply of the statement (T,U)(T, U). It is of height 22^\ell, and of type signature (Tι,Tι)(\mathcal{T}_\iota, \mathcal{T}_\iota). This table is already fixed as of the beginning of the protocol.

We're going to create a single further table, of type signature (Tζ,Tζ)(\mathcal{T}_\zeta, \mathcal{T}_\zeta). That is, it will have two columns, both valued in Tζ\mathcal{T}_\zeta; we will call these RR (for read timestamps) and FF (for final timestamps). Its height will be 22^\ell, the same as TT and UU.

We will allow the prover to populate the second table (R,F)(R, F) arbitrarily during its execution. This table will have only the constraint whereby RR be identically nonzero. Though this is technically not an algebraic constraint, we have a way in Binius of dealing with this (essentially, we run a grand product, and check that the product of its elements is nonzero). We thus suppress this technicality for now.

Flushing rules

We are going to have 4 main flushing rules.

  • Pull (T,F)(T, F). Note that this typechecks (it matches the type signature of the channel).
  • Push (T,O)(T, O). Here, we write OO for the trivial, transparent virtual column which simply takes the value 11 identically everywhere.
  • Pull (U,R)(U, R). Again, we are pulling two committed things, and the types are right.
  • Push (U,W)(U, W). Here, we define WW to be the virtual column WαRW \coloneqq \alpha \cdot R. This is a simple scaling operation, and is definitely a legal way to come up with a virtual column.

Boundary conditions

We're not going to have any boundary conditions in this M3 instance.

The Claim

The claim is that if (T,U)(T, U) satisfies the lookup table relation (that is, if every element of UU is contained somewhere in TT) then there will be an easy way to populate the table (R,F)(R, F) in such a way as to make the attempt succeed.

Conversely, if (T,U)(T, U) does not satisfy the relation (i.e., some element of UU appears nowhere in TT), then this will be unsatisfiable, in that no possible population of the table (R,F)(R, F) will cause the channel to balance.

Proving this rigorously turns out to be fairly tricky; this gets done in [DP23, Thm. 4.28]. Below, we are going to give some intuition for why this is true.

Worked Example

Here, we will work through an explicit example, and get some intuitions for why this works. Using the table TT and looked-up column UU from the boxed example above, we see that the total list of pushes and pulls will lead to a state like:

               /¯\¯¯¯¯¯¯¯¯¯¯¯\
    PUSH      :   :           :    PULL
 ---------->  |   |  channel  | ---------->
   (5, 0)     :   :           :  (5, F₀)
   (6, 0)      \_/___________/   (6, F₁)
   (7, 0)                        (7, F₂)
   (8, 0)                        (8, F₃)
  
 (8, R₀ + 1)                     (8, R₀)
 (6, R₁ + 1)                     (6, R₁)
 (6, R₂ + 1)                     (6, R₂)
 (7, R₃ + 1)                     (7, R₃)

Here, we cheat a bit notationally, by "taking discrete logarithms" of the columns RR and FF with respect to α\alpha. I.e., instead of writing 1=α01 = \alpha^0, we instead write 0, and more generally write i instead of αi\alpha^i, for each i{0,,Tζ2}i \in \{0, \ldots , |\mathcal{T}_\zeta| - 2\}. Note that we required that these columns be nonzero, so this makes sense.

Populating the tables

We have "dummied out" the values of FF and RR, since we (as the prover) don't know them yet. Adopting the prover's point of view, we now must ask: how should the prover populate the as-yet empty table (R,F)(R, F) in order to cause the above channel to balance? Indeed, we can perfectly anticipate which pushes and pulls the verifier is going to do, and what the picture will look like (it's exactly the one above).

Let's walk our way down the table UU, row by row. First, we hit the 8. We know that our only hope of cancelling out the verifier's initial push (8, 0) (way on the top left) hinges on setting R00R_0 \coloneqq 0. On the other hand, as we do this, we simultaneously induce a push—or a "liability"—of (8, 1) in the bottom left quadrant. We mentally remark this for later, since we will need to cancel it out (this is where FF will come in!).

Next up is 6. Here, we have a similar story: in order to cancel out the preemptive push (6, 0), we have to set R10R_1 \coloneqq 0. This has the effect of pulling (i.e., counterbalancing) (6, 0), but also of pushing the outstanding liability (6, 1). More on this later.

Now, on the next row of UU, we hit 6 again. Now, the relevant outstanding liability is not (6, 0) (which already got cancelled above!) but rather (6, 1), the one we just pushed in the last step. We thus set R21R_2 \coloneqq 1. This has the effect of cancelling off (6, 1), but pushing (6, 2).

In the final row, we hit 7; here, we again need to set R30R_3 \coloneqq 0, since (7, 0) is still outstanding.

At the very end of doing this, we have stricken off some (though not all) of the initial pushes in the top left, and replaced these with new pushes on the bottom left. At this final stage, we will use our final counts to counterbalance these stray pushes. It's not hard to check that by setting F(v)F(v) to be the total number of times T(v)T(v) got read throughout UU, we get exactly what we want. Thus we will set F00F_0 \coloneqq 0, F12F_1 \coloneqq 2, F21F_2 \coloneqq 1, and F31F_3 \coloneqq 1. At the very end, we get the channel state:

               /¯\¯¯¯¯¯¯¯¯¯¯¯\
    PUSH      :   :           :    PULL
 ---------->  |   |  channel  | ---------->
   (5, 0)     :   :           :  (5, 0)
   (6, 0)      \_/___________/   (6, 2)
   (7, 0)                        (7, 1)
   (8, 0)                        (8, 1)
  
   (8, 1)                        (8, 0)
   (6, 1)                        (6, 0)
   (6, 2)                        (6, 1)
   (7, 1)                        (7, 0)

which is totally balanced, as desired.

Soundness

The above discussion shows that the prover can succeed if he is honest. What if he's dishonest? Why should the above game become impossible to win?

Let's assume for the sake of argument that UU's final element were 44, instead of 77. Note that 44 is not in the table TT. In this case, on the very bottom row of the above diagram, we would have the pair of elements (4,R3+1)(4, R_3 + 1) on the push side and (4,R3)(4, R_3) on the pull side. Since 44 doesn't appear in TT, so there are no tuples whatsoever containing 44 as their first component in the upper half of the diagram.

It seems intuitively clear that whatever the prover sets R3R_3 to, the pair of different tuples (4,R3+1)(4, R_3 + 1) and (4,R3)(4, R_3) will stick out like a sore thumb, with nothing to balance them. In fact, a problem like this is going to persist even of UU had many 44s in it (or other bad values, etc.): even after setting the relevant entries of RR cleverly, there is nothing the prover is going to be able to do to lead to a balanced channel overall.

The key subtlety is that if the prover can trigger an overflow in the "counter space", then it can get away with this after all. That is why we required above that ζ\zeta be "sufficiently large" relative to \ell (as well as that everything be nonzero!).