Our MPT Construction
Here, we begin our description in earnest. For the sake of this page, we omit state forking, a conceptually tricky, M3-based deduplication technique. Below, in various places, we moreover "fudge" by allowing natural number-valued datatypes. In our practical implementation, we actually work multiplicatively; that is, we use multiplicative group of units of an appropriately chosen tower field. We used a similar expository device in our Lasso explainer.
A Few Datatypes
For our purposes below, it will be convenient now to extract and state separately a few datatypes, which we will often want to "bundle together".
- Nibble objects. We've already discussed this convenient abstraction. A nibble pointer contains:
- An integer-valued , representing a location in memory.
- A boolean-valued , representing one among the two nibbles within the byte.
- State objects. As we've already discussed, we use the term state to refer to a bundle containing:
- An integer-valued , which will point to a position in the prover's memory block.
- A nibble-valued , which will point to a nibble within a key in the prover's memory.
- Memory tuples. As should be understandable in view of our Lasso arithmetization, we will often want to bundle together "memory tuples", containing:
- A byte-valued .
- An integer-valued .
Channels
We begin by recording our instance's channels.
The state channel
This channel accepts tuples containing
- A state value (see above).
The memory channel
This channel accepts tuples containing
- An integer-valued memory address .
- A memory-tuple (see above).
Each tuple here represents an address–value pair , read at ; we refer to our Lasso explainer.
The absorb-block channel
This channel facilitates claims about Keccak-256. It accepts tuples containing:
- A 200-byte .
- An integer-valued .
- An integer-valued .
Each tuple in this channel represents a claim whereby, after absorbing some (nondeterministically supplied) amount of memory content beginning at into the sponge state , one obtains the digest beginning in memory at the location .
The get-child channel
This channel accepts tuples of type:
- An integer-valued .
- An integer-valued child index .
- An integer-valued .
Each tuple in this channel represents the claim whereby, if points to the beginning the th-indexed child of some branch node, let's say, then points to the th-indexed child of the same branch node. That is, if we start at and take steps, then we land at .
The skip-list-header channel
This channel deals with skipping the "header" of RLP-encoded lists. It accepts tuples containing:
- An integer-valued .
- An integer-valued .
This channel represents claims whereby points to the beginning of an RLP-encoded list, and points to the RLP-encoded first element of that list. The point is to "skip the list's header".
The check-nibble channel
This channel handles asserting the equality of substrings of nibbles. It accepts tuples of containing:
- A nibble-valued .
- A nibble-valued .
- A nibble-valued .
- A nibble-valued .
Each tuple in this channel represents a claim whereby two "stretches" of nibbles are equal to each other. Essentially, it says that if you start with two nibble pointers, pointing at and , respectively, and advance both pointers nibble-by-nibble in lockstep until reaches , then, over the course of that time, will have reached ; and, moreover, both stretches of nibbles will have been identically equal to each other all the while.
The Keccak-f channel
This channel handles Keccak- permutation claims. It accepts tuples containing:
- A 200-byte .
- A 200-byte .
Tuples in this channel represent valid input–output pairs of the Keccak- permutation.
Tables, Constraints, and Flushing Rules
We go through our tables, one by one. In each, we describe the datatypes, the constraints, and the flushing rules.
The hash-transition table
In this table, we allow the prover to replace a state pointing to a hash with a state pointing to the preimage of that hash, without advancing his state's key pointer.
The table has the following columns:
- A state .
- A memory-tuple .
- An integer-valued .
Its constraints are:
- Assert that equals .
Here, represents a byte supposedly read from the memory location (see the flushing rules below). Here, we check that this byte in question takes the value 0xA0
, which, in RLP language, is the prefix attached to a 32-byte string. In context, this winds up assuring the verifier that we're starting in a nonempty child of a branch node, as opposed to an empty one (whose prefix would be 0x80
instead).
Its flushing rules are:
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the state from the state channel.
- Push the state to the state channel.
- Push the tuple to the absorb-block channel.
The first two bullet items above are a Lasso-style ROM read. We will see this pattern a lot. Essentially, they have the effect of making sure that the prover's nondeterministically supplied is the same value as that sitting in memory at the location .
The second two bullet items above allow the prover to advance his state. That is, they allow the prover to replace the state pointing to with a new state pointing to —without changing .
The cost of this move, though, is that we push a new outstanding liability to the absorb-block channel. That is, we push the liability whereby, upon beginning with the empty sponge state and beginning absorption at , one ends with the digest sitting at . This is nothing other than asserting that the Keccak-256 of something beginning at equals the digest beginning at . Thus, we make sure that the prover did a legal hash unwinding.
The branch-transition table
Here, we allow the prover to replace a state pointing to the beginning of a branch node with a state pointing to a child of the branch node. Moreover, we allow the prover to consume one nibble of the relevant key in the process. Finally, we require that these match—i.e., that the index of the final child should be the same as the nibble that got consumed.
Here are the table's columns:
- A state .
- An integer-valued .
- A memory-tuple .
- A memory-tuple .
- An integer-valued .
- A state .
Its constraints are:
- Assert that is either or .
- Assert that 's th nibble equals .
- Assert that the nibble pointer is exactly one nibble ahead of . (Concretely, this amounts to flipping its , and moreover incrementing its if the parity was high to begin with.)
We explain these roughly in the following way. The first constraint above checks that the is either 0x80
or 0xA0
. Conditioned on pointing to the beginning of an RLP-encoded list (we can assume this by induction), pointing to that list's first child (we will ensure this below by pushing to the skip-list-header channel), and being the value in memory at (we will ensure this below using a memory read), this latter check thus ensures that we are in a branch node, and not in an extension or a leaf node.
The second constraint above ensures that the prover's nondeterministically supplied relates in the right way to , which will itself be checked using a memory read against (see below). The final constraint ensures tht the prover has advanced his state's key pointer by exactly one nibble.
Here are the branch transition table's flushing rules:
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull from the state channel.
- Push to the state channel.
- Push the tuple to the skip-list-header channel.
- Push the tuple to the get-child channel.
The first two bullets are a Lasso read. Here, we check that the prover's nondeterministically supplied actually matches what's sitting in memory at the location . Similarly, in the second two bullets, we check that the prover's nondeterministically supplied actually matches what's in memory at the location . Coupled with the second constraint above, this check guarantees that is meaningful (i.e., it's actually the "current" nibble of the state's key).
In the next two bullet points, we allow the prover to completely update his state. In virtue of the third constraint above, we know that and relate to each other in the right way (they differ by exactly one nibble). In the final two bulletpoints above, we push new liabilities that guarantee that and relate to each other in the right way. In the first, we handle the "stepping stone" from to —what we achieve here is showing that the latter thing is exactly the result of starting at the former and skipping the list header. In the second step, we assure that, upon starting at , and marching for children, we wind up exactly at .
Thus, the effect has been for the prover to start at the beginning of a branch node, read one nibble from key memory, and end at the appropriate child of that branch node.
The extension–leaf-transition table
This table serves an analogous role to that above, but for extension and leaf nodes. Here, our goal is to allow the prover to step from a state pointing to the beginning of an extension node or a leaf node to a state pointing to that node's "payload". By definition, extension and leaf nodes look like two-element lists, whose first element is a prefixed nibble path and whose second element is a payload. In each case, we want to allow the prover to consume the appropriate number of nibbles of his key—i.e., exactly as many as are in the raw (non-prefixed) nibble path—and moreover to step his state's RLP pointer from the beginning of the node to the payload.
Here are the table's columns:
- A state .
- An integer-valued .
- A memory-tuple .
- An integer-valued .
- A memory-tuple .
- An integer-valued .
- A boolean flag .
- A nibble-valued .
- A nibble-valued .
Here are its constraints:
- Assert that is neither nor . This checks that we're not in a branch node.
- Assert that points to the beginning of the node's prefixed nibble path and points to its end. The details here are extremely technical, and come down to how RLP encoding works; we take it easy here. Essentially, there are two cases, depending on whether the prefixed nibble path is one byte or longer (equivalently, whether the raw nibble path has more than 1 nibble or not).
- If it's just one byte, then we require that and both hold.
- If it's more than one byte, we require that and .
- The above two cases can be distinguished from each other by checking the single-bit flag .
- Assert that the boolean value equals the boolean flag . Here, will be equal the the value in memory at the location (see flushing rules below). Thus, we're simply checking whether the 0th-indexed nibble of the prefixed nibble path beginning at in memory is odd or not (see explanation here).
- Assert that points to the beginning of the raw nibble path. This amounts to comparing it to and using the flag. If is true, we need to advance by just one nibble; otherwise, we need to advance it by two nibbles (see again here).
Here are the table's flushing rules.
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull from the state channel.
- Push the state to the state channel.
- Push the tuple to the skip-list-header channel.
- Push the tuple to the check-nibble channel.
The first two items above are again a memory read; they ensure that equals what's in memory at the location . Similarly, the second two ensure that equals what's in memory at .
The next pair of items is a state update; here, the prover pulls the supplied state, and moreover pushes an updated state—whose has been updated to and its key pointer to .
Of course, we need to constrain those latter things, by pushing new liabilities. To this end, in the second-to-last item, we ensure that and relate to each other by the skipping of a list header. In the final flushing rule above, we ensure that the prover's nondeterministically supplied correctly points to the result that one would get after advancing and in lockstep until the former arrives at .
The absorb-block tables
Here, we drop our first table which doesn't directly interact with the state channel at all. Rather, this table's job is "second-tier"—its job is merely to allow the prover to absorb certain sub-liabilities spawned during in the flushing rules of the hash-transition table. That is, it allows the prover to pull from the absorb-block channel, to which the hash-transition table pushed.
This table is also our first example of a "recursive call stack" pattern. Each row represents a frame in a recursive call, which spawns a subcall on a smaller problem. That is, each row flushes a pull to the absorb-block channel, and also pushes a smaller problem. As a technicality, we actually have two tables here: a recursive table, which spawns a subcall, as well as a base table, which lets the prover pull the final base case of the recursion.
Absorb-block recursive.
Here are the recursive table's columns:
- A 200-byte .
- A 200-byte .
- A 200-byte .
- An integer-valued .
- An array consisting of 136 memory-tuples, called .
- An integer-valued .
Here are its constraints:
- For each :
- Assert .
- For each :
- Assert .
These two constraints are super-easy to explain: they literally just amount to how sponge absorption works. To get the thing we should subject to Keccak-, we should take the sponge state, and absorb in 136 bytes from memory (the rate). The remaining bytes (the capacity) should be left alone. Of course, we have no idea—yet—whether the prover's s have anything to do with what's actually in memory beginning at (since they are supplied by the prover nondeterministically). As usual, we will handle this below using a memory read.
Finally, we have flushing rules, which check the memory reads, and spawn a subcall:
- For each ,
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the tuple from the absorb-block channel.
- Push the tuple to the absorb-block channel.
- Push the tuple to the Keccak- channel.
The first two rules (inside the loop) are easy to explain. Here, by doing a bunch of Lasso reads, we check that, for each the byte nondeterministically supplied by the prover actually represents what's in memory at the location . With the aid of our constraints above, these reads settle the question of the sponge absorption—that is, they guarantee that is actually what the prover should feed into the Keccak- permutation in order to obtain the next sponge state, provided we begin reading at .
In the next two items, we allow the prover to replace the absorb-block triple he started with with a new one, in which the sponge state has been updated and the has been advanced by 136 bytes. As a side effect, though, we spawn a new liability, which controls whether the prover's Keccak- permutation was done correctly. That is, we need to spawn a liability pertaining to whether and actually relate to each other by a Keccak- or not. This will itself be absorbed later.
Absorb-block base.
We need a way for the prover to cap off this recursion. To do this, we also allow it to populate a base variant of this table. That table essentially checks the finalized sponge state against the target location in memory.
The absorb-block base table has the columns:
- A 200-byte .
- An integer-valued .
- An integer-valued .
- An array consisting of 32 integers, called .
This table has no constraints.
Finally, its flushing rules are:
- For each ,
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
The effect here is to read 32 bytes out of memory starting at the location (we have to skip the single prefix byte 0xA0
), and checking whether they equal the first 32 bytes of . Note that we are doing something a bit different: we are using to get the values, and just letting the prover supply the timestamps nondeterministically.
The get-child tables
Here, we allow the prover to absorb the get-child liabilities emitted by the branch-transition table's flushing rules. That is, this table lets the prover pull from the get-child channel, to which that table pushed. This is another pair of tables, consisting of a recursive table and a base table. In the recursive table, we allow the prover to advance his pointer by one child, while decreasing the outstanding child index by 1. We also allow the prover to cap the recursion using the base table.
Get-child recursive.
The recursive table's columns are:
- An integer-valued .
- A memory tuple .
- An integer-valued .
- An integer-valued .
- An integer-valued .
The constraints are:
- Assert that is greater than either by 1 if or by 33 if .
- These two cases can be distinguished by checking the single-bit flag .
The idea here is that we know by induction that the initial pointer points to the beginning of some child in a branch node. That child is either empty, in which case it looks like the RLP-encoded empty string 0x80
, or else it's nonempty, in which case it looks like 0xA0
followed by a 32-byte hash digest. Above, we check that we skipped the right amount—conditioned on being meaningful, which we check below.
Here are the flushing rules:
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the tuple to the get-child channel.
- Push the tuple from the get-child channel.
The idea is straightforward. We check that actually represents what's in memory at the location , using a Lasso read. We moreover allow a pull and a push, in which the pointer has been advanced and the index decremented.
Get-child base.
The get-child base table has just a single column:
- An integer-valued .
It has no constraints.
It has just a single flushing rule:
- Pull the tuple from the get-child channel.
Note that we have enforced that and , in a de facto way by what we pulled.
The check-nibble tables
Here, we present another second-tier table. This table allows the prover to absorb the check-nibble liabilities emitted by the extension–leaf transition table above; i.e., it lets the prover pull from the check-nibble channel.
Check-nibble recursive.
We begin with the recursive table. It has the following columns:
- A nibble-valued .
- A nibble-valued .
- A nibble-valued .
- A nibble-valued .
- A memory tuple .
- A memory tuple .
- A nibble-valued .
- A nibble-valued .
While there are a lot of fields here, the logic should be pretty clear; we refer to our explanation of the check-nibble channel. In each row of the table, we are going to allow the prover to advance both nibbles a single step. We are going moreover going to do a few memory reads, to check that the relevant nibbles are equal. Finally, we will spawn a recursive sub-liability.
Here are the constraints:
- Assert that , advanced by exactly one nibble, yields (this can be done by flipping its and conditionally advancing its ).
- Assert that , advanced by exactly one nibble, yields (similar).
- Assert that the th nibble of equals the th nibble of . This can be done with a bit of bit twiddling and shifting.
Finally, we have the following flushing rules:
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the tuple from the check-nibble channel.
- Push the tuple to the check-nibble channel.
These are straightforward. In the first four items, we check that the value in memory at equals and that the value in memory at equals . In the last two, we pull the liability we started with, and push the sub-liability—with both nibble pointers advanced by exactly one step (see the constraints above).
Check-nibble base.
The base table has just two columns:
- A nibble-valued .
- A nibble-valued .
It has no constraints.
It has the flushing rule:
- Pull the tuple from the check-nibble channel.
Note that here, we are forcing and . That is, we only allow the prover to pull a claim in which the nibble pointer has reached its target, and in which the key and return pointers have equated.
The skip-list-header table
Here, we present another auxiliary table, designed to allow the prover to absorb certain liabilities spawned during the flushing rules above. The point is to advance a pointer from the beginning of an RLP-encoded list to the beginning of that list's first child.
Here are the table's columns:
- An integer-valued .
- A memory-tuple .
- An integer-valued .
Here are its constraints:
- Assert that contains the "magic" RLP prefix byte corresponding to lists. Technically, this can be checked using the boolean test .
- Assert that is ahead of by the right amount of bytes. This is another RLP technicality. There are two cases, depending on whether we're at a "long" list or not.
- In the case of a short list, we need to assert .
- In the case of a long list, we need to assert .
- These cases can be distinguished using the bitwise test .
Finally, its flushing rules:
- Pull the triple from the memory channel.
- Push the triple to the memory channel.
- Pull the tuple from the skip-list-header channel.
Notice that we haven't spawned any new liabilities.
The Keccak-f table
Here, we get to our very final table. Its job is to safely let the prover absorb the Keccak- liabilities spawned by the absorb-block recursive table.
Actually, the Keccak- table is quite sophisticated, and of course can be found in production code in Binius. For now, we just mention the subset of the columns that is relevant to us:
- A 200-byte .
- A 200-byte .
It has constraints designed to ensure that, for each row, . It does this using a bunch of complicated constraints, as well as further, intermediate columns, which we don't mention here.
It has just a single flushing rule:
- Pull the tuple from the Keccak- channel.
Boundary Conditions
Our verifier has two main sorts of boundary conditions. In the first, it pre-seeds the state channel with statement-dependent data. It also has to initialize the memory channel using the prover's memory block and his final read counts. This latter thing is standard from Lasso; we ignore it for now.
Thus, we explain the verifier's state boundary conditions, already sketched in our informal explanation. We write for a pointer, sent by the prover, to the 0xA0
-prefixed global state trie root in the prover's memory block. Moreover, for each key–value pair in the statement, we write for a pointer to the th key in memory and for a pointer to the th value in memory. By making small reads into the prover's memory block, the verifier can explicitly check that all of these pointers are valid and pointing to the right thing.
- For each for each key–value pair contained in the overall statement, the verifier performs the following boundary operations:
- Push the initial tuple to the state channel.
- Pull the terminal tuple from the state channel.
You can convince yourself that this works! For each key–value pair , the prover makes a path from the relevant push to the relevant pull by steadily consuming nibble after nibble of the relevant key, and stepping the pointer from the root down the appropriate Merkle–Patricia path.