Gadget Design
On this page we design a lookup gadget, introducing in the process the concepts of "multiplicities" and "components". A central sub-gadget will be get_components
. We'll use in gadget in a blackbox fashion on this page, and construct it in the next page.
Our lookup gadget will take the signature
fn plain_lookup<U, F, FS, const LOG_MAX_MULTIPLICITY: usize>(
builder: &mut ConstraintSystemBuilder<U, F>,
table: OracleId,
table_count: usize,
balancer_value: FS,
lookup_values: OracleId,
lookup_values_count: usize,
) -> Result<Boundary<F>, anyhow::Error>
where
U: PackScalar<F> + PackScalar<FS> + PackScalar<BinaryField1b> + Pod,
F: TowerField + ExtensionField<FS>,
FS: TowerField + Pod,
where we call this a "plain" lookup to distinguish it from Lasso lookups, explained here and demonstrated in many forms here. While the mechanics of "timestamps" like in Lasso are not needed here, other concepts will need explanation. Looking at this signature we have
table
: an oracle overFS
holding the table of valid lookup values.table_count
: only the firsttable_count
values oftable
are considered valid lookup values.balancer_value
: any valid table value, needed for balancing the channel.lookup_values
: an oracle overFS
holding the values to be looked up.lookup_values_count
: only the firstlookup_values_count
values inlookup_values
will be looked up.LOG_MAX_MULTIPLICITY
: no value inlookup_values
can be looked up1 << LOG_MAX_MULTIPLICITY
times or more.
The last point regarding LOG_MAX_MULTIPLICITY
is an inconvenience. Either the prover and verifier must agree on a strict upper bound for the number of times any table value will be looked up, or the prover can supply the verifier with non-deterministic advice for the upper bound, which in practice will be much smaller than a worst-case upper bound.
Another limitation is that we've restricted the table and lookup values to be over subfield FS
. But many cases can be adapted for this. For example, when we apply this lookup for u8
multiplication in our example application, we will pack the two inputs and the output into one BinaryField32b
element and thus FS
as BinaryField32b
suffices.
First thing we'll do in this gadget is a sanity check that the size of the table is no smaller than table_count
.
let n_vars = builder.log_rows([table])?;
debug_assert!(table_count <= 1 << n_vars);
Next we'll make a channel.
let channel = builder.add_channel();
The idea behind this lookup scheme is that the prover can push any values it wishes into this channel. Those values will come from the oracle lookup_values
. The verifier will allow only valid lookup values to be pulled out of the channel. Those values will come from the oracle table
. Soundness, the property that the prover cannot cheat in this simple scheme, is evident. Completeness, the property that an honest prover can be successful, can be achieved if the prover tells the verifier how many of each table value should be pulled from the channel. We call these numbers "multiplicities", and an honest prover would choose multiplicity k
for a table value if the prover has exactly k
instances of that value in lookup_values
.
But before we work with multiplicities, we can already push into the channel the first lookup_values_count
values of lookup_values
.
builder.send(channel, lookup_values_count, [lookup_values]);
What remains is to pull table values from the channel, pulling the right number of each table value such that the channel balances.
Multiplicities
We create an Option<Vec<usize>>
object to hold the multiplicities, with the internal Vec
of size table_count
such that there is one multiplicity for each valid table value. The reason we wrap the multiplicities in an Option
is that both the verifier and prover will call plain_lookup
, but only the prover will know the multiplicities. Thus we declare multiplicities as None
, but then have a prover-only block following in which we read the table
values and the lookup_values
, counting the multiplicities, and updating multiplicities
to a Some
variant.
let mut multiplicities = None;
// have prover compute and fill the multiplicities
if let Some(witness) = builder.witness() {
let table_slice = witness.get::<FS>(table)?.as_slice::<FS>();
let values_slice = witness.get::<FS>(lookup_values)?.as_slice::<FS>();
multiplicities = Some(count_multiplicities(
&table_slice[0..table_count],
&values_slice[0..lookup_values_count],
true,
)?);
}
We write the function count_multiplicities
, which doesn't depend on an Binius API, as
fn count_multiplicities<T: Eq + std::hash::Hash + Clone + std::fmt::Debug>(
table: &[T],
values: &[T],
check_inclusion: bool,
) -> Result<Vec<usize>, anyhow::Error> {
use std::collections::{HashMap, HashSet};
if check_inclusion {
let table_set: HashSet<_> = table.iter().cloned().collect();
if let Some(invalid_value) = values.iter().find(|value| !table_set.contains(value)) {
return Err(anyhow::anyhow!("value {:?} not in table", invalid_value));
}
}
let counts: HashMap<_, usize> = values.iter().fold(HashMap::new(), |mut acc, value| {
*acc.entry(value).or_insert(0) += 1;
acc
});
let multiplicities = table
.iter()
.map(|item| counts.get(item).copied().unwrap_or(0))
.collect();
Ok(multiplicities)
}
Components
Now that we've computed the multiplicities, we must use them to pull from the channel exactly the right multiplicity of each table value.
Without yet explaining why this works, to make this happen we will bit decompose the multiplicities into columns we'll call bits
, create related columns we'll call components
, and then flush these components
(pulling them from the channel). When we flush we will leverage flush_with_multiplicity
so that instead of flushing all values in a component once, we flush all values in a component an appropriate number of times for the channel to balance.
After sanity checking each multiplicity to be below 1 << LOG_MAX_MULTIPLICITY
, we bit-decompose each multiplicity into LOG_MAX_MULTIPLICITY
bits. We will create an array of LOG_MAX_MULTIPLICITY
oracles over BinaryField1b
called bits
that represent these bits. We will also create an array of LOG_MAX_MULTIPLICITY
oracles over FS
called components
that represent the bits
oracles determining whether a component value is the table value (if the bit is one) or the balancer_value
(if the bit is zero). In pseudocode, for component column j
and bit column j
, at index i
in the range [0..table_count]
we will have
components_j[i] := bits_j[i] * table[i] + (1 - bits_j[i]) * balancer_value
Given the correctly formed component columns, we'll pull the i
th component oracle from the channel with multiplicity 1 << i
.
The idea is that by flushing the i
th component with multiplicity 1 << i
we will flush a table value 1 << i
times if the i
th bit of the multiplicity for that table value is 1, and otherwise flush the table value 0
times. Aggregating the flushes across all components we will end up with the flush count of each table value matching the corresponding multiplicity.
Our channel balancing strategy isn't finalized, because when a bit is zero we pull the balancer_value
from the channel rather than a looked up value. To counter this, we will need to push enough balancer_value
s into the channel, but we discuss that later in channel balancing. For now assume there is a gadget get_components
that creates the bits
oracles and the components
oracles, checks the components
oracles against the table
oracle and the bits
oracles, and then returns the components
oracles. We'll invoke this gadget as
let components: [OracleId; LOG_MAX_MULTIPLICITY] =
get_components::<_, _, FS, LOG_MAX_MULTIPLICITY>(
builder,
table,
table_count,
multiplicities,
)?;
Then as said we'll flush the i
th component with multiplicity 1 << i
.
components
.into_iter()
.enumerate()
.for_each(|(i, component)| {
builder.flush_with_multiplicity(
FlushDirection::Pull,
channel,
table_count,
[component],
1 << i,
);
});
Note that since we pushed the lookup values into the channel, we now pull the table values from the channel.
Channel Balancing
We don't yet hold the Boundary
value we promised to return from the plain_lookup
gadget. Even if we did, we wouldn't be done yet. The reason is even after having both pushed to the channel and pulled from the channel, the channel isn't balanced. We discuss how flushing a boundary value (made from balancer_value
) to the channel with a certain multiplicity rectifies the situation, balancing the channel for both completeness and soundness.
Each value in lookup_values
got pushed into the channel exactly once. There are lookup_values_count
such values. If the channel is to balance, we must pull exactly that many values from the channel. If we count how many values got pulled from the channel, we'll see the numbers do not match.
For each index of the i
th component, we pulled 1 << i
things from the channel. If the corresponding bit was set, we pulled a looked up value with multiplicity 1 << i
. If the corresponding bit was not set, we pulled balancer_value
with multiplicity 1 << i
. If we consider an index across all LOG_MAX_MULTIPLICITY
components, we can conclude that a total of
(0..LOG_MAX_MULTIPLICITY).map(|i| 1 << i).sum()
things were pulled from the channel at that index, some of them the looked up value for that index, and some of them balancer_value
.
As a geometric series this expression equals
(1 << LOG_MAX_MULTIPLICITY) - 1
Therefore, the total number of things pulled from the channel by flushing the components is
((1 << LOG_MAX_MULTIPLICITY) - 1) * table_count
Since we pushed exactly lookup_values_count
lookup values into the channel, this means we pulled an superfluous
((1 << LOG_MAX_MULTIPLICITY) - 1) * table_count - lookup_values_count
balancer_value
s out of channel. We assign this expression the name balancer_value_multiplicity
. Naturally then, we must push this many balancer_value
values into the channel. We do so with the boundary value
let boundary = Boundary {
values: vec![balancer_value.into()],
channel_id: channel,
direction: FlushDirection::Push,
multiplicity: balancer_value_multiplicity,
};
which we then return from the gadget as
Ok(boundary)
Assuming the boundary value is fed to the verifier, our channel now balances. Therefore we have taken care of completeness, but what about soundness? Since the balancer_value
is a valid table value by assumption, the verifier is still only allowing valid table values to be pulled from the channel. Therefore if the channel balances, it means the prover pushed only valid table values into the channel.