Multiset-Check

Reducing multiset equality to grand-product claims

As we've mentioned already, the equality of multisets turns out to be our way of mathematically capturing the balancing of M3 channels. Here, we sketch how we process multiset-equality claims in Binius (specifically, how we reduce them in turn to grand-product claims).

The first thing that makes this tricky is that channels in general accept tuple types. Thus we fix a channel and write $\mu$ for its tuple arity (i.e., the number of components in each tuple pushed to or pulled from that channel). The channel's two internal multisets, then, will look something like $(T_0, \ldots , T_{\mu - 1})$ and $(U_0, \ldots , U_{\mu - 1})$ , respectively ( $T$ s for its pushes and $U$ s for its pulls, let's say). Each of the individual columns $T_i$ and $U_i$ will be an $\ell$ -variate multilinear, let's say, and will itself be virtual.

Indeed, remember that each channel, in general, will receive pushes and pulls at the hands of multiple different flushing rules. When two or more different flushing rules push or pull (as the case may be) to a single channel, the resulting multiset of pushed or pulled (respectively) tuples will itself be virtual, constructed as a merge. That is, each individual $T_i$ or $U_i$ will itself be a merge, with its halves (etc.) coming from disparate flushing rules.

In any case, we ignore for now how these $T_i$ s and $U_i$ s are constructed, and just note for now that they're virtual. The equality we're interested in checking is:

\begin{equation*}\left\{ (T_0(v), \ldots , T_{\mu - 1}(v)) \mid v \in \mathcal{B}_\ell \right\} \stackrel{?}= \left\{ (U_0(v), \ldots , U_{\mu - 1}(v)) \mid v \in \mathcal{B}_\ell \right\},\end{equation*}

where both sides of the above equality are understood as multisets (i.e., with multiplicity counted).

The Fingerprinting Step

The first step will be to "de-tuplify" the two multisets at hand. Here, we are going to reduce the problem of matching multisets whose elements are tuples (i.e., with $\mu$ components each) to the problem of matching multisets whose elements are just singleton field elements.

Indeed, it is shown in [DP23, Prot. 4.20] that, to check whether the above multiset equality holds, it suffices for the verifier (up to a small soundness error) to sample a random fingerprinting challenge $s \gets \mathcal{T}_\tau$ , and to check instead whether the two virtual polynomials $\sum_{i = 0}^{\mu - 1} s^i \cdot T_i$ and $\sum_{i = 0}^{\mu - 1} s^i \cdot U_i$ yield equal multisets over $\mathcal{B}_\ell$ . Note that we have replaced the tuples $(T_0, \ldots , T_{\mu - 1})$ and $(U_0, \ldots , U_{\mu - 1})$ of virtual columns with the single virtual columns $\sum_{i = 0}^{\mu - 1} s^i \cdot T_i$ and $\sum_{i = 0}^{\mu - 1} s^i \cdot U_i$ . These things are linear combination virtual polynomials, something we're allowed to do.

Unrolling this, we see that our claim is now the equality of $1$ -ary multisets:

\begin{equation*}\left\{ \sum_{i = 0}^{\mu - 1} s^i \cdot T_i(v) \mid v \in \mathcal{B}_\ell \right\} \stackrel{?}= \left\{ \sum_{i = 0}^{\mu - 1} s^i \cdot U_i(v) \mid v \in \mathcal{B}_\ell \right\}.\end{equation*}

The point here is that we've made progress, since these are multisets of singletons, instead of multisets of tuples. We can abbreviate $T^* \coloneqq \sum_{i = 0}^{\mu - 1} s^i \cdot T_i$ and $U^* \coloneqq \sum_{i = 0}^{\mu - 1} s^i \cdot U_i$ .

The Permutation Step

It thus remains to check whether two single columns $T$ and $U$ yield equal multisets. Here, we follow [DP23, Prot. 4.17]. To reiterate, we're interested in checking whether

\begin{equation*}\left\{ T^*(v) \mid v \in \mathcal{B}_\ell \right\} \stackrel{?}= \left\{ U^*(v) \mid v \in \mathcal{B}_\ell \right\}.\end{equation*}

as multisets.

The key mathematical fact is that it's secure (again, up to a small error) for the verifier to sample a random challenge $r \gets \mathcal{T}_\tau$ , and to check instead whether the two products

\begin{equation*}\prod_{v \in \mathcal{B}_\ell} r + T^*(v) \stackrel{?}= \prod_{v \in \mathcal{B}_\ell} r + U^*(v)\end{equation*}

are equal. Here, on the two sides of this product identity, we have more linear combination virtual polynomials: now, the much-simpler ones $r + T^*$ and $r + U^*$ (these are valid linear combination virtual polynomials nonetheless).

Product claims of this form are exactly what our grand product argument are capable of treating. More precisely, that argument serves to treat claims of the form

\begin{equation*}\prod_{v \in \mathcal{B}_\ell} T(v) \stackrel{?}= a,\end{equation*}

where $T$ is virtual and $a$ is supplied nondeterministically to the verifier. Here, we want to imagine

\begin{equation*}T = r + T^* = r + \sum_{i = 0}^{\mu - 1} s^i \cdot T_i.\end{equation*}

Miscellaneous Considerations

In practice, there are a few power-of-2 issues that make our picture more complicated. For one, the heights of the M3 tables we're interested in flushing might not be powers of 2. That issue creates a significant—though surmountable—technical challenge; we gloss over it for now.

Separately, the combined quantity of all tuples pushed or pulled (as the case may be) to any given channel might itself not be a power of 2. One fact that makes this easier to deal with is that product identities of the form given above—i.e., of the form $\prod_{v \in \mathcal{B}_\ell} T(v) \stackrel{?}= a$ —can be broken up. If $T$ is itself a concatenation of multiple lists, which isn't itself of power-of-two size—or, for that matter, if its constituent lists are not even of equal sizes—then it's enough to separately learn the grand products of each of its constituent sublists. These latter grand products can then be multiplied in the clear by the verifier.