Cryptographic Specification
Some of the terminology, symbols, and language has been borrowed from and inspired by the Halo2 Book and the MACI 1.0 Audit Specification.
Notation
- Accumulator: an elliptic curve point which is a commitment to \(t\) field elements.
- \(t\): the maximum capacity of the accumulator.
- Zero value: the nothing-up-my-sleeve value (see 2).
- Elliptic curve multiplication: in this specification, we use the dot operator \(\cdot\) to denote scalar multiplication of an elliptic curve point.
Cryptographic primitives
1. The BN254 curve
The current implementation of Semacaulk uses the BN254 curve which Ethereum supports in its elliptic curve addition, scalar multiplication, and pairing-check precompiles as defined in EIP-196 and EIP-197.
1.1. The BN254 scalar field
The BN254 scalar field \(\mathbb{F}_r\) is:
21888242871839275222246405745257275088548364400416034343698204186575808495617
1.2. The BN254 scalar field
The BN254 prime field \(\mathbb{F}_q\) is:
21888242871839275222246405745257275088696311157297823662689037894645226208583
1.3. The \(\mathbb{G}_1\) group
The group \(\mathbb{G}_1\) defined on BN254 has the generator point \(g_1 = (1, 2)\).
1.4. The \(\mathbb{G}_2\) point
The group \(\mathbb{G}_2\) defined on BN254 has the generator point \(g_2 = (x_0 * i + x_1, y_0 * i + y_1)\) where:
- \(x_0\) equals
11559732032986387107991004021392285783925812861821192530917403151452391805634
- \(x_1\) equals
10857046999023057135944570762232829481370756359578518086990519993285655852781
- \(y_0\) equals
4082367875863433681332203403145435568316851327593401208105741076214120093531
- \(y_1\) equals
8495653923123431417604973247489272438418190587263600148770280649306958101930
2. The nothing-up-my-sleeve value
The nothing-up-my-sleeve (NUMS) value is:
14233191614411629788649003849761857673160358990904722769695641636673172216357
It is the Keccak256 hash of the bytestring Semacaulk
, modulo
\(\mathbb{F}_r\). To compute it, run the following in a NodeJS console where
e
is an instance of Ethers.js 5.0:
(
BigInt(e.utils.solidityKeccak256(['string'], ['Semacaulk'])) %
BigInt('21888242871839275222246405745257275088548364400416034343698204186575808495617')
).toString(10)
Due to the second-image resistance property of the Keccak256 hash function, anyone can be assured that no-one knows any other preimage to the NUMS value. It follows that no-one knows the MiMC7 preimage to the NUMS value.
3. The structured reference string (SRS)
Semacaulk's structured reference string (SRS) consists of an ordered list of \(2^n + 1\) \(\mathbb{G}_1\) points and \(2^n\) \(\mathbb{G}_2\) points, where the maximum capacity of the accumulator is \(2^n = t\).
We assume the existence of a secret and unknown value \(\tau\) which can be generated using a securely run trusted setup.
These points are defined as such:
- \(\mathsf{srs\_g1}\): \([g_1, g_1 \cdot {\tau}, \ldots, g_1 \cdot {\tau \cdot {t + 1}}]\)
- \(\mathsf{srs\_g2}\): \([g_2, g_2 \cdot {\tau}, \ldots, g_2 \cdot {\tau \cdot {t + 1}}]\)
Where \(g_1\) is defined in 1.3 and \(g_2\) is defined in 1.4.
4. The MiMC7 hash function
Semacaulk currently uses the MiMC7 hash function to compute identity commitments and nullifier hashes. While other possibly more secure hash functions like Poseidon are possible, we chose MiMC7 only because of its simplicity of implementation for our purposes of delivering a proof-of-concept.
The MiMC7 has function is defined here.
Our instantiation of the MiMC7 hash function for the BN254 curve uses the following constants:
- \(n = 91\)
- \(\mathsf{MIMC\_SEED} =\)
mimc
(the hexidecimal array[0x6d, 0x69, 0x6d, 0x63]
)
4.1. The MiMC7 round constants
Given the BN254 scalar field \(\mathbb{F}_r\), we first define 91 round
constants (denoted as \(\mathsf{cts}\)) using the algorithm implemented in
circomlibjs/src/mimc7.js
and
semacaulk/src/mimc7.rs
.
The algorithm is as such:
- The first round constant is \(0\).
- The next round constant is the Keccak256 hash of \(\mathsf{MIMC\_SEED} =\), modulo the field order of \(\mathbb{F}_r\).
- Each subsequent round constant is the Keccak256 hash of the previous one, modulo the field order of \(\mathbb{F}_r\).
4.2. The MiMC7 hash
algorithm
To hash a single field element \(x\), we use the hash()
algorithm. The inputs to
hash()
are \(x\) and a key \(k\).
- Compute the first round digest \(\mathsf{rd[0]} = (x + k) ^ 7\).
- Compute the next \(n - 1\) round digests such that \(\mathsf{rd}[i] = (\mathsf{rd}[i - 1] + \mathsf{cts}[i] + k) ^ 7\)
- Return \(\mathsf{rd}[n - 1] + k\).
4.3. The MiMC7 multi_hash
algorithm
To hash multiple field elements \(x_0, \ldots, x_n\), we use the multi_hash()
algorithm. The inputs to multi_hash()
are the array of said field elements
and a key \(k\).
-
Initialise \(r\) to equal \(k\).
-
For each \(x_i\):
a. Set \(h_i = \mathsf{hash}(x_i, r)\).
b. Set \(r = x_i + h_i\).
-
Return \(r\).
4.3.1 multi_hash
with two field elements
It is useful to describe the multi_hash
algorithm for two input elements in
individual steps because the Semacaulk circuit construction (see The Circuit
and Gates) makes use of its intermediate states.
Given inputs \(a\) and \(b\):
- Set \(r\) as 0.
- Set \(h_0 = \mathsf{hash}(a, r)\).
- Set \(r = r + a + h_0\).
- Set \(h_1 = \mathsf{hash}(b, r)\).
- Return \(r + b + h_1\).
Note that in step 4, the key is \(a + h_0 = \mathsf{hash}(a, 0)\). This fact is crucial to understanding how the circuit construction works.
5. KZG commitments
Semcaulk uses the KZG commitment scheme described in KZG10.
Given an array of \(t\) values \([m_0, \ldots, m_t]\), we interpolate to derive the polynomial \(\phi\) such that \(\phi(i) = m_i\).
Knowing \(\phi\) with \(l\) coefficients \([\phi_0, \ldots, \phi_{l - 1}]\), one can use \(\mathsf{srs\_g1}\) to produce a commitment in the form of a \(\mathbb{G}_1\) point, or \(\mathsf{srs\_g2}\) to produce a commitment in the form of a \(\mathbb{G}_2\) point.
\(\mathsf{commit}(\phi, \mathsf{srs}) = \sum_{i=1}^{l} \mathsf{srs}[i] \cdot \phi_i \)
An alternative notation for \(\mathsf{commit}\) is:
- \([\phi]_1\) where the commitment is a \(\mathbb{G}_1\) point or
- \([\phi]_2\) where the commitment is a \(\mathbb{G}_2\) point.
A KZG opening proof is denoted as:
\(\mathsf{open}(\mathsf{srs}, [\phi], \phi(i))\)
6. Lagrange basis polynomials
Lagrange basis polynomials are an important concept and are used in several parts of the protocol. To understand them, we must first define roots of unity of a finite field.
6.1. Roots of unity of a finite field
The \(n\)th roots of unity of a finite field \(\mathbb{F}_p\) with prime order \(p\) are field elements where for each element \(x\), \(x^n = 1\).
For example, the 4th roots of unity of the BN254 scalar field (see 1.1) are:
0x0000000000000000000000000000000000000000000000000000000000000001
0x30644E72E131A029048B6E193FD841045CEA24F6FD736BEC231204708F703636
0x30644E72E131A029B85045B68181585D2833E84879B9709143E1F593F0000000
0x0000000000000000B3C4D79D41A91758CB49C3517C4604A520CFF123608FC9CB
Another name for the \(n\) roots of unity is the evaluation domain of size \(n\) for a given finite field. They are commonly denoted as \(\{1, \omega, \ldots, \omega^{n-1}\}\).
6.2. Lagrange basis polynomials
Given an evaluation domain of size \(n\), the Lagrange basis polynomials of this domain are the \(n\) polynomials \([L_0, \ldots, L_n]\) such that \(L_i(\omega^{i - 1} = 1)\) and \(L_i(\omega^{j} = 0)\) for all \(j \neq i - 1\). For example:
- the Lagrange basis polynomial \(L_0\) evaluates to 1 given the input \(\omega^0\).
- the Lagrange basis polynomial \(L_0\) evaluates to 0 given the input \(\omega^1\).
- the Lagrange basis polynomial \(L_1\) evaluates to 0 given the input \(\omega^0\).
- the Lagrange basis polynomial \(L_1\) evaluates to 1 given the input \(\omega^1\).
6.3. Efficient generation of commitments to Lagrange basis polynomials
To support \(t\) insertions, Semacaulk requires the KZG commitments to the Lagrange basis polynomials over the evaluation domain of size \(t\). These KZG commitments are efficiently generated using an implementation of the Feist-Khovratovich technique.
7. The accumulator
The accumulator is a single \(\mathbb{G}_1\) point that is a commitment to a vector of \(t\) \(\mathbb{F}_r\) elements where \(t\) is the maximum capacity of the instance of Semacaulk in question. These elements are ordered with the users' identity commitments followed by nothing-up-my-sleeve values.
An empty accumulator is simply a commitment to \(t\) nothing-up-my-sleeve values.
Given the vector of values \([v_0, \ldots, v_t]\), the accumulator \(C\) is computed as such:
\(\sum_{i=1}^{t} \mathsf{commit}(L_i, \mathsf{srs\_g1}) \cdot v_i\)
7.1 Updating the accumulator
To replace a value at \(w_i\) with \(v_i\) at index \(i\):
\(C_{\mathsf{new}} = C - L \cdot w_i + L \cdot v_i\)
\(= C + L \cdot (v_i - w_i)\)
where \(L = \mathsf{commit}(L_i, \mathsf{srs\_g1})\)
This can be done on-chain at a low cost because the only expensive operations required are an elliptic curve scalar multiplication and a elliptic curve addition.
8. The Keccak256 hash
The Keccak256 hash function is defined in The Keccak SHA-3 submission by Bertoni et al with the output length of 256 bits. We rely on implementations from the following sources:
- The EVM's
KECCAK256
opcode denoted as0x20
. - The Javascript Ethers.js library's
ethers.utils.solidityKeccak256
function. - The Rust
tiny-keccak
library'sv256
function.
9. Evaluation domains
9.1. The subgroup domain
An evaluation domain with size 128. We denote this constant as \(\mathsf{SUBGROUP\_SIZE}\).
9.2. The extended coset domain
An evaluation domain with size 8 * 128. We denote this as \(\mathsf{EXTENDED\_DOMAIN\_FACTOR} * \mathsf{SUBGROUP\_SIZE} = 1024\).
9.3. The table domain
An evaluation domain with a size of at least 1024. This is the upper limit on the number of elements that the accumulator can hold.