ZK-VM Design Tradeoffs

How do different design choices in a zk-VM affect outcomes? Particularly, how do zk-VM design choices affect the metrics of quality defined in Evaluating zk-VMs? Those metrics are correctness, security, zero knowledge, trust assumptions, other assumptions, efficiency, speed, and succinctness. This section offers some generalizations about how zk-VM design choices affect these outcomes.

Here are some of the design choices to consider when building a zk-VM:

  • What instruction set architecture should the zk-VM use?

  • What programming languages should the zk-VM support?

  • Should the zk-VM be modular or monolithic?

  • What arithmetization strategy should the zk-VM use?

  • What field (or, more generally, what ring) should the zk-VM use for its arithmetization?

  • What polynomial commitment scheme should the zk-VM use?

  • Should recursive proving be supported, and if so, by what strategy?

  • Should incrementally verifiable computation be supported, and if so, by what strategy?

Instruction set architectures

An instruction set architecture (ISA) is a model of computation which can be implemented by a physical computer or a virtual machine (VM). An ISA answers questions such as:

  • What is the memory model? How many address spaces are there, how big are those address spaces, and what can be done with each of these address spaces? For example:

    • In a von Neumann architecture, there is a single address space which can be read, written to, and executed.

    • In a Harvard architecture, there are two address spaces: the data address space (readable and writable but not executable), and the code address space (executable but not readable or writable).

    • In a 32-bit ISA, the addresses are 32 bits long, allowing for 2^32 distinct addresses to be represented. (Typically, one byte, which is 8 bits, can be stored at each address.)

    • In a 64-bit ISA, the addresses are 64 bits long, allowing for 2^64 distinct addresses to be represented.

  • What is the instruction set? I.e., what are the primitive, atomic operations which the model of computation defines, out of which all programs for this ISA are composed?

  • What registers are there? A register is a storage location which can be addressed as an implicit or explicit operand of instructions. An ISA defines how many registers there are, what kinds of data they can hold, and what purposes they can be used for.

  • What is the calling convention? In other words, what are functions required to do in order to call each other? How do they pass arguments and return values? Who is responsible for saving / restoring the values in registers? Etc.

ISAs used for zk-VMs can be broadly classified as general-purpose ISAs, which were not designed specifically for zero knowledge proving, and zk-optimized ISAs, which were designed specifically for efficient zero knowledge proving. Examples of general-purpose ISAs include the variants of ARM, RISC-V, and X86. Examples of zk-optimized ISAs include the variants of Valida, TinyRAM, and the Cairo VM.

The choice of ISA affects efficiency, speed, and succinctness of the resulting zk-VM. ISAs which support more instructions, or instructions which do more work, will tend to result in a higher per-instruction complexity for proving and verification, but also a lower number of instructions required to solve a problem.

The net effects of ISA complexity on proving and verification costs are not straightforward to understand. The overall costs (by any metric) of proving and verification for a particular computation can be quantified as cost per instruction * number of instructions. Since increasing ISA complexity tends to increase cost per instruction while reducing the number of instructions required to solve a problem, adding or removing complexity can be either beneficial or detrimental to any metric of cost, depending on the case.

A zk-optimized ISA may allow for more efficiency, speed, and succinctness, because its instruction set and other features are chosen specifically to achieve that outcome. There is nothing that says using a zk-optimized ISA has to result in more efficiency, speed, and succinctness than otherwise, but the opportunity to achieve that outcome is certainly there.

An advantage of using a popular general purpose ISA, instead of a zk-optimized ISA, is that it allows one to leverage pre-existing compiler technology, including sophisticated optimization passes for that ISA which would be difficult to replicate for a new ISA.

Programming language support

Broadly speaking, programming languages can be classified as:

  • High level or low level. A low level language presents a low abstraction model of computation to the programmer; it requires the programmer to explain, with high specificity, how a computation is to be performed, relative to the primitives supported by the target computing environment. A high level language presents a high abstraction model of computation to the programmer; it allows the programmer to specify what result should be obtained without specifying exactly how the computation is to be performed.

  • General purpose or domain specific. A general purpose language is not designed for any particular application domain, whereas a domain specific language is designed for use in a particular application domain.

A zk-VM supports a language when programs written in that language can readily be compiled or otherwise prepared to run on that zk-VM. Deciding which high level languages to support is an important decision for a zk-VM project, because without any high level language support, it would be hard to solve complex problems using the zk-VM, and gaining adoption would be an uphill battle.

Some zk-VM projects are tied to a domain specific language which is the only supported high level language for programming that zk-VM. Examples of this type of zk-VM include the Cairo zk-VM (tied to the Cairo language), the Lurk zk-VM (tied to the Lurk language), and the Polygon zk-EVM (tied to Solidity). Advantages of this approach include:

  • Relative ease of implementing language support, because there is only one language to support and it has the minimal feature set needed to satisfy the use case

  • Relative ease of zk-VM implementation, because the features of the programming language can be tailored to what is easy to implement on the zk-VM, rather than tailoring the zk-VM feature set to what the high level language(s) require

  • The opportunity to tailor the high level language feature set to whatever is most efficient to implement on the zk-VM

Some zk-VM projects offer support for one or more general purpose high level languages, instead of tying themselves to a domain specific language tailored to that zk-VM. Examples of this type of zk-VM include Valida, RISC Zero, and Succinct SP1. Advantages of this approach include:

  • Application developers get to use more feature-rich languages, which in many cases they are already familiar with

  • Can leverage pre-existing compiler technology, leading to savings on development costs, and access to sophisticated compiler optimizations which do not need to be re-implemented

Modular vs monolithic zk-VMs

A modular zk-VM is:

  • built partially out of pluggable components,

  • extensible via a potentially infinite number of pluggable components which could be created, and

  • readily able to be instantiated in a potentially infinite number of different configurations.

In contrast to a modular zk-VM, a monolithic zk-VM is a closed system which cannot be further extended by the user. In a modular zk-VM, the instruction set is open-ended. In a monolithic zk-VM, the instruction set is closed-ended, being updated in a centralized fashion by the zk-VM project itself. In a monolithic zk-VM, there is a lack of a clean separation of concerns in the code architecture, which if present would make it economical for a downstream project to make customizations of the ISA for their use case.

The advantage of a modular zk-VM is primarily that it can be extended with instructions which, while not generally useful, are particularly useful for the given use case, resulting in fewer instructions and less work for the prover.

Arithmetization

"Arithmetization" refers to any process of expressing a set of facts A as a combination of:

  1. a set B which is the union of the set of all solution sets of elements of a set S of systems of polynomial equations (or constraint systems of another type, depending on the protocol): B = {{x : x is a solution to s} : s ∈ S}.

  2. a function f : A -> B which transforms a fact in A to a solution of an element of S.

The systems of polynomial equations in S are over some ring R. Typically, R is a finite field.

Equally, "arithmetization" also refers to any process of proving, using an interactive oracle proof — often a polynomial interactive oracle proof — the existence of a solution to an element of a set of systems of polynomial equations over a ring R.

A polynomial interactive oracle proof is a process by which a prover convinces a verifier, though a series of interactions, that some fact is true, using a trusted third party called the oracle, to whom the prover sends polynomials and to whom the verifier sends queries for evaluations of those polynomials at specific points, which the oracle answers honestly.

More generally, in an interactive oracle proof, the objects sent to the oracle may be of any type (not just polynomials), and the queries to the oracle by the verifier may be of any type (not just single-point evaluations of polynomials).

By applying the former kind of arithmetization (expressing a set of facts as a mapping onto a solution set of a constraint system), and applying the latter kind of arithmetization (turning facts so expressed into an interactive oracle proof), and applying a commitment scheme, such as (typically) a polynomial commitment scheme, and applying the Fiat-Shamir heuristic, it is possible to construct SNARKS (i.e., succinct non-interactive proofs) and zk-SNARKs (i.e., zero knowledge proofs).

Strategies for arithmetization are numerous and diverse. One thing they tend to share with each other is relatively steep computational costs, including asymptotically super-linear time complexity in many if not most cases.

Arithmetization strategies are expected to satisfy complex sets of requirements. They are required to be efficient, fast, and succinct enough for particular use cases, while also being secure and correct, including being sound, complete, and in many cases, zero knowledge.

In the case of a zk-VM, the set of facts which are required to be provable by arithmetization is a set of results of executions of programs running on some virtual machine. A zk-VM proof proves, for a given program, a given result, and given initial conditions, that there is some input which causes the program to produce the given result when executed from the given initial conditions. Typically, the cost of actually executing the program, even in an emulated virtual machine environment, is pretty negligible in comparison to the cost of computing the SNARK which proves the same execution.

Arithmetization strategies differ in what kind of intermediate representation they use for the facts being represented. They differ in how they transform the initial representation of facts into the intermediate representation. They also differ in how they transform the intermediate representation of facts into an interactive oracle proof (IOP). They also differ in what kind of IOP they result in, such as:

  • what ring its coefficients are drawn from,

  • what kind of polynomials it uses (e.g., univariate, multilinear, or multivariate), and

  • what kinds of queries the verifier is allowed to make to the oracle (e.g., point queries or tensor queries).

The choices of intermediate representation for arithmetization range from the more rudimentary, like R1CS as in Groth16, to the more sophisticated, like PLONKish Arithmetization as in halo2. Also of interest is AIR, as in Plonky2 and Plonky3. All of these choices of intermediate representation share in common the representation of a set of facts the set of solutions to a constraint system. To represent a set of facts using an intermediate representation, one chooses a constraint system and a one-to-one correspondence (i.e., a bijection) between the set of facts and the solution set of the constraint system. Different intermediate representations for arithmetization differ in what forms those constraint systems and their solutions may take.

The choice of intermediate representation affects how readily the ISA's semantics may be encoded as a constraint system, and how much complexity is present in a constraint system which encodes the ISA. The choice of intermediate representation also affects what strategies may be used to turn the intermediate representations of facts into polynomial IOPs. In general, using a more powerful intermediate representation reduces the complexity of the intermediate representations of facts, while increasing the complexity of transforming those intermediate representations into polynomial IOPs. As such, the overall effect of the choice of intermediate representation on proving complexity is not straightforward to deduce and needs to be tested.

There are various choices of methods for transforming an intermediate representation of facts into an interactive oracle proof. Broadly, these can be classified as:

  • General purpose or special purpose. General purpose protocols can be applied regardless of the content and nature of the set of facts to be proven, as long as those facts can be checked efficiently by a computer algorithm. Special purpose protocols can only be applied to proving sets of facts with very particular characteristics. Any protocol capable of being used by itself to implement a zk-VM is general purpose, but special purpose protocols can still be relevant to zk-VM design. When used in conjunction with general purpose protocols, special purpose protocols may accelerate particular aspects of proving. For example, GKR style protocols, which are special purpose, may be used to prove lookup arguments as part of a broader zk-VM design strategy.

  • Univariate or multilinear. A univariate polynomial IOP sends univariate polynomials (i.e., polynomials in one variable) to the oracle. A multilinear polynomial IOP sends multilinear polynomials (i.e., polynomials which are linear in each variable). Multilinear protocols may be more efficient than univariate protocols in the presence of high degree polynomial constraints. This is particularly true given recent advances in multilinear protocols, such as Basefold and HyperPlonk. Univariate protocols are compatible with a wider variety of choices of polynomial commitment scheme. Unlike all known multivariate polynomial commitment schemes, the univariate KZG commitment scheme offers constant sized proofs, independent of the complexity of the facts being proven.

  • Divisibility argument based or sumcheck based. Divisibility argument based protocols work by proving that a linear combination of constraint polynomials vanishes on all elements of an evaluation domain; they prove this by showing that the linear combination of constraints is perfectly divisible by the zerofier polynomial of the evaluation domain, that is, the simplest nonzero polynomial which is zero on all points of the evaluation domain. Sumcheck based protocols are based on proving that zero is equal to the sum of evaluations of a multivariate polynomial over the vertices of a Boolean hypercube. With both divisibility argument based protocols and sumcheck based protocols, the argument is supposed to imply that with high probability, the input data satisfies the constraints. Examples of divisibility argument based protocols include PLONK, PLONKish arithmetization, and STARKs. Examples of sumcheck based protocols include Spartan and HyperPlonk. Divisibility argument based protocols are commonly univariate. They commonly rely on a discrete Fourier transform (DFT) in order to transform a polynomial represented in point-evaluation form (as a vector of its evaluations at a fixed set of points) to a polynomial represented in coefficient form (as a vector of its coefficients). They also commonly rely on the inverse of this, an inverse DFT, in order to transform a coefficient form of a polynomial to a point-evaluation form. Because the computational complexity of DFT and inverse DFT are super-linear, they become the dominant cost center in divisibility argument based protocols at some problem size, if the rest of the protocol has linear computational complexity. Dividing large polynomials is another significant cost center for divisibility argument based protocols. Sumcheck based protocols, on the other hand, tend to be linear time. This means in practice that sumcheck based protocols can scale to larger problem sizes while remaining relatively fast and efficient. Although divisibility argument based protocols are more common in practice today, as of this writing, there is a lot of interest in sumcheck based protocols due to their greater potential for scalability.

  • Lookup based. This classification refers to an emerging class of techniques for arithmetization which rely on lookups into large structured lookup tables as an alternative to expressing constraints as systems of polynomials. Lasso and Jolt are a representative example of this type of approach. It is maybe too early to say what the tradeoffs are for lookup based arithmetization strategies vs. other strategies. Regardless, Lita is watching the development of these techniques with interest.

The classifications just given are not intended to cover all known arithmetization strategies. Some, such as for example BCG20 and Orion, do not fit cleanly into the classifications above; they are neither divisibility argument based, nor are they straightforwardly sumcheck based.

It is difficult to generalize about how different choices of arithmetization strategy drive outcomes in zk-VM designs. This is difficult partly because it is difficult to generalize about arithmetization strategies, due to their diversity and the many degrees of freedom available in designing them.

The most useful generalization is probably that everything comes at a cost: making a protocol better on some metric tends to make it worse on another metric. For example, more succinctness tends to come at the cost of more proving complexity; small, secure, easy to verify proofs are hard to generate. Easy proofs to generate tend to be larger, and/or harder to verify, and/or less secure. It seems reasonable to expect that there will never be an arithmetization strategy which is the best regardless of how you weigh the different metrics of success; instead, the use cases and priorities will continue to drive the design choices for arithmetization strategies.

Fields and rings

A ring is a type of algebraic structure which features addition and multiplication operations that satisfy some basic laws of algebra. In a ring, addition is required to be commutative, associative, zero-canceling, and invertible. Multiplication in a ring is required to be associative, unital, and distributive over addition. A field, by definition, is a ring where multiplication is invertible, meaning that every non-zero element of a field has a multiplicative inverse. A finite field is a field where there are only finitely many distinct values.

In arithmetization, facts are represented as collections of elements of vector spaces over some ring, typically a finite field. For example, facts are represented as collections of polynomials in coefficient form, or collections of polynomials in point-evaluation form.

Designing a ZK proving protocol typically entails choosing one or more rings or fields: typically, at least, a field of coefficients for the polynomials used to describe the constraints, and often also one or more extension fields of the field of coefficients.

For elliptic curve based ZK proving protocols, the choice of field of coefficients is heavily constrained relative to the choice of elliptic curve. Each cryptographic elliptic curve is the solution set of a polynomial equation over some finite field of coefficients. Each such solution set defines a group, called an elliptic curve group. Out of this elliptic curve group, the ZK proving protocol chooses a point, and this point generates a cyclic subgroup of the elliptic curve group. Typically, this cyclic subgroup is of prime order p, which means that it is isomorphic to a finite field Fp. Therefore it is possible to define a homomorphism from the additive group of Fp into the elliptic curve group, by mapping the unit (i.e., the multiplicative identity, like the number one) of Fp to the generator element of the elliptic curve group. This mapping of the unit to the generator can be extended to a homomorphism in one unique way. This type of setup can be used to implement zero knowledge protocols with arithmetization using polynomials with coefficients drawn from Fp.

Typically, elliptic curve based ZK proving protocols need to use a fairly large field of coefficients, such as a 255-bit field, and an elliptic curve whose defining equation is over a similarly large field of coefficients, in order for the security arguments of the protocols to go through and result in an acceptable level of security. Working with large fields introduces a constant factor of overhead relative to working with small fields, when only small numbers are being encoded. When large numbers are being encoded, then working with large fields may be beneficial, because working with large numbers in a small field which is too small to represent those numbers requires that the large numbers be broken down into limbs and a system of constraints used to validate arithmetic operations on them.

When ZK proving protocols work over small fields (e.g., 32-bit or 64-bit fields), it is often necessary to use an extension field in certain steps of proving, in order for the security arguments of the protocols to go through and result in acceptable level of security. Plonky2 and Plonky3, for example, take this approach; they work on polynomials over small finite fields of prime order, and they use field extensions in subsequent stages of the argument in order to achieve the required security levels.

Recently, some ZK proving protocols have been developed which are able to work over binary fields (e.g., Binius). Binary fields are fields of size 2^n for some positive integer n. Also, some ZK proving protocols have been developed which are able to work over arbitrary finite fields (e.g., Brakedown), or over classes of rings which are not necessarily fields (e.g., Rinocchio). These advances seem potentially useful, in that there is often a particular choice of ring which would be the most natural for a particular use case, and if the ZK proving protocol is flexible in the choice of ring, then it is more likely that the ideal choice of field for the use case is compatible with the ZK proving protocol. However, the most naturally useful rings for zk-VMs, which are the 32-bit and 64-bit rings of integers, are not amenable to this kind of treatment within the scope of the current research.

Polynomial commitment schemes

A commitment scheme is a protocol which allows a participant (called the prover) to create a fingerprint of some data X, which is called a commitment to X, and later prove facts about X without revealing X, using the commitment to X. A polynomial commitment scheme is a commitment scheme where the prover commits to polynomials and proves the evaluations of those polynomials at particular points.

Polynomial commitment schemes are a key ingredient in most strategies for creating SNARKs. Typically, a ZK proving protocol calls for the prover to encode, in the form of polynomials, the information evidencing the facts to be proven. The prover commits to those polynomials and, typically, some other derived polynomials. For instance, a divisibility argument based SNARK will typically commit to the quotient by a zerofier polynomial of a linear combination of constraint polynomials, after substituting variables in those constraint polynomials with the polynomials representing the evidence. The commitments are sent to the verifier, along with enough evaluations and proofs of those evaluations that the verifier can be convinced of the fact being proven, without the prover needing to transmit the whole polynomials or all of the evidence of the fact being proven.

Polynomial commitment schemes (PCSs) can be broadly classified as follows (although these classifications are not intended to be exhaustive):

  • According to their coefficient rings. Most PCSs will only work for polynomials whose coefficients are drawn from a particular ring, or a ring satisfying particular constraints. Typically, the ring of coefficients has to be a finite field, and not just any finite field will work, but only certain finite fields will work as the ring of coefficients for a PCS. This is a very important consideration about a PCS, since the choice of coefficient ring has a large impact on the complexity of polynomial constraint systems capturing particular sets of facts. As mentioned previously, larger fields are more complex to compute with, but may result in simpler constraints when dealing with large numbers. Also, oftentimes there is a specific choice of coefficient ring which would result in by far the simplest possible constraint systems for a particular zk-VM. In the case of Cairo, for example, the source language and ISA used for the zk-VM are designed so that it is most natural to arithmetize the zk-VM using the coefficient ring which it is most natural for the PCS to use. When this is not done, then typically the most natural choice of coefficient ring would be the ring of 32-bit or 64-bit integers under modular arithmetic, but typically, this choice of coefficient ring is not available for a PCS.

  • Univariate vs multilinear. A univariate PCS can commit to univariate polynomials. A mutilinear PCS can commit to multilinear polynomials. Since all univariate polynomials are multilinear, multilinear PCSs can be used in univariate protocols. On the other hand, a univariate PCS cannot be used in a multilinear protocol. An advantage of multilinear protocols over univariate protocols is that they may provide for a more efficient encoding of facts as polynomial constraints and polynomials satisfying those polynomial constraints. Multilinear protocols are more apt to have linear proving complexity, whereas univariate protocols are more apt to have super-linear proving complexity. An advantage of univariate protocols over multilinear protocols is that univariate PCSs offer the option of constant sized proofs of point evaluations. The univariate KZG commitment scheme offers constant sized proofs, and Lita is not aware of any multilinear PCSs which offer constant sized proofs.

  • Singular vs batch. A batch PCS can use a single commitment to commit to several polynomials, and a single proof to prove many point evaluations of many polynomials. This is much more useful than a PCS which can only be used to commit to single polynomials in a commitment and prove single point evaluations in a proof. For this reason, PCSs used in complex SNARK protocols are typically batch PCSs.

  • Elliptic curve based vs hash based. In a typical elliptic curve based PCS, the operation of committing to a polynomial is a homomorphism from the additive group of a vector space over the ring of coefficients into an elliptic curve group. The KZG commitment scheme and the inner product argument are two examples of elliptic curve based PCSs. In a typical hash based PCS, the operation of committing to a polynomial is an application of hashing. For example, in a FRI-based PCS, the operation of committing to a polynomial consists of first computing its Reed-Solomon codeword and then computing the Merkle hash tree of the Reed-Solomon codeword. The root of the Merkle hash tree of a polynomial or a collection of polynomials is the polynomial commitment in a FRI-based PCS. Elliptic curve based PCSs tend to offer smaller proofs than hash based PCSs. Hash based PCSs tend to offer less proving complexity, because they can work securely with smaller field sizes, and they use fewer finite field operations due to not working with elliptic curves. Some elliptic curve PCSs have easier to verify proofs, as compared to hash based PCSs, but this is not true across the board. As a broad generalization, KZG PCSs have less verification complexity than FRI-based PCSs, and FRI-based PCSs have less verification complexity than the inner product argument. Typically, elliptic curve based PCSs are not secure in theory against a sufficiently powerful quantum computer, because they assume that the discrete logarithm problem is hard in a certain subgroup of a certain elliptic curve group. Shor's algorithm offers a refutation in theory of the proposition that these elliptic curve PCSs are secure against a sufficiently powerful quantum computer. On the other hand, hash based PCSs typically are plausibly secure in theory against a sufficiently powerful quantum computer.

  • Linear time prover vs super-linear time prover. In a linear time prover, the computational complexity of proving scales asymptotically linearly with the size of the inputs. For example, in a linear time prover, processing an input which is twice as long will take roughly twice as much time, for sufficiently large inputs, all else being equal. In a super-linear time prover, the computational complexity of proving scales asymptotically faster than linearly with the size of the inputs. For example, in a super-linear time prover, proving time may scale log-linearly with the size of the inputs, written in big O notation as O(n * log(n)). For example, FRI-based PCSs have O((log(n))^2) time complexity. The difference between a linear time prover and super-linear time prover has to do with asymptotic complexity, as opposed to the concrete computational complexity at any specific problem size. A linear time prover is more efficient than a super-linear time prover at sufficiently large problem sizes, but not necessarily at any particular problem size. Purely in theory, asymptotic complexity doesn't matter, because there is always an upper limit in practice on how big the problems are. In practice, asymptotic complexity does matter; there are big enough practically relevant problems that linear time provers tend to end up dominating super-linear time provers at some practically relevant problem sizes.

  • Constant vs sub-linear vs linear proof size and verifier complexity. It is useful to think of succinctness as three separate yet related properties: proof size, verifier time complexity, and verifier space complexity. Of these, the most discussed are probably proof size and verifier time complexity. All three metrics are particularly important when it comes to verifying SNARK proofs in a smart contract on a layer 1 blockchain. Aside. Layer 1 blockchains typically have fairly stringent resource usage constraints. In a layer 1 blockchain, every computation is highly replicated; typically, every node runs the computations for every transaction in each block. The memory and compute capacity of each node can be thought of as a globally shared resource; the whole world has to share them. Also, the layer 1 network as a whole has only as much computational throughput as the least powerful node in the network. Therefore, in order to keep the hardware requirements for running a node reasonable, and to some extent by necessity of the design, layer 1 blockchains tend to place strict limits on the size of transactions and the time and space complexity of smart contract computations. To contend with this, smart contracts which want to verify SNARK proofs can limit the proof size, and the time and space complexity of verification, so that the proof can be posted and verified in a single transaction. Alternatively, they can try to break up the SNARK proof and the process of verifying it into multiple transactions, as many as are required to fit each transaction within the resource constraints of the layer 1 blockchain. End aside. Each metric of succinctness (proof size, verifier time complexity, and verifier space complexity) can, for a PCS, be asymptotically constant, sub-linear, or linear. For example, KZG PCSs have constant proof size, constant verifier time complexity, and constant verifier space complexity. For example, the inner product argument has sub-linear proof size, linear verifier time complexity, and sub-linear verifier space complexity. The ideal for succinctness is, of course, constant proof size and constant verifier time and space complexity. This ideal is achieved by KZG PCSs, but KZG PCSs require a trusted setup, so as usual, there are no ideal solutions, only tradeoffs. Also worth noting is that PCSs with less proving complexity, compared to KZG, tend to produce larger proofs which are harder to verify, again compared to KZG. Compared to KZG PCSs, FRI-based PCSs produce larger proofs which are harder to verify, but they tend to have less proving complexity.

Overall, these considerations suggest that a multilinear hash-based PCS would currently be the ideal way to go if prioritizing low proving complexity, lack of a trusted setup process, and proofs that are not too large and not too hard to verify. These considerations also suggest that a KZG PCS is currently the way to go if the overriding concern is minimizing proof size and verification complexity, and it's okay to have a trusted setup process. Presumably, the situation will continue to change as new research continues to emerge, and these evaluations should be expected to change accordingly.

Recursive proving and incrementally verifiable computation

The facts that people want to prove grow ever more complex, since there are cases where one wants to prove a set of facts which is the concatenation of some new facts onto a pre-existing set of facts, which grows over time without shrinking. Proving the validity of a ledger is an example of such a case.

Proving time complexity for a SNARK cannot be any better than linear in the input size. Therefore, at some problem size, it becomes necessary to distribute the computation of a proof across multiple machines and/or multiple executions of the prover algorithm. Some SNARK protocols, such as DIZK, natively support distributed proving; most do not. For SNARK protocols which do not natively support distributed proving, recursive proving and incrementally verifiable computation can serve as means of supporting distributed proving.

Sometimes the need for incremental proving arises, not because of the scale and complexity of the facts being proven, but because the facts are arriving at different points in time. When a set of facts should be proven incrementally as the facts come in, with a low marginal cost of adding to the set of facts proven, then recursive proving and incrementally verifiable computation can apply.

Recursive proving is when a SNARK is used to verify another SNARK. In the most basic case, a SNARK proves that another SNARK exists which proves a given fact; this is proving that same fact with extra steps. This can be useful for compressing a proof of a fact into a more succinct proof. In more advanced cases, a SNARK proves the existence of one or more other SNARKs, possibly along with some other facts; this can be useful for incrementally proving sets of facts.

For a zk-VM, typically the simplest approach for implementing recursive proving is to compile the SNARK verifier to run on the zk-VM. This approach is not always efficient, however. Another way to solve the problem is to create a specialized constraint system for verifying SNARK proofs; this tends to be efficient but far more effort to implement. A compromise approach is to compile the SNARK verifier to run on the zk-VM, but to tailor the zk-VM for this use case by using an extended ISA which includes opcodes that do critical pieces of work for the verifier. This type of compromise approach may offer the best set of tradeoffs in terms of proving performance, ease of implementation, and ease of verifying the solution.

Recursive proving can have bounded or unbounded depth of recursion. Recursive proving with unbounded depth of recursion means that within one system, one can prove a fact A, then prove the existence of a SNARK proving A, then prove the existence of a SNARK proving the existence of a SNARK proving A, and so on for a potentially infinite number of recursive proofs.

Recursive proving, with unbounded depth of recursion, is not always possible. When recursive proving with unbounded depth is possible, then there is a size of computation or constraint system which is big enough to represent verification of a proof of a computation or constraint system of the same size. The minimal such size, if it exists, is called a threshold of recursion. When a SNARK proving system has no threshold of recursion, then it is not possible to construct recursive proofs of unbounded depth for that system, in that system.

Another issue to consider with recursive proving is the potential loss of soundness. Soundness refers to the probability that the verifier will accept a proof of something false; lower probability means better soundness. Loss of soundness with recursive proving refers to the fact that soundness error (i.e., the probability of the verifier accepting something false) may compound with increasing recursion depth. For example, suppose a SNARK protocol has at most a 10% probability of the verifier accepting a proof of something false; this is an unacceptably high probability, but let's go with it for the sake of ease of calculation. Then we can say there's at most a 10% probability that a SNARK proving a fact A exists despite A being false, and there's at most a 10% probability that a SNARK proving the existence of a SNARK proving A exists despite there being no SNARK proving A. This yields a probability of at most 20% that there is a SNARK proving the existence of a SNARK proving A, despite A being false. A better rate of compounding soundness error may be realizable in practice, since this calculation only purports to give an upper bound on the soundness error.

There is an emerging area of research on quantifying the compounding of soundness error with increasing recursion depth. To date, the best results Lita is aware of in this area are from Chiesa, Guan, Samocha, and Yogev (2023). This research found that some practically realizable recursive SNARKs have a soundness error which does not compound with increasing recursion depth. This research is fairly new, and not published in a peer reviewed format. For these reasons, Lita is taking these findings with a grain of salt, yet hopeful about a future holding unbounded SNARK recursion with known levels of security.

Incrementally verifiable computation is neither the same thing as recursive proving, nor a more general category containing recursive proving, nor a subset of recursive proving. Incrementally verifiable computation is a category which overlaps recursive proving, while neither subsuming nor being subsumed by recursive proving.

Incrementally verifiable computation (IVC) is the process of succinctly and non-interactively proving and verifying a computation in time-slice increments. For example, given a SNARK proof of the result of computation A up to time t1, and a SNARK proof of the result of computation A from time t1 to time t2, an incrementally verifiable computation scheme provides a SNARK proof of the result of computation A up to time t2. The SNARK proofs which are folded together in this manner typically need to be specially structured in order to allow for incrementally verifiable computation to happen efficiently.

Incrementally verifiable computation could also conceivably take other forms besides folding together proofs of consecutive time slices of execution. For example, if the goal of a computation is to verify that every element of a finite set S satisfies a property P, then an IVC scheme could conceivably exist which would fold together two proofs, each proving that it was computed that P is satisfied for all elements of some subset of S, to prove that each element of the union of those two sets satisfies P. For another example, IVC schemes could conceivably exist which would fold together two computations that occur in parallel with each other, operating on shared memory.

Recursive proving provides a means of implementing IVC. IVC can also be implemented without recursive proving, for example, using folding schemes. Folding schemes, generically, provide a means of folding together two statements of the same type as each other, and two proofs or two witnesses of the same type as each other, to get a new statement and a new proof or witness, once again belonging to the same types as the inputs. Folding schemes include accumulation schemes.

In folding schemes with selective verification, a folding scheme is paired with a means of proving just one of the statements folded together. Folding schemes with selective verification may be relevant, for example, to implementing ZK light clients for decentralized ledgers, where a bunch of clients want to know facts about the same ledger, but they all want to know different facts.

Although not all cases of IVC are cases of recursive proving, nonetheless, recursive proving can be used to implement IVC in full generality. However, the efficiency of IVC via recursive proving may not be the same as IVC via other means, such as IVC via folding schemes. For this reason, IVC is of interest independently of recursive proving. Recursive proving is strictly more expressive than IVC; everything that can be done with IVC can be done with recursive proving, but not everything that can be done with recursive proving is IVC. However, IVC by other means besides recursive proving may end up being more efficient than IVC via recursive proving, and so IVC by means other than recursive proving should be considered, as applicable, in cases where IVC is required.

The earliest known folding schemes use additively homomorphic polynomial commitment schemes. This means that these folding schemes require that committing to a polynomial is an operation which commutes over addition. If f is the operation of committing to a polynomial in an additively homomorphic PCS, then for all polynomials p and q of the relevant type, f(p + q) = f(p) + f(q). The addition (+) sign on the left hand side of the equation denotes addition of polynomials, whereas the addition sign on the right hand side of the equation denotes whatever is the relevant addition operation in the type of polynomial commitments, most often the addition operation of an elliptic curve group. Nova is a representative example of this type of the homomorphic approach to IVC via folding schemes.

There are also emerging techniques for doing folding schemes without using homomorphic PCSs. However, these techniques still have significant limitations, such as only allowing bounded depth of folding.

Last updated