Valida Design Considerations
This section outlines the key strategic choices driving the design of Valida, and the considerations that went into those design choices, along with possible future directions for Valida and the considerations behind those.
The overall goal motivating Valida is correct, fast, efficient, succinct and zero knowledge proving, with the minimum required engineering effort for the application developer. Given this goal, plausible options for achieving the goal include:
A circuit compiler, which turns a description of a set of facts (in some high level language) into a constraint system which can be used to construct (zk-)SNARK proofs of facts in that set.
A zk-VM, and associated tooling, which provide a way to compile programs and succinctly (and in some cases, in zero knowledge) prove executions of those programs.
Both of these seem like plausible options for achieving the goal. A plausible advantage of a circuit compiler is that it could result in more optimal arithmetization of the facts, and faster, more efficient proving, compared to a zk-VM. This makes sense in theory, because a zk-VM is something of a one size fits all solution: one circuit to prove executions of many programs cannot be optimal for each of those programs individually. However, actually achieving this outcome, a circuit compiler which results in more efficient proving than the state of the art zk-VMs, might require extensive research and development, since the theory and practice of circuit compilers is relatively new.
In contrast, the theory and practice of compilers for physical and virtual machines is one of the most mature areas of computer science and software engineering. The theory and practice of optimized zk-VMs is relatively new, but seems to be more tractable than the theory and practice of optimizing circuit compilers, within the current state of the art.
Of these two options, the zk-VM approach seems to be the more practical route to achieving good results on a reasonably short planning horizon. The techniques for solving the problems are currently better understood when it comes to zk-VMs than when it comes to circuit compilers.
Another advantage of the zk-VM approach is that it leverages a paradigm that all programmers are already familiar with, which is the paradigm of computing. Zero knowledge proving is fundamentally about proving facts, not about proving executions of programs. However, all of the facts that can be proven using zk-SNARKs can be expressed as facts about the executions of programs. Therefore, without loss of generality, we can think of zk-SNARK proving as being about succinctly, and sometimes in zero knowledge, proving the results of executing programs.
We could instead think about zk-SNARK proving in a completely different way; we could, for instance, write sets of facts in predicate logic and come up with a circuit compiler that goes directly from that representation to a constraint system for arithmetization, as in OSL. However, the paradigm of computation is convenient because it leverages the existing knowledge of programmers about this paradigm. The circuit compiler approach can leverage the paradigm of computation, and the zk-VM approach by its nature does so.
It is convenient for the application developer if the zk-VM and associated tooling leverage pre-existing and popular programming languages, as opposed to using a novel domain specific language specialized for zk-SNARK proving. In order to support popular programming languages, it makes sense to leverage existing compiler technology for those programming languages as much as applicable.
Designing a zk-VM involves choosing an ISA, an arithmetization strategy, and a polynomial commitment scheme. All of these are complex decisions with very considerable effects on performance outcomes. Also, these decisions can be expected to change over time as lessons are learned about the effectiveness of different strategies.
One of the basic design decisions in Valida is that such design decisions should be decoupled as much as reasonably possible. In Valida's architecture, the ISA, the arithmetization strategy, and the polynomial commitment scheme are all reasonably decoupled from each other, so that they can vary more or less independently. The implementation of the ISA is coupled via the intermediate representation (currently, AIR) to the arithmetization strategy. To change the intermediate representation used by the ISA implementation and the arithmetization strategy would require changing both the ISA implementation and the arithmetization strategy.
The ISA is probably the most difficult design choice to change in a zk-VM project, because it is coupled with both the zk-VM implementation and the compiler toolchain. In the case of Valida, the ISA is a novel ISA designed specifically for use in a zk-VM.
The thinking behind the choice of a novel ISA is that an ISA designed specifically for use in a zk-VM can potentially be implemented more efficiently in a zk-VM than an ISA which is not designed specifically for use in a zk-VM. The SNARK proving context is substantially different from the hardware execution environment, in terms of how ISA design choices impact performance outcomes.
For example, the hardware context involves memory locality concerns, because information in the system travels no faster than the speed of light, and so pieces of silicon that are closer together can talk to each other with less latency than pieces of silicon that are farther apart from each other. It's because of memory locality concerns that ISAs designed for hardware include general-purpose registers. General-purpose registers are temporary storage locations which are physically close to the CPU's arithmetic logic unit (ALU), making them efficient to access for the ALU. The general-purpose registers play a special role in the ISA, being able to be used as operands to all of the operations the ALU can perform.
In the SNARK proving context, there are no memory locality concerns, and accessing RAM may be no more costly than accessing a general-purpose register. In this context, it makes sense to do away with general-purpose registers, because they are not needed and thus constitute unnecessary bloat, for the SNARK proving context.
Being simplified and optimized for the SNARK proving setting, Valida is substantially different from most other ISAs. This means that existing compiler technology cannot be used off the shelf for Valida. A new compiler backend needs to be created, which can generate code for Valida. Since the decision is to compile popular high-level languages to Valida, instead of creating a new language, existing compiler frontend technology can be leveraged.
Lita is using LLVM as the compiler frontend and the framework for creating the compiler backend for Valida. This makes sense because LLVM is probably the most mature open source compiler framework. LLVM supports many different source languages, compiling all of them to a target agnostic intermediate representation (LLVM IR). LLVM also provides a framework for creating compiler backends, which can compile LLVM IR into machine code. Leveraging LLVM saves a lot of effort for the complex task of creating compilers for popular programming langauges like C and Rust.
Overall, the design choices behind the Valida zk-VM and the Valida LLVM compiler toolchain can be expected to eventually yield a highly usable and efficient zero knowledge proving system, with a performance advantage that will be difficult for other projects to replicate, since it required developing a novel ISA and novel compiler backend technology in order to do it. The cost of this is the same as the benefit; it takes a lot of time and effort to develop such compiler backend technology from scratch. Whereas other zk-VM projects that use a pre-existing, popular ISA do not have to develop any compiler technology, and can focus solely on zk-VM development, Lita needs to focus on both zk-VM development and compiler backend development in order to see this through to the goal, following the strategy outlined here.
Last updated