SHA-256 (vs. SP1)
As of May 2024
Last updated
As of May 2024
Last updated
This note gives a benchmark comparison of proving execution of SHA-256 on a 32-byte input, on Valida and on SP1, using their precompile for SHA-256. In this benchmark, single-core Valida proving was estimated to be about 304 times more efficient than multi-core SP1 proving, in terms of the time and energy expended on computing the proof. Multi-core Valida proving was estimated to be about 53.5 times faster than multi-core SP1 proving on this example. These are not the best results that could be obtained on this problem with Valida, since much work remains to be done on optimizing the prover and the code generated by the compiler.
The raw data is available. The following table records the means, medians, and standard deviations of the various sample groups. All measurements are denoted in seconds.
The following table displays the ratios between the mean measurements for Valida and SP1 of the user time and the wall clock time. A number greater than 1 indicates that Valida is faster; a number less than 1 indicates that SP1 is faster.
The following zk-VM versions were used:
The following version of the Valida LLVM compiler was used: 45bce621680189d5d006f88cbadbe9cbef403b89
The C implementation of SHA-256 which was compiled to run on Valida is a modified version of the reference implementation of SHA-256 by Brad Conte. That modified version is available here: https://github.com/lita-xyz/valida-c-examples/blob/main/sha256_32byte_in.c
The Rust implementation of SHA-256 which was compiled to run on SP1 is available in this repository. The SHA-256 library points to SP1's patched crate, which calls their precompiles under the hood.
The inputs for both Valida and SP1 are the same and are a 32-byte array of 5s.
The following commands were run to execute the benchmarks:
For Valida:
The above command shows multi-threaded execution; for single-threaded execution, 32 is replaced with 1.
Note that Valida currently fails to verify this program, but the output is checked to be correct by examining the log time. We are working on fixing this problem but the verification time should not make a meaningful impact on the results.
For SP1, first we built the program in the program
directory with the following command:
Then, run the following command in the script
directory:
In order to measure the run time of a program, this study used GNU time. The test system has a AMD Ryzen 9 7950X 16-Core Processor, with 32 threads, and 124.9 GB DDR5 RAM. During the tests there was no other programs running on the system other than the tests themselves. Tests were performed sequentially, one after another.
For some unknown reason, running cargo run --release
on the test system caused the host program to be recompiled every time, which is not supposed to happen. This made it a little harder to measure the execution time of the SP1 prover. The build time listed in the output is therefore subtracted from the recorded time such that only the execution, proving, and verifying times are counted.
Prover
Measure
Mean
Median
Standard deviation
Valida serial
User t.
1.1237
1.1235
0.01787
Valida parallel
User t.
2.7835
2.7865
0.06926
SP1
User t.
341.757
322.061
63.214
Valida serial
Wall clock t.
1.151
1.1505
0.0219
Valida parallel
Wall clock t.
0.249
0.2485
0.0051
SP1
Wall clock t.
13.326
13.328
0.0597
Measure
Valida condition
Valida advantage
User time
Serial
304.135
User time
Parallel
122.78
Wall clock time
Serial
11.578
Wall clock time
Parallel
53.47