What is a SNARK?
A SNARK stands for a succinct proof that a certain statement is true. Succinct here is meaning that the proof is "short". For example, I have a statement:
- I know an such that .
In SNARK, the proof should be short and fast to verify. A trivial proof of the above statement is to simply send to the verifier. However, that proof is not short; it is as big as . Verification is also not fast, as the verifier has to hash the entire message to actually see the proof.
A SNARK can have a proof size of few KBs and verification should take at most seconds.
zk-SNARK
In the case of a zk-SNARK, the proof reveals nothing about . zk-SNARKs have many applications:
- Private transactions: Tornado Cash, ZCash, IronFish, Aleo (private dApps).
- Compliance: private proofs of solvency & compliance, zero-knowledge taxes
- Scalability: Rollup systems with validity proofs
To understand zk-SNARKs, we need quite a bit of cryptography:
- Arithmetic Circuits
- Argument Systems
Arithmetic Circuits
Fix a finite field for some prime . A finite field is just a set of numbers where we can do addition and multiplication in modulo .
An arithmetic circuit is a DAG (directed acyclic graph) where internal nodes are labeled and inputs are labeled . The circuit defines an -variate polynomial with an evaluation recipe.
Here is an example:
flowchart LR 1((1)) --> - x2((x2)) --> - 1((1)) --> + x1((x1)) --> + x2((x2)) --> + + --> x - --> x x1 --> x x --> r(( ))
This circuit defines the operation .
For convenience, the size of the circuit refers to the number of gates, and is denoted as . In the example above, .
A theorem states that all polynomial time algorithms can be captured by polynomial sized arithmetic circuits!
For example:
- You can implement a circuit that does . This outputs 0 if is the preimage of using SHA256, and something other than 0 otherwise. This circuit uses around 20K gates, which is not bad!
- You can have a that outputs 0 if is a valid ECDSA signature on with respect to .
Argument Systems
Consider a public arithmetic circuit where
- is a public statement in
- is a secret witness in
There will be a Prover with access to and a Verifier with access to . Prover's goal is to convince a Verifier that s.t. .
sequenceDiagram actor P as Prover actor V as Verifier note over P: knows x, w note over V: knows x loop interactions P ->> V: send commitments and stuff V ->> P: send queries and stuff end note over V: accept or reject
The above process is interactive, prover and verifier interact with each other.
We also have non-interactive preprocessing argument systems. In this case, there is a preprocessing (setup) phase that generate two public parameters, one for prover and one for the verifier.
sequenceDiagram actor P as Prover P(S_p, x, w) actor V as Verifier V(S_v, x) P ->> V: proof π note over V: accept or reject
As we can see, this is non-interactive; Verifier does not talk back to Prover!
More formally, a preprocessing argument system is a triple :
- takes an arithmetic circuit and outputs public parameters for the prover and verifier respectively.
- outputs a proof .
- accepts or rejects a given proof.
An argument system must formally have the following properties:
- Completeness: , it must hold that for the honest provers.
- Knowledge Soundness: If the Verifier accepts the proof by a Prover, then the Prover must definitely know some such that . Furthermore, a Prover that does not know any such can only provide a proof that a Verifier can accept with at most negligible probability.
- Zero-Knowledge: An extra property is that should reveal nothing about .
For a preprocessing argument system to be succinct, it needs to have the following to constraints:
- meaning that length of the proof can only be logarithmic in the size of the circuit (number of gates). It can be linear in the security parameter too.
- meaning that the time to verify should be logarithmic in the size of circuit, and linear with the size of the statement.
- here is the security parameter (e.g. 128 for 128-bit security). It is mostly omitted from the complexity notation, or something like is used.
Note that with these constraints, the verifier does not have enough time to read itself, as it can't be done in time .
So in short, a zk-SNARK has all 4 properties above: Complete, Knowledge Sound, Zero-Knowledge, Succinct. We can go a bit more formal for the knowledge-soundness and zero-knowledge properties.
Knowledge Soundness
Formally, for an argument system is knowledge-sound for some circuit , if for every polynomial time adversary such that:
- for some non-negligible
there is an efficient extractor that uses as a black box (oracle) such that:
- for some negligible .
In other words, the probability that you can convince the verifier for some witness must be at most negligibly different than the probability that this witness is a valid witness for the circuit .
Zero-Knowledge
Formally (simplified), for an argument system is zero-knowledge if for every statement the proof reveals nothing about , other than its existence. By that, we mean that the Verifier is capable of generating the same proof without the knowledge of . Formally, there must exist an efficient simulator such that s.t. the distribution:
- where
is indistinguishable from the distribution:
- where
Types of Preprocessing Setup
We said that a preprocessing setup is done for a circuit . Things are actually a bit more detailed than this, there are 3 types of setups:
- Trusted Setup per Circuit: is a randomized algorithm. The random is calculated per circuit, and must be kept secret from the prover; if a prover can learn than they can prove false statements!
- Trusted Setup & Universal (Updatable): a random is only chosen once, and the setup phase is split in two parts: .
- is done a single time.
- is done for each circuit, and nothing here is secret!
- Transparent: does not use any secret data, meaning that a trusted setup is not required.
These setups are sorted in ascending order with respect to how good they are, so Transparent is kind of the best.
A SNARK Software System
flowchart LR D[DSL] -- compiler --> S[SNARK-friendly format] S -- Sp, Sv --> B[SNARK backend prover] X[x, witness] --> B B --> proof
A SNARK software system has the above format:
- A Domain-Specific Language is used to write the circuit, there are lots of languages (Circom, ZoKrates, Leo, Zinc, Cairo, Noir, …) and there is even a framework called CirC that can help you write your own DSL.
- The SNARK-friendly format also has options, such as R1CS, AIR, Plonk-CG, PIL, …
- A backend will run the heavy computation of generating the proof. Note that this is in time linear of for a circuit .
- Finally, a generated proof!