# What is a SNARK?

A SNARK stands for a *succinct* proof that a certain statement is true. Succinct here is meaning that the proof is "short". For example, I have a statement:

- I know an $m$ such that $SHA256(m)=0$.

In SNARK, the proof should be short and fast to verify. A trivial proof of the above statement is to simply send $m$ to the verifier. However, that proof is not short; it is as big as $m$. Verification is also not fast, as the verifier has to hash the entire message to actually see the proof.

A SNARK can have a proof size of few KBs and verification should take at most seconds.

## zk-SNARK

In the case of a zk-SNARK, the proof reveals nothing about $m$. zk-SNARKs have many applications:

**Private transactions**: Tornado Cash, ZCash, IronFish, Aleo (private dApps).**Compliance**: private proofs of solvency & compliance, zero-knowledge taxes**Scalability**: Rollup systems with validity proofs

To understand zk-SNARKs, we need quite a bit of cryptography:

- Arithmetic Circuits
- Argument Systems

# Arithmetic Circuits

Fix a finite field $F={0,…,p−1}$ for some prime $p>2$. A finite field is just a set of numbers where we can do addition and multiplication in modulo $p$.

An arithmetic circuit is a DAG (directed acyclic graph) $C:F_{n}→F$ where internal nodes are labeled $+,−,×$ and inputs are labeled $1,x_{1},…,x_{n}$. The circuit defines an $n$-variate polynomial with an evaluation recipe.

Here is an example:

flowchart LR 1((1)) --> - x2((x2)) --> - 1((1)) --> + x1((x1)) --> + x2((x2)) --> + + --> x - --> x x1 --> x x --> r(( ))

This circuit defines the operation $x_{1}(x_{1}+x_{2}+1)(x_{2}−1)$.

For convenience, the size of the circuit refers to the number of gates, and is denoted as $∣C∣$. In the example above, $∣C∣=3$.

A theorem states that all polynomial time algorithms can be captured by polynomial sized arithmetic circuits!

For example:

- You can implement a circuit that does $C_{hash}(h,m)=(h−SHA256(m))$. This outputs 0 if $m$ is the preimage of $h$ using SHA256, and something other than 0 otherwise. This circuit uses around 20K gates, which is not bad!
- You can have a $C_{sig}(pk,m,σ)$ that outputs 0 if $σ$ is a valid ECDSA signature on $m$ with respect to $pk$.

# Argument Systems

Consider a public arithmetic circuit $C(x,w)→F$ where

- $x$ is a public statement in $F_{n}$
- $w$ is a secret witness in $F_{m}$

There will be a Prover with access to $x,w$ and a Verifier with access to $x$. Prover's goal is to convince a Verifier that $∃w$ s.t. $C(x,w)=0$.

sequenceDiagram actor P as Prover actor V as Verifier note over P: knows x, w note over V: knows x loop interactions P ->> V: send commitments and stuff V ->> P: send queries and stuff end note over V: accept or reject

The above process is interactive, prover and verifier interact with each other.

We also have non-interactive preprocessing argument systems. In this case, there is a preprocessing (setup) phase $S(C)→(S_{p},S_{v})$ that generate two public parameters, one for prover and one for the verifier.

sequenceDiagram actor P as Prover P(S_p, x, w) actor V as Verifier V(S_v, x) P ->> V: proof π note over V: accept or reject

As we can see, this is non-interactive; Verifier does not talk back to Prover!

More formally, a preprocessing argument system is a triple $(S,P,V)$:

- $S(C)$ takes an arithmetic circuit $C$ and outputs $(S_{p},S_{v})$ public parameters for the prover and verifier respectively.
- $P(S_{p},x,w)$ outputs a proof $π$.
- $V(S_{v},x,π)$ accepts or rejects a given proof.

An argument system must formally have the following properties:

**Completeness**: $∀x,w:C(x,w)=0$, it must hold that $Pr[V(S_{v},x,P(S_{p},x,w))=accept]=1$ for the honest provers.**Knowledge Soundness**: If the Verifier accepts the proof by a Prover, then the Prover must definitely know some $w$ such that $C(x,w)=0$. Furthermore, a Prover that does not know any such $w$ can only provide a proof that a Verifier can accept with at most negligible probability.**Zero-Knowledge**: An extra property is that $(C,S_{p},S_{v},x,π)$ should reveal nothing about $w$.

For a preprocessing argument system to be **succinct**, it needs to have the following to constraints:

- $∣π∣=O(g(∣C∣),λ)$ meaning that length of the proof can only be logarithmic in the size of the circuit (number of gates). It can be linear in the security parameter $λ$ too.
- $time(V)=O(∣x∣,g(∣C∣),λ)$ meaning that the time to verify should be logarithmic in the size of circuit, and linear with the size of the statement.
- $λ$ here is the security parameter (e.g. 128 for 128-bit security). It is mostly omitted from the complexity notation, or something like $O_{λ}(g(∣C∣))$ is used.

Note that with these constraints, the verifier does not have enough time to read $C$ itself, as it can't be done in time $g(∣C∣)$.

So in short, a zk-SNARK has all 4 properties above: **Complete**, **Knowledge Sound**, **Zero-Knowledge**, **Succinct**. We can go a bit more formal for the knowledge-soundness and zero-knowledge properties.

## Knowledge Soundness

Formally, for an argument system $(S,P,V)$ is knowledge-sound for some circuit $C$, if for every polynomial time adversary $A=(A_{0},A_{1})$ such that:

- $S(C)→(S_{p},S_{v})$
- $(x,state)←A_{0}(S_{p})$
- $π←A_{1}(S_{p},x,state)$
- $Pr[V(S_{v},x,π)=accept]>β$ for some non-negligible $β$

there is an efficient **extractor** $E$ that uses $A_{1}$ as a black box (oracle) such that:

- $S(C)→(S_{p},S_{v})$
- $(x,state)←A_{0}(S_{p})$
- $w←E_{A_{1}(S_{p},x,state)}(S_{p},x)$
- $Pr[C(x,w)=0]>β−ϵ$ for some negligible $ϵ$.

In other words, the probability that you can convince the verifier for some witness $w$ must be at most negligibly different than the probability that this witness $w$ is a valid witness for the circuit $C$.

## Zero-Knowledge

Formally (simplified), for an argument system $(S,P,V)$ is zero-knowledge if for every statement $x∈F_{n}$ the proof $π$ reveals nothing about $w$, other than its existence. By that, we mean that the Verifier is capable of generating the same proof $π$ without the knowledge of $w$. Formally, there must exist an efficient simulator $Sim$ such that $∀x∈F_{n}$ s.t. $∃w:C(x,w)=0$ the distribution:

- $(C,S_{p},S_{v},x,π):$ where $(S_{p},S_{v})←S(C),π←P(S_{p},x,w)$

is indistinguishable from the distribution:

- $(C,S_{p},S_{v},x,π):$ where $(S_{p},S_{v},π)←Sim(C,x)$

## Types of Preprocessing Setup

We said that a preprocessing setup $S(C)$ is done for a circuit $C$. Things are actually a bit more detailed than this, there are 3 types of setups:

**Trusted Setup per Circuit**: $S(C;r)$ is a randomized algorithm. The random $r$ is calculated per circuit, and must be kept secret from the prover; if a prover can learn $r$ than they can prove false statements!**Trusted Setup & Universal (Updatable)**: a random $r$ is only chosen once, and the setup phase is split in two parts: $S=(S_{init},S_{index})$.- $S_{init}(λ;r)→pp$ is done a single time.
- $S_{index}(pp,C)→(S_{p},S_{v})$ is done for each circuit, and nothing here is secret!

**Transparent**: $S(C)$ does not use any secret data, meaning that a trusted setup is not required.

These setups are sorted in ascending order with respect to how good they are, so Transparent is kind of the best.

# A SNARK Software System

flowchart LR D[DSL] -- compiler --> S[SNARK-friendly format] S -- Sp, Sv --> B[SNARK backend prover] X[x, witness] --> B B --> proof

A SNARK software system has the above format:

- A Domain-Specific Language is used to write the circuit, there are lots of languages (Circom, ZoKrates, Leo, Zinc, Cairo, Noir, …) and there is even a framework called CirC that can help you write your own DSL.
- The SNARK-friendly format also has options, such as R1CS, AIR, Plonk-CG, PIL, …
- A backend will run the heavy computation of generating the proof. Note that this is in time linear of $∣C∣$ for a circuit $C$.
- Finally, a generated proof!