Probability
 Random Variable: A variable that takes on (discrete) values with certain probabilities
 Probability Distribution for a Random Variable: The probabilities with which the variable takes on each possible value.
 Each probability must be $∈[0,1]$
 Sum of probabilities should be equal to $1$
 Event: A particular occurence in some experiment. $Pr[E]:probability of eventE$
 Conditional Probability: Probability that one event occurs, assuming some other event occured. For example, probability that A occurs assuming B occured: $Pr[A∣B]=Pr[B]Pr[A∧B] $
 Independent Random Variables: Two random variables $x,y$ are independent if $∀x,y:Pr[X=x∣Y=y]=Pr[X=x]$
 Law of Total Probability: Say $E_{1},E_{2},…,E_{n}$ are a partition of all possibilities (no pair can occur at the same time). Then for any other event $A$: $Pr[A]=i∑ Pr[A∧E_{i}]=i∑ (Pr[A∣E_{i}]×Pr[E_{i}])$
 Bayes Theorem: A cool trick to change the orders in conditional probability. $Pr[A∣B]=Pr[B]Pr[B∣A]×Pr[A] $
Discrete Probability

Everything here is defined over a universe $U$, such as ${0,1}_{n}$. The universe must be a finite set. As an example: ${0,1}_{2}={00,01,10,11}$. Also note that $Pr[U]=1$.

A probability distribution $P$ over $U$ is a function $P:U→[0,1]$ such that $∑_{x∈U}P(x)=1$.
 A uniform distribution is when $∀x∈U:P(x)=1/∣U∣$.
 A point distribution is when $P(x_{0})=1∧∀x=x_{0}:P(x)=0$.

An event $A$ is a set $A⊆U$ and: $Pr[A]=x∈A∑ P(x)∈[0,1]$

A union bound of $A_{1}$ and $A_{2}$ is: $Pr[A_{1}∪A_{2}]≤Pr[A_{1}]+Pr[A_{2}]$

If $A_{1}$ and $A_{2}$ are disjoint sets, then $Pr[A_{1}∪A_{2}]=Pr[A_{1}]+Pr[A_{2}]$.

A random variable $X$ is a function $X:U→V$ where $U$ is the universe and $V$ is the value set.
EXAMPLE: Let $X:{0,1}_{n}→{0,1}$ where $X(a)=lsb(a)∈{0,1}$ and $lsb$ returns the least significant bit of its argument.
 Let $U$ be some set such as ${0,1}_{n}$. We write $rR U$ to denote a uniform random variable over $U$, in other words: $∀a∈U:Pr[r=a]=1/∣U∣$.
A deterministic algorithm is one where $y←F(x)$, but a randomized algorithm is as $y←F(x;r)$ where $rR {0,1}_{n}$. The output is a random variable $yR F(x)$ which defines a distribution over all possible outputs of $F$ given $x$. For example, $Enc(k,m;r)$ is a randomized encryption!
 Two events $A$ and $B$ are independent events if $Pr[A∧B]=Pr[A]×Pr[B]$.
 Two random variables $X,Y$ taking values over $V$ are independent random variables if $∀a,b∈V:Pr[X=a∧Y=b]=Pr[X=a]×Pr[Y=b]$.
XOR Operation
XOR is a very important function that will keep appearing throughout cryptography. It is a logical operation $⊕:{0,1}×{0,1}→{0,1}$:
a  b  a $⊕$ b 

0  0  0 
0  1  1 
1  0  1 
1  1  0 
Theorem: For a r.v. $X$ over ${0,1}_{n}$ and an independent uniform random r.v. $Y$ over ${0,1}_{n}$, $Z:=X⊕Y$ is a uniform random variable over ${0,1}_{n}$. This is a very important theorem!
Birthday Paradox
This is a quite surprising probabilistic result, and it will come handy in some proofs.
Theorem: Let $r_{1},r_{2},…,r_{n}∈U$ be independent identically distributed random variables. The theorem states that when $n=1.2×∣U∣_{1/2}$, then:
$Pr[∃i,j:i=j∧r_{i}=r_{j}]≥21 $
Proof: We will first negate this probability:
$Pr[∃i=j:r_{i}=r_{j}]=1−Pr[∀i=j:r_{i}=r_{j}]$
Let $B=∣U∣$, we see that:
$1−Pr[∀i=j:r_{i}=r_{j}]=1−(BB−1 )(BB−2 )…(BB−(n−1) )$
$=1−i=1∏n−1 (1−Bi )$
We will use the inequality $1−x≤e_{−x}$ to obtain:
$=1−i=1∏n−1 (1−Bi )≥1−i=1∏n−1 e_{B−i}=1−e_{B−1∑_{i=1}i}$
We can further obtain another inequality as:
$1−e_{B−1∑_{i=1}i}≥1−e_{2B−n}$
When we plug in $n=1.2×B_{1/2}$ here we get $1−e_{−0.72}=0.53$. Surely, $0.53>1/2$, QED!
Entropy
In the context of discrete probability, entropy can be thought of as a measure of randomness. Suppose you have a r.v. $X$ with outcomes $x_{1},x_{2},...,x_{n}$ with probabilities $p(x_{1}),p(x_{2}),...,p(x_{n})$.

Entropy is defined as:
$H(X)=−i=1∑n p(x_{i})gp(x_{i})$

Minentropy is when $Pr[X=x]<2_{−n}$ for any outcome $x$.

Mutual Information measures how one random variable tells us about another. Suppose you have r.v. $X$ and $Y$. Denote $P_{(X,Y)}(x,y)=Pr[X=x∧Y=y]$ as the probability mass function of $X$ and $Y$, and $P_{X}(x)=Pr[X=x]$ and $P_{Y}(y)=Pr[Y=y]$ as the marginal probability mass functions of $X$ and $Y$ respectively. The mutual information of $X$ and $Y$ is then defined as:
$I(X,Y)=x,y∑ P_{(X,Y)}(x,y)gP_{X}(x)P_{Y}(y)P_{(X,Y)}(x,y) $
Asymptotics
There are some special properties of functions that relate how much they grow with respect to one another.
 Negligible functions are literally negligible; too small that you should not worry much about them (especially if the functions represents the probability of failure). Formally, a positive function $μ(n)$ is negligible if for any positive polynomial $p(n)$ there exists a constant $n_{0}$ such that $∀n≥n_{0}$ we have $μ(n)<1/p(n)$.
 Noticeable functions are kind of like the opposite of negligible ones. A positive function $f$ is noticeable if there exists a positive polynomial $p$ and a number $n_{0}$ such that $f(n)≥1/p(n)$ for all $n≥n_{0}$.
 Nonnegligible functions are functions that are not negligible. This does not necessarily mean they are noticable!
 Overwhelming functions are such that if $f$ is overwhelming, then $1−f$ is negligible.
Note that space complexity should be upper bounded by Time complexity. You can't be using more complex space than time, because you should also spend time accessing/modifying whatever data you store.
An equivalent definition of negligible functions is given by the BigO notation: a positive function $μ(n)$ is negligible if and only if $μ(n)∈O(1/p(n))$ for all positive polynomials $p$. The following functions are negligible:
 $1/2_{n}$
 $1/n!$
 $1/n_{logn}$
A polynomial times polynomial functions is a polynomial; however, a polynomial times negligible is a negligible function.
Negligible vs Noticable
 A positive function $f(n)$ is negligible if and only if for any positive polynomial $p(n)$ such that $p(n)f(n)$ converges to 0.
 A positive function $f(n)$ is noticable if and only if there exists a positive polynomial $p(n)$ such that $p(n)f(n)$ goes to $∞$.
Note that for negligible we use "for any", but for noticable we use "there exists".