The notion of "High Dimensional" is playing an increasingly important role in Probability Theory and Theoretical Computer Science. High-dimensional objects are unexpectedly interesting because they often contradict our intuition derived from low-dimensional cases. In this blog post, I will discuss some intriguing properties of polytopes in high-dimensional spaces. (Most of the content is from the amazing book “High Dimensional Probability and Applications in Data Science”[Ver18] by Roman Vershynin and the corresponding Lecture Videos).
Defintion 1(Convex Combination). A convex combination of points $z_1, . . . , z_m ∈ \R^n$ is a linear combination with coefficients that are non-negative and sum to $1$, i.e. it is a sum of the form
$$ \sum_{i=1}^m\lambda_iz_i\quad\mathrm{where}\quad\lambda_i\geq0\quad\mathrm{and}\quad\sum_{i=1}^m\lambda_i=1.\tag{1} $$
Definition 2(Convex Hull). The convex hull of a set $T ⊂ \R^n$ is the set of all convex combinations of all finite collections of points in $T$:
$$ \mathrm{conv}(T):=\left\{\text{convex combinations of }z_1,\ldots,z_m\in T\mathrm{~for~}m\in\mathbb{N}\right\}. $$
Theorem 1(Caratheodory’s theorem). Every point in the convex hull of a set $T ⊂ \R^n$ can be expressed as a convex combination of at most $n + 1$ points from $T$.
This theorem indicates that any given point in the convex hull can be represented by $n+1$ points from the same hull. When considering an approximation case, could we use fewer points? This leads us to the following theorem.
Theorem 2(Approximate Caratheodory’s theorem). Consider a set $T ⊂ \R^n$ whose diameter is bounded by $1$. Then, for every point $x ∈ \text{conv}(T)$ and every integer $k$, one can find points $x_1, . . . , x_k ∈ T$ such that
$$ \left\|x-\frac1k\sum_{j=1}^kx_j\right\|_2\leq\frac1{\sqrt{k}}. $$
There are two reasons why this result is surprising. First, the number of points $k$ in convex combinations does not depend on the dimension $n$. Second, the coefficients of convex combinations can be made all equal. (Note however that repetitions among the points $x_i$ are allowed.)
Proof. Our proof is based on the empirical method of B. Maurey.
Firstly, we may assume w.l.o.g. that not only the diameter but also the radius of $T$ is bounded by $1$, i.e.
$$ \|t\|\le 1\text{ for all }t\in T $$
Fix a point $x ∈ \text{conv}(T)$ and express it as a convex combination of some vectors $z_1, . . . , z_m ∈ T$ as in (1). Now, interpret the definition of convex combination in (1) probabilistically, with $λ_i$ taking the roles of probabilities. Specifically, we can define a random vector $Z$ that takes values $z_i$ with probabilities $λ_i$:
$$ \mathbb{P}\left\{Z=z_i\right\}=\lambda_i,\quad i=1,\ldots,m. $$
Our next step is to bound $\mathbb{E}\left\|x-\frac1k\sum_{j=1}^kZ_j\right\|2^2=\frac1{k^2}\mathbb{E}\left\|\sum{j=1}^k(Z_j-x)\right\|2^2=\frac1{k^2}\sum{j=1}^k\mathbb{E}\left\|Z_j-x\right\|2^2$ . Since if we could prove $\mathbb{E}\left\|x-\frac1k\sum{j=1}^kZ_j\right\|2^2$ is less than $\frac 1k$, then there must exists $k$ realizations $Z_1,Z_2,…,Z_k$ of $Z$ such that $\left\|x-\frac1k\sum{j=1}^kZ_j\right\|_2^2\leq\frac1k.$