Compactness and Heine-Borel

Theorem (Heine-Borel)
A subset \(E\) of \({\mathbb R}^n\) or \({\mathbb C}^n\) is compact if and only if it is closed and bounded.
We already have seen that any compact subset of a metric space must be closed and bounded. We also know that closed subsets of compact sets are necessarily compact in any metric space. To prove the Heine-Borel Theorem in \({\mathbb R}^n\) (or \({\mathbb C}^n\), which is topologically identical to \({\mathbb R}^{2n}\)), it therefore suffices to show that Cartesian products of closed intervals in \({\mathbb R}^n\) are compact. We can prove this by induction on dimension by showing that \([a,b] \times V\) is compact when \(V\) is compact.

Fix an open cover \(\mathcal U\) of \([a,b] \times V\) for some compact set \(V \subset {\mathbb R}^{n-1}\). Let \(t_*\) be the supremum of all \(t \in [a,b]\) for which \([a,t] \times V\) can be covered by a finite subset of \(\mathcal U\). Here we let \([a,a] := \{a\}\). This \(t_*\) is well-defined because \([a,a] \times V\) is coverable simply because the restriction of each \(U \in \mathcal U\) to the slice \(\{x \in {\mathbb R}^n \ : \ x_1 = a\}\) equals \(\{a\} \times U'\) for some open \((n-1)\)-dimensional set \(U'\) and those \(U'\) necessarily cover \(V\) because the \(U\)'s cover \(\{a\} \times V\). By essentially the same argument, there are finitely many \(U_1,\ldots,U_N\) in \(U\) which cover \(\{t_*\} \times V\).
Lemma
If \(V \subset {\mathbb R}^{n-1}\) is compact and an open set \(U \subset {\mathbb R}^n\) contains \(\{t_*\} \times V\), then there is a positive \(\epsilon\) such that \(U\) contains \((t_*-\epsilon,t_*+\epsilon) \times V\).
Proof
Every point \(x \in V\) admits some \(\epsilon\) such that \((t_*-\epsilon,t_* + \epsilon) \times B_\epsilon(x) \subset U\). Cover \(V\) by the corresponding balls and let \(\{B_{\epsilon_i}(x_i)\}_{i=1}^N\) be a finite subcover. Then let \(\epsilon = \min_{i=1,\ldots,N} \epsilon_i\). Then \((t_* - \epsilon,t_* + \epsilon) \times B_{\epsilon_i}(x_i) \subset U\) for each \(i\), and taking the union over \(i\) gives that \(\{(t_*-\epsilon,t_*+\epsilon)\} \times V \subset U\) also.
By the lemma, this means that the set \(S\) of points \(t\) such that \([a,t] \times V\) can be covered by finitely many \(U\) in \(\mathcal U\) is relatively open in \([a,b]\): take \(U_1,\ldots,U_N\) covering \([a,t] \times V\) and \(U_{N+1},\ldots,U_{N+M}\) covering \((t-\epsilon,t+\epsilon) \times V\), and their union covers \([a,t'] \times V\) for all \(t' \in [a,b]\) at distance less than \(\epsilon\) to \(t\). But it is also true that the supremum \(t_*\) of \(S\) must belong to \(S\). This is because the finite collection \(U_1,\ldots,U_N\) that cover \(\{t_*\} \times V\) also cover \((t_* - \epsilon,t_* + \epsilon) \times V\) for some positive \(\epsilon\), and \([a,t_* -\epsilon'] \times V\) can be covered by finitely many \(U_j\)'s as well for some \(\epsilon' < \epsilon\) (because \(t_* > a\) by openness), so the union of those finite covers covers at least \([a,t_*] \times V\). This forces \(t_* = b\) because otherwise relative openness of \(S\) in \([a,b]\) would imply that it contains elements strictly larger than \(t_*\), which isn't possible.

1. Compactness and Sequential Compactness in Metric Spaces

Lemma
A metric space \((X,d)\) is compact if and only if for any function \(r : X \rightarrow {\mathbb R}_{>0}\), there are finitely many points \(x_1,\ldots,x_N\) such that
\[ X = \bigcup_{n=1}^N B_{r(x_i)}(x_i). \]
In other words, the space is compact if and only if every covering of the form \(\{B_{r(x)}(x)\}_{x \in X}\) has a finite subcovering.
Proof
Meta (Main Idea)
If \((X,d)\) is compact, existence of the desired subcover is automatic. In the reverse direction, given any covering \(\mathcal{U}\) of \(X\), for each \(x \in X\), let \(r(x) := \frac{1}{n}\) for the smallest positive integer \(n\) such that \(B_{1/n}(x)\) is entirely contained in some set \(U\) in the cover \(\mathcal{U}\). Because each \(x\) must belong to some \(U\) and each \(U\) is open, there is always such an \(n\). Now if \(\{B_{r(x_i)}(x_i)\}_{i=1}^N\) covers \(X\), then replace each such ball by an element of \(\mathcal U\) that contains it and one must be left with a finite subcover of \(\mathcal U\).
Lemma
In a sequentially-compact metric space \((X,d)\), for any \(\delta > 0\), there do not exist any infinite sets \(E\) such that \(d(x,x') > \delta\) for all pairs \(x,x' \in E\) with \(x \neq x'\) (such a set is called \(\delta\)-separated).
Lemma
In a sequentially-compact metric space \((X,d)\), given \(r : X \rightarrow {\mathbb R}_{> 0}\), there exists some \(\epsilon_0>0\) such that every point \(x\) with \(r(x) < \epsilon_0\) has a corresponding point \(x'\) with \(r(x') \geq \epsilon_0\) such that \(B_{r(x)}(x) \subset B_{r(x')}(x')\) (i.e., all balls in the collection below the threshold size are contained in a corresponding ball at or above the threshold).
Proof
Suppose not. Then for each natural number \(n\), \(2^{-n}\) is not a valid threshold, meaning that there is a point \(x_n\) such that \(r(x_n) < 2^{-n}\) but \(B_{r(x_n)}(x_n)\) is not contained in \(B_{r(x')}(x')\) for any \(x'\) with \(r(x') \geq 2^{-n}\). Now \(\{x_n\}_{n=1}^\infty\) has a convergent subsequence \(\{x_{n_j}\}_{j=1}^\infty\) tending to \(x_*\). For sufficiently large \(j\), \(d(x_*,x_{n_j}) < r(x_*)/2\) and \(2^{-n_j} < r(x_*)/2\), which means that \(B_{r(x_{n_j})}(x_{n_j}) \subset B_{r(x_*)}(x_*)\), which is a contradiction by virtue of the fact that the radius \(r(x_*)\) exceeds the threshold \(2^{-n_j}\) for which \(B_{r(x_{n_j})}(x_{n_j})\) is known not to belong to any such balls with radius that large or larger.
Theorem (Equivalence of Compactness and Sequential Compactness for Metric Spaces)
A metric space \((X,d)\) is compact if and only if it is sequentially compact.
Proof
Compactness Implies Sequential Compactness Let \(\{x_n\}_{n=1}^\infty\) be a sequence in a compact metric space \((X,d)\). Let \(E \subset X\) be the set of values of the sequence, i.e., \(E := \{y \in X \ : y = x_n \text{ for some } n\}\). If \(z\) is any point in \(X\), we may assume that there is some radius \(r > 0\) such that \(B_r(z)\setminus\{z\}\) contains no points of \(E\) (otherwise we may build a subsequence \(\{x_{n_k}\}_{k=1}^\infty\) such that \(d(x_{n_k},z) < d(x_{n_{k-1}},z)/2\) for each \(k\), which will clearly converge because the distance to \(z\) decays at least exponentially in \(k\); note also that if we were forced to take \(n_k < n_{k-1}\), then \(B_r(z) \setminus \{z\}\) would not intersect \(E\) when \(r < \frac{1}{2} \min \{d(z,x_1),\ldots,d(z,x_{n_{k-1}})\}\)). We can also assume that \(E\) is infinite, since finiteness of \(E\) implies that there is a constant (and therefore convergent) subsequence of \(\{x_n\}_{n=1}^\infty\). But we can now see that \(X\) must be covered by balls \(B_r(z)\) which contain either zero or one points of \(E\) (depending on whether \(z \not \in E\) or \(z \in E\)). The existence of a finite subcover implies that \(E\) must be finite, which is a contradiction.

Sequential Compactness Implies Compactness Let \(\tilde r(x) := 1/n\) for the smallest \(n\) such that \(1/n \leq r(x)/2\) and let \(\tilde \epsilon_0\) be the threshold such that every ball \(B_{\tilde r(x)}(x)\) of radius less than \(\tilde \epsilon_0\) is contained a ball \(B_{\tilde r(z)}(z)\) of radius at least \(\tilde \epsilon_0\).

Let \({\mathcal{C}}_1\) be a maximal collection of \(1\)-separated points \(x\) for which \(\tilde r(x) = 1\). Next let \({\mathcal{C}}_2\) be a maximal collection of \(1/2\)-separated points \(x\) such that \(\tilde r(x) = 1/2\) and \({\mathcal {C}}_1 \cup {\mathcal {C}}_2\) is also \(1/2\)-separated. Proceed similarly: Let \({\mathcal{C}}_n\) be a maximal collection of \(1/n\)-separated points \(x\) such that \(\tilde r(x) = 1/n\) and \(\bigcup_{j=1}^n {\mathcal{C}}_j\) is \(1/n\)-separated. Continue in this way for all \(n \leq N_0\), where \(N_0\) is the smallest integer such that \(1/N_0 < \epsilon_0\).

For any \(x\), if \(\tilde r(x) < \tilde \epsilon_0\), then \(B_{\tilde r(x)}(x)\) is contained in some \(B_{\tilde r(x')}(x')\) for \(r(x') \geq \tilde \epsilon_0 \geq 1/N_0\). Moreover, if any ball \(B_{\tilde r(x')}(x')\) with \(r(x') \geq 1/N_0\) does not have \(x' \in {\mathcal{C}}_j\) for the appropriate \(j\), it must be because there is a point \(x''\) which does belong to one of those \({\mathcal{C}}_j\) which is too close to \(x'\), i.e., \(r(x'') \geq r(x')\), \(d(x'',x') < r(x')\) and \(x'' \in C_j\) for the relevant \(j\). Any point \(y \in B_{\tilde r(x')}(x')\) satisfies
\[{}d(y,x'') \leq d(y,x') + d(x',x''){}\]
\[{}< \tilde r(x') + \tilde r(x'){}\]
\[{}< 2 \tilde r(x'').{}\]
In other words, \(B_{\tilde r(x')}(x') \subset B_{2 \tilde r(x'')} (x'')\). Because \(2 \tilde r(x'') \leq r(x)\), it follows that \(B_{\tilde r(x')}(x') \subset B_{r(x'')}(x'')\) for some \(x''\) belonging to one of the finite sets \({\mathcal{C}}_1,\ldots,{\mathcal{C}}_{N_0}\). Thus the finite subcollection
\[ \left\{ B_{r(x'')}(x'') \ : \ x'' \in \bigcup_{j=1}^{N_0} {\mathcal{C}}_j \right\}\]
must cover every ball \(B_{\tilde r(x)}(x)\), and therefore must cover \(X\) itself.
A set \(E\) in a metric space \((X,d)\) will be called close to finite when for every \(\epsilon\), there exists a finite subset \(\{x_1,\ldots,x_N\} \subset X\) such that \(E \subset \bigcup_{n=1}^N B_\epsilon(x_n)\).
Theorem
A metric space \((X,d)\) is compact if and only if it is complete and close to finite.
Proof
\([\Rightarrow]\) Suppose that \((X,d)\) is compact but not complete. By the notes on completeness, there must be a Cauchy sequence \(\{x_n\}_{n=1}^\infty\) in \(X\) for which every \(x \in X\) has a ball \(B_\epsilon(x)\) containing \(x_n\) for only finitely many values of \(n\). By compactness, \(X\) may be covered by finitely many such balls, but then only finitely many values of \(n\) could have \(x_n\) belonging to the union of these balls (which would be all of \(X\) and consequently contain \(x_n\) for all values of \(n\)).

To see that it must be close to finite, fix \(\epsilon > 0\) and cover \(X\) by balls \(B_\epsilon(x)\) for each \(x \in X\). Compactness implies the existence of a finite subcover satisfying exactly the required properties.

\([\Leftarrow]\) Let \(\{x_n\}_{n=1}^\infty\) be a sequence in \(X\). We will show that it must have a convergent subsequence. For each positive integer \(j\), let \(S_j\) be a finite subset of \(X\) such that every element of \(X\) is within distance strictly less than \(2^{-j}\) to some point in \(S_j\).

Because \(S_1\) is finite, there must be \(y_1 \in S_1\) such that \(d(x_n,y_1) < 1/2\) for infinitely many values of \(n\). Let \(n_1\) be the smallest such value of \(n\). By induction, suppose that there are points \(y_1,\ldots,y_j\) in \(S_1,\ldots,S_j\) such that infinitely many values of \(n\) (call the set of such \(n\)'s \(I_j\)) have \(d(x_n,y_k) < 2^{-k}\) for each \(k = 1,\ldots,j\). Let \(n_j\) be the smallest such value of \(n \in I_j\) which is greater than \(n_{j-1}\). Because \(S_{j+1}\) is finite and \(I_j\) is infinite, there must be some \(y_{j+1} \in S_{j+1}\) such that infinitely many \(n\) in \(I_j\) have \(d(x_n,y_{j+1}) < 2^{-j-1}\). In particular, there is some \(n_{j+1}\) strictly greater than \(n_j\) which belongs to this desired set \(I_{j+1}\) of indices.

The subsequence \(\{x_{n_j}\}_{j=1}^\infty\) must be Cauchy, because
\[{}d(x_{n_j},x_{n_k}){}\]
\[{}\leq d(x_{n_j},y_j) + d(y_j,x_{n_k}){}\]
\[{}\leq 2^{-j+1}{}\]
when \(k \geq j\). In particular, for any \(\epsilon > 0\), if \(2^{-N+1} < \epsilon\) and \(j,k \geq \epsilon\), then \(d(x_{n_j},x_{n_k}) < \epsilon\). Because \(X\) is complete, the subsequence \(\{x_{n_j}\}_{j=1}^\infty\) must converge to some point of \(X\), which is exactly what is required to see that \(X\) is sequentially compact.

2. Compactness in Euclidean Space

Because \({\mathbb R}^n\) is complete, a subset of \({\mathbb R}^n\) is complete if and only if it is closed. Likewise, because every bounded subset of \({\mathbb R}^n\) can be covered by finitely many balls of radius \(\epsilon\), a set \(E \subset {\mathbb R}^n\) is close to finite if and only if it is bounded. So the characterization of compact sets in metric spaces proved above reduces exactly to the Heine-Borel Theorem when \(X = {\mathbb R}^n\).

3. Compactness in Hilbert Spaces

Key Definition: Close to Finite Dimensional Let us call a subset \(E\) of a Hilbert space \(H\) close to finite-dimensional when for every \(\epsilon > 0\), there exists an orthogonal projection \(P\) onto a finite-dimensional subspace \(M\) of \(H\) such that \(||x - Px|| < \epsilon\) for all \(x \in E\).
Theorem
A subset of a Hilbert space is compact if and only if it is closed, bounded, and close to finite-dimensional.
Proof
\([\Rightarrow]\) Fix any \(\epsilon > 0\). From general metric space theory, we know that \(E\) must be closed and bounded. The set \(U_P := \{ u \ : \ ||u - P u|| < \epsilon \}\) is open for any orthogonal projection \(P\). The zero vector belongs to each \(U_P\), and for any \(x \in H\) which is nonzero, it belongs to \(U_P\) when \(P\) is the one-dimensional projection onto the vector subspace spanned by \(x\). If \(E \subset H\) is compact, then, it must be contained in a finite union of sets \(U_{P_1},\ldots,U_{P_N}\) for orthogonal projections \(P_j\) onto 1D subspaces spanned by vectors \(v_1,\ldots,v_N\). Let \(P\) be orthogonal projection onto the span of \(\{v_1,\ldots,v_N\}\). Since \(P x\) is the vector \(z\) in the span which minimizes \(||x - z||\), it follows that \(||x - P x|| \leq ||x - P_j x||\) for each \(j = 1,\ldots,N\) (because \(P_j x\) also belongs to the span). Now every \(x \in E\) has \(||x - P_j x|| < \epsilon\) for some \(j \in \{1,\ldots,N\}\), so \(||x - Px|| < \epsilon\) for all \(x \in E\).

\([\Leftarrow]\) Since \(E\) is closed and \(H\) is complete, \(E\) is also complete. Let \(P\) be orthogonal projection onto some finite-dimensional subspace \(M\) of \(H\) such that \(||x - Px|| < \epsilon/4\) for each \(x \in E\). Because \(||Px|| \leq ||x||\) and \(E\) is bounded, \(PE\) is also a bounded subset of \(M\). Let \(e_1,\ldots,e_N\) be an orthonormal basis of \(M\). Every \(y \in P E\) equals \(\sum_{j=1}^N c_j e_j\) for some coefficients \(c_j\) with \(||y||^2 = \sum_{j=1}^N |c_j|^2\). The map \(y \mapsto (c_1,\ldots,c_N)\) is in particular an isometry between \(M\) and \({\mathbb R}^N\) or \({\mathbb C}^N\) (depending on the field used). In \({\mathbb R}^N\) and \({\mathbb C}^N\), every bounded set may be covered by finitely many balls of radius \(\epsilon/4\) (because every bounded set is contained in a compact set). This means that there are finitely many points \(y_1,\ldots,y_{N'}\) in \(M\) such that every \(x \in E\) has some \(k\) for which \(||P x - y_k|| < \epsilon/4\). Without loss of generality (reducing the number of \(y_k\)'s if necessary), for each \(k=1,\ldots,N'\), it may be assumed that there is some \(x_k\) such that \(||P x_k - y_k|| < \epsilon/4\). Then any other \(x\) with \(||P x-y_k|| < \epsilon/4\) has
\[{}||x_k - x||{}\]
\[{}\leq ||P x_k - P x||{}\]
\[{}+ ||P x_k - x_k|| + ||P x - x||{}\]
\[{}< ||P x_k - P x|| + \epsilon/2{}\]
\[{}< ||P x_k - y_k||{}\]
\[{}+ ||y_k - Px|| + \epsilon/2{}\]
\[{}< \epsilon.{}\]
Thus \(E\) is contained in the union of \(B_\epsilon(x_1) \cup \cdots \cup B_\epsilon(x_{N'})\), meaning that \(E\) is close to finite.

4. Other Examples

Theorem (Arzelà-Ascoli Revisited)
Suppose that \((X,d_X)\) and \((Y,d_Y)\) are metric spaces and \(X\) is compact. Let \(\mathcal F\) be a subset of the continuous functions from \(X\) to \(Y\) such that
  • For each \(x \in X\), the closure of \(\{f(x) \ : \ f \in \mathcal F\}\) is compact in \(Y\).
  • For each \(\epsilon > 0\), there exists \(\delta > 0\) such that every \(x,x' \in X\) with \(d_X(x,x') < \delta\) satisfy \(d_Y(f(x),f(x')) < \epsilon\). Then the closure of \(\mathcal F\) is compact in the uniform topology on \(C(X,Y)\).
Proof
Let \(\{f_n\}_{n=1}^\infty\) be a uniform Cauchy sequence of elements in the uniform closure of \(\mathcal F\). Without loss of generality, for each \(n\), there is some \(g_n \in \mathcal F\) such that \(d_Y(f_n(x),g_n(x)) < 2^{-n}\) for all \(x \in X\). This forces \(\{g_n\}_{n=1}^\infty\) to be a uniform Cauchy sequence as well. Because the closure of \(\{ f(x) \ : \ f \in \mathcal F\}\) is compact in \(Y\) for each \(X\), it is complete and \(g_n(x)\) must have a pointwise limit at each \(x \in X\). By the usual \(\epsilon/3\) argument, the limit must be continuous. If the limit is called \(g\), then
\[{}d_Y(f_n(x),g(x)){}\]
\[{}\leq d_Y(f_n(x),g_n(x)){}\]
\[{}+ d_Y(g_n(x),g(x)){}\]
and the right-hand side tends to zero uniformly in \(X\) as \(n \rightarrow \infty\). Thus the Cauchy sequence \(\{f_n\}_{n=1}^\infty\) converges uniformly in \(C(X,Y)\), and the limit function \(g\) must belong to the closure of \(\mathcal F\) because it's closed.

It suffices to show that the closure of \(\mathcal F\) is close to finite. Let \(\delta\) be such that \(d_X(x,x') < \delta\) implies \(d_Y(f(x),f(x')) < \epsilon/4\) for each \(f \in \mathcal F\), and let \(x_1,\ldots,x_N\) be points in \(X\) such that \(X = \bigcup_{i=1}^N B_{\delta}(x_i)\) (which exist by compactness). Since the Cartesian products of the closures of the sets \(\{f(x_i) \ : f \in \mathcal F\}\) for \(i=1,\ldots,N\) is compact, there must exist finitely many tuples \((y_1,\ldots,y_N)\) in \(Y^N\) such that every \(f\) admits some such tuple \((y_1,\ldots,y_N)\) such that \(d_Y (f(x_i),y_i) < \epsilon/4\) for each \(i\). As above, it may be assumed that for a given tuple \((y_1,\ldots,y_N)\), there is at least one \(f \in \mathcal F\). If \(g\) is any other such element of \(\mathcal F\), then every \(x \in X\) admits some \(x_i\) for which
\[{}d_Y(f(x),g(x)){}\]
\[{}\leq d_Y (f(x),f(x_i)) + d_Y(f(x_i),g(x_i)){}\]
\[{}+ d_Y(g(x_i),g(x)){}\]
\[{}< \frac{\epsilon}{4} + d_Y(f(x_i),g(x_i)){}\]
\[{}+ \frac{\epsilon}{4}{}\]
\[{}< \frac{\epsilon}{2} + d_Y(f(x_i),y_i) + d_Y(y_i,g(x_i)){}\]
\[{}< \epsilon.{}\]
This yields a finite list of functions \(f \in \mathcal F\) such that every \(g\) in \(\mathcal F\) is uniformly less than distance \(\epsilon\) to one of the \(f\)'s. After taking the closure, no element \(\tilde g\) in the closure of \(\mathcal F\) can be strictly greater than distance \(\epsilon\) to each of the \(f\)'s. Aside from the fact that the distance inequality is not strict (which can be fixed in the usual way by replacing \(\epsilon\) by a small factor times \(\epsilon\) at the beginning of the argument), this is exactly what is necessary to see that the closure of \(\mathcal F\) is close to finite.
Exercises
  1. Suppose that \(\{K_n\}_{n=1}^\infty\) is a sequence of nonempty compact sets in some metric space \((X,d)\) such that the diameter of \(K_n\) tends to \(0\) as \(n \rightarrow \infty\). Prove that the union \(\bigcup_{n=1}^\infty K_n\) is compact if and only if there is a compact set \(E \subset \bigcup_{n=1}^\infty K_n\) such that \(E \cap K_n \neq \emptyset\) for each \(n\). Use this result to show that the set depicted below (consisting of a countable union of circles with radii decreasing by a factor of \(1/3\) at each step, all tangent to one another at the same point) is compact.
    Figure. Countably Many Nested Tangent Circles
  2. The set depicted below is a countable union of circles defined by recursively taking each circle \(C\) and placing two additional circles of one third the radius which are internally tangent to \(C\) at the points of \(C\) with angle \(-\pi/4\) and \(-3 \pi/4\) from the horizontal. Prove or disprove that the set is compact.
    Figure. More Elaborate Tangent Circles