The Implicit Function Theorem

1. Statement

Theorem (Implicit Function Theorem)

Suppose \(U \subset {\mathbb R}^n\) is open and \(\Phi : U \rightarrow {\mathbb R}^m\) is \(C^1\), where \(m < n\). Suppose also that the matrix

\[ \left[ \frac{\partial \Phi_i}{\partial x_j} \right]_{i,j=1,\ldots,m}\]

is invertible at \(p \in U\). Then there exists a neighborhood \(U'\) of \(p\), neighborhood \(V_0\) of \(f(p)\), and an open set \(V_1 \subset {\mathbb R}^{n-m}\) such that for each \(a \in V_0\), there is a \(C^1\) function \(f_a : V_1 \rightarrow U'\) parametrizing

\[ \left\{ x \in U' \ : \ \Phi(x) = a \right\}, \]

which means that

\(f_a\) is injective on \(V_1\) for each \(a\),
\(\Phi( f_a(t)) = a\) for all \(t \in V_1\),
All solutions of \(\Phi(x) = a\) in \(U'\) belong to the image of \(f_a\),
The derivative \(D f_a\) has full rank (\(n-m\)) at each point in \(V_1\).

2. Proof Strategy

Meta (Strategy)

Reduce to the Inverse Function Theorem.

Since \(\Phi\) is \(C^1\) on \(U\), the augmented map

\[{}\tilde \Phi(x){}\]

\[{}:= \left( \Phi_1(x),\ldots,\Phi_m(x), x_{m+1},\ldots,x_n \right){}\]

is a \(C^1\) function from \(U\) into \({\mathbb R}^n\).

We can compute the total derivative explicitly because \(\tilde \Phi \in C^1\): \(D \tilde \Phi\) has the form

\[ \begin{bmatrix} A & B \\ 0 & I \end{bmatrix}\]

where \(A\) is the \(m \times m\) matrix

\[ \begin{bmatrix} \partial_{x_1} \Phi_1 & \cdots & \partial_{x_m} \Phi_1 \\ \vdots & \ddots & \vdots \\ \partial_{x_1} \Phi_m & \cdots & \partial_{x_m} \Phi_m \end{bmatrix}, \]

\(B\) is the \(m \times (n-m)\) matrix

\[ \begin{bmatrix} \partial_{x_{m+1}} \Phi_1 & \cdots & \partial_{x_n} \Phi_1 \\ \vdots & \ddots & \vdots \\ \partial_{x_{m+1}} \Phi_m & \cdots & \partial_{x_n} \Phi_m \end{bmatrix}, \]

\(0\) represents an \((n-m) \times m\) block of all zeros, and \(I\) is an \((n-m) \times (n-m)\) identity matrix block.

The matrix \(D \tilde \Phi\) is invertible because \(\left[ \frac{\partial \Phi_i}{\partial x_j} \right]_{i,j=1,\ldots,m}\) is (think row reduction).
Apply the Inverse Function Theorem to \(\tilde \Phi\) on a neighborhood of \(p\). We know that there exists a neighborhood \(U'\) of \(p\) and an open set \(V' \subset {\mathbb R}^n\) such that \(\tilde \Phi\) is a bijection from \(U'\) to \(V'\).
By choosing a smaller \(V'\) as needed (and correspondingly shrinking \(U'\)), we may assume that \(V'\) has the form
\[{}B_{\delta}(\Phi(p)){}\]
\[{}\times (p_{m+1}-\delta,p_{m+1}+\delta){}\]
\[{}\times \cdots{}\]
\[{}\times (p_{n} - \delta, p_n+\delta){}\]
for some \(\delta > 0\).
Let \(V_0 := B_\delta(\Phi(p))\) and
\[{}V_1{}\]
\[{}:= (p_{m+1}-\delta,p_{m+1}+\delta){}\]
\[{}\times \cdots{}\]
\[{}\times (p_{n} - \delta, p_n+\delta),{}\]
and let
\[ f_a(t) = \tilde \Phi^{-1} (a,t) \]
for \(a \in V_0\) and \(t \in V_1\).
We have
\[ \tilde \Phi( f_a(t)) = (a,t) \]
for all \((a,t) \in V_0 \times V_1\). Since \(\tilde \Phi\) is injective, so is \(f_a\) for each \(a\).
Because \(\tilde \Phi^{-1}\) is \(C^1\) by the Inverse Function Theorem, \(f_a\) must also be \(C^1\).
Look closer at the formula \(\tilde \Phi ( f_a(t)) = (a,t)\). If \(P\) represents projection onto the last \(n-m\) variables, then \(P \tilde \Phi(x) = (x_{m+1},\ldots,x_n) = P x\), so \(P f_a(t) = t\). Take total derivatives with respect to \(t\) variable and regard \(a\) as fixed. This gives that
\[ D P f_a = I_{(n-m) \times (n-m)}. \]
Projecting onto the last variables commutes with \(D\), so this tells us that the bottom \(n-m\) rows of \(Df_a\) must be the identity matrix. This means \(Df_a\) is full rank.