The Implicit Function Theorem

1. Statement

Theorem (Implicit Function Theorem)
Suppose \(U \subset {\mathbb R}^n\) is open and \(\Phi : U \rightarrow {\mathbb R}^m\) is \(C^1\), where \(m < n\). Suppose also that the matrix
\[ \left[ \frac{\partial \Phi_i}{\partial x_j} \right]_{i,j=1,\ldots,m}\]
is invertible at \(p \in U\). Then there exists a neighborhood \(U'\) of \(p\), neighborhood \(V_0\) of \(f(p)\), and an open set \(V_1 \subset {\mathbb R}^{n-m}\) such that for each \(a \in V_0\), there is a \(C^1\) function \(f_a : V_1 \rightarrow U'\) parametrizing
\[ \left\{ x \in U' \ : \ \Phi(x) = a \right\}, \]
which means that
  • \(f_a\) is injective on \(V_1\) for each \(a\),
  • \(\Phi( f_a(t)) = a\) for all \(t \in V_1\),
  • All solutions of \(\Phi(x) = a\) in \(U'\) belong to the image of \(f_a\),
  • The derivative \(D f_a\) has full rank (\(n-m\)) at each point in \(V_1\).

2. Proof Strategy

Meta (Strategy)
Reduce to the Inverse Function Theorem.

Since \(\Phi\) is \(C^1\) on \(U\), the augmented map
\[{}\tilde \Phi(x){}\]
\[{}:= \left( \Phi_1(x),\ldots,\Phi_m(x), x_{m+1},\ldots,x_n \right){}\]
is a \(C^1\) function from \(U\) into \({\mathbb R}^n\).

We can compute the total derivative explicitly because \(\tilde \Phi \in C^1\): \(D \tilde \Phi\) has the form
\[ \begin{bmatrix} A & B \\ 0 & I \end{bmatrix}\]
where \(A\) is the \(m \times m\) matrix
\[ \begin{bmatrix} \partial_{x_1} \Phi_1 & \cdots & \partial_{x_m} \Phi_1 \\ \vdots & \ddots & \vdots \\ \partial_{x_1} \Phi_m & \cdots & \partial_{x_m} \Phi_m \end{bmatrix}, \]
\(B\) is the \(m \times (n-m)\) matrix
\[ \begin{bmatrix} \partial_{x_{m+1}} \Phi_1 & \cdots & \partial_{x_n} \Phi_1 \\ \vdots & \ddots & \vdots \\ \partial_{x_{m+1}} \Phi_m & \cdots & \partial_{x_n} \Phi_m \end{bmatrix}, \]
\(0\) represents an \((n-m) \times m\) block of all zeros, and \(I\) is an \((n-m) \times (n-m)\) identity matrix block.
  • The matrix \(D \tilde \Phi\) is invertible because \(\left[ \frac{\partial \Phi_i}{\partial x_j} \right]_{i,j=1,\ldots,m}\) is (think row reduction).
  • Apply the Inverse Function Theorem to \(\tilde \Phi\) on a neighborhood of \(p\). We know that there exists a neighborhood \(U'\) of \(p\) and an open set \(V' \subset {\mathbb R}^n\) such that \(\tilde \Phi\) is a bijection from \(U'\) to \(V'\).
  • By choosing a smaller \(V'\) as needed (and correspondingly shrinking \(U'\)), we may assume that \(V'\) has the form
    \[{}B_{\delta}(\Phi(p)){}\]
    \[{}\times (p_{m+1}-\delta,p_{m+1}+\delta){}\]
    \[{}\times \cdots{}\]
    \[{}\times (p_{n} - \delta, p_n+\delta){}\]
    for some \(\delta > 0\).
  • Let \(V_0 := B_\delta(\Phi(p))\) and
    \[{}V_1{}\]
    \[{}:= (p_{m+1}-\delta,p_{m+1}+\delta){}\]
    \[{}\times \cdots{}\]
    \[{}\times (p_{n} - \delta, p_n+\delta),{}\]
    and let
    \[ f_a(t) = \tilde \Phi^{-1} (a,t) \]
    for \(a \in V_0\) and \(t \in V_1\).
  • We have
    \[ \tilde \Phi( f_a(t)) = (a,t) \]
    for all \((a,t) \in V_0 \times V_1\). Since \(\tilde \Phi\) is injective, so is \(f_a\) for each \(a\).
  • Because \(\tilde \Phi^{-1}\) is \(C^1\) by the Inverse Function Theorem, \(f_a\) must also be \(C^1\).
  • Look closer at the formula \(\tilde \Phi ( f_a(t)) = (a,t)\). If \(P\) represents projection onto the last \(n-m\) variables, then \(P \tilde \Phi(x) = (x_{m+1},\ldots,x_n) = P x\), so \(P f_a(t) = t\). Take total derivatives with respect to \(t\) variable and regard \(a\) as fixed. This gives that
    \[ D P f_a = I_{(n-m) \times (n-m)}. \]
  • Projecting onto the last variables commutes with \(D\), so this tells us that the bottom \(n-m\) rows of \(Df_a\) must be the identity matrix. This means \(Df_a\) is full rank.