Inverse Functions

Video. Inverse Functions

Some Topics Covered The Inverse Function Theorem for single variable functions; Intermediate Value Property of derivatives
Important (Errata)
  • On slide 3 (at the 17:30 mark in the video), I should have said \(f(x_0) = y\) rather than \(f(x_0) = I\).
  • At the 23:19 mark in the video, I should have written \(\tilde \epsilon = \frac{1}{2} f'(x_0)\) rather than \(\frac{1}{2} f(x_0)\).

1. Statement of the One Variable Inverse Function Theorem

Theorem
Suppose \(f\) is a differentiable function on some interval \(I := (a,b)\) and \(f'(x) \neq 0\) on \(I\). Then \(f\) is strictly monotone, maps \(I\) onto some open interval \(U\), and has an inverse function \(f^{-1} : U \rightarrow I\) which is differentiable and satisfies
\[ \frac{d}{dy} f^{-1}(y) = \frac{1}{f'(f^{-1}(y))} \text{ for all } y \in U. \]

2. Step 0: Simplifying Assumptions and Analysis of Differentiability

Assume for the moment that \(f'(x) > 0\) on all of \(I\). We immediately know by the Mean Value Theorem that \(f\) must be strictly increasing on \(I\); since it is \(1-1\) on \(I\), \(f\) is a bijection between \(I\) and its image \(f(I)\) and has a formal inverse \(f^{-1}\). We also know that \(f(I)\) has to be an open interval \(U\) since it can have no largest or smallest element (if \(z \in f(I)\), then \(z = f(c)\) for some \(c \in I\) and consequently \(z' := f(c')\) will be less than \(z\) when \(c' < c\) and will be greater than \(z\) when \(c' > c\)).

Suppose \(f\) is known to be differentiable at \(x \in I\) and have \(f'(x) = m\). The definition of the derivative implies that
\[ \left| \frac{f(x') - f(x)}{x'-x} - m \right| < \epsilon\]
for all \(x'\) satisfying \(0 < |x'-x| < \delta\), where \(\delta\) depends on \(\epsilon\). Manipulating the absolute values gives that
\[ m-\epsilon < \frac{f(x') - f(x)}{x'-x} < m + \epsilon\]
for \(0 < |x'-x| < \delta\), and multiplying by \(x'-x\) on both sides gives that \(f(x') - f(x)\) must lie between \((m+\epsilon)(x'-x)\) and \((m-\epsilon)(x'-x)\) whenever \(|x'-x| < \delta\) (the case \(x'=x\) also makes sense to consider because all the expressions are just zero here). Thus \(f(x')\) lies between \((m-\epsilon)(x'-x) + f(x)\) and \((m+\epsilon)(x'-x) + f(x)\) for all \(x'\) in the interval \(|x'-x| < \delta\). The expressions \((m-\epsilon)(x'-x) + f(x)\) and \((m+\epsilon)(x'-x) + f(x)\), as a function of \(x'\), describe lines through the point \((x,f(x))\) with slopes \(m-\epsilon,m+\epsilon\). Thus differentiability at \(x\) is the same as saying that near the point \(x\), the graph of \(f\) lies inside cones with vertex at \((x,f(x))\) and sides with slopes \(m-\epsilon,m+\epsilon\) (which one should regard as being very close to \(m\)).

3. Step 1: Writing a Useful Formula for \(f^{-1}\).

Proposition
For \(f\) as described previously,
\[ f^{-1}(y) = \sup \{ x \in I \ : f(x) \leq y \}\]
for all \(y \in U\).
Proof
Since \(y \in f(U)\), \(y = f(x_0)\) for some \(x_0 \in I\). Since \(f\) is increasing, \(f(x) \leq y\) if and only if \(x \leq x_0\). So the set of points \(x \in I\) for which \(f(x) \leq y\) is exactly the set of points \(x \in I\) with \(x \leq x_0\), whose supremum is \(x_0\).

From here, we would like to establish continuity of \(f^{-1}\). In practice, this means that for any fixed \(y_0 \in U\) and any \(\epsilon > 0\), there is some \(h > 0\) such that \(y' \in U\) and \(|y'-y_0| < h\) implies
\[{}|\sup \{ x \in I \ : f(x) \leq y' \}{}\]
\[{}- \sup \{ x \in I \ : f(x) \leq y_0 \} | < \epsilon.{}\]
Since \(f'(x_0) = m > 0\) at \(x \in I\), we know (taking \(\epsilon = m/2\) when trapping the graph of \(f\) between two lines) that there is some \(\eta > 0\) such that \(f(x') - f(x_0)\) lies between \(\frac{1}{2} m (x'-x_0)\) and \(\frac{3}{2} m(x'-x_0)\) when \(x' \in I\) has \(|x'-x_0| < \eta\).

Now take \(x_+\) and \(x_{-}\) both to lie in the interval \(I \cap \{ x \ : \ |x - x_0| < \eta \}\) such that \(x_+ > x_0\) and \(x_{-} < x_0\) and let
\[ \delta := \min \{f(x_+)-f(x_0), f(x_0)-f(x_-) \}\]
which we know is strictly positive because \(f\) is increasing. If \(y_1 \in N_\delta(f(x_0))\), then \(y_1\) is necessarily between \(f(x_-)\) and \(f(x_+)\), and then the Intermediate Value Theorem guarantees the existence of some \(x_1\) between \(x_-\) and \(x_+\) such that \(y_1 = f(x_1)\). Using our inequality above,
\[ \frac{m}{2} |x_1-x_0| \leq |y_1 - y_0| \leq \frac{3m}{2} |x_1-x_0| \]
where \(y_0 := f(x_0)\) and \(y_1 = f(x_1)\).
In summary:
Proposition
For \(f\) as above, if \(x_0 \in I\) and \(y_0 = f(x_0)\), then there exists some \(\delta > 0\) such that for all \(|h| < \delta\), there is a solution \(x_1\) of the equation
\[ f(x_1) = f(x_0) + h \]
such that \(|x_1 - x_0| \leq \frac{2|h|}{f'(x_0)}\).
(Compare to the 28:30 mark in the video.)

As a corollary, \(f^{-1}\) must be continuous at \(y_0\): For any \(\epsilon > 0\), fix \(\delta\) to be the minimum of the \(\delta\) from the proposition above and the quantity \(f'(x_0) \epsilon/2\). Then for any \(y_1\) with \(|y_1 - y_0| < \delta\), \(f^{-1}(y_1)\) must satisfy \(|f^{-1}(y_1) - x_0| \leq 2 |y_1 - y_0| / f'(x_0) < \epsilon\) (because \(f^{-1}(y_1)\) is the unique solution of the equation \(f(x_1) = y_1\) on the interval \(I\)).

4. Step 2: Differentiability of \(f^{-1}\)

We have established already that when \(f' > 0\) on its domain \(I\), its inverse function \(f^{-1}\) exists and must be continuous. Now we establish differentiability. The proof is only a minor variation of the argument above. Let \(\rho \in (0,m)\) be fixed. There is some \(\eta > 0\) such that \(f(x') - f(x_0)\) lies between \((m - \rho) (x'-x_0)\) and \((m+\rho))(x'-x_0)\) when \(x' \in I\) has \(|x'-x_0| < \eta\).

Now take \(x_+\) and \(x_{-}\) both to lie in the interval \(I \cap \{ x \ : \ |x - x_0| < \eta \}\) such that \(x_+ > x_0\) and \(x_{-} < x_0\) and let
\[ \delta := \min \{f(x_+)-f(x_0), f(x_0)-f(x_-) \}\]
which we know is strictly positive because \(f\) is increasing. If \(y_1 \in N_\delta(f(x_0))\), then \(y_1\) is necessarily between \(f(x_-)\) and \(f(x_+)\), and then the Intermediate Value Theorem guarantees that \(f^{-1}(y_1)\) lies between \(x_-\) and \(x_+\), which then means that \(y_1 - y_0\) lies between \((m-\rho)(f^{-1}(y_1) - f^{-1}(y_0))\) and \((m+\rho)(f^{-1}(y_1) - f^{-1}(y_0))\). Thus
\[ \frac{1}{m+\rho} \leq \frac{f^{-1}(y_1) - f^{-1}(y_0)}{y_1 - y_0} \leq \frac{1}{m-\rho}\]
when \(0 < |y_1 - y_0| < \delta\). For any fixed \(\epsilon > 0\), the quantity \(\rho\) can be chosen so that \((m+\rho)^{-1} > m^{-1} - \epsilon\) and \((m-\rho)^{-1} < m^{-1} + \epsilon\), which then implies that when \(0 < |y_1 - y_0| < \delta\),
\[ \frac{1}{m} - \epsilon < \frac{f^{-1}(y_1) - f^{-1}(y_0)}{y_1 - y_0} < \frac{1}{m} + \epsilon. \]
This exactly implies that \(f^{-1}\) is differentiable at \(y_0\) with derivative \(m^{-1} = 1/f'(f^{-1}(y_0))\) as required by the Inverse Function Theorem.

5. Step 3: Cleaning Up Remaining Cases

We have assumed that \(f'(x) > 0\) on \(I\); if instead \(f'(x) < 0\) on \(I\), one can define \(g(x) = f(-x)\) on the interval \(-I := (-b,-a)\) and establish that \(g^{-1}\) exists, is differentiable, and has \((g^{-1})'(y) = 1/g'(g^{-1}(y))\) (because \(g' > 0\) now). However, \(g (g^{-1}(y)) = y\) implies \(f(-g^{-1}(y)) = y\), meaning that \(-g^{-1}(y)\) is necessarily the inverse function of \(f\). Thus \(-f^{-1}\) is differentiable and
\[{} - \frac{d}{dy} f^{-1}(y){}\]
\[{}= \frac{1}{g'(g^{-1}(y))}{}\]
\[{}= \frac{1}{-f'(-g^{-1}(y))}{}\]
\[{}= - \frac{1}{f'(f^{-1}(y))}.{}\]
Finally, assuming merely that \(f' \neq 0\) on the interval \(I\), the Intermediate Value Property of Derivatives (see 46:45 mark in the video for an alternate proof) implies that \(f'\) must either be positive everywhere or negative everywhere, so every possible case of the theorem has already been established.