Mean Value Theorems

Video. Mean Value Theorems

1. Main Takeaway: Think 1D

Meta (Mean Value Theorems are 1D)
Several of the most obvious ways that one might generalize the Mean Value Theorem to higher dimensions are simply false:
  • The real-valued function \(f(x,y) = x-y\) has \(f(1,1) - f(0,0) = 0\) but the total derivative \(D f\) and coordinate partial derivatives are never zero. But it is trivially true that some directional derivative of \(f\) is zero at every point.
  • The \({\mathbb R}^2\)-valued function \(f(t) = (\cos t, \sin t)\) has \(f(2\pi) - f(0) = (0,0)\) but its derivative is never zero. Most often the MVT is adapted to higher dimensions by comparing values of the function at points \(a\) and \(b\) by connecting them by a path, typically a line segment. In this case, one thinks of the 1D interval domain of \(f\) as generalizing to a convex set \(K\).

2. Common Multivariate Variations on the Mean Value Theorem

Throughout this section, it will be assumed that \(f\) is some continuous function (real-valued or \({\mathbb R}^m\)-valued depending on context) defined on some compact, convex set \(K \subset {\mathbb R}^n\) and differentiable on the interior of \(K\). The interior of \(K\) will be called \(U\). Points \(a,b \in {\mathbb R}^n\) are assumed to belong to \(K\) and have the property that the line segment \(L(a,b)\) joining \(a\) and \(b\) intersects the boundary \(\partial K\) only at \(a\) and \(b\) (or one or neither of those points). The points \(a\) and \(b\) do not technically need to be distinct (as all theorems below will be vacuously true in this case).

2.1. MVT for (Mostly) Scalar Functions

Theorem
For \(f\) and \(a,b\) as above, if \(f\) is real-valued, then there exists \(\xi \in U\) such that
\[ f(b) - f(a) = (D_\xi f) (b-a). \]
If \(f\) is \(C^1\) on \(U\), then it is also true that
\[{}f(b) - f(a){}\]
\[{}= \int_0^1 (D_{t b + (1-t)a} f)(b-a) dt. {}\]
Proof
Let \(g(t) := f(t b + (1-t) a)\) for \(t \in [0,1]\). By the Chain Rule, \(g'(t) = (D_{tb + (1-t)a} f) (b-a)\) for all \(t \in [0,1]\) (even if \(a=b\), since \(g\) is subsequently constant). In the first case, apply the one-dimensional Mean Value Theorem to \(g\) at the points \(t=0,1\). In the second case, apply the Fundamental Theorem of Calculus to say that \(g(1) - g(0) = \int_0^1 g'(t) dt\).
Corollary
For \(f\) and \(a,b\) as above, if \(f\) is real-valued, then for any \(\ell \in {\mathbb R}^m\) there exists \(\xi \in U\) such that
\[ \ell \cdot (f(b) - f(a)) = \ell \cdot (D_\xi f) (b-a). \]
If \(f\) is \(C^1\) on \(U\), then it is also true that
\[{}\ell \cdot (f(b) - f(a)){}\]
\[{}= \int_0^1 \ell\cdot(D_{t b + (1-t)a} f)(b-a) dt.{}\]
Proof
Apply the scalar-valued Mean Value Theorem to \(\ell \cdot f(x)\).
Corollary
If \(U \subset {\mathbb R}^n\) is open and connected and if \(f : U \rightarrow {\mathbb R}^m\) is differentiable and satisfies \(D_x f = 0\) for all \(x \in U\), then \(f\) is constant on \(U\).
Proof
Fix \(a,b \in U\) such that the line segment \(L(a,b)\) is contained in \(U\). Fixing \(\ell := f(b) - f(a)\) and applying the corollary above gives that there is some \(\xi \in U\) such that
\[{}||f(b) - f(a)||^2{}\]
\[{}= \ell \cdot (f(b) - f(a)){}\]
\[{}= \ell \cdot (D_\xi f)(b-a) = 0{}\]
because \(D_\xi f = 0\). Thus \(f\) must be constant on any convex open subset of \(U\). In particular, pick any \(p \in U\) and consider the sets \(\{ x \in U \ : \ f(x) = f(p) \}\) and \(\{ x \in U \ : \ f(x) \neq f(p) \}\). The former set is open because \(f\) for any \(x\) in that set, \(f\) must be constant on all sufficiently small balls centered at \(x\), meaning that some small ball centered at \(x\) is contained in the set. On the other hand, the second set is open because it is the inverse image of \({\mathbb R}^m \setminus \{f(p)\}\) (an open set) via \(f\) (a continuous function). Thus these two open sets separate \(U\) if neither is empty. Because \(p\) belongs to the set \(\{ x \in U \ : \ f(x) = f(p) \}\), the set \(\{ x \in U \ : \ f(x) \neq f(p) \}\) must be the empty one.

2.2. MVT for Vector Functions

In this section, we use the properties of convex functions established in the Convex Function Supplement to prove another version of the Mean Value Theorem which is valid for vector-valued functions.
Theorem
For \(f\) as above, if \(f\) is \({\mathbb R}^m\)-valued and \(\varphi\) is any convex function defined on all of \({\mathbb R}^m\), then there exists \(\xi \in U\) such that
\[ \varphi(f(b) - f(a)) \leq \varphi((D_\xi f)(b-a)). \]
If \(f\) is \(C^1\) on \(U\), then it is also true that
\[{}\varphi(f(b) - f(a)){}\]
\[{}\leq \int_0^1 \varphi((D_{t b + (1-t)a} f)(b-a)) dt.{}\]
Proof
Meta (Main Idea)
Set \(y_0 := f(b) - f(a)\). Because \(\varphi\) is convex, We know that there must be some \(\ell' \in {\mathbb R}^m\) and some constant \(c\) such that \(\varphi(y) \geq c + \ell' \cdot y\) with equality at \(y = y_0\). Apply the earlier Mean Value Theorem to the function \(x \mapsto \ell' \cdot f(x)\) to get
\[ \ell' \cdot y_0 = \ell' \cdot ((D_\xi f)(b-a)) \]
and
\[ \ell' \cdot y_0 = \int_0^1 \ell' \cdot ((D_{tb + (1-t)a}f)(b-a)) dt. \]
Add \(c\) to both sides of both both identities and, in the case of the latter identity, bring the \(c\) inside the integral (which is allowed because \(\int_0^1 dt = 1\)). This gives
\[ \varphi(y_0) = c + \ell' \cdot ((D_\xi f)(b-a)) \]
and
\[{}\varphi(y_0) ={}\]
\[{}\int_0^1 \left[ c + \ell' \cdot ((D_{tb + (1-t)a}f)(b-a)) \right] dt.{}\]
To finish, apply the inequality \(c + \ell' \cdot y \leq \varphi(y)\) on the right-hand side of both identities.
Corollary
For \(f\) as above, if \(f\) is \({\mathbb R}^m\)-valued and \(||\cdot||\) is any fixed norm on \({\mathbb R}^n\), then there exists \(\xi \in U\) such that
\[ ||f(b) - f(a)|| \leq ||| D_\xi f||| \cdot ||b-a||. \]
If \(f\) is \(C^1\) on \(U\), then it is also true that
\[{}||f(b) - f(a)||{}\]
\[{}\leq \int_0^1 |||D_{t b + (1-t)a} f||| \cdot ||b-a|| dt.{}\]
In both cases, \(|||D_x f|||\) is the operator norm, i.e., \(|||D_x f||| := \sup_{||v|| \leq 1} || (D_x f)( v)||\).
Note
In the vector version of the Mean Value Theorem, one does not need \(\varphi\) to be defined on all of \({\mathbb R}^m\); it suffices to assume that its domain contains a convex open neighborhood of all points of the form \((D_\xi f)(b-a)\) for \(\xi\) ranging over the line segment joining \(a\) to \(b\). Let that open set be called \(O\). Using the Separating Hyperplane Theorem, we can show that the closure of \(O\) must be the intersection of half spaces \(\{x \ : \ \ell \cdot x \geq c\}\) over some suitable (possibly uncountable) collection of \((\ell,c) \in {\mathbb R}^m \times {\mathbb R}\). For any such pair \((\ell,c)\), the Mean Value Theorem applied to \(\ell \cdot f\) implies that \(\ell \cdot (f(b) - f(a)) = \ell \cdot (D_\xi f) (b-a)\) for some \(\xi\) on the line segment joining \(a\) to \(b\). In particular, \(\ell \cdot (f(b) - f(a)) > c\) since \((D_\xi f)(b-a)\) can never be on the boundary of \(\overline{O}\). This implies that \(f(b) - f(a)\) belongs to \(\overline{O}\) and is not on the boundary (since equality is always attained at every boundary point for some \((\ell,c)\)). Thus \(f(b) - f(a)\) belongs to \(O\). This means that \(\varphi(f(b) - f(a))\) is well-defined as well as \(\varphi((D_\xi f)(b-a))\) for every relevant \(\xi\). Moreover \(\varphi\) is continuous on \(O\), so \(\varphi((D_{tb + (1-t)a} f)(b-a))\) is continuous (and therefore Riemann integrable) for \(t \in [0,1]\).