Partial Derivatives

Video. Partial Derivatives

1. Definition and Intuition

Suppose \(f\) is a function on some open subset \(U \subset {\mathbb R}^n\). For each \(i \in \{1,\ldots,n\}\), let \(e_i\) be the standard basis vector in direction \(i\). We define the partial derivative with respect to \(x_i\) to equal
\[{}\frac{\partial f}{\partial x_i} (x){}\]
\[{}:= \lim_{h \rightarrow 0} \frac{f(x + h e_i) - f(x)}{h}{}\]
\[{}= \left. \frac{d}{dh} f(x + h e_i) \right|_{h=0}{}\]
whenever the limit exists.
Meta (Intuition)
The partial derivative in direction \(x_i\) is exactly the quantity that one obtains by computing the standard, one dimensional derivative of the function \(f\) along a line that points in the \(i\)-th coordinate direction. We may think of it as fixing all coordinates of \(x\) except the \(i\)-th one and studying the variation of \(f\) in that specific direction.

2. Clairaut's Theorem

Theorem (Clairaut's Theorem)
Suppose \(f\) is a real-valued function on some open subset \(U\) of \({\mathbb R}^n\). If both functions
\[ \frac{\partial}{\partial x_i} \left[ \frac{\partial f}{\partial x_j} \right] \text{ and } \frac{\partial}{\partial x_j} \left[ \frac{\partial f}{\partial x_i} \right]\]
exist everywhere in \(U\) and are continuous throughout \(U\), then they are equal. Similarly, if one of the two second partial derivatives exists and is continuous, and if all first partial derivatives exist, then the second partial derivative with the opposite ordering must exist and be equal to the original ordering.
Proof
Meta (Main Idea)
Let \(e_i\) be the unit vector pointing in the \(i\)-th coordinate direction and likewise for \(e_j\). Consider the expression
\[{}Q(x,h,k){}\]
\[{}:= f( x + h e_i + k e_j) - f(x + h e_i) {}\]
\[{}- f(x + k e_j) + f(x).{}\]
By the Mean Value Theorem applied twice,
\[{}Q(x,h,k){}\]
\[{}= ( f(x + h e_i + k e_j) - f(x + h e_i)) {}\]
\[{}- (f (x + k e_j) - f(x)){}\]
\[{}= h \Bigg[ \frac{\partial f}{\partial x_i} (x + \xi_{x,h,k} e_i + k e_j){}\]
\[{}- \frac{\partial f}{\partial x_i} (x + \xi_{x,h,k} e_i) \Bigg]{}\]
\[{}= hk \frac{\partial}{\partial x_j} \frac{\partial f}{\partial x_i} (x + \xi_{x,h,k} e_i + \eta_{x,h,k} e_j){}\]
for some \(\xi_{x,h,k}\) between \(0\) and \(h\) and some \(\eta_{x,h,k}\) between \(0\) and \(\eta\).
Figure. Illustration of the Proof of Clairaut's Theorem
Set \(h =k\) and let \(h \rightarrow 0\); it follows that
\[\lim_{h \rightarrow 0} h^{-2} Q(x,h,h) = \frac{\partial}{\partial x_j} \frac{\partial f}{\partial x_i}(x).\]
However \(Q\) is completely symmetric in \(i\) and \(j\), so similarly
\[\lim_{h \rightarrow 0} h^{-2} Q(x,h,h) = \frac{\partial}{\partial x_i} \frac{\partial f}{\partial x_j}(x).\]

Uniqueness of limits establishes the theorem.
Note
If we merely assume that \(\frac{\partial}{\partial x_j} \frac{\partial f}{\partial x_i}(x)\) is continuous and that all first-order partials exist, then \(\lim_{k \rightarrow 0} k^{-1} Q(x,h,k)\) must exist and equal
\[ \frac{\partial f}{\partial x_j}(x + h e_i) - \frac{\partial f}{\partial x_j}(x). \]
By taking an appropriate subsequence of \(k_m\) so that \(\xi_{x,h,k_m}\) converges as \(k_m \rightarrow 0\) (which exists because the \(\xi_{x,h,k}\) are bounded), continuity of \(\frac{\partial}{\partial x_j} \frac{\partial f}{\partial x_i}(x)\) alone would imply as above that
\[{}\frac{\partial f}{\partial x_j}(x + h e_i) - \frac{\partial f}{\partial x_j}(x){}\]
\[{}= h \frac{\partial}{\partial x_j} \frac{\partial f}{\partial x_i}(x + \xi_{x,h} e_i) {}\]
for some \(\xi_{x,h}\) between \(0\) and \(h\) (inclusive). Dividing by \(h\) and letting \(h \rightarrow 0\) implies that \(\frac{\partial}{\partial x_i} \frac{\partial f}{\partial x_j}(x)\) exists and equals \(\frac{\partial}{\partial x_j} \frac{\partial f}{\partial x_i}(x)\). So we really only need to assume that one of the orderings exists and is continuous.
Corollary
Fix \(k \geq 1\). Suppose that \(f\) is a function on an open subset of \({\mathbb R}^n\) with the property that any sequence \((d_1,\ldots,d_\ell) \in \{1,\ldots,n\}^{\ell}\) of length \(\ell \in \{1,\ldots,k\}\) can be reordered in some way \((d_1',\ldots,d_\ell')\) such that
\[ \frac{\partial}{\partial x_{d'_\ell}} \cdots \frac{\partial}{\partial x_{d'_1}} f(x) \]
exists and is continuous. Then all mixed partial derivatives of order at most \(k\) exist, are continuous, and are independent of the ordering of the partial derivatives.
Proof
Meta (Main Idea)
Given a particular ordering of \((d_1,\ldots,d_\ell)\) that is known to yield a mixed partial that exists and is continuous, Clairaut's Theorem and induction on \(k\) guarantees that any two adjacent elements of the sequence may be transposed. The corollary follows from the fact that all permutations may be achieved via a finite number of transpositions of adjacent elements.
The Class \(C^k\) We define the class \(C^{k}(U)\) to consist of all those functions \(f\) on \(U\) which are continuous and have continuous mixed partial derivatives of all orders less than or equal to \(k\). By the corollary, one need only check a single ordering for each possible mixed derivative to verify membership in the class \(C^k(U)\).

3. Notation: Multiindices

If \(\alpha := (\alpha_1,\ldots,\alpha_n)\) is an \(n\)-tuple of nonnegative integers, we call it a multiindex of dimension \(n\). We will use the notation \(\partial^{\alpha}\) to refer to the partial derivative
\[ \left( \frac{\partial}{\partial x_1} \right)^{\alpha_1} \cdots \left( \frac{\partial}{\partial x_n} \right)^{\alpha_n}\]
So for example, in \({\mathbb R}^3\), if the coordinate directions are named \(x\), \(y\), and \(z\), respectively, then
\[ \partial^{(2,3,1)} := \left( \frac{\partial}{\partial x} \right)^2 \left( \frac{\partial}{\partial y} \right)^3 \frac{\partial}{\partial z} = \frac{\partial^6}{\partial x^2 \partial y^3 \partial z}\]
(where this latter notation only makes sense because the ordering of the partial derivatives does not generally matter).

We define the magnitude, length, or order of \(\alpha\) to be \(|\alpha| := \alpha_1 + \cdots + \alpha_n\). If \(\alpha\) has dimension \(n\) and \(x \in {\mathbb R}^n\), we define \(x^\alpha := x_1^{\alpha_1} \cdots x_n^{\alpha_n}\). We also define the factorial of \(\alpha\) to be \(\alpha! := \alpha_1 ! \cdots \alpha_n !\). This gives us a convenient notation to write many expressions involving several variables. For example:
Theorem (Multinomial Theorem)
For any positive integer \(k\),
\[ (x_1 + \cdots + x_n)^k = \sum_{|\alpha| = k} \frac{k!}{\alpha!} x^\alpha\]
where \(\alpha\) ranges over all \(n\)-dimensional multiindices of order \(k\).