Section 6.1 Definition of the derivative
As I mentioned in the introduction of Chapter 3, the goal here will be to pretend a given function is linear.
Let's step a little into what linearity ought to mean. We'll deal with a normed space \(W\text{,}\) a normed space \(V\) and a function \(f:W\to V\text{.}\)
We're used to the equation of a line:
notice that a line has two parameters, its slope \(m\) and its intercept \(b\text{.}\) We want to model \(f\) by a function that looks like this, say
for some \(A\) and \(B\text{.}\) But what sort of a thing should \(A\) and \(B\) be in order for this to make sense? \(B\) is the same sort of thing as the outputs of \(f\text{;}\) namely, \(B\in V\text{.}\) Similarly, we need \(Ax\in V\text{.}\) \(x\in W\text{,}\) so this means \(A\) must be the sort of thing that takes elements of \(W\) and yields elements of \(V\) by multiplication. In other words, \(A\) should be a linear transformation \(W\to V\text{.}\)
Definition 6.1.1.
Given vector spaces \(V_1, V_2\text{,}\) we write \(L(V_1,V_2)\) for the set of all linear transformations \(V_1 \to V_2\text{.}\)
Proposition 6.1.2.
We can equip \(L(V_1,V_2)\) with pointwise addition and scaling coming from \(V_2\text{;}\) this gives a vector space structure on \(L(V_1,V_2)\text{.}\)
Definition 6.1.3.
Let \(V,W\) be normed spaces. The operator norm of a linear transformation \(L\in L(W,V)\) is given by
Proposition 6.1.4.
The set \(B(W,V)=\left\{L\in L(W,V)\middle\vert \lVert L\rVert_{op}\lt \infty\right\}\) of bounded linear operators is a normed space.
Checkpoint 6.1.5.
Definition 6.1.1, Proposition 6.1.2, Definition 6.1.3, and Proposition 6.1.4 might seem a little daunting. But if we take \(W=V=\mathbb{R}\text{,}\) then we know that a linear map is just mutliplication by a number \(m\in \mathbb{R}\text{.}\) Work out what Definition 6.1.1, Proposition 6.1.2, Definition 6.1.3, and Proposition 6.1.4 say in this case.
Checkpoint 6.1.6.
Similarly to Checkpoint 6.1.5, if we take \(W=\mathbb{R}^2\) and \(V=\mathbb{R}\text{,}\) then we're looking at linear transformations \(\mathbb{R}^2\to\mathbb{R}\text{,}\) which are given by sending \((x_1,x_2)\mapsto (a_1,a_2)\cdot (x_1,x_2)\text{.}\) Work out what Definition 6.1.1, Proposition 6.1.2, Definition 6.1.3, and Proposition 6.1.4 say in this case.
Whew! That's a doozy. Now let's define the derivative. Remember, we're looking to make our function \(f\) look as linear as possible.
Definition 6.1.7. Carathéodory Differentiability.
If \(A\subseteq W\) is a subset of a normed space, \(f:A\to V\) is a function from \(A\) to a normed space, and \(c\in A\) is a limit point for \(A\text{,}\) we say \(f\) is differentiable at \(c\) according to Carathéodory if there is a function \(\Phi(x):A\to B(W,V)\) which is continuous at \(c\) and
We call \(\Phi(c)\in B(W,V)\) the derivative of \(f\) at \(c\) and write \(Df(c)=\Phi(c)\text{.}\)
Let's come down out of the clouds for a moment.
Checkpoint 6.1.8.
Let \(A=W=V=\mathbb{R}\text{,}\) \(c=3\text{,}\) and consider the squaring function \(f:x\mapsto x^2\text{.}\) \(f(c)=9\text{,}\) so our goal is to write
for some continuous function \(\Phi\text{.}\) What's the \(\Phi\) which makes this work?
What's \(Df(3)=\Phi(3)\text{?}\)
Definition 6.1.9. Fréchet Differentiability.
If \(A\subseteq W\) is a subset of a normed space, \(f:A\to V\) is a function from \(A\) to a normed space, and \(c\in A\) is a limit point for \(A\text{,}\) we say \(f\) is differentiable at \(c\) according to Fréchet if there is \(D\in B(W,V)\) so that
In this case, we call \(D\) the derivative of \(f\) at \(c\) and write \(Df(x)=D\text{.}\)
In case (as we'll mainly be concerned with) we're working on \(W=V=\mathbb{R}\text{,}\) where division makes sense, then the two definitions work out to:
Definition 6.1.10. Carathéodory and Fréchet Differentiability in \(\mathbb{R}\).
Carathéodory's sense of differentiability requires a continuous \(\Phi(x):\mathbb{R}\to\mathbb{R}\) so that
then \(Df(c)=\Phi(c)\text{.}\)
Fréchet's sense of differentiability requires that
exists; then \(Df(c)\) is the value of this limit.
We usually write \(f'(c)\) in place of \(Df(c)\) in the case of functions \(\mathbb{R}\to\mathbb{R}\text{.}\)
Checkpoint 6.1.11.
For \(A=V=W=\mathbb{R}\text{,}\) \(c=3\text{,}\) \(f:x\mapsto x^2\text{,}\) use the Fréchet version to compute \(f'(3)\text{.}\)
Theorem 6.1.12. Carathéodory's Criterion.
A function is differentiable-according-to-Fréchet if and only if it is differentiable-according-to-Carathéodory.
Theorem 6.1.12 is useful mainly because it gives us two ways to deal with derivatives: either by taking a limit (Fréchet) or by rearranging \(f\) (Carathéodory).
Theorem 6.1.13.
If \(f\) is differentiable at \(c\text{,}\) then \(f\) is continuous at \(c\text{.}\)
Proof.
We know that \(f(x)=f(c)+\Phi(x)(x-c)\text{,}\) where \(\Phi\) is continuous at \(c\text{.}\) Since multiplication and addition are continuous, this shows we built \(f\) from continuous-at-\(c\) pieces; hence \(f\) is continuous at \(c\text{.}\)