Definition of the derivative

Section 6.1 Definition of the derivative

As I mentioned in the introduction of Chapter 3, the goal here will be to pretend a given function is linear.

Let’s step a little into what linearity ought to mean. We’ll deal with a normed space \(W\text{,}\) a normed space \(V\) and a function \(f:W\to V\text{.}\)

🔗

We’re used to the equation of a line:

\begin{equation*} L(x)=y=mx+b\ \ ; \end{equation*}

notice that a line has two parameters, its slope \(m\) and its intercept \(b\text{.}\) We want to model \(f\) by a function that looks like this, say

\begin{equation*} L(x)=Ax+B \end{equation*}

for some \(A\) and \(B\text{.}\) But what sort of a thing should \(A\) and \(B\) be in order for this to make sense? \(B\) is the same sort of thing as the outputs of \(f\text{;}\) namely, \(B\in V\text{.}\) Similarly, we need \(Ax\in V\text{.}\) \(x\in W\text{,}\) so this means \(A\) must be the sort of thing that takes elements of \(W\) and yields elements of \(V\) by multiplication. In other words, \(A\) should be a linear transformation \(W\to V\text{.}\)

🔗

Definition 6.1.1.

Given vector spaces \(V_1, V_2\text{,}\) we write \(L(V_1,V_2)\) for the set of all linear transformations \(V_1 \to V_2\text{.}\)

🔗

Proposition 6.1.2.

We can equip \(L(V_1,V_2)\) with pointwise addition and scaling coming from \(V_2\text{;}\) this gives a vector space structure on \(L(V_1,V_2)\text{.}\)

🔗

Definition 6.1.3.

Let \(V,W\) be normed spaces. The operator norm of a linear transformation \(L\in L(W,V)\) is given by

\begin{equation*} \lVert L\rVert_{op}=\sup\left\{\lVert Lx\rVert \middle\vert \lVert x\rVert=1\right\}\text{.} \end{equation*}

🔗

Proposition 6.1.4.

The set \(B(W,V)=\left\{L\in L(W,V)\middle\vert \lVert L\rVert_{op}\lt \infty\right\}\) of bounded linear operators is a normed space.

🔗

Checkpoint 6.1.5.

Definition 6.1.1, Proposition 6.1.2, Definition 6.1.3, and Proposition 6.1.4 might seem a little daunting. But if we take \(W=V=\mathbb{R}\text{,}\) then we know that a linear map is just mutliplication by a number \(m\in \mathbb{R}\text{.}\) Work out what Definition 6.1.1, Proposition 6.1.2, Definition 6.1.3, and Proposition 6.1.4 say in this case.

🔗

Checkpoint 6.1.6.

Similarly to Checkpoint 6.1.5, if we take \(W=\mathbb{R}^2\) and \(V=\mathbb{R}\text{,}\) then we’re looking at linear transformations \(\mathbb{R}^2\to\mathbb{R}\text{,}\) which are given by sending \((x_1,x_2)\mapsto (a_1,a_2)\cdot (x_1,x_2)\text{.}\) Work out what Definition 6.1.1, Proposition 6.1.2, Definition 6.1.3, and Proposition 6.1.4 say in this case.

🔗

Whew! That’s a doozy. Now let’s define the derivative. Remember, we’re looking to make our function \(f\) look as linear as possible.

🔗

Definition 6.1.7. Carathéodory Differentiability.

If \(A\subseteq W\) is a subset of a normed space, \(f:A\to V\) is a function from \(A\) to a normed space, and \(c\in A\) is a limit point for \(A\text{,}\) we say \(f\) is differentiable at \(c\) according to Carathéodory if there is a function \(\Phi(x):A\to B(W,V)\) which is continuous at \(c\) and

\begin{equation*} f(x)=f(c)+\Phi(x)(x-c)\ \ . \end{equation*}

We call \(\Phi(c)\in B(W,V)\) the derivative of \(f\) at \(c\) and write \(Df(c)=\Phi(c)\text{.}\)

🔗

Let’s come down out of the clouds for a moment.

🔗

Checkpoint 6.1.8.

Let \(A=W=V=\mathbb{R}\text{,}\) \(c=3\text{,}\) and consider the squaring function \(f:x\mapsto x^2\text{.}\) \(f(c)=9\text{,}\) so our goal is to write

\begin{equation*} x^2=9+\Phi(x)(x-3) \end{equation*}

for some continuous function \(\Phi\text{.}\) What’s the \(\Phi\) which makes this work?

🔗

What’s \(Df(3)=\Phi(3)\text{?}\)

🔗

Definition 6.1.9. Fréchet Differentiability.

\begin{equation*} \displaystyle\lim_{x\to c}\frac{\lVert f(x)-f(c)-D(x-c)\rVert}{\lVert x-c\rVert}=0 \end{equation*}

In this case, we call \(D\) the derivative of \(f\) at \(c\) and write \(Df(x)=D\text{.}\)

🔗

In case (as we’ll mainly be concerned with) we’re working on \(W=V=\mathbb{R}\text{,}\) where division makes sense, then the two definitions work out to:

🔗

Definition 6.1.10. Carathéodory and Fréchet Differentiability in \(\mathbb{R}\).

Carathéodory’s sense of differentiability requires a continuous \(\Phi(x):\mathbb{R}\to\mathbb{R}\) so that

\begin{equation*} f(x)=f(c)+\Phi(x)(x-c)\ \ ; \end{equation*}

then \(Df(c)=\Phi(c)\text{.}\)

🔗

Fréchet’s sense of differentiability requires that

\begin{equation*} \displaystyle\lim_{x\to c}\frac{f(x)-f(c)}{x-c} \end{equation*}

exists; then \(Df(c)\) is the value of this limit.

🔗

We usually write \(f'(c)\) in place of \(Df(c)\) in the case of functions \(\mathbb{R}\to\mathbb{R}\text{.}\)

🔗

Checkpoint 6.1.11.

For \(A=V=W=\mathbb{R}\text{,}\) \(c=3\text{,}\) \(f:x\mapsto x^2\text{,}\) use the Fréchet version to compute \(f'(3)\text{.}\)

🔗

Theorem 6.1.12. Carathéodory’s Criterion.

A function is differentiable-according-to-Fréchet if and only if it is differentiable-according-to-Carathéodory.

🔗

Theorem 6.1.12 is useful mainly because it gives us two ways to deal with derivatives: either by taking a limit (Fréchet) or by rearranging \(f\) (Carathéodory).

🔗

Theorem 6.1.13.

If \(f\) is differentiable at \(c\text{,}\) then \(f\) is continuous at \(c\text{.}\)

🔗

Proof.

We know that \(f(x)=f(c)+\Phi(x)(x-c)\text{,}\) where \(\Phi\) is continuous at \(c\text{.}\) Since multiplication and addition are continuous, this shows we built \(f\) from continuous-at-\(c\) pieces; hence \(f\) is continuous at \(c\text{.}\)

🔗

Prev Top Next