The Stone-Weierstraß Theorem

Section 8.2 The Stone-Weierstraß Theorem

The content of this section is to prove a rather remarkable result: that any continuous function -- even one with quite bad behavior otherwise -- can be uniformly approximated by a sequence of polynomials.

Theorem 8.2.1. Stone-Weierstraß Theorem.

Let \(f:[a,b]\to \mathbb{R}\) be a continuous function. Then for any \(\epsilon\gt 0\text{,}\) there is a polynomial \(p\) so that for all \(x\in[a,b]\text{,}\) \(\lvert f(x)-p(x)\rvert\lt \epsilon\text{.}\)

Proof.

We'll show the equivalent version: there is a sequence \(p_n\) of polynomials so that \(p_n\rightrightarrows f\) on \([a,b]\text{.}\)

Without loss of generality, take \(a=0,b=1\text{.}\)

Without loss of generality, take \(f(0)=f(1)=0\text{.}\)

We can then extend \(f\) continuously to \(g:\mathbb{R}\to\mathbb{R}\) by

\begin{equation*} g(x)=\begin{cases}f(x)&\text{ if }x\in[0,1]\\0&\text{ otherwise}\end{cases}\ \ \ . \end{equation*}

Observe that \(g\) defined thus is uniformly continuous on \(\mathbb{R}\text{.}\)

For each \(n\in \mathbb{N}\text{,}\) set

\begin{equation*} \displaystyle c_n=\left[\int_{[-1,1}(1-t^2)^n\ dt \right]^{-1} \end{equation*}

and

\begin{equation*} q_n(t)=c_n(1-t^2)^n\ \ \ . \end{equation*}

(We pick the constant \(c_n\) so that \(\displaystyle\int_{[-1,1]}q_n(t)\ dt=1\text{.}\))

Lemma 8.2.2.

\(c_n\leq 2\sqrt{n}\text{.}\)

Now define, for each \(n\in\mathbb{N}\text{,}\)

\begin{equation*} \displaystyle p_n(x)=\int_{[-1,1]}g(x+t)\ q_n(t)\ dt\ \ \ . \end{equation*}

Lemma 8.2.3.

\(p_n\) is a polynomial.

Proof.

The idea is to consider the change of variables \(s=x+t\text{.}\) I'll leave the details to you.

Now we'll show that \(p_n\rightrightarrows f\text{.}\)

Given \(\epsilon\gt 0\text{,}\) choose \(\delta\gt 0\) so that \(\lvert x-y\rvert\lt \delta\Rightarrow \lvert f(x)-f(y)\rvert\lt \frac{\epsilon}{2}\text{.}\) Let \(M=\sup\lvert g\rvert=\sup\lvert f\rvert\text{.}\) Compute:

\begin{align*} \left\lvert p_n(x)-f(x)\right\rvert&=\left\lvert \int_{[-1,1]}g(x+t) q_n(t)\ dt - f(x)\int_{[-1,1]}q_n(t)\ dt\right\rvert\\ &=\left\lvert \int_{[-1,1]}(g(x+t)-f(x))\ q_n(t)\ dt\right\rvert\\ &\leq \int_{[-1,1]}\left\lvert g(x+t)-f(x)\right\rvert \ q_n(t)\ dt\\ &=\int_{[-1,-\delta]}\left\lvert g(x+t)-f(x)\right\rvert\ q_n(t)\ dt + \int_{[-\delta,\delta]}\left\lvert g(x+t)-f(x)\right\rvert\ q_n(t)\ dt+\int_{[\delta,1]}\left\lvert g(x+t)-f(x)\right\rvert\ q_n(t)\ dt\\ &\leq 2Mq_n(-\delta)(1-\delta)+\frac{\epsilon}{2}\int_{[-\delta,\delta]}q_n(t)\ dt+2Mq_n(\delta)(1-\delta)\\ &\leq 4M(1-\delta^2)^n c_n+\frac{\epsilon}{2} \end{align*}

Now by Lemma 8.2.2, it will suffice to show that for any \(\delta\gt 0\text{,}\)

\begin{equation*} (1-\delta^2)^n\sqrt{n}\to 0\ \ \ . \end{equation*}

and this you can do.

Checkpoint 8.2.4.

Rephrase Theorem 8.2.1 as a statement about how the set \(\mathcal{P}\) of polynomials sits inside \(\mathcal{C}^0([a,b])\text{.}\)

Checkpoint 8.2.5.

At the beginning of the proof of Theorem 8.2.1, we made two simplifying assumptions: first, we decided to work on the interval \([0,1]\text{;}\) then, we assumed that \(f(0)=f(1)=0\text{.}\)

Justify the first of these assumptions: given continuous \(f:[a,b]\to\mathbb{R}\text{,}\) define the function \(\tilde{f}:[0,1]\to\mathbb{R}\) by

\begin{equation*} \tilde{f}(x)=f(a+x(b-a))\ \ \ . \end{equation*}

The special case proved applies to find a sequence of polynomials \(\tilde{p}_n\) with \(\tilde{p}_n\rightrightarrows\tilde{f}\text{.}\) Explain how to find a sequence of polynomials approximating \(f\) using the \(\tilde{p}_n\text{.}\)

Checkpoint 8.2.6.

Following up on Checkpoint 8.2.5: justify the other simplification. Here's an approach: suppose we could polynomially approximate \(g(x)=f(x)+P(x)\) for some fixed polynomial \(P(x)\text{.}\) Then we would be able to approximate \(f\) by polynomials, too.

Once you have explained the claim I just made, show how to pick a polynomial \(P(x)\) so that \(g(0)=g(1)=0\text{,}\) so that the proof given above applies to \(g\text{.}\)

Checkpoint 8.2.7.

Use a computer to plot the functions \(q_n\) for several values of \(n\text{.}\) Does this remind you of anyhing?

Remark 8.2.8.

The method used in the proof of Theorem 8.2.1 is known as mollification. The idea is to approximate a possibly-not-so-nice function \(f\) by integrating it against a known-to-be-nice function \(q\text{.}\) We call this integrated product

\begin{equation*} \displaystyle\tilde{f}=\int f(x+t)\ q(t)\ dt \end{equation*}

the convolution of \(f\) and \(q\text{.}\)

Mollification is used all over functional analysis, as well as in the theory of partial differential equations.

Convolution is also how graphic equalizers work, but that's really a different class altogether.