Weierstrass Approximation Theorem

Theorem
For any continuous, real-valued function \(f\) on the interval \([a,b]\) and any \(\epsilon > 0\), there exists some polynomial \(p\) such that
\[ \sup_{x \in [a,b]} |f(x) - p(x)| < \epsilon. \]
In other words, any continuous function may be approximated uniformly by a polynomial which differs at every point by less than \(\epsilon\).
Meta (Main Idea)
The theorem is most easily established in two steps.
  1. Show that any continuous function can be uniformly approximated by piecewise-linear functions.
  2. Show that piecewise linear functions can be uniformly approximated by polynomials.
    A critical point in the second step is to show that the absolute value function \(|x|\) can itself be uniformly approximated by polynomials on any desired interval.

1. Continuous Functions Uniformly Approximated by Piecewise Linear

Proposition
Suppose \(f\) is a continuous function on \([a,b]\). Then for every \(\epsilon > 0\), there is a piecewise-linear function \(g\) such that
\[ \sup_{x \in [a,b]} |f(x) - p(x)| < \epsilon. \]
Proof
Since \([a,b]\) is compact, \(f\) is uniformly continuous on \([a,b]\). Let \(\delta > 0\) be such that any two points \(x,y \in [a,b]\) with \(|x-y| < \delta\) satisfy \(|f(x) - f(y)| < \epsilon\). Let \(N\geq 1\) satisfy \((b-a)N^{-1} < \delta\) and subdivide \([a,b]\) into \(N\) evenly spaced subintervals \([a,x_2],[x_2,x_3],\ldots,[x_{N-1},b]\). Let \(g\) be the piecewise linear function which agrees with \(f\) at the points \(a,x_2,x_3,\ldots,x_{N-1},b\). By construction, any two adjacent points in this list are distance less than \(\delta\) apart.

Now let \(x\) be any point in \([a,b]\). This point \(x\) must belong to one of the subintervals just constructed (or two if it happens to be an endpoint). If the subinterval in question has endpoints \(a'\) and \(b'\), then since the length of the interval is less than \(\delta\), we have \(|f(x) - f(a')| < \epsilon\) and \(|f(x) - f(b')| < \epsilon\). Because \(g\) is linear on \([a',b']\), there is some \(\theta \in [0,1]\) such that \(g(x) = \theta f(a') + (1-\theta) f(b')\), by which it follows (via the triangle inequality) that
\[{}|f(x) - g(x)|{}\]
\[{}\leq \theta |f(x) - f(a')|{}\]
\[{}+ (1-\theta) |f(x) - f(b')|{}\]
\[{}< \epsilon.{}\]

2. Piecewise Linear Uniformly Approximated by Polynomials

Lemma
There exists a sequence of polynomials \(\{p_n(x)\}_{n=1}^\infty\) which converge uniformly to \(|x|\) on the interval \([-1,1]\). As a consequence, the polynomials \(\{ R p_n(R^{-1}x) \}_{n=1}^\infty\) necessarily converge uniformly to \(|x|\) on \([-R,R]\) for any fixed, positive \(R\).
Proof
Meta (Main Ideas)
Let \(\phi(x) := \frac{x}{2}(3-x^2)\). This function is increasing on \([-1,1]\), maps \([-1,1]\) into itself, and has \(\phi(t) = t\) exactly when \(t = 0, -1, 1\) (increasing because \(\phi'(x) = \frac{3}{2} - \frac{3}{2} x^2\); maps \([-1,1]\) to itself because it's monotone and fixes \(-1,0,1\)). Let \(q_1(x) := x\) and \(q_n(x) := \phi(q_{n-1}(x))\) for \(n > 1\). The plot below shows \(q_1,\ldots,q_5\).
Figure. Plot of \(q_1,\ldots,q_5\)

Each \(q_n\) is necessarily polynomial, increasing on \([-1,1]\), and takes values between \(-1\) and \(1\) on \([-1,1]\). Because \(\phi(x) \geq x\) for \(x \in [0,1]\), \(\{q_n(x)\}_{n=1}^\infty\) is pointwise increasing for each fixed positive \(x \in [0,1]\) and pointwise decreasing for each fixed negative \(x \in [-1,0]\). By symmetry,
\[ | \operatorname{sgn}(x) - q_n(x)| \leq | \operatorname{sgn}(x_0) - q_n(x_0)| \]
for all \(|x_0| \leq |x| \leq 1\). Thus the \(q_n\) converge uniformly to the signum function (\(+1\) on the positive reals and \(-1\) on the negative reals) on the set \(\epsilon \leq |x| \leq 1\) as \(n \rightarrow \infty\) because (for example) for each fixed \(x > 0\) \(\{q_n(x)\}\) is a bounded increasing sequence and thus must have a limit which equals a fixed point of \(\phi\), namely, \(1\).

Now let \(p_n(x) := x q_n(x)\). A plot of \(p_1,\ldots,p_5\) is shown below along with a plot of \(|x|\) itself.
Figure. Plot of \(p_1,\ldots,p_5\)

We can show that \(p_n(x)\) converges uniformly to \(|x|\) on \([-1,1]\) by fixing \(\epsilon\) and breaking the interval \([-1,1]\) into the small piece \([-\epsilon/4,\epsilon/4]\) (where \(p_n(x) - |x|\) is smaller than \(\epsilon/2\) because each of \(p_n(x)\) and \(|x|\) is less than or equal to \(\epsilon/4\) at every point for every value of \(n\)) and the complement, on which \(|p_n(x) - |x|| = |x| |q_n(x) - \operatorname{sgn}(x)| < \epsilon/2\) uniformly on \(\epsilon/4 \leq |x| \leq 1\) for all sufficiently large \(n\).
Claim
Every piecewise-linear function \(f\) on an interval \([a,b]\) is either linear or can be expressed as a sum
\[ f(x) = cx + d + \sum_{i = 2}^{n-1} c_i |x-x_i| \]
for some points \(x_2,\ldots,x_{n-1} \in (a,b)\) and constants \(c_i\), \(c\), and \(d\).
Proof
If \(f\) is a piecewise-linear function on some interval \([a,b]\), then by definition there are points Given points \(a = x_1 < \cdots < x_n = b \in {\mathbb R}\) and values \(y_1,\ldots,y_n \in {\mathbb R}\) such that \(f\) is a linear function on \([x_i,x_{i+1}]\) for each \(i=1,\ldots,n-1\) such that \(f(x_i) = y_i\) and \(f(x_{i+1}) = y_{i+1}\). A simple exercise gives that
\[ f(x) = \frac{y_{i+1}-y_i}{x_{i+1}-x_i} (x - x_i) + y_i \]
for all \(x \in [x_i,x_{i+1}]\). For simplicity, let \(m_i\) denote the slope of this line. Now consider the function
\[ g(x) := \sum_{j=2}^{n-1} \frac{m_{j+1}-m_{j}}{2} |x - x_j|. \]
On any interval \([x_{i-1},x_{i+1}]\) for \(i=2,\ldots,n-1\), each term in the sum defining \(g(x)\) is linear with the exception of the term \(j=i\). This particular term is piecewise-linear there with slope
\[ - \frac{m_{i+1} - m_{i}}{2}\]
to the left of \(x_i\) and slope
\[ \frac{m_{i+1} - m_{i}}{2}\]
to the right. Thus if \({\tilde m}_i\) denotes the slope of \(g(x)\) on \([x_{i-1},x_i]\) (where \(g\) itself must also be linear), it follows that
\[ {\tilde m}_{i+1} - {\tilde m}_i = m_{i+1}-m_i \]
for each \(i=2,\ldots,n-1\). As a consequence, the difference \(f(x) - g(x)\) is a continuous function on \([a,b]\) and the difference of slopes on \([x_{i},x_{i+1}]\) and \([x_{i-1},x_i]\) is zero. In other words, \(f(x) - g(x)\) has the same slope on all segments and therefore must be linear.
Corollary
Every piecewise-linear function \(f\) on \([a,b]\) can be uniformly approximated by polynomials.
Proof
Without loss of generality, \(f\) is not linear (since otherwise it can be approximated by itself with no error). Suppose \(f\) is expressed as above. Let \(n-2\) denote the total number of terms appearing in the sum of shifted absolute value terms and let \(c_{max} = \max\{c_2,\ldots,c_{n-1}\}\). Fix any \(\epsilon > 0\), let \(R := 2\max\{|a|,|b|\}\), and let \(p\) be
a polynomial such that \(|p(x) - |x|| < \epsilon/(nc_{max}+1)\). On \([a,b]\), \(|x - x_i| \leq |x| + |x_i| \leq R\), therefore
\[ |p(x-x_i) - |x-x_i|| < \frac{\epsilon}{nc_{max}+1}\]
for each \(i=2,\ldots,n-1\) in the sum. Thus
\[ \left| \sum_{i=2}^{n-1} c_i p(x-x_i) - \sum_{i=2}^{n-1} c_i |x-x_i| \right| < \epsilon. \]
Therefore the polynomial
\[ cx + d + \sum_{i=2}^{n-1} c_i p(x - x_i) \]
is uniformly close to \(f\) on \([a,b]\) with difference everywhere less than \(\epsilon\).

3. Approximating Continuous Functions by Polynomials

To finish the full Weierstrass Approximation Theorem, we use a simple transitivity argument: let \(f\) be continuous on \([a,b]\); there exists a piecewise-linear function \(g\) on \([a,b]\) such that
\[ \sup_{x \in [a,b]} |f(x) - g(x)| < \frac{\epsilon}{2}. \]
Now let \(h\) be a polynomial such that
\[ \sup_{x \in [a,b]} |g(x) - h(x)| < \frac{\epsilon}{2}. \]
For any \(x \in [a,b]\),
\[{}|f(x) - h(x)|{}\]
\[{}\leq |f(x) - g(x)| + |g(x) - h(x)|{}\]
\[{}\leq \sup_{x \in [a,b]} |f(x) - g(x)|{}\]
\[{}+ \sup_{x \in [a,b]} |g(x) - h(x)|{}\]
\[{}< \epsilon.{}\]