Riemann versus Darboux Integration

Technically the formulation of integration we have used (in terms of upper and lower sums) is known as the Darboux integral. It's more likely that in your first exposure to integration you were told about Riemann's formulation:
Definition (Riemann Integral)
A function \(f\) on an interval \([a,b]\) is said to have Riemann integral \(L\) when for all \(\epsilon > 0\), there is a positive real number \(\delta\) such that for every partition \(\mathcal P\) of \([a,b]\) with norm \(||\mathcal{P}||\) less than \(\delta\) (meaning that all intervals in \(\mathcal P\) have length less than \(\delta\)), it must be the case that
\[ \left| \left( \sum_{I \in \mathcal P} |I| f(x_I) \right) - L \right| < \epsilon, \]
where for each \(I \in \mathcal P\), \(x_I\) is any point in \(I\).

Functions \(f\) as just described will also be called integrable in the Riemann sense.
We will prove that the two notions of integration are equivalent: a function \(f\) on \([a,b]\) is Riemann integrable if and only if it is bounded and Darboux integrable (i.e., the integral exists in the manner we have already defined elsewhere). In either case, the integrals must also be equal.
Note
There exist other elementary theories of integration which are similar to but not equivalent to Riemann integration. See, for example, the regulated integral (e.g., Bourbaki's book “Functions of a Real Variable”).
Lemma
If \(f\) has Riemann integral \(L\) in the sense just defined above, then \(f\) must be bounded.
Proof
If \(f\) is unbounded on \([a,b]\), then for any partition \(\mathcal P\), there must be some \(I \in \mathcal P\) on which \(f\) is unbounded. This forces the collection of Riemann sums
\[ \sum_{I \in \mathcal P} |I| f(x_I) \]
for the partition \(\mathcal P\) to be unbounded (leave all but one of the terms in the sum fixed and then choose \(x_I\) to belong to some sequence such that \(f(x_I)\) is unbounded along that sequence). This means that no partition \(\mathcal P\) could have all of its Riemann sums belonging to the range \((L - \epsilon,L+\epsilon)\) for any \(L\).
Theorem 1 (Riemann Implies Darboux)
If \(f\) on \([a,b]\) has Riemann integral \(L\) as defined above, then for every \(\epsilon > 0\), there is a partition \(\mathcal P\) such that
\[ L - \epsilon \leq L(f,\mathcal{P}) \leq U(f,\mathcal{P}) \leq L + \epsilon. \]
In particular, this means that \(f\) must be integrable in the Darboux sense and the Riemann and Darboux integrals must be equal.
Proof
Let \(\delta\) be chosen so that all Riemann sums on partitions with norm less than \(\delta\) are within \((L - \frac{\epsilon}{2},L + \frac{\epsilon}{2})\). Let \(N\) be an integer such that \((b-a) N^{-1} < \delta\) and let \(\mathcal P\) be the subdivision of \([a,b]\) into \(N\) intervals of equal length. For each \(I \in \mathcal P\), let \(x_I\) have the property that \(f(x_I) > - \frac{\epsilon}{2 (b-a)} + \sup_{x \in I} f(x)\). Summing over \(I\), it follows that
\[{}U(f,\mathcal P) {}\]
\[{}< \frac{\epsilon}{2(b-a)} \sum_{I \in \mathcal P} |I| + \sum_{I \in \mathcal P} |I| f(x_I){}\]
\[{}< \frac{\epsilon}{2} + L + \frac{\epsilon}{2}{}\]
\[{}= L + \epsilon.{}\]
Likewise, choosing \(y_I \in I\) such that \(f(y_I) < \frac{\epsilon}{2(b-a)} + \inf_{y \in I} f(y)\) gives
\[{}L(f,\mathcal P){}\]
\[{}> -\frac{\epsilon}{2(b-a)} \sum_{I \in \mathcal P} |I| + \sum_{I \in \mathcal P} |I| f(y_I){}\]
\[{}> L - \epsilon.{}\]
Thus, for this partition \(\mathcal P\),
\[ L - \epsilon < L(f,\mathcal P) \leq U(f,\mathcal P) < L + \epsilon. \]
Because \(\epsilon\) is an arbitrary positive number, the upper and lower integrals of \(f\) must be equal and must equal \(L\).
Lemma
Suppose that \(\mathcal{P}_1\) and \(\mathcal{P}_2\) are partitions of an interval \([a,b]\), and let \(\mathcal{P}'\) be the minimal common refinement of \(\mathcal{P}_1\) and \(\mathcal{P}_2\), i.e., \(\mathcal{P}'\) is the common refinement with the property that every endpoint of every interval \(I \in \mathcal{P}'\) is an endpoint of an interval in \(\mathcal{P}_1\) or in \(\mathcal{P}_2\). If \(\# \mathcal{P}\) denotes the number of intervals in a partition \(\mathcal{P}\), then there are at least \(\# \mathcal{P}_2 - \# \mathcal{P}_1 + 1\) intervals in \(\mathcal{P}_2\) which also belong to \(\mathcal{P}'\).
Proof
The only way that an interval \(I \in \mathcal{P}_2\) can fail to belong to \(\mathcal{P}'\) is if there is some interval \(J \in \mathcal{P}_1\) which has an endpoint falling in the interior of \(I\). The total number of endpoints of intervals \(J \in \mathcal{P}_1\) in \((a,b)\) is \(\# \mathcal{P}_1 - 1\). So there can be at most \(\# \mathcal{P}_1 - 1\) intervals \(I \in \mathcal{P}_2\) which fail to belong to \(\mathcal{P}'\).
Theorem 2 (Darboux Implies Riemann)

If \(f\) on \([a,b]\) is a bounded function which is integrable in the Darboux sense (i.e., the infimum of the upper sums equals the supremum of the lower sums), then \(f\) is integrable in the Riemann sense and the Riemann and Darboux integrals are equal.
Proof
Suppose without loss of generality that \(|f(x)| \leq M\) on \([a,b]\). Let \(\mathcal{P}_1\) be any partition of \([a,b]\) such that
\[ L - \frac{\epsilon}{2} < L(f,\mathcal{P}) \leq U(f,\mathcal{P}) < L + \frac{\epsilon}{2}. \]
Given any partition \(\mathcal{P}_2\) of \([a,b]\), let \(\mathcal{P}'\) denote the minimal common refinement of \(\mathcal{P}_1\) and \(\mathcal{P}_2\). By monotonicity of upper and lower sums with respect to refinements,
\[ L - \frac{\epsilon}{2} < L(f,\mathcal{P}') \leq U(f,\mathcal{P}') < L + \frac{\epsilon}{2}. \]
Let \(\mathcal G\) denote the set of intervals which belong to both \(\mathcal{P}_2\) and \(\mathcal{P}'\). It is clearly the case that
\[{}\sum_{I \in \mathcal{G}} |I| \inf_{y \in I} f(y) {}\]
\[{}\leq \sum_{I \in \mathcal{G}} |I| f(x_I){}\]
\[{}\leq \sum_{I \in \mathcal{G}} |I| \sup_{y \in I} f(y){}\]
for any points \(x_I \in I\).

Now if \(\mathcal{P}_2\) has norm less than \(\delta\) for some \(\delta > 0\), since \(\#\mathcal{G} \geq \#\mathcal{P}_2 - \#\mathcal{P}_1 + 1\), there are at most \(\#\mathcal{P}_1-1\) intervals in \(\mathcal{P}_2\) which do not belong to \(\mathcal{P}'\). Call this collection \(\mathcal{B}\). Then
\[{}\left| \sum_{I \in \mathcal{B}} |I| f(x_I) \right|{}\]
\[{}\leq \sum_{I \in \mathcal{B}} |I| |f(x_I)|{}\]
\[{}\leq (\# \mathcal{P}_1 - 1) \delta M.{}\]
Combining estimates gives that
\[{}-(\# \mathcal{P}_1 - 1) \delta M + \sum_{I \in \mathcal{G}} |I| \inf_{y \in I} f(y) {}\]
\[{}\leq \sum_{I \in \mathcal{P}_2} |I| f(x_I){}\]
\[{}\leq (\# \mathcal{P}_1 - 1) \delta M {}\]
\[{}+ \sum_{I \in \mathcal{G}} |I| \sup_{y \in I} f(y) {}\]
(1)
for any Riemann sum on the partition \(\mathcal{P}_2\).

Now consider intervals \(I\) belonging to \(\mathcal{P}'\) but not \(\mathcal{P}_2\). The sum of the lengths of intervals in \(\mathcal{G}\) is greater than \((b-a) - (\# \mathcal{P}_1 - 1) \delta\), so the sum of the lengths of intervals in \(\mathcal{P}'\) but not \(\mathcal{P}_2\) is again at most \((\# \mathcal{P}_1 - 1) \delta\). It follows that
\[{}\sum_{I \in \mathcal{P'} \setminus \mathcal{P}_2} |I| \sup_{x \in I} f(x){}\]
\[{}\geq \sum_{I \in \mathcal{P'} \setminus \mathcal{P}_2} |I| (-M){}\]
\[{}\geq -M (\# \mathcal{P}_1 - 1) \delta,{}\]
which means that
\[{}\sum_{I \in \mathcal{G}} |I| \sup_{y \in I} f(y){}\]
\[{}= \sum_{I \in \mathcal{P}'} |I| \sup_{y \in I} f(y){}\]
\[{}- \sum_{I \in \mathcal{P'} \setminus \mathcal{P}_2} |I| \sup_{x \in I} f(x){}\]
\[{}\leq U(f,\mathcal{P'}){}\]
\[{}+ M (\# \mathcal{P}_1 - 1) \delta.{}\]
Similar reasoning gives that
\[{}\sum_{I \in \mathcal{P'} \setminus \mathcal{P}_2} |I| \inf_{x \in I} f(x){}\]
\[{}\leq M (\# \mathcal{P}_1 - 1) \delta{}\]
and
\[{}\sum_{I \in \mathcal{G}} |I| \inf_{y \in I} f(y){}\]
\[{}\geq L(f,\mathcal{P'}) - M (\# \mathcal{P}_1 - 1) \delta.{}\]
Combining these estimates with (1) implies that
\[{}\left| \left( \sum_{I \in \mathcal{P}_2} |I| f(x_I) \right) - L \right|{}\]
\[{}< \frac{\epsilon}{2} + 2 M (\# \mathcal{P}_1 - 1) \delta{}\]
for any partition \(\mathcal{P}_2\) of \([a,b]\) with norm at most \(\delta\). Constraining \(\delta\) so that \(2M (\#\mathcal{P}-1) \delta < \frac{\epsilon}{2}\) establishes that \(f\) must have Riemann integral \(L\) when \(L\) is the value of the Darboux integral of \(f\).
Exercises
  1. Suppose \(f\) is a function on \([0,1]\) such that the sums
    \[ \frac{1}{n} \sum_{i=1}^n f\left( \frac{i}{n} \right)\]
    tend to some limit \(L\) as \(n \rightarrow \infty\). Does this imply that \(f\) is Riemann integrable? If \(f\) is known to be Riemann integrable, must the value of the limit equal the integral?
  2. Suppose that \(f\) is a one-periodic function on \(\mathbb{R}\) which is Riemann integrable on \([0,1]\). Show that the functions
    \[ \frac{1}{N} \sum_{n=1}^N f \left( x + \frac{n}{N} \right)\]
    converge uniformly to a constant function as \(N \rightarrow \infty\).
  3. Show that if \(f\) is bounded and the Darboux integral of \(f\) exists on \([a,b]\) and equals \(L\), then for every \(\epsilon > 0\), there is some \(\delta\) such that \(L - \epsilon/2 \leq L(f,\mathcal{P}) \leq U(f,\mathcal{P}) \leq L + \epsilon/2\) for all partitions \(\mathcal{P}\) with norm less than \(\delta\). (We have already shown that there must be some parition \(\mathcal{P}\) whose upper and lower sums are close, but in fact upper and lower sums must be close for any sufficiently fine partition).