Conditional Convergence

Recall that a series \(\sum_n a_n\) is called conditionally convergent when it is convergent but \(\sum_{n} |a_n|\) is not. Because the absolute value is a non-negative function, the only way that \(\sum_n |a_n|\) can fail to be convergent is for its partial sums to be unbounded, i.e., for each \(M > 0\), one must have
\[ \sum_{n=1}^N |a_n| > M \]
for all \(N\) sufficiently large.

1. Infinite Cancellation

The first thing to observe about conditionally-convergent series is that they have the rather extreme property that there is an infinite amount of cancellation which takes place in order to achieve convergence. By this, we mean that the sum of just the positive terms will be infinite, as will the sum of just the negative terms. The precise formulation of this idea is as follows.
Proposition
Suppose
\[ \sum_{n=1}^\infty a_n \]
is a conditionally convergent series. Let \(\{a_{n_j}\}_{j=1}^\infty\) be the subsequence of \(\{a_n\}_{n=1}^\infty\) consisting of all nonnegative terms and let \(\{a_{m_k}\}_{k=1}^\infty\) be the subsequence of \(\{a_n\}_{n=1}^\infty\) consisting of all strictly negative terms. Then
\[ \sum_{j=1}^\infty a_{n_j} = \infty \text{ and } \sum_{k=1}^\infty a_{m_k} = -\infty. \]
Proof
First a remark: one should not assume that there are infinitely many positive terms and infinitely many negative terms, but it's possible to prove that this must be the case. For if there are only finitely many negative terms, then \(|a_n| = a_n\) for all but finitely many \(n\) and consequently convergence would imply absolute convergence. Likewise, if there are only finitely many positive terms, then \(|a_n| = -a_n\) for all \(n\) sufficiently large and once again convergence would imply absolute convergence.

The key thing to notice is that every index \(n\) appears exactly once in either \(\{n_j\}_{j=1}^\infty\) (if \(a_n \geq 0\)) or \(\{m_k\}_{k=1}^\infty\) (if \(a_n < 0\)). In particular, for every \(N\), there is some \(J_N\) and \(K_N\) such that
\[ \sum_{n=1}^N a_n = \sum_{j=1}^{J_N} a_{n_j} + \sum_{k=1}^{K_N} a_{m_k}. \]
(Simply take \(J_N\) to be the largest index \(J\) for which \(n_J \leq N\) and similarly take \(K_N\) to be the largest index \(K\) for which \(m_K \leq N\).) Furthermore, for this same triple \((N,J,K)\),
\[ \sum_{n=1}^N |a_n| = \sum_{j=1}^{J_N} a_{n_j} - \sum_{k=1}^{K_N} a_{m_k}. \]
By conditional convergence of our original series, we must have that
\[ \sum_{j=1}^{J_N} a_{n_j} + \sum_{k=1}^{K_N} a_{m_k}\]
remains bounded (and in fact converges) in the limit \(N \rightarrow \infty\) while
\[ \sum_{j=1}^{J_N} a_{n_j} - \sum_{k=1}^{K_N} a_{m_k}\]
is unbounded and tends to \(\infty\) as \(N \rightarrow \infty\). Since differences of bounded sequences are themselves bounded, at least one of
\[ \sum_{j=1}^{J_N} a_{n_j} \text{ or } \sum_{k=1}^{K_N} a_{m_k}\]
must be unbounded as a function of \(N\). But if one of them is unbounded and the other is unbounded, the sum
\[ \sum_{j=1}^{J_N} a_{n_j} + \sum_{k=1}^{K_N} a_{m_k}\]
would be unbounded (which isn't the case). Thus both are unbounded as \(N \rightarrow \infty\). Now we have established that the partial sums
\[ \sum_{j=1}^J a_{n_j} \text{ and } \sum_{k=1}^K a_{m_k}\]
have subsequences (with \(J\) replaced by \(J_N\) and \(K\) replaced by \(K_N\)) which are unbounded. This forces all partial sums to tend to \(\infty\) when \(J \rightarrow \infty\) and to \(-\infty\) as \(K \rightarrow \infty\) because both sequences of partial sums are monotone (due to the fact that the terms are always nonnegative in the first sum and strictly negative in the second sum).

2. Absolute Convergence and Rearrangement

Here we give a rigorous proof of a fact that is likely more “intuitively obvious” than it should be: for absolutely convergent series, one can reorder the terms of the sum in any way desired and the new series will still converge and have the same sum. One can think of this as an infinite analogue of the finite commutativity of addition.
Theorem
Suppose that \(\sum_{n=1}^\infty\) is absolutely convergent. Then for any bijection \(\phi \ : \ \mathbb{N} \rightarrow \mathbb{N}\),
\[ \sum_{j=1}^\infty a_{\phi(j)} = \sum_{n=1}^\infty a_{n}. \]
In other words, rearranging the terms of the sequence \(\{a_n\}_{n=1}^\infty\) preserves convergence of the series and specifically the sum of the series.
Proof
Meta (Main Idea)
The main idea is that when \(J'\) is large, the terms \(a_{\phi(1)},\ldots,a_{\phi(J')}\) have to include at least \(a_1,\ldots,a_N\) for some large \(N\). We can argue that when \(N\) is large, the finite sum of these terms will be close to the actual sum of the infinite series, and we can use the triangle inequality to show that all the terms in \(a_{\phi(1)},\ldots,a_{\phi(J')}\) whose indices are larger than \(N\) don't contribute significantly to the overall sum. For each \(\epsilon\), let \(N\) be chosen so that
\[ \sum_{n=N+1}^{M} |a_n| < \frac{\epsilon}{2}\]
whenever \(M > N\). Such an \(N\) exists by the Cauchy criterion because the series converges absolutely. Now let \(J\) be any index such that \(\{1,\ldots,N\} \subset \{\phi(1),\ldots,\phi(J)\}\); we know such a \(J\) must exist because \(\phi\) is a bijection and so every index \(n \in \{1,\ldots,N\}\) equals \(\phi(j)\) for some \(j\). For convenience, for every natural number \(J'\), let \(I_{J'} := \phi(\{1,\ldots,J'\})\) and observe that \(\{1,\ldots,N\} \subset I_{J'}\) when \(J' \geq J\). Because finite sums are commutative, we can write
\[ \sum_{j=1}^{J'} a_{\phi(j)} = \sum_{n=1}^N a_n + \sum_{n \in I_{J'} \setminus \{1,\ldots,N\}} a_n, \]
so the triangle inequality implies that
\[ \left| \sum_{j=1}^{J'} a_{\phi(j)} - \sum_{n=1}^N a_n \right| \leq \sum_{n \in I_{J'} \setminus \{1,\ldots,N\}} |a_n|. \]
The sum on the right-hand side is a finite sum of terms \(|a_n|\) and every index \(n\) appearing in the sum is at least as large as \(N+1\). Therefore, if \(M\) is the maximum index appearing in this sum, we have
\[\sum_{n \in I_{J'} \setminus \{1,\ldots,N\}} |a_n| \leq \sum_{n=N+1}^{M} |a_n| < \frac{\epsilon}{2}. \]
But also observe that the triangle inequality implies that
\[ \left| \sum_{n=1}^N a_n - \sum_{n=1}^M a_n \right| \leq \frac{\epsilon}{2}\]
for all \(M > N\), so taking the limit of the inequalities
\[ -\frac{\epsilon}{2} + \sum_{n=1}^N a_n \leq \sum_{n=1}^M a_n \leq \frac{\epsilon}{2} + \sum_{n=1}^N a_n \]
as \(M \rightarrow \infty\) gives
\[ -\frac{\epsilon}{2} + \sum_{n=1}^N a_n \leq \sum_{n=1}^\infty a_n \leq \frac{\epsilon}{2} + \sum_{n=1}^N a_n \]
or simply
\[ \left| \sum_{n=1}^N a_n - \sum_{n=1}^\infty a_n \right| \leq \frac{\epsilon}{2}. \]
Combining everything we've observed, if \(J' \geq J\), we have
\[{}\left| \sum_{j=1}^{J'} a_{\phi(j)} - \sum_{n=1}^\infty a_n \right|{}\]
\[{}\leq \left| \sum_{j=1}^{J'} a_{\phi(j)} - \sum_{n=1}^N a_n \right|{}\]
\[{}+ \left| \sum_{n=1}^N a_n - \sum_{n=1}^\infty a_n \right|{}\]
\[{}< \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon.{}\]

3. Riemann Rearrangement Theorem

The reason that the theorem above is more intuitive than it should be is that it turns out to be false for conditionally-convergent series. The Riemann Rearrangement Theorem (below) actually demonstrates that conditionally convergent series can be reordered to make the new sum equal any value that one likes.
Theorem
Suppose that \(\sum_{n=1}^\infty a_n\) is a conditionally convergent series. Let \(r\) be any real number. Then there is a bijection \(\phi : \mathbb{N} \rightarrow{N}\) such that
\[ \sum_{j=1}^\infty a_{\phi(j)} = r. \]
In other words, there is some reordering of the terms \(\{a_n\}_{n=1}^\infty\) such that the series converges to \(r\).
The proof is a bit involved, so we divide it into several steps.

Step 1: Sorting Terms Because the series \(\sum_{n=1}^\infty a_n\) converges, its terms tend to zero. This means that for any \(\epsilon > 0\), there are only finitely many terms \(a_n\) with \(|a_n| > \epsilon\). In particular, there must always be a term with largest magnitude (since for any \(n_0\) with \(a_{n_0} \neq 0\), there are only finitely many \(n\) with \(|a_n| > |a_{n_0}|\)). This means that we can “sort” the terms \(\{a_n\}_{n=1}^\infty\) into a nonincreasing sequence \(\{p_k\}_{k=1}^\infty\) of positive terms and a nondecreasing sequence \(\{g_k\}_{k=1}^\infty\) of negative terms. We note that there must indeed be infinitely many positive terms and also infinitely many negative terms because we know that the sum of
just the positive terms diverges, as does the sum of just the negative terms.

Every positive term \(a_n\) appears exactly once in the sequence \(\{p_k\}_{k=1}^\infty\) and every negative \(a_n\) appears exactly once in \(\{g_k\}_{k=1}^\infty\). Terms \(a_n\) which equal zero (if there are any) don't appear in either sequence and will be handled separately.

Step 2: Defining the Rearrangement Let \(S_0 := 0\). For each \(J' \geq 1\), we are going to inductively define \(\phi(J')\) and the corresponding partial sum \(S_{J'}\) of the rearranged series. We'll do this by using the terms \(\{p_k\}_{k=1}^\infty\) and \(\{g_k\}_{k=1}^\infty\) in order. We'll say that a particular term \(p_k\) or \(g_k\) is “used” at step \(J'\) if the term \(a_n\) that it equals satisfies \(\phi(j) = n\) for some \(j \leq J'\) (so it's “used” if its corresponding \(a_n\) already appears somewhere in the partial sum \(S_{J'}\)). Now assuming that \(\phi(1),\ldots\phi(J')\) are defined, define \(\phi(J'+1)\) as follows. If \(S_{J'} \leq r\), let \(\phi(J'+1)\) be defined so that \(a_{\phi(J'+1)}\) corresponds to the first positive term \(p_k\) which is still unused at step \(J'\). On the other hand, if \(S_{J'} > r\), define \(\phi(J'+1)\) so that \(a_{\phi(J'+1)}\) corresponds to the first negative term \(g_k\) still unused at step \(J'\). Proceeding in this way gives a well-defined \(\phi\) which is injective (since no term is used more than once).

Step 3: Understanding the Partial Sums A key claim to prove is that the partial sums \(S_{J'}\) switch infinitely many times between the case \(S_{J'} < r\) and \(S_{J'} \geq r\). If this were not the case, it would have to instead be true that either \(S_{J'} < r\) for all \(J'\) sufficiently large or \(S_{J'} \geq r\) for all \(J'\) sufficiently large. This would in turn mean that each \(a_{\phi(j)}\) would be drawn from \(\{p_k\}_{k=1}^\infty\) for all \(j \geq J'\) in the first case or would be drawn from \(\{g_k\}_{k=1}^\infty\) for all \(j \geq J'\) in the second case. To see why this can't happen, observe that \(\sum_{k=k_1}^{k_2} p_k \rightarrow \infty\) as \(k_2 \rightarrow \infty\) and \(\sum_{k=k_1}^{k_2} g_k \rightarrow -\infty\) as \(k_2 \rightarrow \infty\). That's because (for example) we know that for any \(M\), every sufficiently large \(N\) has the property the sum of the first \(N\) positive \(a_n\)'s exceeds \(M + \sum_{k=1}^{k_1-1} p_k\). But if \(k_2\) is large enough, each of the first \(N\) positive \(a_n\)'s will appear in the list \(p_1,\ldots,p_{k_2}\), which means \(p_1 + \cdots + p_{k_2} > M + p_1 + \cdots + p_{k_1-1}\). Subtracting \(p_1,\ldots,p_{k_1-1}\) from both sides accomplishes the claim. But now, if \(a_{\phi(j)}\) is always positive for \(j \geq J_0\), this would mean that \(S_{J'} - S_{J_0}\) would have to tend to \(+\infty\) as \(J' \rightarrow \infty\) because \(S_{J'}\) differs from \(S_{J_0}\) by perpetually adding more and more terms of \(\{p_k\}_{k=1}^\infty\), used in order. But once \(S_{J'} - S_{J_0}\) were large enough, it would no longer be the case that \(S_{J'} < r\) as supposed. A symmetric argument shows that \(S_{J'} \geq r\) can't be the case for all sufficiently large \(J'\).

Now define \(J_1'\) to the first index \(J'\) at which \(S_{J'} - r\) changes from negative to nonpositive, i.e., \(S_{J'_1} - r < 0\) but \(S_{J'_1+1} - r \geq 0\). Let \(J'_2 > J_1'\) be the first index after \(J_1'\) when the sequence \(\{S_{J'} - r\}_{J'}\) switches back, i.e., \(S_{J'_2} - r \geq 0\) and \(S_{J'_2+1} - r < 0\). Let \(J'_3,J'_4,\ldots\) continue on to be an ordered enumeration of the indices at which \(S_{J'} - r\) changes sign in this way. For any odd \(\ell\), Since \(S_{J'_\ell+1} = S_{J'_\ell} + p_k\) for some \(k\), it follows that
\[0 \leq S_{J'_\ell+1} - r = p_k + (S_{J'} - r) \leq p_k.\]
The \(k\) which is used here always increases as a function of \(\ell\), and since the \(p_k\) are decreasing, it follows that \(p_k \leq p_{(\ell+1)/2}\). So \(S_{J'_\ell+1} - r \leq p_{(\ell+1)/2}\). For \(J'\) between \(J'_{\ell}+1\) and \(J'_{\ell+1}\), the partial sums \(S_{J'}\) decrease, so \(0 \leq S_{J'} - r \leq S_{J'_\ell+1} - r \leq p_{(\ell+1)/2}\). Similarly, if \(J'\) is between \(J'_\ell+1\) and \(J'_{\ell+1}\) for some even \(\ell\), it will be the case that \(g_{\ell/2} \leq S(J') - r < 0\).

Step 4: Convergence of the Series Now fix any \(\epsilon > 0\). Let \(K\) be so large that \(p_k < \epsilon\) for all \(k \geq K\) and \(g_k > - \epsilon\) for all \(k \geq K\) and then fix \(\ell \geq 2K\). We claim that whenever \(J' \geq J'_\ell\), it must be the case that
\[ |S(J') - r| < \epsilon. \]
This is because every \(J'\) in this range is between \(J'_{\ell_\star}+1\) and \(J'_{\ell_\star+1}\) for some \(\ell_\star \geq \ell\), so either \(-\epsilon < g_{\ell_\star/2} \leq S(J') - r < 0\) or \(0 \leq S_{J'} - r \leq S_{J'_{\ell_\star}+1} - r \leq p_{(\ell_\star+1)/2} < \epsilon\) depending on whether \(\ell_\star\) is odd or even. Thus it must be the case that
\[ \sum_{j=1}^\infty a_{\phi(j)} = r \]
as desired.

Step 5: Conclusions and Clean-up There is a final issue remaining, which is to verify that \(\phi\) is not just injective but bijective. Technically the argument above will certainly not yield a bijection if there are any terms \(a_n = 0\), but we can fix that problem in an ad hoc way in just a moment. The more subtle issue is to understand why every nonzero term is being used. Here the answer is again related to the fact that \(S(J') - r\) changes sign infinitely often. With \(J'_1,J'_2,\ldots\) defined as just was the case above, we see that the term \(a_{\phi(J'_1+1)}\) must be positive, \(a_{\Phi(J'_2+1)}\) must be negative, \(a_{\phi(J'_3+1)}\) positive, etc. In other words, we know that there are, in fact, infinitely many occasions on which \(a_{\phi(J')}\) is positive and also infinitely many occasions on which \(a_{\Phi(J')}\) is negative. Because we choose these terms from \(\{p_k\}_{k=1}^\infty\) and \(\{g_k\}_{k=1}^\infty\) in order, every term from both sequences must eventually be used (as the only way for \(p_k\) not to be used would be to use only finitely many positive terms, i.e., \(p_1,\ldots,p_{k-1}\)). Thus we can say with certainty that every \(n\) for which \(a_n\) is nonzero admits some \(j\) such that \(\phi(j) = n\). So, in fact, what we have accomplished is a map \(\phi\) which sends \(\mathbb{N}\) bijectively to the set of indices \(\{n \ : \ a_n \neq 0\}\).

As a last step, suppose \(a_n = 0\) for at least one \(n\). If there are only finitely many \(n\) for which \(a_n = 0\), we can tack on a finite number of zeros to the beginning of our series just built, i.e., build some new \(\phi'\) such that \(a_{\phi'(1)},\ldots,a_{\phi'(N)}\) coincide with all the terms which are zero and then take \(\phi'(j+N) := \phi(j)\). Adding finitely many zeros does not change the convergence of the series, nor does it change the value of the sum. If, on the other hand, there are infinitely many terms which are zero, let us define \(\phi'(2j) := \phi(j)\) and let \(\phi'(j)\) restricted to the odd indices \(j\) simply enumerate the infinite set \(\{ n \ : \ a_n = 0\}\). This new ordering interleaves the terms of our series built above with zeros. This is also a totally harmless operation–the new series still converges and has the same sum \(r\).