Devlin's Angle
November 2000

The perplexing mathematics of presidential elections

When Slobodan Milosevic claimed victory in the Yugoslavian elections last month, the opposition cried foul and the population took to the streets in protest, eventually forcing Milosevic to admit defeat and stand down. Whichever of Gore and Bush is able to claim victory in this month's US Presidential election, it is unlikely that there will be similar cries that the process was unfair. After all, everyone knows -- don't we? -- that, dubious campaign gifts, negative ads, and occasional dirty tricks notwithstanding, when it comes to the actual election process, the US electoral system is as fair as can be. One person one vote, with victory going to the candidate with the most votes. Who can possibly object to that?

Well, John McCain could, for one. And in theory at least, so could Ralph Nader. (More on both of them later.) So too could anyone who takes a look at the mathematics of voting. It's not the idea of one person one vote that's the problem, it's that math that is used to turn those votes into a final decision. Ideally, that math should reflect the wishes of the electorate. But does it?

The answer usually comes as a surprise to most people. There are, in fact, several different ways to do the math, and they often lead to very different outcomes. That's right: there's a choice of how to do the math!

The electoral math used in the United States election process counts votes using a system known as plurality voting. In this system, also known as "first-past-the-post," the candidate with the most votes is declared the winner. Now in an election where there are just two candidates, that system works just fine. It's when there are three or more candidates that problems can arise. Plurality voting can result in the election of a candidate whom almost two-thirds of voters detest.

For instance, in 1998, in a three-party race, plurality voting resulted in the election of former wrestler Jesse Ventura as Governor of Minnesota, despite the fact that only 37% of the electors voted for him. The almost two-thirds of electors who voted Democrat or Republican had to come to terms with a governor that none of them wanted -- or expected. Judging by the comments immediately after the election, the majority of Democrat and Republican voters were strongly opposed to Reform Party candidate Ventura moving into the Governor's mansion. In which case, he won not because the majority of voters chose him, but because plurality voting effectively thwarted the will of the people. Had the voters been able to vote in such a way that, if their preferred candidate were not going to win, their preference between the remaining two could be counted, the outcome could have been quite different.

For instance, several countries, among them Australia, the Irish Republic, and Northern Ireland, use a system called single transferable vote. Introduced by Thomas Hare in England in the 1850s, this system takes account of the entire range of preferences each voter has for the candidates. All electors rank all the candidates in order of preference. When the votes are tallied, the candidates are first ranked based on the number of first-place votes each received. The candidate who comes out last is dropped from the list. This, of course, effectively "disenfranchises" all those voters who picked that candidate. So, their vote is automatically transferred to their second choice of candidate -- which means that their vote still counts. Then the process is repeated: the candidates are ranked a second time, according to the new distribution of votes. Again, the candidate who comes out last is dropped from the list. With just three candidates, this leaves one candidate, who is declared the winner. In a contest with more than three candidates, the process is repeated one or more additional times until only one candidate remains, with that individual winning the election. Since each voter ranks all the candidates in order, this method ensures that at every stage, every voter's preferences among the remaining candidates is taken into account.

An alternative system that avoids the kind of outcomes of the 1998 Minnesota Governor's race is the Borda count, named after Jean-Charles de Borda, who devised it in 1781. Again, the idea is to try to take account of each voter's overall preferences among all the candidates. As with the single transferable vote, in this system, when the poll takes place, each voter ranks all the candidates. If there are n candidates, then when the votes are tallied, the candidate receives n points for each first-place ranking, n-1 points for each second place ranking, n-2 points for each third place ranking, down to just 1 point for each last place ranking. The candidate with the greatest total number of points is then declared the winner.

Yet another system that avoids the Jesse Ventura phenomenon is approval voting. Here the philosophy is to try to ensure that the process does not lead to the election of someone whom the majority opposes. Each voter is allowed to vote for all those candidates of whom he or she approves, and the candidate who gets the most votes wins the election. This is the method used to elect the officers of both the American Mathematical Society and the Mathematical Association of America.

To see how these different systems can lead to very different results, let's consider a hypothetical scenario in my home state of California, where Green Party candidate Ralph Nader is expected to do well. Suppose that, on November 7, 15 million Californians go to the polls, and that their preferences between the three main candidates are as follows:

6 million rank Bush first, then Nader, then Gore.
5 million rank Gore first, then Nader, then Bush.
4 million rank Nader first, then Gore, then Bush.

If the votes are tallied by the plurality vote -- the present system -- then Bush's 6 million (first-place) votes make him the clear winner. And yet, 9 million voters (60% of the total) rank him dead last! That hardly seems fair.

What happens if the votes are counted by the single transferable vote system -- the system used in Australia and Ireland? The first round of the tally process eliminates Nader, who is only ranked first by 4 million voters. Those 4 million voters all have Gore as their second choice, so in the second round of the tally process their votes are transferred to Gore. The result is that, in the second round, Bush gets 6 million first place votes while Gore gets 9 million. Thus, Gore wins by a whopping 9 million to 6 million margin.

But wait a minute. Looking at the original rankings, we see that 10 million voters prefer Nader to Gore -- that's 66% of the total vote. Can it really be fair for such a large majority of the electorate to have their preferences ignored so dramatically?

Thus, both the plurality vote and single transferable vote lead to results that run counter to the overwhelming desires of the electorate. What happens if we use the Borda count? Well, with this method, Bush gets

6m x 3 + 5m x 1 + 4m x 1 = 27m points,

Gore gets

6m x 1 + 5m x 3 + 4m x 2 = 29m points,

and Nader gets

6m x 2 + 5m x 2 + 4m x 3 = 34m points.

The result is a decisive win for Nader, with Gore coming in second and Bush trailing in third place.

What happens with approval voting? Well, as I have set up the problem so far, we don't have enough information -- we don't know how many electors actively oppose each particular candidate. Let's assume that the Gore supporters and the Nader supporters could live with the others' candidate, but the voters in both groups really don't want to see Bush in the White House. (This is not at all an unreasonable supposition, given the voting preferences we started with, but remember that this is a purely hypothetical example.) In this case, Nader gets 15 million votes, Gore gets 9 million votes, and Bush gets a mere 6 million. All in all, it's beginning to look as though Nader is the one who should receive the Electoral College's votes for California.

Faced with such confusion in how to count votes in elections with three or more candidates, it's tempting to say that the only fair way to decide the issue is to choose the individual who would beat every other candidate in head-to-head, two-party contests. This approach was suggested by the Marquis de Condorcet in 1785, and as a result is known today as the Condorcet system.

For the scenario in our example, Nader also wins according to the Condorcet system. He gets at least 10 million votes in a straight Nader-Gore contest and at least 9 million votes in a Nader-Bush match-up, in either case a majority of the 15 million voters. Unfortunately, although it works for this example, and despite the fact that it has considerable appeal, the Condorcet method suffers from a major disadvantage: it does not always produce a clear winner!

For example, suppose the Californian voting profile were as follows:

5 million rank Bush first, then Gore, then Nader.
5 million rank Gore first, then Nader, then Bush.
5 million rank Nader first, then Bush, then Gore.

Then 10 million Californian voters prefer Bush to Gore, so Bush would easily win a Bush-Gore battle. Also, 10 million voters prefer Gore to Nader, so Gore would romp home in a Gore-Nader contest. The remaining two-party match-up would pit Bush against Nader. But when we look at the preferences, we see that 10 million people prefer Nader to Bush, so Nader comes out on top in that contest. In other words, there is no clear winner. Each candidate wins one of the three possible two-party battles!

So what do we do next? Faced with such a confusing state of affairs, the obvious thing is to abandon all of the methods we have looked at and search for an alternative approach. After all, there must be a fair way to count the votes in an election, mustn't there?

Sadly -- and surprisingly -- the answer is no. In 1950, the Stanford economist Kenneth Arrow made a startling mathematical discovery -- a discovery for which he was subsequently awarded the Nobel Prize in Economics. Suppose, said Arrow, that we want to find a way of tallying the votes in an election. What kinds of conditions must that tallying system satisfy in order for it to give a fair outcome? One obvious condition is that if every voter prefers candidate A over candidate B, then the final ranking produced by the tally system should place A above B. Another obvious requirement is that if the tally system puts candidate A above candidate B, then that ordering between A and B should remain the same if one or more voters changes their mind about some third candidate C.

All right, you say, so what? Why beat about the bush (not George, this time) stating -- in the words of Basil Fawlty (John Cleese) -- the bleeding obvious? Here's why. Arrow proved that there is only one vote-tallying system that satisfies those two seemingly innocuous, and eminently desirable, requirements: One person is appointed as a dictator and he or she rules absolutely. And that's as far away from the idea of democracy as you can get! In other words, if it's democracy you want, there is no fair way to tally the votes in an election.

I should stress that Arrow's theorem doesn't just say that none of the tallying systems that have been devised so far is fair. Arrow proved that no fair system can possibly exist. Period.

Thus, the best we can hope for is to pick the best of a range of imperfect election tallying systems. But how do we make that choice? Things might not be so bad if mathematicians themselves agreed which system is best. Unfortunately, pretty well the only thing everyone does agree on is that the present system -- plurality voting -- is the worst, and any of the other systems described here would do a better job of representing the preferences of the electorate.

Lest I have given the impression that the single transferable vote and the Borda count are without their problems, let me rectify that misapprehension right away. One worrying problem with the single transferable vote is that if some voters increase their evaluation of a particular candidate and raise him or her in their rankings, the result can be -- paradoxically -- that the candidate actually does worse! For example, consider an election in which there are four candidates, A, B, C, D, and 21 electors. Initially, the electors rank the candidates like this:

7 voters rank: A B C D
6 voters rank: B A C D
5 voters rank: C B A D
3 voters rank: D C B A

In the first round of the tally, the candidate with the fewest first-place votes is eliminated, namely D. After D's votes have been redistributed, the following ranking results:

7 voters rank: A B C
6 voters rank: B A C
5 + 3 = 8 voters rank: C B A

Then B is eliminated, leading to the new ranking:

7 + 6 = 13 voters rank: A C
8 voters rank: C A

Thus A wins the election.

Now suppose that the 3 voters who originally ranked the candidates D C B A change their mind about A, moving him from their last place choice to their first place: A D C B. These voters do not change their evaluation of the other three candidates, nor do any of the other voters change their rankings of any of the candidates. But when the votes are tallied this time, the end result is that B wins. (If you don't believe this, just work through the tally process one round at a time. The first round eliminates D, the second round eliminates C, and the final result is that 10 voters prefer A to B and 11 voters prefer B to A.)

For all the advantages offered by the single transferable vote system, the fact that a candidate can actually harm her chances by increasing her voter appeal -- to the point of losing an election that she would otherwise have won -- leads some mathematicians to conclude that the method should not be used.

The Borda count has at least two weaknesses. First, it is easy for blocks of voters to manipulate the outcome. For example, suppose there are 3 candidates A, B, C and 5 electors, who initially rank the candidates:

3 voters rank: A B C
2 voters rank: B C A

The Borda count for this ranking is as follows:

A: 3x3 + 2x1 = 11
B: 3x2 + 2x3 = 12
C: 3x1 + 2x2 = 7

Thus, B wins. Suppose now that A's supporters realize what is likely to happen and deliberately change their ranking from A B C to A C B. The Borda count then changes to:

A: 11; B: 9; C: 10.

This time, A wins. By putting B lower on their lists, A's supporters are able to deprive him of the victory he would otherwise have had.

Of course, almost any method is subject to strategic voting by a sophisticated electorate, and Borda himself acknowledged that his system was particularly vulnerable, commenting: "My scheme is intended only for honest men." Somewhat more worrying to the student of electoral math is the fact that the entry of an additional candidate into the race can dramatically alter the final rankings, even if that additional candidate has no chance of winning, and even if none of the voters changes their rankings of the original candidates. For example, suppose that there are 3 candidates, A, B, C, in an election with 7 voters. The voters rank the candidates as follows:

3 voters rank: C B A
2 voters rank: A C B
2 voters rank: B A C

The Borda count for this ranking is:

A: 13; B: 14; C: 15.

Thus, the candidates final ranking is C B A. Now candidate X enters the race, and the voters' ranking becomes:

3 voters rank: C B A X
2 voters rank: A X C B
2 voters rank: B A X C

The new Borda count is:

A: 20; B: 19; C: 18; X: 13.

Thus, the entry of the losing candidate X into the race has completely reversed the ranking of A, B, and C, giving the result A B C X. With even seemingly "sophisticated" vote-tallying methods having such drawbacks, how are we to decide which is the best method? Of course, the democratic way to settle the matter would be to vote on the available systems. But then, how do we tally the votes of that election?

When it comes to elections, it seems that even the math used to count the votes is subject to debate!

Finally, what of the luckless John McCain, who dropped out of the presidential race after he lost the Californian Primary. Unlike the hypothetical examples we've looked at so far, in the case of the Californian Primary, we have real data to look at -- the Sacramento Bee conduced an exit poll of voters. This showed that Californian voters would have voted 48 to 43 against Gore in a two-candidate presidential race, and would have voted 51 to 43 in favor of Gore in a two-candidate presidential battle against Bush. The newspaper did not ask voters how they would have voted in a McCain-Bush presidential race, since that was never on the cards, of course. However, the official polls showed that Republicans split 60-35 in favor of Bush, while the registered Democrats who voted Republican split 64-31 in favor of McCain. If we assume that the entire Democratic party would have split that way, then we conclude that in a McCain-Bush presidential race, the vote would go 50-45 in McCain's favor. Based on these data, if the Californian votes had been tallied using the Borda count, then McCain would have got 48 + 50 = 98 points; Gore 43 + 51 = 94; and Bush 45 + 43 = 88. In other words, McCain would have won!

References: Alan D. Taylor, Mathematics and Politics: Strategy, Voting, Power and Proof, Springer-Verlag (1995). The McCain example was described by Dana Mackenzie in an article in the October issue of SIAM News.

Devlin's Angle is updated at the beginning of each month.

Keith Devlin ( devlin@stmarys-ca.edu) is Dean of Science at Saint Mary's College of California, in Moraga, California, and a Senior Researcher at Stanford University. His latest book is The Math Gene: How Mathematical Thinking Evolved and Why Numbers Are Like Gossip, published by Basic Books.