Case Study 2: The 1948 Presidential Election

Soon after the 1936 fiasco, the Literary Digest went out of the business of polling. In fact, they went out of business altogether. At the same time, the practice of using public opinion polls to measure the pulse of the American electorate was thriving. By 1948 there were several major polls competing for the big prize, that of accurately predicting the outcome of presidential elections. The best known was the Gallup poll, and its two main competitors were the Roper poll and the Crossley poll.

By this time, all major polls were using what was belived to be a much more scientific method for choosing their samples called quota sampling. Quota sampling had been introduced by George Gallup as early as 1935 and had been successfully used by him to predict the winner of the 1936,1940 and 1944 elections. Quota sampling is nothing more than a systematic effort to force the sample to fit a certain national profile by using quotas: The sample should have so many women, so many men, so many blacks, so many whites, so many under 40, so many over 40 etc. The numbers in each category are taken to represent the same proportions in the sample as are in the electorate at large.

If we assume that every important characteristic of the population is taken into account when setting up the quotas, it is reasonable to expect that quota sampling will produce a good cross-section of the population and therefore lead to accurate predictions. For the 1948 election between Thomas Dewey and Harry Truman, Gallup conducted a poll with a sample size of about 3250. Each individual in the sample was inteviewed in person by a professional interviewer to minimize nonresponse bias, and each interviewer was given a very detailed set of quotas to meet. For example, an interviewer could have been given the following quotas: seven white males under 40 living in a rural area, five black males under 40 living in a rural area, six black females under 40 living in a rural area, etc. Other than meeting these quotas the ultimate choice of who was interviewed was left to each interviewer.

Based on the results of this poll, Gallup predicted a victory for Dewey, the Republican candidate. The predicted breakdown of the vote was 50% for Dewey, 44% for Truman, and 6% for third-party candidates Strom Thurmond and Henry Wallace. The actual results of the election turned out to be almost exactly reversed: 50% for Truman, 45% for Dewey, and 5% for third-party candidates.

Truman's victory was a great surprise to the nation as a whole. So convinced was the Chicago Tribune of Dewey's victory that it went to press on its early edition for November 4, 1948 with the headline "Dewey defeats Truman" -- a blunder that led to Truman's famous retort "Ain't the way I heard it." The picture of Truman holding aloft a copy of the Tribune has become part of our national folklore. To pollsters and statisticians, the results of this election were a clear indication that as a method for selecting a representative sample, quota sampling can have some serious flaws.

The basic idea of quota sampling is on the surface a good one: Force the sample to be a representative cross-seciton of the population by having each important characteristic of the population proportionally represented in the sample. Since income is an important factor in determining how people vote, the sample should have all income groups represented in the same proportion as the population at large. Ditto for sex, race, age, etc. Right away we can see one of the potential problems: Where do we stop? No matter how careful one might be, there is always the possibility that some criterion that would affect the way people vote might be missed and the sample could be deficient in this regard.

An even more serious flaw in the method of quota sampling is the fact that ultimately the choice of who is in the sample is left to the human element. Recall that other than meeting the quotas the interviewers were free to choose whom they interviewed. Looking back over the history of quota sampling, one can see a clear tendency to overestimate the Republican vote. In 1936, using quota sampling, Gallup predicted the Republican candidate would get 44% of the vote, but the actual number was 38%. In 1940 the prediction was 48% and the actual vote was 45%; in 1944 the prediction was 48% and the actual vote was 46%. But in spite of the errors, Gallup was able to predict the winner correctly in 1936, 1940 and 1944. This was merely due to luck -- the spread between the candidates was large enough to cover the error. In 1948 Gallup, and all the other pollsters, simply ran out of luck. It was time to ditch quota sampling.

The failure of quota sampling as a method for getting representative samples has a moral: Even with the most carefully laid plans, human intervention in choosing the sample is always subject to bias.