23.11.15

Aggregating polls: a tricky business (Part I)

The second component of our methodology is a model to aggregate the results of national polls. Hundreds of polls are conducted between two elections. We use all publicly available polls and some not not publicly available conducted by pollsters for companies that maintain their private electoral observatory. However, we do not include polls conducted by political parties since the house effect (see below for an explanation of this term) is likely to be very large. Figure 1 shows the polls conducted since the Congressional Election of 2011.

Figure 1.




The task of aggregating polls is more complicated than it looks. The simplest possibility would be just to average the latest period (one week, two weeks, one month). This local averaging, known as a moving average, might be carried out using overlapping or non-overlapping windows of time. Prediction can then be done under the assumption that there is not going to be a change in public opinion from that time period to the election day (or -probably more sensibly- using some time series prediction technique). This approach is followed, for instance, in Wikipedia and the mass media. The estimation in Wikipedia is based on moving average of 15 days. El Mundo allows visitors to chose the time window. We have collected a list of close to 350 polls conducted since the Congressional Election of 2011. Figure 2 shows the effect of smoothing using moving averages.

Figure 2. 




Using unweighted averages implies that all the polls during the period of reference have the same weight. It also implies that the quality of the polls in the window of time selected is the same independently of the pollster (“cooking” abilities may be very different among pollsters), the sample size/margin of error, or the method of surveying (CATI with landlines, mobile or both, personal interviews, etc). These are unlikely to be useful assumptions. In addition, to convert a poll average to a forecast you need to make corrections for the house effect, that is the tendency of a given pollster to produce numbers that systematically favor one party. We will tackle these issues in the next post.