20.11.15

Putting the cart before the horse?


Forecasting in social sciences is one of the most difficult and unsatisfactory endeavors that one can imagine. Probably the most challenging exercise in this respect is the forecast of electoral results… Well, this is the basic purpose of our webpage FiveFiftyEight.  The purpose of this blog is to keep an updated record of the electoral prospects of political parties during the 2015 Spanish General Election campaign, and the description of the methodology of our calculation.

The name of the webpage refers to FiveThirtyEight, a popular blog and forecasting model developed by Nate Silver. Silver has developed a highly successful methodology to forecast the results of US elections. In fact, he had predicted successfully the outcome of 49 out of 50 states in the Presidential elections of 2008. Silver predicted every state correctly in the US Presidential election of 2012, even though most of the media claimed that the race was tied. Nate Silver is also the author of the best seller “The signal and the noise”.

Our methodology shares many features with Silver’s method, but also with other statistical approaches to electoral predictions like for instance Votamatic. The method rests on two distinctive elements: what we are going to call the fundamental model and a method to aggregate polls. The final estimation synthesizes the fundamental model and the polls. Both components are needed since polls by themselves, even if averaged, have biases and are less informative than one would hope (as shown recently by the forecast failures in the UK general election or Argentina). On the one hand the fundamental model is used to represent voting behavior and its evolution over time. On the other hand, the polls reflect voting intentions. They are aggregated with weights that depend on many factors and, especially, the forecast record of each pollster. Our approach  is also characterized by the probabilistic nature of Siver’s. We use Bayesian hierarchical models to synthesize data at provincial, regional and national level with polls. We use Bayesian updating to refresh the estimation over time when new information (for instance a new poll) becomes available. 

However, the specific methodology we use to perform these tasks is quite different from Silver’s approach. We do not perform ad hoc corrections to the statistical model, instead our method provides the weights to apply to the mixture of the forecast from the fundamental model and the aggregation of the results of polls (what Siver refers to as the “now-cast” or “snapshot”). Those weights change over time depending on the information content of the fundamental model and the polls updates. There are many other differences with respect to Silver’s methodology that we will uncover in future posts.

In any case there are enormous differences between the forecasting of US Presidential elections and the prediction of Congressional elections in Spain. In the US Presidential elections electoral constituencies are typically contested by two parties only (the Democrats and Republicans) in a  “winner take all” system for the Electoral College. In the Spanish Congressional election there are many parties, and D’Hondt’s system to determine seats. There are also important differences in the data available for prediction. In US a long time series of electoral results is available whereas only a handful in Spain.  If this was not difficult enough as it is, in the 2015 Congressional election we have an additional problem: new political parties  (C's and Podemos) with important support. This implies that the short time series generated since the return of democracy is actually shorter. Finally, in the US there is at least a very large poll that provides representative results at the level of electoral districts. This is unfortunately not the case in Spain. This makes the combination of national polls and a fundamental model at province level of our approach all the more relevant.

The reader may think that we are just providing excuses in advance in case our forecast is not successful. It may look like putting the cart before the horse, as we say in Spain “ponerse la venda antes de la herida”. This is not the case. We just wish to provide the readers with some understanding of the significant challenges of forecasting electoral results, and the additional -and to an extent unique- challenges of doing this  in the context of the Spanish Congressional election of 2015.