Thread by @DrewLinzer, This is a thread on presidential election forecasting, polls, and fundamentals, based [...]

This is a thread on presidential election forecasting, polls, and fundamentals, based on my research and experience forecasting the 2012 and 2016 elections.

Let& #39;s walk through some of the debates and why it gets contentious.

If we had to guess, right now, who& #39;s going to win in 2020, it& #39;s probably Joe Biden. Why? Because Trump is unpopular, the economy& #39;s terrible, and Biden is leading in the polls. https://projects.economist.com/us-2020-forecast/president">https://projects.economist.com/us-2020-f...

Forecasting the US 2020 elections

Right now, our model thinks Joe Biden is very likely to beat Donald Trump in the electoral college

https://projects.economist.com/us-2020-forecast/president

But that& #39;s a very rough claim. It doesn& #39;t tell us how certain we should be, or say anything about what states Biden/Trump is going to win. It& #39;s also not very rigorous -- what if only 1 or 2 of those conditions were true? Then what would we think?

Traditionally what election forecasters have done is looked at past elections and compared some measure of those years& #39; political/economic climate (the "fundamentals") to the election outcome. Then they extrapolate from those correlations to predict the current year& #39;s election.

For example, presidents who are more popular in June tend to get more votes in November. When second quarter GDP growth is higher, on average, the candidate from the incumbent president& #39;s party does better. https://twitter.com/DrewLinzer/status/1289290422754279424">https://twitter.com/DrewLinze...

https://twitter.com/DrewLinzer/status/1289290422754279424

This sort of analysis is useful from an academic standpoint where researchers are trying to test theories about why people vote the way they do. But from a statistical forecasting perspective, there are major challenges/problems with this approach.

For starters, there& #39;s hardly any historical data to extrapolate from -- typically fewer than 20 past presidential elections are considered comparable. With that small sample size, any forecasts are going to be very imprecise.

Second, it& #39;s not clear which fundamental factors are the best predictors, and there& #39;s not enough data to test or optimize this. A lot depends on the judgment of the analyst. Put simply, if you select different fundamental factors, you will get different forecasts.

In 2015, @benlauderdale and I did our best to address this in a research paper -- built a Bayesian model with lots of inputs, and still concluded: "Until more elections have been observed, our expectations about forthcoming U.S. presidential elections cannot be very strong."

One solution to this forecasting problem is to bring in another source of data: pre-election polls. Then the fundamentals can be used as vaguely informative prior expectations that get updated as more polling becomes available. I wrote about this in JASA https://votamatic.org/wp-content/uploads/2013/07/Linzer-JASA13.pdf">https://votamatic.org/wp-conten...

But the polls are no silver bullet either. Polling isn& #39;t available every day or in every state, and the polls have errors -- random and systematic. Now forecasters have to "fill gaps" in each state& #39;s polling AND determine how much info each poll should contribute to the forecast.

In practice, election polls contain a lot less evidence than indicated by their sample sizes, because of all the methodological constraints that go into conducting them. The real-world margin of error is about twice its theoretical margin of error. https://www.nytimes.com/2016/10/06/upshot/when-you-hear-the-margin-of-error-is-plus-or-minus-3-percent-think-7-instead.html">https://www.nytimes.com/2016/10/0...

When You Hear the Margin of Error Is Plus or Minus 3 Percent, Think 7 Instead

There are many ways, besides the well-known sampling error, to get things wrong in polling.

https://www.nytimes.com/2016/10/06/upshot/when-you-hear-the-margin-of-error-is-plus-or-minus-3-percent-think-7-instead.html

So average the public polls and the error goes away, right? I used to be a huge believer in averaging the polls. Then Wisconsin 2016 happened. There was zero evidence that Trump would win there. Take everything with a grain of salt. https://elections.huffingtonpost.com/pollster/2016-wisconsin-president-trump-vs-clinton">https://elections.huffingtonpost.com/pollster/...

2016 Wisconsin President: Trump vs. Clinton - Polls - HuffPost Pollster

Polls and chart for 2016 Wisconsin President: Trump vs. Clinton. See the latest estimates and poll results at HuffPost Pollster.

https://elections.huffingtonpost.com/pollster/2016-wisconsin-president-trump-vs-clinton

Some analysts try to estimate the direction of the error in the polls, and "correct" for it. With limited data, this is difficult too. The adjustments may seem sensible on their face, but a lot of times they only add complexity and unless done carefully, can make forecasts worse.

What election forecasters are left with is a bunch of uncertain fundamentals and polling data of uncertain quality.

The wisest approach may simply be to add more uncertainty in both the Democratic and Republican directions, recognizing the weaknesses of all of the evidence.

Forecasters differing on the "right" amount of uncertainty is how we ended up with 2016, where every major forecaster& #39;s Electoral Vote forecast was Clinton 323, Trump 215, but the chances of Clinton winning ranged from 71% to 99%. https://www.nytimes.com/interactive/2016/upshot/presidential-polls-forecast.html">https://www.nytimes.com/interacti...

2016 Election Forecast: Who Will Be President?

The Upshot’s presidential forecast, updated daily.

https://www.nytimes.com/interactive/2016/upshot/presidential-polls-forecast.html

Two identical forecasts can come up with very different probabilities of winning, by expanding the uncertainty around each point estimate. https://twitter.com/DrewLinzer/status/804824244748099584">https://twitter.com/DrewLinze...

https://twitter.com/DrewLinzer/status/804824244748099584

Election forecasters aren& #39;t oracles, they& #39;re analysts with systems for processing data. All else equal, good forecasters devise systems that are empirically defensible, robust to outliers, theoretically well-motivated, honest about uncertainty, and preferably mostly transparent.

Election forecasting serves a lot of important purposes. We& #39;re lucky to have a range of smart people doing it quantitatively, in rigorous but somewhat different ways. For all the challenges and debates, the alternative is pure punditry and guesswork, and that& #39;s way worse. /end

Latest Threads Unrolled: