Statistician with near-perfect election formula says prepare yourself for President Trump

Statistician with near-perfect election formula says prepare yourself for President Trump
Response: Bill D. Herman

The degree of certainty claimed here is, on its face, absurd. It’s hard to find contests in _any_ arena with a 97% level of certainty in the outcome.
Consider the NBA, where the Golden State Warriors are on track for the best regular season record in league history, and the Philadelphia 76ers are almost the mirror image at 8-51. These are not merely the best/worst teams _this_ year. Decades from now, these teams will still be discussed as among the best/worst *of all time*. These teams played on January 30, and the money line put the odds of a Warriors win at about 97%:

This was for a game with very well-known rules, a very controlled environment, and two countervailing historical outliers creating a mismatch of epic proportions. It’s also a highly knowable environment with good, highly precise quantitative data. The sample sizes range from quite good (games played per time by that point in the season, career stats and trajectories for individual players) to pretty darned large, at least by pre-“big data” standards (every NBA game ever played). You add up all of that, and you have a future event that really can be predicted with 97% accuracy.

NONE of these conditions hold in a presidential election. Our entire database of events (past presidential elections) numbers in the dozens. The rules (formal and informal) change at least a little every time, and many of them (e.g., the Republican candidate must be strongly pro-life) seem like they won’t apply or matter as much this time.
“Previous presidential candidates like Trump” is an empty set, so you can’t model it. We’ve never had the voting public of one party refuse to be corralled by party leadership to anything like this extent in my lifetime — which is roughly the era of modern funding rules.

What happens when the existing political leadership in a party splits on “their” candidate — or as may happen, actively undermines him? We don’t know because it’s never happened. The sample size for that variable, at least, is zero. His model is based on whether or not there are hotly contested primaries of the incumbent party, but (and I’ll grant that this usually holds) contested primaries in the challenger party are normal and healthy. Does the 2016 Republican primary look normal and healthy to the party?
What happens if Bloomberg runs? (Viable 3rd party candidate, modern rules: Count those events on one hand.) How about if Trump is subpoenaed and forced to talk in open court about how Trump University was a big scam? That’s also not in the model.
Certainty increases as sample sizes increase, but his model only goes back to 1912. That’s a couple dozen elections. The margin of error on any study with an n of 25 is so high that no reasonable claim of certainty is possible. Once we have other valid data with much larger sample sizes — repeated polling in Ohio in September, say — then talk to me about relative certainty.
For comparison: Nate Silver may be too modest in his error estimates, but he only gives Clinton a 98% chance of beating Bernie Sanders in _Arkansas_ on Tuesday. (You know, the first place she got to be the First Lady?) With about 3 weeks before the 2008 election, he still gave John McCain a 5.9% chance of winning.
By that point, our financial system was toast — which played substantially into Obama’s hand. We had tons of good, recent polls about these two specific candidates after all the debates and most of the ads had come and gone, and McCain was repeatedly down by about 7 points. But one candidate in the 20th Century (Reagan) came back despite being behind by a similar margin in a later poll (let’s not talk about why), so there was still a chance for McCain, but it was slim indeed. There was still time for stuff to happen that could change the outcome.
Again, none of these conditions hold for the 2016 presidential election as of now. The economy is growing just slowly enough that it’s more of a Rorschach test than a huge factor for either party. Much more importantly, we have front runners but not yet nominees. The conventions and debates still have to happen. Almost a full year of news and political ads and sound bites and ground game moves still have to happen. Which is why one-on-one polls about hypothetical general election match-ups are pretty much useless right now.
There are far, far too many variables in play (all of the above) that are simply unknown to assign anything like 97% certainty to any model right now.
Anybody who tells you they’re 97% certain about November is full of crap. He’s published some solid work on statistics and political science, I’ll give you that, but he’s clearly drunk his own Kool-Aid on this one — and the press is all too happy to drink it right along with him.