|
Please excuse the brevity . . . ;-)
Let me start by saying that I know who Nate Silver is. Anybody interested in Sabermetrics knows who Nate silver is and the guy is very bright. He was also recently on the Daily Show and was an interesting guest . . . though I will say he confirms too many stereotypes about statisticians. ;-) I think Nate made a mistake by getting into election predictions and especially made a mistake in thinking he could use an amalgamation of state and national polls to get something like the predictive power that he was able to achieve in his work in sports statistics. The very first thing that enters my brain when I go to his site and peek around is there is a seriously convincing illusion of precision. It's convincing because Silver is very smart; it's an illusion because his intelligence doesn't make the data work for him like he wishes it would.
Now onto the details. I'm starting with his most recent blog post, which I assume is the one Beach and GC were referring to. I will say that his commitment to maximizing use of all available data is commendable (yet not surprising for a metrics guy) and is not common among people who write about politics and elections. All you have to do is read this board to see that people cling to data that confirm their bias and are dismissive to data that don't. You at least get the impression that Silver is looking at everything . . . for a while.
He goes on to talk about how the state polls don't seem to jive with the national polls. He uses Georgia and Texas as primary examples that indicate that it would be hard to get a national "lead" for Obama if we look at the state polls. Disappointingly he never discusses the accuracy of the state polls (both of these have error margins greater than 4%), nor whether the methodology is consistent among state polls (it isn't) or consistent with national polls (it isn't). Read, if you will, the methodology for the state polling by the Texas Tribune poll. Without shifting criticism too deeply toward this one poll, you can get a very clear picture how the methodology differs from other polls via the use of matching, which has its advantages but can introduce a bias not present in simpler snapshot polls. The methodology for the Georgia poll is not spelled out in detail but it does not use matching. No mention of weighting is given with the poll results and SurveyUSA's Methodology page gives no more details on the sampling and weighting schemes. (This is a problem in itself by the way; unclear methodologies by the various polling sources.)
Eventually Silver comes out and says that he believes the state polls to be a better measure:
With that said, our decision to cast our lot mostly with the state polls is not arbitrary. In recent years, they've been a slightly more unbiased indicator of how the election will play out.
. . .
In recent elections - since state polling data became more robust - it's the state polls that have done a bit better. This was especially so in 1996, when national polls implied a double-digit victory for Bill Clinton over Bob Dole (and Ross Perot) but state polls were more in line with the single-digit victory that he actually achieved. In 2000, state polls provided an accurate portrayal of a too-close-to-call race, while national polls missed high on George W. Bush vs. Al Gore.
There have been other years like 1992 in which the national polls did a bit better. But on average since that year, the state polls have had a bias of 1.1 percentage points - half as much as the national polls, which have had a 2.1-point bias instead.
His decision that state polls are less biased is the result of looking at ten elections, of which six showed state polls to be more biased. Wait, what? Reread that if you like. He adds that the more recent polls have been more robust, which is a reasonable argument, and that three of the last five have shown more bias in national polls. So we're basing the choice of using state polls over national on the fact that in just over half of our sample of FIVE recent elections, the state polls have shown less bias. I probably don't need to tell you that his decision here is not a statistical one. There aren't enough data points to come to statistical conclusions and the observed difference is so marginal that I can guarantee he can't show statistically that the state polls have less bias. So we are left with this feeling that he has made a choice of one set of polls over another . . . I don't want to presume that he did this because those polls confirm his own bias, but the lack of a good technical reason for doing so makes you wonder.
My next observation is more general but is probably the most important: there is very little precision in election polls no matter how carefully and cleverly they are designed and executed. Surveys that don't need to make any presumptions about the honesty or intentions of the respondents are hard enough (and quite expensive) to do accurately. You'll note that we very often talk about "likely voters" and the pollsters feel like that is the most accurate predictor of what will happen when ballots are cast-and it would be, if we were getting an accurate assessment of who was really likely to vote. But if you read the methodologies (yes, plural-these vary as much as the statistical methodologies) for determining which respondents are "likely" you'll find that it does rely on people to be accurate and honest about their sincerity and intentions. This is significant when you're talking about polls that separate candidates by only a few percentage points- especially ones within the margin of error. So what we're looking at in a poll that advertises an error margin of +/- 4% may actually be a larger true error margin-and possibly one that is asymmetrical due to bias. In the end it just means that the best of intentions don't necessarily result in polls with high levels of predictability.
Silver combines these polls with varying methodologies, varying levels of accuracy, and varying propensities for bias into a model that spits out an overall probability of winning and various other predictions of state results and electoral vote totals. He has to subjectively choose how to weight the polls based on size, history, quality, etc., introducing his own methodological problems. And this is a guy who's really smart. . . the only dumb thing he does is to make this look like the whole thing is accurate. All that shows is arrogance and I don't begrudge him that, but unfortunately a lay audience is not going to dig below the surface and is going to accept his authority on the matter because he's smart. We can't help doing this and that's not Silver's fault.
It's not that somebody out there is doing this better than Silver (that I know of). It's just that Silver doesn't have this figured out either. He may well be right in the end but I am not convinced by his technical efforts to date.
In the end what will happen will happen and it doesn't matter what Silver thinks. I'd advise you all to not make such a big #badword#ing deal over it and insult one another repeatedly in the process, but I don't think that advice will be heeded.
|