First, the news clips. At The New Yorker, John Cassidy digs further into the question of data journalism vs. data punditry, and cites PEC favorably. He thinks data journalism is at its best when it isn’t trying to make predictions, but helps us understand what is happening now. I mostly agree with that, though I do think future predictions can be useful if they are transparent and put the assumptions on the table where we can see them.
Also, some long perspectives from Scott Lemieux at the New Republic and from me at the Daily News on Trump’s long odds. Based on a current margin of Clinton +7%, I put Clinton’s poll-based November win probability* at 70%. That’s not taking into account my observation yesterday that Trump’s ascent shows that “The Republican Party is broken. It probably broke slowly, from 1994 to 2014.” This is addressed in part by a long analysis piece by Patrick Healy and Jonathan Martin in today’s NYT.
Now let us turn to a recent offering by FiveThirtyEight. They gave income information on Trump voters (which is good data journalism practice!) – and then created a false impression that Trump voters are well-off (which is questionable data punditry). Let me explain.
First let me praise FiveThirtyEight for showing the data, which revealed the problem with their headline. I think that is good practice on their part.
Now to the claim.
Based on exit polls, Trump voters have a median income of $72,000, above the national median. Therefore, writes FiveThirtyEight in its headline and lede, these voters are more affluent than the press would have you believe. This claim has been picked up by USA TODAY and Money magazine. In the case of USA TODAY, the original analysis has been mangled somewhat, which can happen when the original article has a misleading headline.
However, the exit-poll statistics describe Trump voters from Republican primaries. Republican primary voters are not representative of all voters. For example, they are better off than Democratic primary voters; Trump, Cruz, and Kasich voters all have this characteristic. So the $72,000 figure is consistent with “Republican primary voters are better-off than Democratic primary voters.” Which is not news.
However, the same data allows a within-group comparison, which tells a different story. Trump voters have slightly lower median income than Cruz supporters…and a lot lower than Kasich supporters. I should point out that even this measure is hard to interpret easily, since each group of voters contains a mixed bag of different incomes. However, these numbers do support the idea that more Trump voters tend to have lower incomes than Republican voters as a whole.
Another problem with the income analysis is that Trump supporters and Cruz supporters differ in income by only $1,000. This is a very small difference. There are far better differentiators among the different types of voters. Here are two.
First, see this excellent piece by The Upshot’s Neil Irwin and Josh Katz, “The Geography of Trumpism.” Counties with Trump support correlate with counties where voters have less education, work in old-economy jobs, and when asked about their ethnicity say “I’m an American.” Irwin and Katz get their conclusions from a correlational analysis of many demographic variables. They are telling a deeper story with data than what the $72,000 income statistic appears to tell.
Another way to get a better picture comes form Google-wide Association Studies. Recall that Google search terms did better than polls in predicting primary outcomes. Those search terms tell a story that is far more like what The Upshot says than FiveThirtyEight:
These search terms, to the extent that we can interpret them, point toward Kasich voters being the most affluent of the three groups of Republican voters. This is consistent with the exit-poll data when viewed more broadly.