A Vote For Roger Clemens Was A Vote For Barry Bonds: The Politics Of The Hall Of Fame Ballot, By The Numbers [UPDATE]S

Hall of fame ballots follow their own own internal logic. For instance, regardless of how they feel about steroids, almost all voters agree with both or neither of the following statements:

  • Barry Bonds should be in the Hall of Fame.
  • Roger Clemens should be in the Hall of Fame.

Bonds and Clemens aren't similar players, but the essential question about their candidacy is the same: "Should an all-time-great player be elected even if he took steroids?" Voting for both is like saying yes, voting for neither says no, and voting for one and not the other is akin to making a bunch of fart noises.

Thanks to some new data, we can quantify not just the players' performance, but also the relationships among the arguments about their performance. Leokitty assembled a spreadsheet of Hall of Fame ballots publicly released by BBWAA voters over the last five years, allowing us to see who voted for whom instead of just the raw vote totals.

Using that information, I found the relationships between a vote for one player on the ballot on a vote for every other player. You can view the results here.

The values refer to the correlation between the two players: for instance, in 2013, there was a .97 percent correlation between voting for Bonds and Clemens, by far the highest correlation on the list. Larry Walker and Jack Morris, at the other end of the spectrum, had a correlation of negative .24. Voters who thought Jack Morris was a Hall of Famer had completely different standards from the voters who thought Larry Walker was one. Other inimical pairings: Don Mattingly and Mike Piazza, and Don Mattingly and Roger Clemens.

I also correlated each player's votes with the total number of players, out of a possible 10, that each voter had chosen. Voting for Alan Trammell strongly correlated with voting for a higher number of players overall—suggesting that Trammell's supporters are the voters who believe in a more inclusive Hall. Update: Here's a heat map of the 2013 data, courtesy of reader MaltedLaurelBridge:

A Vote For Roger Clemens Was A Vote For Barry Bonds: The Politics Of The Hall Of Fame Ballot, By The Numbers [UPDATE]S

The correlations in bold are statistically significant at a 95 percent confidence level. Here are some more observations I made; feel free to add in the discussion below:

  • Anti-steroids voters also seem more likely to vote for older candidates. The age of the voters themselves might be the common denominator.
  • Craig Biggio and Mike Piazza had a strong correlation, which might imply that they were on the wrong side of the "Should I vote for a deserving player on his first ballot?" razor.
  • Larry Walker's candidacy distills the sabermetric argument. In 2013, he was strongly aligned with votes for Tim Raines, Edgar Martinez, Jeff Bagwell, and Trammell, all beloved by nerds, and was aligned against Morris, the foil for the last few years against more statistically deserving candidates.
  • Fred McGriff was negatively correlated with almost every possible steroid user in this year's balloting. Perhaps some voters are trying to find a clean, statistically similar alternative to Mark McGwire, Sammy Sosa, and Rafael Palmeiro. Frank Thomas will presumably have a similar pattern next year.
  • McGwire, Sosa, and Palmeiro have high correlations with each other, but it's likely that there are cases of voters choosing one at the expense of the others.
  • Bert Blyleven is the classic example of sabermetrics influencing Hall of Fame voters, but the data suggests that other forces are at work. He was strongly correlated with Andre Dawson and Morris, players disdained by the stats crowd, in 2010. This type of coalition might be what eventually gets Tim Raines to 75 percent.
  • Don Mattingly voters have no logic, internal or otherwise. Candidates who have little chance of being elected yet stay on the ballot like Mattingly, Tommy John, Dale Murphy, and Dave Parker tend to be correlated with each other and against easy choices like Rickey Henderson and Roberto Alomar. You'd think those voters would advocate for a larger Hall, too, but their vote-total correlations are among the lowest on the board.
  • Lee Smith remains a fairly popular choice, but he's had no significant correlation with any other player since 2012. This doesn't (necessarily) mean that a vote for Smith makes no sense. It may be that the argument about saves and relief pitching doesn't match up with other Hall of Fame arguments.
  • The next couple of years of balloting will introduce Greg Maddux, Randy Johnson, Pedro Martinez, Tom Glavine, Mike Mussina, and John Smoltz. The last few years have been hitter-heavy, and it will be interesting to see how voters choose to balance their ballots between offense and defense, not to mention how they choose to balance a ballot that has around twenty deserving candidates.
  • [Google Docs]