Can Sports Show The Way To Smarter Voting?

Barry Petchesky|published: Wed 15th May, 13:11 2019

credits: Benjamin Currie | source: GMG

Electoral reform is coming to America. This year in MLB, a simple first-past-the-post voting system will no longer determine all-star starters. Fans will instead select three finalists in a Primary Round and then vote from among those finalists on a single Election Day.

Favoritism has plagued MLB all-star voting for decades. Some fan bases ( the 1957 Reds, 2001 Mariners, and 2015 Royals among others) have taken this idea to the extreme. Sportswriters use unworthy starters as proof that experts need a say. Experts are also imperfect, though, as coaches awarded Rafael Palmeiro the 1999 Gold Glove despite playing 135 games at DH.

In reality, the problem was never the 20 percent of voters who preferred David Bell in 2001 or the 25 percent of voters who wanted Omar Infante in 2015. The problem was a voting system that declared candidates to be the people’s choice even if 75 or 80 percent of voters considered them a joke. Major League Baseball identified the source of the issue and the new system should work relatively well. The biggest remaining question is whether MLB Advanced Media, the largest born-and-bred tech startup in New York City, can limit the number of illegal votes.

If MLB wishes to further improve the selection process they can take advantage of centuries of research on collective decision making. Mathematicians and economists in the field of Social Choice Theory have analyzed how to minimize collective disappointment when a group decides to elect a new leader, award a prize, or choose a restaurant for lunch.

The following systems have not only been studied by some of the greatest minds in human history but have also been applied in the sports world.

Six Popular Voting Systems

Borda count is often used for end-of-season awards like MVPs and collegiate polls. Points are awarded as a decreasing function of rank. In AP polls, for example, a first-place vote is worth 25 points, second place is 24 points, and so on until 25th place is worth a single point. Candidates are then ranked based on total points earned.

Approval voting is used by the baseball Hall of Fame. Voters provide an unordered list of as many acceptable candidates as desired. The winner is the candidate on the greatest number of ballots. For the Hall of Fame, the BBWAA uses a modified version that limits voters to 10 choices and elects anyone appearing on at least 75 percent of ballots. From the 13th century until Napoleon’s conquest in 1797, Venice used approval voting to elect its Doge in a complicated system meant to ensure that the leader would be acceptable to the largest number of Venetians.

In the Condorcet method, the winner is the candidate that would defeat any other candidate in a one-on-one matchup. Although this method doesn’t always settle on a single winner, figure skating added a tiebreaker and used this method from 1998 until the 2002 Olympic vote-rigging scandal led to widespread changes.

Score voting is often used in events where contestants perform in order, like the NBA dunk contest, gymnastics, and diving. Voters award each candidate some number of points and the candidate with the most total points wins. Honeybees also use a form of score voting involving complicated dancing rituals to choose the best site for a new hive.

Instant runoff voting is used in a growing number of locations including non-Presidential elections in Maine. Voters provide a ranking of their favorite candidates, and if a candidate has a majority of first-place votes they immediately win. Otherwise, the candidate with the fewest first-place votes is eliminated. This process repeats using the top remaining choices on each ballot until one candidate wins with a majority of votes. Host cities for the World Cup and the Olympics are decided by exhaustive ballots, a process that is identical except that voters can change their mind between each round.

First past the post (or plurality voting) is used by the NFL to select its MVP. Voters choose one candidate and the candidate with the most votes wins. Most countries use first-past-the-post to elect leaders because it is the simplest system to understand and to implement.

As the following scenarios illustrate, many odd outcomes are possible when combining multiple preferences into one. Some of these outcomes are the result of a bad system, while others are fine methods that just weren’t well-explained, thus decreasing confidence in the system.

The 2012 Cy Young Award winner... in this reality. credits: Jared Wickerham | source: Getty

2012 AL Cy Young

In the 2012 AL Cy Young race there were 28 ballots cast. David Price and Justin Verlander were by far the most popular candidates. Thirteen ballots had Verlander first and Price second, while 12 ballots had Price first and Verlander second.

It was the other three ballots that decided the election: One voter who thought that Fernando Rodney was the best pitcher in the American League, ahead of Verlander and then Price, and the two beat writers for the Angels who put hometown star Jered Weaver in second place, behind Price.

In reality, the BBWAA’s method—a Borda count—awarded David Price the Cy Young because Weaver pushed Verlander two spots behind Price instead of one. Based on first-past-the-post, Price still would have won because of the Rodney voter who pushed Verlander down to second.

But instant-runoff voting would have declared the election a tie, as exactly half of voters preferred Price over Verlander, and half Verlander over Price. In situations where only two candidates have a real chance of winning (like most general elections in the U.S.), instant-runoff allows voters to express their support for a third option without fundamentally changing the two-person race—if used in American Presidential elections, it would do away with the phenomenon of third-party candidates affecting the race.

An important aspect of any voting system is the probability of strategic voting. If you were a voter who considered Justin Verlander to be the best pitcher in the league and David Price the second-best, you’d place Verlander first on your ballot, but where should you place Price? The instinctual answer is in second, because that makes your ballot an accurate reflection of your beliefs. But if you truly think Verlander deserves to win, shouldn’t you fill out the ballot that would make this outcome most likely? That would mean leaving Price off your ballot altogether.

If a single voter had left Price off the ballot—in other words, had voted strategically in favor of Verlander—Verlander would have won.

In the world of sports, using strategic voting to guarantee the correct outcome is generally frowned upon. Outside of sports, even though the ethical question is identical, voting your conscience instead of for the lesser of two evils is not always viewed so positively.

The Gibbard–Satterthwaite theorem proves that strategic voting is inevitable in a non-dictatorial rank-order system with more than two candidates. But some systems can minimize the impact of strategy while others “make an election more of a game of skill than a real test of the wishes of the electors,” as Charles Dodgson (the mathematician better known as Lewis Carroll, author of Alice in Wonderland) noted.

The Borda count is especially susceptible to voters and even candidates gaming the system—put your choice first, and leave off their real competitors altogether. As Jean-Charles de Borda admitted, his system is “only intended for honest men!”

2016 AL Cy Young

If the BBWAA used first-past-the-post in 2016, then Justin Verlander would have easily won the, AL Cy Young with 14 first-place votes compared to just eight for Rick Porcello.

Of course, that’s not how it happened in the real world. Porcello won the award via a Borda count, because more voters had him ranked above Verlander among their non-first-place votes. Additionally, Porcello would also have been a Condorcet winner, because a majority of voters preferred him to each individual alternative. Instant runoff also would have picked Porcello because he was second on the ballot of every Corey Kluber and Zach Britton voter.

An interesting theoretical scenario is what would happen if we divided the voters into two groups of 15. Instant-runoff does not satisfy the consistency criterion, so it is possible to gerrymander the voters so that both groups declare Verlander the victor, even though he loses when the groups are combined.

credits: Robert Wilcox

Notice that Rick Porcello would be eliminated first via instant-runoff in Group 1 even though he would be a Condorcet and Borda winner in that group. Instant-runoff is an improvement over first-past-the-post (because a good candidate is only eliminated if they are in third place or below), but a candidate with broad support can still lose.

In reality, instant-runoff would not make gerrymandering worse—the constraints on redistricting advantages are legal rather than mathematical. The bigger impact of the inconsistency is the difficulty of providing an accurate running total of votes in the midst of the election because precincts cannot simply report the number of votes for each candidate.

Imagine if the BBWAA had released Group 1’s instant-runoff results first and Verlander knew that he would receive majority support from Group 2. Rick Porcello might have given his concession speech if he did not realize that he could pull ahead of Britton and then defeat Verlander head-to-head.

The need for up-to-the-minute results leads news sites to display first-past-the-post results, which inevitably leads to misleading graphics.

credits: Robert Wilcox

The above table is exactly how most news sites would have presented the 2016 Cy Young results if they had used instant-runoff. If the goal is to stir up controversy for cable news then it is a great format, because it makes no attempt to explain why Porcello won. Some websites at least sort by the final result and include the instant-runoff totals, but a Sankey diagram is a better representation of the complete process:

credits: Robert Wilcox

Final 2017 AP College Football Top 25 Poll

After the 2017 college football season, the AP used its usual Borda count to rank LSU 18th with 368 points, Mississippi State 19th with 359 points, and Stanford 20th with 336 points.

credits: Robert Wilcox

But what if the AP issued a Top 20 ranking instead of the Top 25? Naively we would expect the top 20 teams to remain unchanged. And yet, Stanford would have leapfrogged to 18th, pushing LSU to 19th and Mississippi State to 20th.

credits: Robert Wilcox

One flaw in the Borda count, and thus in the AP poll, is that it does not perform as well near the bottom. Every unranked team is equal in its eyes, so a Borda count becomes closer to first-past-the-post further down the rankings. For example, Safid Deen listed Mississippi State 21st and LSU 22nd but did not rank Stanford. Under a Top 20, Deen’s ballot would have treated all three teams as equal despite his belief that Stanford was decidedly inferior.

You might say the difference between 18th and 20th is meaningless, but the difference between a ranked team and an unranked team is an important distinction. The AP could improve its Top 25, and better measure the will of voters, by asking them to rank 30 teams.

For a new hypothetical, what if the top 17 teams in that 2017 poll were all suddenly ruled ineligible?

credits: Robert Wilcox

Again we would expect that 18th through 25th would stay in the same the same order, making No. 18 LSU the national champs. And again, we’d be wrong. This time it is Mississippi State who jumps LSU to become national champions.

As we see below, Stanford won the most first-place votes and thus would have won using first-past-the-post. But Mississippi State was second or third on many ballots and thus would have won with instant-runoff voting. We can see in the diagram below how NC State and USF supporters largely shifted to Mississippi State rather than to LSU or Stanford.

credits: Robert Wilcox

The main advantage of the Borda count is the possibility of expressing a strong preference for one candidate. But it’s not without its flaws. Let’s go back to the original Top 25 as it actually happened. LSU benefitted from voters placing Michigan State and Northwestern ahead of Mississippi State. This scenario, where two Big Ten teams affected a comparison between two SEC teams, is the core issue of Arrow’s impossibility theorem. Kenneth Arrow, winner of the 1972 Nobel Prize in Economics, proved that any voting system using rank-order ballots will have one of three problems: a consensus choice may lose, a dictator will choose the outcome, or irrelevant alternatives—like our Big Ten schools—can change the outcome.

2017 NL MVP

In 2017, Giancarlo Stanton defeated Joey Votto for NL MVP by the slimmest of margins, just two points, using the Borda count. But using first past the post or instant runoff, Stanton and Votto would have tied for first.

The presence of other players helped Stanton rise above Votto, so what would have happened if MLB announced the voting results in stages, as more and more players were considered? The following charts show what the rankings would be before either of the Rockies’ candidates were ranked, and then the actual rankings.

credits: Robert Wilcox

This is a very specific hypothetical, and it would be very strange for MVP voting to be announced in this manner, so it’s understandable that Joey Votto was unaware until this very moment that Nolan Arenado and Charlie Blackmon cost him the 2017 MVP. Figure skating, however, uses exactly this approach of adding skaters to the rankings after each performance.

1995 Women’s World Figure Skating Championship

At the 1995 World Championships, Nicole Bobek was in second and Surya Bonaly in third with only a 14-year-old Michelle Kwan left to go. Kwan skated well enough to finish in fourth place, but who won the silver medal? Bonaly, who leapfrogged ahead of Bobek despite both women already having skated.

credits: Robert Wilcox

Although the exact scoring system is complicated, we can see that Kwan’s performance in the free skate pushed down Bobek, but not Bonaly. Before Kwan skated, judges were unable to express a strong preference for Bonaly’s free skate. The final decision was likely the best reflection of what the judges wanted, but the result seemed less definitive because of the late-game change.

After a similar occurrence at the 1997 European Championships, the International Skating Union decided to act. They settled on the one-by-one system based on Condorcet’s method of pairwise comparisons and very publicly declared to have eliminated flip-flops. However, the OBO method they chose does not solve that problem.

The results below are from the top 11 skaters at the 2002 Olympic women’s short program. Each cell represents the number of judges who prefer the skater on the left to the skater above. There were nine judges, so one skater was considered victorious over another if at least five judges preferred their performance. We see that Kwan was preferred over every other skater—unanimously, save for the slim majority who gave her the edge over Irina Slutskaya. Thus Kwan was a Condorcet winner.

credits: Robert Wilcox

Things get weird further down the table, where we see that Vanessa Gusmeroli defeated Julia Sebestyen, Sebestyen defeated Fumie Suguri, and Suguri defeated Gusmeroli. Such an outcome is a Condorcet cycle; voters had no clear preference among these three, and the system offers no way to eliminate the possibility of such a cycle. A tiebreaker is required when a cycle occurs, and these tiebreakers can still lead to flip-flops.

We’ll never know how the ISU would have responded to such an embarrassment because the only Olympics to use the OBO system were the 2002 games. Other problems with scoring—including judges getting caught rigging the outcome—forced an overhaul of the entire system.

Although the ISU chose this system for the wrong reasons, many great thinkers including Ramon Llull, Marquis de Condorcet, and Charles Dodgson have advocated the use of pairwise comparisons. Nobel Laureate Eric Maskin, aware of the importance of messaging, calls this system true majority rule.

Major League Baseball could use Condorcet’s method to pick the final winner of an all-star slot. It would be easy to explain that a player won because a majority of ballots preferred that player over each competitor. Since there will be only three candidates for each position, each voter would only need to list a favorite and runner-up. This method would force joke or homer candidates to have significantly wider support from voters in every fan base, not just their own.

Barry Bonds speaking in favor of first-past-the-post voting. credits: Lachlan Cunningham | source: Getty

2019 Baseball Hall of Fame voting

The main difference between approval voting and other systems is that there is no ranking or rating component; you either like a candidate or you don’t. Approval voting selects candidates that most voters like, even if they lack a base of especially passionate supporters.

Barry Bonds and Roger Clemens are polarizing figures, which ultimately hurts their chances in the current system. Some Hall of Fame voters consider them the most deserving candidates on the ballot, while others will never vote for them because of their PED usage. With first-past-the-post, they’d already be in Cooperstown: For example, more voters would likely list Bonds as the single most deserving HOF candidate than would list, say, Mike Mussina. Love-or-hate candidates perform significantly better under first-past-the-post voting (used in most American elections, remember) than in any other system. Whether that outcome is a feature or a bug is up for debate.

The Hall of Fame’s use of a 75-percent cutoff theoretically prevents vote-splitting concerns and allows each candidate to be considered separately from anyone else. If MLB were to use this same method, approval voting, to select its all-stars, it would need a certain number of finalists. Voters would then confront the issue of later-no-harm where their favorite candidate can lose because they also approved of a less-preferred candidate.

An emaple: Suppose a voter loves Nolan Arenado, likes Kris Bryant, and hates Johan Camargo. (Sorry, Johan.) Voting for both Arenado and Bryant is good if the decision comes down to Bryant and Camargo. But if the decision comes down to Arenado and Bryant then the vote for Bryant makes their vote for Arenado, their preferred candidate, worthless.

If, on the other hand, voters know who has a real chance to win—generally the case in Hall of Fame voting—or have weak preferences between approved candidates, then approval voting is both simple and effective.

There is a precedent for using approval voting to select all-star starters. In the old all-star voting system, three outfielders started the game. The ability of each voter to choose up to three outfielders was a variation of approval voting. In the new system, voters are only allowed to pick one catcher in the Primary Round, but the three highest-voted catchers will be selected as finalists; this is first-three-past-the-post. A voting system tends to perform best when voters can provide at least as many preferences as winners chosen—in this case, letting fans vote for three candidates at each position in the Primary Round.

1990 NBA Slam Dunk Contest

One advantage of score voting is that it is not a rank-order system, so Arrow’s impossibility theorem does not apply. Gymnastics and diving have never had the same controversies as figure skating because they use score voting.

In the NBA dunk contest, players receive up to 10 points from each of five judges after each dunk. The dunk contest generally works because every judge uses basically the same range: a good dunk is eight points, a really good dunk is nine points, and any great dunk is worth 10 points. This is the assumption upon which the dunk contest’s integrity, such as it is, relies.

Since the NBA realized that, in the finals, most judges would choose either a nine or a 10, it allowed judges to use tenths of points from 1989 until 1995. The lowest of 14 dunks in the last two rounds of the 1990 contest was a 47.2, which was an average of more than 9.4 from each judge. The decimals improved the rankings because there was more room for distinguishing among great dunks.

credits: Robert Wilcox

Judges get booed for low scores, but a rogue judge awarding anything less than a nine could have singlehandedly determined the winner. All that matters is the difference between two scores, so a judge that only awards scores from 9.1 to 10.0 is only using one-tenth of their possible power. This system only works as long as every judge can be pressured to follow the same unwritten rules of the contest.

How could the NBA improve the dunk contest? Allowing judges to give their ratings all at once would alleviate some of the issues. If judges got to see all the dunks before assigning scores, there would be fewer 10s.

If MLB were to use score voting in the Primary Round, fans would likely use the entire range and award zero points to indisputably undeserving candidates. This would be a good thing: If everyone uses the entire range of scores in a somewhat consistent manner, then score voting is a very effective method. Voters can indicate a strong preference for one candidate and a weak preference for another without the need to insert other candidates in between.

Citius, Altius, Borda count. credits: Lintao Zhang | source: Getty

Olympic host voting

The International Olympic Committee selects host cities by holding multiple rounds of voting and eliminating the lowest-scoring until one city remains. But if the IOC used a strict first-past-the-post system, the winner would quite often be different.

For the 2012 summer games, Madrid was the leader after two rounds but eventually lost out to London. Then for 2016, Madrid led after the first round only to lose to Rio. After being eliminated in the first round of 2020 voting, Madrid has taken a break from bidding.

For both the 2010 and 2014 Winter Games, Pyeongchang won the first round but then was eliminated in the second round. Pyeongchang finally won the 2018 games.

Beyond the IOC’s well-documented bribery issues, this system works quite well. But the method of elimination in each round does not need to be first-past-the-post: Since early rounds advance multiple candidates, asking for only the number-one choice can hurt cities that have broad support but few diehard fans. The IOC could use anti-plurality (like voting in Survivor), a Borda count, score voting, or approval voting in early rounds to ensure that well-liked candidates are not removed too soon.

Organizing five or more rounds of balloting is impractical when millions of voters participate. Instant runoff is a better alternative. Instant-runoff voting operates on the same principle as the IOC’s system, but voters provide all of their preferences at once. A computer then determines each round of voting by using the top remaining choice on each ballot.

The above scenarios only scratch the surface of Social Choice Theory. There are more than a dozen mathematical criteria to evaluate voting systems and it is impossible to satisfy every criteria. The relative importance of each criteria is a matter of opinion and heavily depends on what is being voted on.

If you want to have some fun play around with these interactive ballots or just look at some pretty pictures. If you prefer reading books, check out Gaming the Vote, Collective Choice and Social Welfare, Sports Math, Fair Division, and Chaotic Elections for a variety of investigations into voting systems and more.

There are several competing campaigns to reform the way people vote. Visit Election Science to learn why approval voting is the best, Fair Vote to learn why instant-runoff is the best, and Range Voting to learn why score voting is the best.

Most importantly, even a perfect “will of the people” is really the “will of the people who vote” so don’t forget to support your favorite MLB players in the Primary Round and again on Election Day.

Robert Wilcox is a freelancer based in South Carolina and a creator of the iOS game Sudoku Farm.