On Sunday, the Seattle Seahawks made a grievously bad playcall, and Malcolm Butler won the Super Bowl for the Patriots, and millions of football-watching Americans reeled in reflexive disbelief at the sheer boneheadedness on display.
On Monday, just as reflexively, the smart-guy corner of the internet rushed to explain that well, actually, that decision wasn't as dumb as you think it is. They were wrong. It was dumb.
The attempt to argue otherwise exposed a sickness in sports analysis—one that leaves sports fans aggrieved at having to admit that Emmitt Smith is right and Nate Silver is wrong. Like many of the internet's more obvious maladies, this is probably Malcolm Gladwell's fault.
Over the last few years, a cottage industry has grown around the Gladwellian proposition that the truth is rarely simple, and a greater understanding can be—must be!—revealed through deceptively small changes in the way we view the world, perhaps with the aid of game theory or statistics. The call-and-response is ritualized, by now. A Professional Smart Person sees some knee-jerk reaction by the public, ducks into a spreadsheet, emerges with a surprising, insightful position that quells the idiots, and onlookers cheer while cracking wry jokes about the basic human condition of stupidity. (Burn him, he's a witch! or Behold, the dark arts of A TENTH GRADE MATH BOOK, CHRIST.) This is often deserved!
But the contrary-insight-through-statistics rubric has become a genre, and this genre has sprung an entire fleet of websites—oh, look, including this one—dedicated to complexity for its own sake, and all of this is very far removed from sports as they're played. The sophisticated sports analyst has learned certain truths: The on-field success or failure of an individual decision does not necessarily reflect its underlying soundness. Or: Too often, coaches choose a strategy to avoid blame, rather than to get the best chance at winning.
This leaves simple facts undervalued when something as dead-ass obvious as the Super Bowl playcall comes up. Here are some facts: The Seahawks had Marshawn Lynch. They were one yard away from winning the Super Bowl.
More facts: Marshawn Lynch is very strong, and exceptionally skilled. He averaged 2.96 yards after contact per rushing attempt this year—highest in the league, according to Pro Football Focus—and he converted 17 of 20 attempts on third- or fourth-and-short.
The odds of success were high had Seattle simply run the ball. The odds of success were also high if Seattle passed the ball—plays from the half-yard line are never disadvantaged—but the logic behind doing so puts emphasis on meta-analysis that ignores the primary goal of that moment, which wasn't to improve the odds of scoring a touchdown, but to actually score a touchdown.
Fundamentally, this is a category error. The object is not to maximize the probability of getting into the end zone or to maximize the probability of winning after a presumed play that would get the ball into the end zone; the object is to just score a touchdown. These are different things. Given all the moving pieces at the end of a football game, running a play and hoping for an incomplete pass in the hopes that it will shade your run-success probability up a few points over the following two downs is some subprime mortgage-level shit.
Case in point! Here's Ben Morris at FiveThirtyEight, giving words to the concept:
An NFL head coach's goal isn't to maximize his team's chances of scoring a touchdown on a given play; it's to maximize its chances of winning the game. That distinction seems to have gotten lost in all the rancor and rush to condemn Carroll.
He goes on to show that there might even be some statistical benefit to passing, since modern kickers are much better than the Win Probability models account for (though Morris's assumptions here don't seem to account for a defense selling out to make a turnover), and acknowledges that you're talking marginal improvements either way. But this is entirely the point! Presumed "marginal improvements" are the reason the pass was called in the first place, pushing aside the crucial task of actually scoring a touchdown.
(Another assumption in all of these pieces is that 26 seconds and one timeout is not enough time to run three plays if one is not an incomplete pass. But getting back to the line is much quicker in a goal line situation, and coming out of a timeout on second down, with another chance to regroup after their last timeout, the Seahawks had plenty of opportunity for the "call two plays in the huddle" move coaches always talk about—particularly since those plays could have been, "Marshawn runs right," or, "Russell bootlegs left." These are doubly moot points, anyway, since Belichick has already said that he'd have called timeout after a second down run, forcing New England to continue to defend the run and pass, regardless.)
Anyway. Here's another example—this one from Justin Wolfers at the New York Times—of someone presenting a strange appeal to game theory:
It's the biggest moment in your N.F.L. coaching career: 26 seconds remain in the Super Bowl, your team is 4 points behind, you have the ball just one yard short of the end zone, it's second down, and your team has arguably the N.F.L.'s best running back. What's your call? Run or pass?
Here's what I would do: Call a game theorist, someone who specializes in the branch of economics that analyzes strategic interactions.
From here, Wolfers offers some baseline game-theory handwaving—things like, If you pass 100 percent of the time, they will know you pass 100 percent of the time, as though this is at all instructive—and goes on to explain that, through the use of game theory, you make advanced calculations about your chances on run/pass based on ... rock-paper-scissors logic.
To see why, realize that Carroll and Belichick were essentially playing the football equivalent of Rock-Paper-Scissors. If Carroll will definitely play scissors, Belichick will respond with rock. The only way to make Belichick's job hard is for Carroll to make it impossible for him to guess what he will play next. And the only way to do that is for his strategy to appear random. After the game, Carroll suggested as much: "We went to three receivers, they sent in their goal-line people. We had plenty of downs and timeouts. We really didn't want to run against their goal-line group right there."
This is insane for a number of reasons—as mentioned above, outside the realm of abstraction, there is no rock-paper-scissors counter to throwing Marshawn Lynch into the line on short yardage. (As a great short-yardage back, he has the ability to defeat theory by imposing fact.) Rock doesn't beat scissors; rock desperately hopes it might have a two-in-10 chance of stopping scissors, rather than paper's 0-in-10 chance.
Even more grating is that throughout this post—and throughout the other posts like it—the argument totally ignores the option of a bootleg. A bootleg serves all of the keep-'em-honest requirements of Chapter 1 game theory without putting the ball in the air: The threat of Lynch makes the Patriots stack the middle, and Wilson rolls out around the undefended end.
The case for avoiding the pass isn't just reasoning from the actual result, either—the interception was catastrophic, but taking a sack at the 5- or 6-yard line, or getting a 10-yard holding penalty would also have been a disaster. Of 19 Seahawks plays inside the 3-yard line, 11 were runs by Lynch (for five touchdowns), and two more were bootlegs for Wilson for touchdowns, while the six pass calls went 3-for-5 for two touchdowns, but also a sack.
And while the Seahawks had shown the stack rub play early in the season—enough to be on Bill Belichick's demon radar—it's not nearly as natural and practiced a part of the offense as a Russell Wilson bootleg.
Wolfers goes on to explain that Marshawn's statistics don't mean he's good enough to be successful against a stacked goal line—though presumably he isn't talking about Lynch's yards-after-contact or success on short-yardage power downs—and that if there were truly an optimal choice, "Belichick would probably figure it out, and he would instruct his players to guard against the run. When most of the defenders focus only on stopping one running back, they usually succeed."
The thing about all of these pieces is that they actually do prove what they're trying to prove. They're just arguing an orthogonal point, that has nothing to do with football.
Here's Brian Burke of Advanced Football Analytics—a friend to the site and a sharp guy—trying to muster a defense of the call at Slate:
But an interception wasn't the only added risk of a passing play. There was also the possibility of a sack and higher probabilities of a penalty or turnover. There are any number of possible combinations of outcomes to consider on Seattle's three remaining downs—too many to directly evaluate. So I ran the situation through a game simulation. The simulator plays out the remainder of the game thousands of times from a chosen point—in this case from the second down on. I ran the simulation twice, once forcing the Seahawks to run on second down and once forcing them to pass. I anticipated that the results would support my logic (and Carroll's explanation) that running would be a bad idea. It turns out I was wrong. The simulation—which is different than Win Probability—gave Seattle an 85 percent chance of winning by running and a 77 percent chance by passing. It turns out the added risk of a sack, penalty, or turnover was not worth the other considerations of time and down.
I emailed Burke to ask about the simulation as it regards Lynch. "The general/public model is for baseline teams," he wrote back, "but that's not far off at all from a model that accounts uniquely for Lynch. He's great, but he's only slightly better on any one play or handful of plays than other good RBs. (I think he totaled -1 yards from the 1-yard line this season.) It's that a talented guy like him get lots and lots of opportunities, and his slight advantage accumulates over the course of a game to become what we know as Beast-mode."
Burke's piece is the best executed of the bunch, and seems correct in its observations (though I'd quibble with Burke's opinion of Lynch on the short yardage, since the 1-yard line isn't fundamentally different from the above numbers from the 3-and-in and on short yardage). Even this one, though, in arguing toward that narrow sort of mitigation—Clearly this was a bad decision, it says, but there was not a huge drop in scoring or win probability either way—actually proves the global point: There was no good reason to pass, not even a marginal gain.
And that's the point here. This isn't about whether it was the most actuarially depreciative call in recent memory—the values have been run down about as well as you'd need—but about the Seahawks bungling a drop-dead simple choice in the most important moment imaginable. "B is not a tremendously greater risk than A, but certainly it's riskier and there is no tangible benefit to be had in the task at hand" is the claim. The Seahawks went with B. It was every bit as dumb as you thought it was.