Rain, Trains, And Dead Kids: What To Put In Your Movie If You Want To Win An Oscar

Welcome to Dataspin, a weekly data visualization of whatever the fuck.

Accusations of "Oscar baiting" get thrown around a lot. As brilliantly parodied by Tropic Thunder, there's an accepted notion that movies touching on certain topics—like the Holocaust, mental illness, and class and race relations—are more likely to get recognized by the Academy thanks to their perceived seriousness. Does this really happen? Do certain plot points really dominate the Oscars?

To test, we took a look at the IMDB plot keywords associated with every film ever nominated for Best Picture, a total of 52,922 keywords (15,774 unique) across 494 films.* After removing some plot-unrelated tags (such as Title Spoken By Character), there were 310 keywords tagged in at least five percent of Best Picture nominees. Here they are, weighted by use:

Rain, Trains, And Dead Kids: What To Put In Your Movie If You Want To Win An Oscar

This isn't exactly we're looking for though. For example, Murder was one of the most common keywords, tagged in 150 nominees (30 percent), but it was equally common across all films. A whopping 13,109 films have this tag on IMDB, meaning that just 1.1 percent of movies that contain a murder become Best Picture nominees. This isn't really "bait."

So let's calculate it differently. Given a specific keyword, what are the chances of a movie with that keyword getting nominated for Best Picture? From the most common keywords above, movies with these tags got nods at least five percent of the time:

Rain, Trains, And Dead Kids: What To Put In Your Movie If You Want To Win An Oscar

The most Oscar baiting keyword was, somewhat surprisingly, Rainstorm. Forty-eight of the 362 movies with this tag—over 13 percent!—went on to be nominated for Best Picture. Here's the rest of your Top 10:

Rain, Trains, And Dead Kids: What To Put In Your Movie If You Want To Win An Oscar

Just looking at those top tags, you can definitely get a feel for the type of gloomy gravitas that generally catches the Academy's eye. This list is basically just a description of The Life of Emile Zola, which won in 1938. While early favorite "Holocaust" did not make the list—only 3.8 percent of Holocaust tags were nominated—Anti-Semitism did come out very high.

If we expand to the Top 100 bait tags, here's how this year's nominees stack up:

  • Les Miserablés - 5: Main Character Dies (5th), 19th Century (22nd), Redemption (53rd), Army (87th), Wedding (100th).
  • Beasts of the Southern Wild - 5: Face Slap (43rd), Crying (73rd), Swimming (82nd), Dancing (86th), River (96th).
  • Lincoln - 4: Speech (8th), 19th Century (22nd), Washington D.C. (26th), Assassination (63rd).
  • Django Unchained - 4: 19th Century (22nd), Racial Slur (29th), Racism (44th), Death of Friend (52nd).
  • Silver Linings Playbook - 4: Widow (48th), Tears (50th), Crying (73rd), Dancing (86th).
  • Zero Dark Thirty - 3: Controversy (30th), Death of Friend (52nd), Assassination (63rd).
  • Argo - 1: Washington D.C. (26th).
  • Amour - 1: Wheelchair (82nd).
  • Life of Pi - none.

It's no surprise to see Amour and Life of Pi toward the bottom, as these are arguably the least "Oscar-y" films of the year—and distant longshots to win. A bit more surprising is Beasts's tally, largely driven by some earthy, naturalistic keywords that the Academy seems to be into. We were also shocked to see Argo, a movie about making movies, pull only one. Perhaps a good sign for Lincoln?

Finally, here are the keywords (included in a min. of 25 nominees) that were least likely to result in a nomination, giving you a good sense of what sort of stuff the Academy generally looks down upon:

Rain, Trains, And Dead Kids: What To Put In Your Movie If You Want To Win An Oscar

This imaginary movie sounds like it'd do $80M on opening weekend.

*A caveat about this data: IMDB's database is enormous and uneven, and the number of keywords per movie varies quite a bit, with more notable films (like Oscar nominees) and more recent films generally getting more tags. The tags are also not doled out perfectly consistently—for example, some NYC-based films were tagged "New York City," others "Manhattan," and some both. While this isn't ideal, IMDB is still far and away the strongest source for this sort of information so we'll make do. Also, I somehow missed one movie and actually pulled data for 493 nominees. I do not know which one. I hope it was Crash. I do not believe that this substantially impacted the results.

Got an idea for the column? Email me.

Image by Jim Cooke.