The last time we played around with sportswriter analytics, we wondered if we could algorithmically determine a column's author based on his favorite words. (We could!) For a followup, I decided to look at the readability of different writers. Reading level is a nebulous concept and hard to define precisely, but we have a statistical measure known as the Flesch-Kincaid readability test to give us a rough benchmark.
Here's the formula:
The output of the test is a single value that can be interpreted as the "grade level" of a piece of writing—i.e. the lowest level at which a writer can be considered accessible. As you can see, the test assigns higher grade scores to writers who have words with lots of syllables and sentences with lots of words. It's an imperfect measure, but it's still a good way of assigning a value in an unbiased way. (The test is considered a relatively good indicator of readability; it's often used by government agencies to ensure that documents are accessible to the general public.)
First, I decided to see how the sports section compared with other parts of the newspaper. I chose The New York Times and compared the reading level of the sports columns to the writing in other sections.
|NYT section||Flesch-Kincaid grade level|
That seems right. The topics that people follow for fun and that rarely contribute to society in any tangible way (sports, arts, politics) produced the lowest reading level. I'd bet that the ranking above would hold for other newspapers, but not the grade levels. For example, the New York Post's sports section has a Flesch-Kincaid grade level of 7.2.
I also performed the test on a variety of sportswriters from a variety of publication. Below are the writers I chose; next to the writer's name is the media outlet from which I took a writing sample.
|Writer||Flesch-Kincaid grade level|
|Charlie Pierce (Boston Globe)||10|
|Jason Whitlock (Fox Sports)||9|
|Will Leitch (Deadspin)||8.9|
|Darren Rovell (CNBC)||8.6|
|Michael Wilbon (ESPN)||8.2|
|Bill Simmons (Grantland)||8.1|
|Tucker Wyatt (SI for Kids' "Kid Reporter")||7.2|
|Rick Reilly (ESPN)||5.2|
Looking at this chart, I really don't see any surprises. Charlie Pierce is known for his smart prose. Whitlock, Leitch, Rovell, Wilbon, and Simmons all bunch together, which makes a rough kind of sense (they're all popular columnists writing for large, diverse audiences). Tucker Wyatt, a 13-year-old "kid reporter" for Sports Illustrated for Kids, writes at a 7.2 grade level. Rick Reilly put up a 5.2. (For the record, the reading level of my previous HSAC articles checked in at 9.7. This one scored an 8.2.)
Before you go drawing any sweeping conclusions, a caveat: The Flesch-Kincaid formula is not perfect. It penalizes writers for using short, punchy sentences and small words. And we're not talking about the intelligence of a writer; we're talking about the level at which his prose is pitched. A writer with a broad following would almost certainly have a lower reading level.
That being said, the lowest reading-level result on any writing sample that I tested came from people I am certain are the dumbest writers on the internet: the folks who left comments on Yahoo! Sports articles. They wrote at a 3.3 grade level.
Regressing is a numbers-minded column by our clever friends at the Harvard College Sports Analysis Collective, a student club dedicated to quantitative analysis of sports strategy and business. Follow HSAC on Twitter, @Harvard_Sports. If you have any comments or ideas for future columns, email them to email@example.com. Email Ben Blatt at firstname.lastname@example.org. Image via Getty.