A Computer’s Coming for My Job

TOWER GROVE — Over the past couple weeks, I’ve spent some time updating the book I wrote on the Cardinals for a new edition that will include the 2011 World Series and Tony La Russa’s retirement. The final update was completed last week from a hotel room in Curacao, which is fitting because some of the previous book was written off the coast of Greece. The book is more well-traveled than it deserves.

The trick with the update was to write something I’ve written dozens of times before — hey, did you hear that the Cardinals came back from 10 1/2 games back in August? No, seriously, it happened! And Game 6, wasn’t that great? — with fresh sentences. I kept stumbling into familiar phrases. Unforgettable. Underachievers. Unexpected. These were verbal blankies that I kept wrapping around my copy because they’re cozy and reliable. I was unable to avoid all of them because there are only so many ways to peel the same onion.

That said, I hope I was able to keep from sounding repetitive. The worst thing to do would be to become so numbed by writing about the events (again) that the emotion, thrill and drama is sucked out of the description. The writer has to remember the reader may be visiting these stories for the first time, and the writer owes that reader the same verve.

Or else, the story will seem … well, robotic.

There’s that word again.

For the past six months or so, I’ve been intrigued by this growing  notion that a computer will someday be able to do the job of a sportswriter. Some believe these “robot journalists” are already capable — and some web sites are employing a program to recreate game stories. The Wall Street Journal had an article this week about a “jobless recovery” for the economy, where robots are filling more and more positions once held by organic life. It would have been so much easier to plug the book’s manuscript into a program and have it spit out the updates. A computer can read the box score from Game 6 and connect the dots. A computer can identify that La Russa’s record needed to be updated. A computer can rewrite the section on the longest homers at Busch Stadium III to reflect the two moonshots hit there this past season. That would have erased my concern about sounding repetitive in my descriptions.

It wouldn’t have solved my issue with sounding lifeless.

It would be lifeless.

Almost a year ago, Deadspin came upon an article about a college baseball game that didn’t mention a perfect game had been thrown. Deadspin’s writers assumed the story had been written by a computer because of the gaping hole in the coverage. Alas, it was written by a person, but Narrative Sciences, the leader in the field of algorithm journalism, took Deadspin’s find as a challenge: would a computer misfire on a perfect game? Would the historic importance of retiring all 27 batters faced be lost on a program? At The Next Web, there’s a recount of the story that the program produced:

Tuesday was a great day for W. Roberts, as the junior pitcher threw a perfect game to carry Virginia to a 2-0 victory over George Washington at Davenport Field.

Twenty-seven Colonials came to the plate and the Virginia pitcher vanquished them all, pitching a perfect game. He struck out 10 batters while recording his momentous feat. Roberts got Ryan Thomas to ground out for the final out of the game.

Tom Gately came up short on the rubber for the Colonials, recording a loss. He went three innings, walked two, struck out one, and allowed two runs.

Kudos to the computer for not ignoring the perfect game. That alone gives this round to the robot journalist, as NPR argued. But the computer gets cocky and becomes preoccupied with the perfect game. Within the first three sentences, the perfect game is not only referenced four times but we also are told exactly what a perfect game is (“Twenty-seven Colonials came to the plate and the Virginia pitcher vanquished them all”) and that it is indeed a “momentous feat.” We are not, however, given any context. Was this the first time in the pitcher’s career he had thrown one? Was it the first in school history? This is not a good article. It gets one fact right and everything wrong. It is bogged down by repetition, unnecessary explanations, and, really, who uses “vanquished” with a straight face anymore? Excuse me, a straight emoticon. Deadspin, NPR and others seems content to recognize that the computer mentioned the perfect game without asking the seminal question all articles must answer: Can it be published?

The writer at TNM reads the above robo-written article and calls it “impressive.”

“Sports reporters should have real cause for concern, it seems,” he writes.

And he’s right.

But not for the reason he thinks.

We’re not in trouble because someone has written a program that can vomit out a game story with all of the info in a box score and lace it up with hyperventilating verbs. That doesn’t scare me. Bring it on, HAL. We’re in trouble because a fellow writer thinks it’s good. That’s a real cause for concern. When the consumer — heck, when somebody in the business of writing, a peer — sees the above game story and finds it acceptable, then we’ve lost ground. Skynet is upon on. There will be no demand for quality coverage or quality writing in the sports page when a reader like this cannot be bothered to recognize it.

Allow me to hop up on my soap box, because I’ve written about this before, back in September, on Tumblr and really it’s a much better fit for what I’m trying to do here …


ST. LOUIS – There was a legend when I was a young baseball writer that eventually every reporter covers enough games, sees enough results, and writes enough copy that he could keep a Rolodex of gamers. They could be indexed by outcome (Rout, seven runs or greater; Walk-off), theme (Injury, back from; Redemption, veteran) or feat (Home Runs, three hit; Shutout, one hit allowed). Simply spin the directory, thumb the appropriate gamer and fill in the new names, appropriate score, location and perhaps spruce it up with some hip new verbs. And, voila! -30-.

Far from being something to achieve, the Rolodex of gamers was a cautionary tale, something to work diligently to avoid before you became repetitive and obsolete.

Apparently the Rolodex is real. It’s coming for our jobs.

In Sunday’s New York Times, a business column by Steve Lohr explored the growing efficiency and effectiveness of “robot journalists,” or artificial intelligence programs that are writing – ahem, producing – articles. Lohr’s story focuses Narrative Science, a tech company in Evanston, Ill., home of that other top journalism school in America, Northwestern’s Medill School of Journalism. Narrative Science specializes in computer-generated content, and that content is getting less computer-like all the time. For his lede, Lohr quotes from an article written by the Narrative Science program. Of course, it’s a sports story.

“WISCONSIN appears to be in the driver’s seat en route to a win, as it leads 51-10 after the third quarter. Wisconsin added to its lead when Russell Wilson found Jacob Pedersen for an eight-yard touchdown to make the score 44-3 … . ”

Sportswriting is the natural place for the roboreporters to start their revolution. Games are easy to distill into numbers, right down to the integers that are the very definition of sport – the final score. What happened in a baseball game can be conveyed in a box score or strings of code that detail play by play. We are able to quantify everything these days – right down to the millimeter of break on Mariano Rivera’s cut fastball – and all that info plays right into the roboreporter’s wheelhouse. The computer doesn’t have any problem taking this data and transforming it into a paint-by-numbers game story that tells what happened.

The improving quality of articles from these AI programs prompted Businessweek.com to ask in August 2010, “Are Sportswriters Really Necessary?” The article, by Justin Bachman, used press releases from college sports information departments – again press releases, not game stories from beat writers; press releases! – to compare the flesh-and-blood copy against the J-bot’s. In the computer-generated story, the program writes (relatively speaking) about a college baseball game: “The Hawkeyes (16-21) were unable to overcome a four-run sixth inning deficit. The Hawkeyes clawed back in the eighth inning, putting up one run.”

“There’s no human author and no human editing,” Narrative Science’s CEO Stuart Frankel told Bachman more than a year ago. “But the stories sound really good.”

No, news flash, they don’t.

They sound formulaic. They sound stilted. They are, by the nature of their learning database, going to rely on cliché instead of toy with cliché. They are dull.

But, here’s the problem: It may not matter.

Narrative Science has 20 customers, according to The New York Times article. One of them is the Big Ten Network, which used computer-generated coverage from football and basketball games to update its Web site’s content. About halfway through Lohr’s chilling article is this heart-stopper:

Those reports helped drive a surge in referrals to the Web site from Google’s search algorithm, which highly ranks new content on popular subjects, (Big Ten Network official Michael) Calderon says. The network’s Web traffic for football games last season was 40 percent higher than in 2009.

Traffic rules. Clicks matter.

We have all been schooled in the importance of SEO, Search Engine Optimization. It is why on first reference the Cardinals are always the St. Louis Cardinals and never the Cards. It’s why online headlines seem so cumbersome, less conversational. Tags are way more important than the byline, but the byline can be a tag to make it easier for Google to find a specific writer. Now, here is a program that specializes in producing articles tailor-made SEO success. Writers with hearts have to force themselves to make headlines and sentences more SEO-friendly and, thus, draw traffic through Google and Yahoo! and other sites. J-bots do it innately. It is literally what they were created to do — cater to the clicks.

“The leaders of Narrative Science emphasized that their technology would be primarily a low-cost tool for publications to expand and enrich coverage when editorial budgets are under pressure,” Lohr wrote in Sunday’s Times.

Well, that’s comforting. So these automated writing programs are designed to help newspapers or news organizations that have dwindling travel budgets, limited ability to pay reporters overtime, shrinking staffs, reduced manpower on the copy desk and a readership that is ever-hungrier for free content.

That narrows it down to only every newspaper that, um, still exists.

As the news business changes and rolls with punch after punch, the hope is that not just any content wins, but quality content does. Several long-form sportswriting sites (Grantland, The Classical) are out to prove this (again). It’s not enough for a consumer to know the difference between a computer-generated game story and one written by a beat writer. That consumer has to value the game story written by the beat writer. The legend of the Rolodex wasn’t a goal, it was a reminder of something to avoid — no matter how many thousands of games you cover don’t fall into the trap of repetition. As the Narrative Science advances illustrate, any old program can retell What happened. It’s the other tenets of journalism that are missing. A deft beat writer can also explain Why it happened and How it happened. Context isn’t the computer’s strength. The two S’s are – speed and salary.

As journalism students at Mizzou, a few of us on the sports desk had a derisive name for stories that focused only on What happened and didn’t offer anything stylish, substantive or, you know, human. The name came from the small type that makes up box scores and it inferred that the writer just dropped adjectives, active words, names and transitions into the raw stats. We called these stories “Agate with Verbs.” It was a joke.

It’s not so funny anymore.


Social media has radically transformed how we do business. Twitter has not only made writers more accessible but they’ve made them more competitive and news ever more instant. A computer cannot work the lobby in Dallas for scoops from the Winter Meetings. A computer can stick a recorder in David Freese’s face and ask about the game-winning home run in the World Series, but it won’t understand what he means when he says he’s come a long way from “almost walking away.” Heck, a few human reporters didn’t catch the reference. Despite the rise of algorithm journalism and small corners of the world where the robot’s perfect game story is “impressive,” technology isn’t a threat to sports reporters. Technology only enhances our worth.

Are sportswriters necessary?

It doesn’t take a computer to come up with an answer.




Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s