<D <M <Y
Y> M> D>

Behind the Scenes of @RealHumanPraise: Last night I went to the taping of The Colbert Report to witness the unveiling of @RealHumanPraise, a Twitter bot I wrote that reuses blurbs from movie reviews to post sockpuppet praise for Fox News. Stuff like this, originally from an Arkansas Democrat-Gazette review of the 2006 Snow Angels:

There is brutality in Fox News Sunday, but little bitterness. Like sunlight on ice, its painful beauty glints and stabs the eyes.

Or this, adapted (and greatly improved) from Scott Weinberg's review of Bruce Lee's Return of the Dragon:

Certainly the only TV show in history to have Bill O'Reilly and John Gibson do battle in the Roman Colosseum.

Here's the segment that reveals the bot. The bot actually exists, you can follow it on Twitter, and indeed as of this writing about 11,000 people have done so. (By comparison, my second-most-popular bot has 145 followers.) I personally think this is crazy, because by personal decree of Stephen Colbert (I may be exaggerating) @RealHumanPraise makes a new post every two minutes, around the clock. So I created a meta-bot, Best of RHP, which retweets a popular review every 30 minutes. Aaah... manageable.

I figured I'd take you behind the scenes of @RealHumanPraise. When last we talked bot, I was showing off Col. Bert Stephens, my right-wing bot designed to automatically argue with Rob Dubbin's right-wing bot Ed Taters. Rob parleyed this dynamic into permission to develop a prototype for use on the upcoming show with guest David Folkenflik, who revealed real-world Fox News sockpuppeting in his book Murdoch's World.

Rob's original idea was a bot that used Metacritic reviews. He quickly discovered that Metacritic was "unscrapeable", and switched to Rotten Tomatoes, which has a pretty nice API. After the prototype stage is where I came in. Rob can code--he wrote Ed Taters--but he's not a professional developer and he had his hands full writing the show. So around the 23rd of October I started grabbing as many reviews from Rotten Tomatoes as the API rate limit would allow. I used IMDB data dumps to make sure I searched for movies that were likely to have a lot of positive reviews, and over the weekend I came up with a pipeline that turned the raw data from Rotten Tomatoes into potentially usable blurbs.

The pipeline uses TextBlob to parse the blurbs. I used a combination of Rotten Tomatoes and IMDB data to locate the names of actors, characters, and directors within the text, and a regular expression to replace them with generic strings.

The final dataset format is heavily based on the mad-libs format I use for Col. Bert Stephens, and something like this will be making it into olipy. Here's an example:

It's easy to forgive the movie a lot because of %(surname_female)s. She's fantastic.

Because I was getting paid for this bot, I put in the extra work to get things like gendered pronouns right. When that blurb is chosen, an appropriate surname from the Fox roster will be plugged in for %(surname_female).

I worked on the code over the weekend and got everything working except the (relatively simple) "post to Twitter" part. On the 28th I went into the Colbert Report office and spent the afternoon with Rob polishing the bot. We were mostly tweaking the vocabulary replacements, where "movie" becomes "TV show" and so on. It doesn't work all the time but we got it working well enough that we could bring in a bunch of blurbs that wouldn't have made sense before.

Most of the tweets mention a Fox personality or show, but a minority praise the network in general (e.g.). These tweets have been given the Ed Taters/Col. Bert Stephens treatment: a small number of their nouns and adjectives are replaced with other nouns and adjectives found in the corpus, giving the impression that the sock-puppetry machine is running off the rails. This data is marked up with Penn part-of-speech tags like so:

... the film's %(slow,JJ)s, %(toilsome,JJ)s %(journey,NN)s does not lead to any particularly %(shocking,JJ)s or %(interesting,JJ)s revelations.

Here's a very crazy example. Again, you'll eventually see tools for doing this in olipy. It ultimately derives from a mad-libs prototype I wrote a few months ago as a way of cheering up Adam when he was recovering from an injury.

We deployed the bot that afternoon of the 28th and let it start accumulating a backlog. It wasn't hard to keep the secret but it did get frustrating not knowing for sure whether it would make it to air. It's a little different from what The Colbert Report normally does, and I get the feeling they weren't sure how best to present it. In the end, as you can see from the show, they decided to just show the bot doing its stuff, and it worked.

It was a huge thrill to see Stephen Colbert engage with software I wrote! I wasn't expecting to see the entire second segment devoted to the bot, and then just when I thought it was over he brought it out again during the Folkenflik interview. While we were all waiting around to see whether they had to re-record anything, he pulled out his iPad Mini yet again and read some more aloud to us. Can't get enough!

After the show Rob took me on a tour of the parts of the Colbert Report that were not Rob's office (where I'd spent my entire visit on the 28th). We bumped into Stephen and he shook my hand and said "good job." I felt this was a validation of my particular talents: I wrote software that made Stephen Colbert crack up.

Sumana, Beth, Rob and I went out for a celebratory dinner, and then I went home and watched the follower count for RHP start to climb. Within twenty minutes of the second segment airing, RHP had ten times as many Twitter followers as my personal account. And you know what? It can have 'em. I'll just keep posting old pictures of space-program hardware.


Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.