< Previous
Beautiful Soup 4 Beta 8 >

[Comments] (6) Worst Episode Ever: Time for some more IMDB fun. Last time I looked at whole years of television. This time, I'll graph the ratings for individual episodes of TV shows. Can we watch shows get better or worse over time?

We sort of can. The problem is that only a true fan bothers to go to IMDB and rate individual episodes of a TV show. So you can't really trust the episode ratings--they're too high. But we can visualize trends in show quality, as percieved by the fans.

For these visualizations you want long-running series with lots of die-hard fans. So let's start with Star Trek:

(Note the very last data point in that one. That's the series finale, which everyone hates.)

There's a lot of scatter, but you can generally see the common Star Trek pattern of the show getting better as the ensemble cast comes together. Except for the original series, which ended with a lousy season. Now let's look at another nerd favorite, "Buffy the Vampire Slayer":

Beth requested that one. I've seen exactly one episode of Buffy so I wasn't expecting anything in particular. It looks like a show that's consistently good, but wildly inconsistent within the bounds of "consistently good". It doesn't really get better over time. Maybe the Voyager and DS9 graphs look the same to someone who's not a Trek fan.

But compare "Mystery Science Theater 3000", which gets drastically better over time. When I was younger I would have disputed this finding, but now I basically agree with this graph:

I did a lot more graphs, but I'll just show two more. Here's the graph for "The Simpsons", a very long-running show with a very fickle fan base (see title of this post):

Wow! I love this graph! I don't know enough about the history of the show to name the historical trends, but I'm pretty sure a Simpsons fan will be see a big part of their life history reflected in this graph.

I wanted to see if this sort of coherent shape was just an artifact of the fact that "The Simpsons" has been on the air for over 20 years, so I graphed another long-running show notorious for huge variation in quality, "Saturday Night Live":

You can definitely see where things went wrong, but even within a season there's huge variation in quality. The Simpsons is created by the same people every week, where SNL has two wild cards every week: its guest host and musical guest. And since it's sketch-based, three good or three awful minutes can make or break the entire episode.

Next up, the third and possibly final part of this analysis, in which I'll pit fans of a show against the general public.

PS: For the record, according to IMDB data, the actual worst episode ever of "The Simpsons" was #9.11, "All Singing, All Dancing".

Update: People in comments had questions I can't answer because I only know how to do very basic statistics, but they also had questions about how many people rated the episodes, which I can answer. This table shows how many people have rated each series as a whole, as well as the median and mean numbers of ratings for every episode that has any ratings. I also included how many people rated the first episode, how many rated an episode in the middle, and how many rated the last/most recent episode.

Series Series ratings Show ratings (median) (mean) (std)First showMiddleMost recent
"Buffy the Vampire Slayer" (1997) 34564 498 553.41 224.888625111091
"Enterprise" (2001) 8843 140 189.27 242.282397130152
"Mystery Science Theater 3000" (1988) 6650 57 65.54 47.412178131
"Saturday Night Live" (1975) 10151 15 19.86 15.651121160
"Star Trek" (1966) 12695 419 480.95 222.836683891923
"Star Trek: Deep Space Nine" (1993) 9779 172 188.32 107.371501151290
"Star Trek: The Next Generation" (1987) 16974 329 375.62 354.4921893184580
"Star Trek: Voyager" (1995) 11245 153 169.08 110.961558177348
"The Simpsons" (1989) 15578 319 355.07 173.09221430996

So SNL actually has very few ratings per episode, while The Simpsons is on par with ST:TNG. It's common for the first episode and the finale to have many more ratings than others. And here's a graph of the number of people who have rated "The Simpsons" over time:

Filed under:


Posted by Zack at Wed Feb 22 2012 22:13

I'm wondering where the season boundaries fall on these charts: for instance, I stopped watching "Buffy the Vampire Slayer" one episode before the end of season 4 because I had Read On The Internets that seasons 5-7 were not nearly as good as 1-4 (the conclusion to the season 4 overplot is one episode before the end, the final episode sets up season 5, AIUI).

Posted by Andrew at Thu Feb 23 2012 03:18

It'd be good to see some r-squared values on those lines. And perhaps an indication of the polling size for each episode - it may well be that a series that's gone off the boil attracts only die-hard (and super-supportive) fans by the end.

Another idea that might be worth trying would be to fit separate regressions for each season of a TV series. On the whole, things seem to be headed in one direction or the other, but The Simpsons and SNL seem to have more "structure" in their ratings.

Anyway, nice work!

Posted by Nathaniel at Thu Feb 23 2012 08:00

Part of the reason the Simpsons and SNL have more structure might just be that a lot of people rate those shows, so there's less sampling variability in each data point?

I'm a little skeptical that there's a trend at all for some of those Star Trek graphs. Did you compute t or p values?

Posted by Leonard at Thu Feb 23 2012 08:40

I didn't, because I don't actually know any statistics. But I'll post an update about number-of-raters.

Posted by Andrew at Mon Feb 27 2012 11:15

Hey - good job following things up a bit. Your Simpsons results kind-of represent the sort of phenomenon that I thought might be going on. And the SNL numbers make one a lot less confident about the patterns there.

Of course, that these numbers document a self-selected sample anyway should make one intrinsically cautious. But it's still good to see them. Thanks for your efforts!

Posted by Seb at Mon Mar 05 2012 09:52


Thanks for all the work.
But why do you insist on using linear interpolation ?

It's sometimes obvious that the variation is not linear like "Saturday Night Live"

And, just like Andrew, I think it would be a good indication to display the r-squared values.

Nice work


Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.