< Previous
Tiny IF Roundup >

[Comments] (2) Anthology Of Ruby Cookbook Recipes That's Not The Ruby Cookbook Itself #3: Sparklines: Okay, thanks to much work by Kevin the nameserver is back up, and I can return you to the Book Sales Trilogy, where I sell the Ruby Cookbook using software that tracks how many copies people have already bought. That's what I call recycling!

On previous installations I showed you how to get sales rank information about a book from Amazon, and how to create a graph of the data over time. Now my vengeance will be complete: I will unleash sparklines upon the world!

Sparklines are really interesting: bits of anonymous data that add quantitative analysis to text, in some cases without breaking the flow of a sentence. The meaning of a sparkline depends on context or a tiny textual label, not on big sets of axial markings. I've been a fan of sparklines for a while, and what better way to propagandize them than by inclusion into a book of cool tricks? THERE IS NONE. So here we go with the final part of the trilogy:

  1. Getting book information with Ruby/Amazon.
  2. Generating graphs with Gruff.
  3. Generating sparklines with the sparklines gem.

As you know, Bob, in previous episodes we defined a SalesReport class which encapsulates information about a product's sales rank over time. Then we extended the class to give it the ability to write out a big graph describing the history of its sales rank. Now we're going to extend it again, and give it the ability to write out a sparkline.

Like Gruff, the sparklines gem is a great piece of code by Geoffrey Grosenbach. It's simpler than Gruff because sparklines are simpler than graphs. It does have some issues you need to watch out for, though. I cover sparklines in Recipe 12.5, "Adding Graphical Context with Sparklines". In the book I give some silly examples focused on embedding sparklines into HTML pages with the data: URI scheme, and incorporating sparklines into Rails views with the sparkline_generator gem. Here, I'll show you to show how to write sparkline graphics to static PNG files.

Those images from the Crummy homepage (Cookbook and Beginning Python ) show sales rank values for the past 30 days. How do I make them? Let's see.

class SalesReport
  # Make a sparkline for the sales of this product.
  def make_sparkline(graph_path, time_units=30, samples_per_unit=24)
    path = File.join(graph_path, "#{@asin}-salesrank-sparkline.png")

    # Gather a sample of the data
    sample = []
    (size-(time_units*samples_per_unit)-1).step(size-1, samples_per_unit) do |i| 
	sample << 1/self[i][1]

A sparkline needs to be small, and unlike Gruff, sparklines doesn't compress data points. Gruff takes an image size in pixels, and whether you give it 8 data points or 800, your graph is that size. But if you give sparklines 800 data points, you get a really long sparkline. I can't use all the data I've gathered since these books showed up on Amazon: I need to economize.

This is fine. I don't need the whole history for a sparkline like I do with a big graph. I'll settle for the sales rank history from the past 30 days. But I run my data collector every hour, and that's still 24*30=720 data points. Solution: I go back 720 hours, and skip ahead 24 samples at at time. This way I pick a sample sales rank for every day, ending with the most recent sample. You can change time_units and samples_per_unit to customize your sparkline.

Now I have an array and I just have to send it to be turned into a sparkline:

    Sparklines.plot_to_file(path, scale(sample), :type => 'smooth', 
                            :line_color => 'black')

Well, not quite. Earlier I mentioned that sparklines line graphs don't compress data horizontally. There's a fixed number of pixels between each point. Well, sparklines doesn't compress data vertically either. A line graph can handle values between 0 and 100. You can make the sparkline bigger but I'm pretty sure you can't increase that range. If you give values outside that range they get clipped or ignored.

In this case, our numbers are already between 0 and 100, since we're taking the reciprocals of numbers greater than one. But they're also between 0 and 1, as we saw in the previous installment. When the range is 0-100, this makes for boring sparklines. If your book rocketed to the top of the Amazon charts and stayed there, its amazing success would look like this: (caution: embedded sparkline not visible in IE).

To make our data visible, we need to stretch it out. We might as well go for maximum stretch: treat the smallest sampled rank as zero, and the largest as 100. That's what the scale method mentioned above does. Here's the definition:

  # Scale data so that the smallest item becomes 0 and the largest becomes 
  # 100.
  def scale(data, bottom=0, top=100)
    min, max = data.min, data.max
    scale_ratio = (top-bottom)/(max-min)
    data.collect { |x| bottom + (x-min) * scale_ratio}

I gave a similar method in the sparklines recipe in the book, but it's hard-coded to scale a range to 0-100. This implementation is more general and you can use it anywhere. I really should have made this a separate recipe because it kept coming up. In 12.14 "Generating MIDI Music" (which I hope to write about later) I reused this formula for a different range and got it wrong, and had to send in an erratum. In 2.12 "Using Complex Numbers" I used a similar formula to scale an ASCII drawing of the Mandelbrot set to any desired size. So it turns out this is a useful method to have.

Anyway, now we can scale any data set to take maximum advantage of the vertical space allotted us by a sparkline. In the book I lightly discuss the ramifications of scaling all these different data sets to the same 0-100 range. The Cookbook sells much better than Beginning Python, but you can't tell that from the sparklines. Which is fine here, because I want these sparklines to show trends.

The sparklines gem lets you do other kinds of sparklines: pie charts are my favorite, as you'll see in the book. I'm really excited about the possibility of sparklines: they're like the little crawls at the bottom of television news stations, except they're classy and related to the main text.

The full sales rank monitor program is available here. I hope other authors find it useful.

What's next for Best of Ye Book of Ruby-Receipts? I don't know. These promotional tutorials take surprisingly long to write: it's almost as much work as was writing the original recipes. So I may give it a rest for a while. But let me know if there's a recipe in the Cookbook you'd like to see me explore in more detail, in the context of a real-world program like the sales rank monitor.

Filed under:


Posted by Aaron Swartz at Sun Jul 30 2006 20:44

"show pagerank values for the past 30 days"

I think you mean salesrank

Posted by Leonard at Sun Jul 30 2006 20:48

Fixed, thanks.

[Main] [Edit]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.