<D <M <Y
Y> M> D>

Lack-Of-Cliffhanger Cliffhanger: I read Dan Simmons' Hyperion and for the most part it was great. The universe is well-fleshed out and has fairly credible newslang and people act believably. It reminds me of the universe in the The Legend Lives! IF game, which I always liked and which I think has the most interesting universe I've seen in sci-fi IF, albeit one of the worst titles. (Obligatory disclaimer: Planetfall and Suspended and probably others are better games.) Geez, I haven't played any IF or felt guilty about abandoning my IF project for probably a year. Anyway.

The book is really good in general but parts of it are very boring (boringness is not correlated to "action" in the crude sense: some of the most boring scenes describe firefights and daring escapes). In particular there's a boring part about 40 pages from the end which goes on and on with fisticuffs and murder and repercussions and blah and blah. Maybe it's just boring because it's a framed story inside a much more exciting story that's coming to a head, and I've come to expect the last-minute revelation in these stories that someone is not who he seems. In my sci-fi epic, revelations that people are not who they seem will come at unpredictable times.

This framed story's eating up more and more pages and I realize that not only is this book the first part of a multi-part series, but it's not going to have a real conclusion or even a cliffhanger. Instead it will be as though it was originally a longer book that got split down the middle. Sure enough, that's exactly what happens. Just a chapter break with no next chapter after it, even though with about ten more pages there could have been a real heart-pounding "to be continued" cliffhanger. I never realized this before, but this is more frustrating than a real cliffhanger!

So, good book, but no resolution, even by the accepted standards of first books in series. I'd advise you to get the first and the second books at once, but maybe you won't like the first book and you'll blame me for forcing you to buy into the capitalist publishing machine when all I did was advise you to do this. Also, the copy I got in the used bookstore has much better cover art than the overly literal cover art you see for the book at Amazon. So get the, ah, 1991 edition if you want to feel threatened every time you close the book, instead of feeling like you're reading an Edward Scissorhands novelization.

RSS feed construction helper: This is kind of a neat tool I wrote but I'm not sure what to call it. It's a wrapper around the famous PyRSS2Gen library. It provides a simple pickle-based backend storage for state relating to an RSS feed that's screen-scraped from a web page. Some of the state it stores is redundant with the actual content of the RSS feed, but some of it is contextual information like "when was the first time a screen-scrape attempt found the item that has this guid"? (I am too tired to explain why this information is useful, but trust me, it quite often is.)

The tool provides convenience functions for fetching a new version of the web page, and does the good-citizen Etag and Last-Modified thing, so all you have to do is write a hook method that scrapes the webpage into a bunch of RSS items and adds them to the feed. As items go on the top of the feed, older ones automatically drop off the end.

In conjunction with Beautiful Soup) this makes it incredibly easy for me to screen-scrape a web page into an RSS feed and have it keep working over time. Up to this point I've been trying to run the various Syndication Automat feeds out of NewsBruiser notebooks. It's a clever idea but cleverness is just about all it's got going for it. It's clunky and awkward, and there are some cases I just can't handle, such as the Dover page where the items I want to RSSify don't have any dates on them. (That's why the contextual information mentioned earlier in this entry is useful, BTW)

I thought any alternative to using NewsBruiser would be a lot of backend work, but it's not. My module's about 150 lines of code: here it is in a temporary location until I come up with a real name for it: RSSHelper.py. Here's a sample script that uses it and Beautiful Soup and ASCII, Dammit (actually the "HTML, Dammit" subset) to make a Dover automat feed.

It originally took me an hour of work and eventual failure to get a NewsBruiser notebook-backed feed for the Dover site. It took about five minutes to write that script linked above. And the new script actually works, instead of putting the same books into the feed over and over again.

I've got the same feeling with this as I did when I stopped writing custom parsers for screen-scraping and started using Beautiful Soup. Is this as cool as I think it is? Do things like this already exist? What should I call it?

PS: Danny deserves a midwife credit since I came up with this and wrote it while working on my second hack for the Life Hacks book.


[Main]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.