[Comments] (22) It Lives!: NewsBruiser now has a rudimentary comment system with Bayesian spam filtering. Go ahead and try it out; if you break it I'll fix it. You know what this means: I need another icon. A speech bubble? A burst of flame? Well, I could probably draw a speech bubble myself.

Posted by Leonard Richardson at Tue Oct 21 2003 02:35

Hopefully this will show up on the static pages.

Posted by Greg Knauss at Tue Oct 21 2003 03:27


Admit it, geekboy, you've got body image problems. That's why you've squirreled yourself away for a goodly portion of your life behind a keyboard, plugged into a world where you're judged by your intellect and not the pathetic physical shell that you spent your adolescence sure you'd have evolved out of by now. You may desperately _want_ to be a brain in a jar, but things just aren't going to work that way. So what are you going to do about it? Huh? Geekboy? Huh?

Why, how about transferring your battered self-esteem into your penis, like most guys have done by age fifteen or so? It's fast! And easy! And once your self-image is based entirely around an accident of genetics and human physiology, will you be satisfied? Hell, no! You're all about meddling in God's realm! Boldly go where angles fear to tread!

Grow it! Expand it! EN1AR6E it! Use it to drive nails or beat small woodland creatures to death! You don't have to be content/ashamed of your manhood falling comfortably within the normal range of the bell curve, within a single standard deviation of the mean for one second more! No! Now that every aspect of your personality -- moral, intellectual and spiritual selves included -- have been tossed overboard as so much useless garbage, you've only got to concentrate on the one thing that makes you feel good as a human being: your penis.

Be huge! Be gargantuan! Be a freak of biology, lugging your organ behind you in a wheelbarrow! Be incapable of congress! Or even arousal, for fear that you'll pass out! Drive women wild with desire, assuming they have hips like the jaws of snakes, that can unhinge. Cervix be damned! Full speed ahead!

You _know_ you're hung up about sex, and here, at least, is something you can fool yourself into thinking makes a difference. You think _love_ makes good sex? Friendship? Tenderness? Playfulness? Weenie. What makes good sex is out physically mismatched the various organs are. If you can’t give her a vaginal Indian burn, you’re not a man.

And all this -- the reduction of as complex and elegant a machine as a human being; the collapse of it's mind and it's soul into a single, crude point -- is possible without any work on your part whatsoever! Isn't that great? It's a pill! Imagine that! We're selling instant, effortless gratification to superficial people! How about that?

So send $69.95 and whatever tattered remains of your self-respect you can find to greg@eod.com. And I'll be sure and get a bottle of magic penis pill out to you as soon as I possibly can. Once I stop laughing.

Posted by Kevan at Tue Oct 21 2003 05:28

Excellent. Of course, it'd be even better if the Bayesian filter whispered its opinion after every comment.

Is it possible to set up various Bayesian dictionaries - different names, but all working the same way? "This comment has a 1.5% chance of being spam, a 24.8% chance of being sarcastic and a 73.0% chance of being secretly written by Leonard himself."

Posted by Kevan at Tue Oct 21 2003 05:28

(Ah, I'm allowed HTML, but my innocent linebreaks get stamped on?)

Posted by Peter A. Peterson II at Tue Oct 21 2003 09:28

Wow. I decide to sleep for just one night this month, and I wake up to find newsbruiser comments. What's this world coming to? An END. That's what.

Posted by Peter A. Peterson II at Tue Oct 21 2003 09:31

Greg, I was just thinking (speaking of accidents of genetics) that unhinging hips could pave the way for easier childbirth... there could be a market here.

Posted by Leonard Richardson at Tue Oct 21 2003 11:39

It's supposed to filter HTML; I'm not sure why it's not doing so. Every comment gets one line in the entry file, which is why linebreaks get smashed. Task #1 for tonight is to make the entry loader smart enough to handle multi-line fields and then treat blank lines as new paragraphs and do proper HTML filtering.

Comments with spam probability >50% are not displayed except on the entry edit form. Comments with spam probability >80% are outright rejected (the latter is configurable; the former should probably be as well). There's currently one global spam corpus for the entire install, but late last night I decided that wasn't a good idea, so tonight I'm going to make it a per-notebook corpus. I also want to be able to export and import the corpus so that like-minded people can share their conception of what is spam.

There's currently no preference-storing cookie for the comment form, but that's just because I didn't have time to implement it before bedtime.

I do want to experiment with named Bayesian pools (eg. "abusive") to see if other types of tar pit comments can be proactively filtered out. If nothing else, it can act as an IP blocker. You could also create non-judgemental pools like "sarcastic", though that sounds like a feature no one would really use. I like the idea of publishing the spam percentage, but worry, possibly unreasonably, that this would help spammers.

Misc. other things: sending mail to the notebook owner when an annotation (comment or trackback) comes in, an RSS feed for annotations, the icon (I'm thinking an empty speech bubble when there are no comments and one with a ! inside it when there are comments).

As long as I'm engaged in Bayesian madness (when do the Bayesian filtering conferences start cropping up?) I'm also thinking of writing a toy that predictively files notebook entries under what it considers to be the appropriate categories, just to see what kind of accuracy it can get.

Posted by Greg Knauss at Tue Oct 21 2003 12:22

You also might want to consider permalinks for each comment, so they can be targeted by links.

Posted by Kristofer Straub at Tue Oct 21 2003 13:23

This thing is simmering with flavor! I wanted a comments system in mine. But I'm wondering if my host has Python (or if they'd let me set it up). That might be my dream-killer.

Posted by Seth Schoen at Tue Oct 21 2003 14:12


P. S. Avast.

Posted by Sumana at Tue Oct 21 2003 14:17

This comment thread is apparently serving as a reverse mic-check for NYCB readers, so I'll add my tag. Also, I suspect that the pirate who posted just before me was not really Seth.

Posted by Leonard Richardson at Tue Oct 21 2003 15:34

That comment was from zork.net, so I think it probably was Seth.

It appears I must also add some styling or formatting to distinguish entries from each other. This is getting hard to read.

Posted by YourFavLittleSis (TheOtherOne) at Tue Oct 21 2003 16:25

I think the icon should be a flame.

Posted by Peter A. Peterson II at Tue Oct 21 2003 16:38

I think it should be a little speech bubble with a number in it, the number being the number of comments entered. Actually though, that sounds a little expensive (individual graphics), so what about a speech bubble with a number next to it (both linking to the comments section), i.e. "[icon](12)".

I like the flame idea, but it might be more accurate to have the icon be the tail of a boat with a rod and line hanging out the back.

So leonard, when do you implement Meta-moderation?

Posted by Seth Schoen at Tue Oct 21 2003 17:29

Hash: SHA1

Arrrr, pirates who impersonate me will taste STEEL.
Or else their PGP signatures will fail to validate.
Version: GnuPG v1.2.3 (GNU/Linux)


Posted by Seth Schoen at Tue Oct 21 2003 17:34

Hash: SHA1

In conclusion, I think that Newsbruiser's end of line behavior for
comments must be keel-hauled.
Version: GnuPG v1.2.3 (GNU/Linux)


Posted by The Sister advocate at Tue Oct 21 2003 21:31

Susie's comments aren't working, BTW.

Posted by The Sister advocate at Tue Oct 21 2003 21:47

Never mind, is fixed. Thanks, Sumana (I think)...

Posted by uncle pedro at Tue Oct 21 2003 21:59

Perhaps this is from looking at virgule-based sites for too long, but I think that the attributions should be before the quote. It's a little like replying *above* the quotation. I also find myself skipping the quote, reading the attribution (so I know who is writing), then going back up to the quote. My eyes have logged 6 miles today alone because of this!

Posted by Brendan at Tue Oct 21 2003 23:57

Hey! Hey! You know what these are? These are METACOMMENTS!

Posted by Sean Neakums at Wed Oct 22 2003 06:32

Sweet! Now I can go back and comment on all of my favourite entries!

Posted by pedro at Wed Oct 22 2003 10:54

Do you endorse the comment system?


