[Comments] (2) Wiki Is Not A Toy: Despite the fact that I'm a big jerk who laughs when wikis are spammed, I've been thinking about the problem since yesterday. The reason I got all those copycat wiki spammers stepping on each others' toes (and, incidentally, mine) in the NewsBruiser wiki was not because there was spam indexed in the page history, as with some other wikis, but because there was spam right there on the front page. Because I was lazy. My wiki was living in sin. The price of having a wiki is eternal vigilance, and I wasn't taking it seriously.

So in addition to the anti-spam measures I took yesterday, this morning I updated my subwiki installation and started toughening it. I put in robots meta tags similar to those in NewsBruiser, and email notification of changes so it'll poke me if it gets spammed. Unfortunately un-spamming a page in subwiki is pretty difficult, which is why I was so lazy in the first place, so now I need to write a 'revert' function. Then I can have a 'click here to revert' link in the update email like I do with "click here to mark as spam" in the NewsBruiser comment emails.

I wish Bayesian techniques would work well on wikis, but it seems like a just-change-the-links attack would be even easier to do for wiki pages than for weblog comments.

Posted by Zack at Thu Dec 16 2004 15:24

Making Light is having a discussion about blog spam just now:

Posted by Kevan at Mon Dec 20 2004 14:20

For what it's worth, I added ruthlessly unBayesian "reject anything that puts more than twenty links onto a page" and "reject anything that uses URLs that contain either of these two obvious spam words" filters to the Encyclopaedia Morningtonia, a couple of months ago, and have only had one (very, very modest) spammer get through, with three-hundred odd being rejected, since.

Quite a polite and obvious "hey, this looks like" error message, as well, so presumably it's mostly bots, or vat-grown humans who are paid per page, and haven't got time to muck around fighting their way past clever filters.

