< Previous
Next >

[Comments] (2) The Bayes Motel: I kept wondering about the suitability of Bayesian textual analysis to such-and-such a problem domain. Would it magically solve the problem as it does with spam, or would it be like trying to evolve an artificial intelligence by rating randomly generated sentences? Eventually I decided that the time had come for action! So I hired this guy to write a really simple generalized Bayesian rating application which you could use for just long enough to see if the problem was tractable. Yes, I don't just sit around--I make others do my bidding! But then he subcontracted it to a company in the Phillippines, and they went out of business, and their contracts were picked up by a floating libertarian utopia on a raft built out of abandoned oil drums, so make a long story short I ended up doing it myself for about half the price.

It's called The Bayes Motel and it does a pretty good job. By which I mean I wrote applications using it for a domain where Bayesian rating obviously works, and one where it almost works, and... it works and it almost works. TBM also does something I've been wanting for a while without realizing it: different tokens are displayed with different colors, to help you visualize how a piece of text got a certain Bayesian score. Pretty slick! And inefficient. Oh well, it's not supposed to be a big heavy-duty application anyway, just a prototyping tool. I hope you find it useful.

If I were a real hacker I would forget this Bayesian nonsense and do a Python implementation of Fast and accurate text classification via multiple linear discriminant projections. Even in accomplishment I feel guilty!

Filed under: ,

Comments:

Posted by Josh Myer at Mon Jun 13 2005 11:43

BayesMotel: probabilities check in, and they don't check out.

(What a wonderfully useful idea, by the way)

Posted by Leonard at Mon Jun 13 2005 22:21

Glad you like it. Some variant on "x checks in but it doesn't check out" was my original slogan for the Motel, but I forgot all about it while I was doing development.


[Main]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.