<D <M <Y
Y> M> D>

[Comments] (2) The Bayes Motel: I kept wondering about the suitability of Bayesian textual analysis to such-and-such a problem domain. Would it magically solve the problem as it does with spam, or would it be like trying to evolve an artificial intelligence by rating randomly generated sentences? Eventually I decided that the time had come for action! So I hired this guy to write a really simple generalized Bayesian rating application which you could use for just long enough to see if the problem was tractable. Yes, I don't just sit around--I make others do my bidding! But then he subcontracted it to a company in the Phillippines, and they went out of business, and their contracts were picked up by a floating libertarian utopia on a raft built out of abandoned oil drums, so make a long story short I ended up doing it myself for about half the price.

It's called The Bayes Motel and it does a pretty good job. By which I mean I wrote applications using it for a domain where Bayesian rating obviously works, and one where it almost works, and... it works and it almost works. TBM also does something I've been wanting for a while without realizing it: different tokens are displayed with different colors, to help you visualize how a piece of text got a certain Bayesian score. Pretty slick! And inefficient. Oh well, it's not supposed to be a big heavy-duty application anyway, just a prototyping tool. I hope you find it useful.

If I were a real hacker I would forget this Bayesian nonsense and do a Python implementation of Fast and accurate text classification via multiple linear discriminant projections. Even in accomplishment I feel guilty!


Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.