<M <Y

[No comments] September Film Roundup: A whopping two films this month, at best. October's also not looking great. For movies, I mean. Everything else looks pretty good.

[No comments] On Scarne On Dice: At a book sale where the deal was "$5 for all the books you can fit in a bag" I picked up a book that barely fit in the bag, Scarne on Dice, originally published in 1943 and updated in 1974. The author, John Scarne, combines a ton of genuine gambling expertise with the demeanor of a megalomaniac crackpot. The jacket copy, written by some unknown soul *cough*, describes him as "the man who made the phrase 'Acording to Hoyle' obsolete and replaced it with 'According To Scarne'". He's invented his own kind of dice, Scarney Dice®, which are normal six-sided dice except that the two face and the five face have the word "DEAD" on them.

With Scarney Dice you can play a number of games such as Scarney 3000® ("the favorite dice game of the members of the John Scarne Game Club of my hometown of Fairview, New Jersey"), Scarney Put-and-Take Dice, Scarney Duplicate Jackpots, Scarney 21 Up and Down, Scarney Bingo Dice, and Scarney Black Jack. Many of these games feature dice combinations called "Big Scarney" and "Little Scarney", or require a player to call "Scarney" when exploiting a winning position.

There are also three chapters of the book devoted to card games Scarne has invented, games like Scarney® ("the first really new card game concept of this century"), Scarney Gin, and Scarney Baccarat. These games—stay with me here—are card games, they include no dice, and they have no place in a book called Scarne on Dice, especially since John Scarne also wrote a whole other book called Scarne on Cards. But since we're going down this route, how about the family portrait in the front of the book where John Scarne poses with his wife, his son, his books, and the board games he invented, most notably a checkers-like thing called Teeko. Did I mention that he named his son after his board game? Oh, and after himself, of course. John Teeko Scarne.

But unlike every other person like this I've ever encountered, John Scarne actually knows his stuff. He convincingly debunks parapsychology dice-rolling experiments by contrasting the way the experiment was run with the way casinos handle dice. He explains ludicrous systems for beating the casinos and then explains why they're mathematically impossible. His chapters on how to spot loaded dice, rigged games, steer joints, and general cheating are clearly a light rewrite of the lectures he went around giving on Army bases to stop GIs losing their paychecks to craps hustlers. He has a convincing description of what it would take to run an underground gambling operation, down to a detailed payroll.

What is going on here? My initial guess was that gambling is a field where being a Jeffrey Lebowski-esque blowhard is tolerated and even encouraged. That's still my primary guess, actually. But after reading the most interesting 200 pages of this massive tome and skimming the rest I I wonder if something else is going on. This book is mostly about craps, a folk game with a relatively clear origin in Hazard but no real chain of custody between its origin and the modern day. Maybe Scarne just wants to make damn sure that his contributions to ludology are properly credited. Unfortunately, his habit of naming everything after himself just made it that much easier to ignore his innovations and play the same games people have been playing for hundreds of years.

But there was one game that John Scarne invented whose genius I appreciate, even though I'll never play it. It's a drinking game called Scarney Pie-Eyed Dice and it survives in a modified version called Twenty-One Aces. Scarne describes a couple variants but here's the simplest one: in Scarney Pie-Eyed Dice the players take turns rolling two dice until someone rolls nothing but twos and fives (these are the "DEAD" faces of official Scarney dice). The first person to accomplish this orders a drink. Scarne recommends "a double rye with celery tonic, vodka with chili sauce", or something equally weird. The second person to roll twos and fives drinks the drink, and the third person to roll twos and fives pays for the drink.

That's just great. It creates two types of tension at once—who's going to drink the drink and who's going to pay for it, and it uses creativity from an unrelated field as a game mechanic. Good job.

[Comments] (4) The Bot of Mormon: I don't usually do in-depth analyses of my bots, especially one that's probably not gonna break ten followers, but my most recent bot is very personal to me, and the making of it turned out to be much stranger than I expected. It's The Bot of Mormon, "the most correct bot", a text-generating process with a very niche audience but the niche audience includes me, so I'm happy. A few of my recent favorites:

And again I say unto you, and more especially the elephants and cureloms and cumoms.

— The Bot Of Mormon (@TheBotOfMormon) October 16, 2014

A large and tough businessman, I pray only that I might always be found as Abraham Lincoln said: "Die when I may, by a wild olive tree."

— The Bot Of Mormon (@TheBotOfMormon) October 16, 2014

"As we read in the Book of Mormon, but I will have him come to the phone."

— The Bot Of Mormon (@TheBotOfMormon) October 14, 2014

A note: In a bid for more followers, as well as not alienating all my relatives, I designed the Bot of Mormon to be a bit of harmless humor for believing LDS folk (early versions could be pretty offensive, and I chose not to go that route). However, Saints might take offense at this blog post about how and why I made the bot. So, fair warning. Here we go.


It's not much of an exaggeration to trace my interest in generative text back to my experience growing up in Mormonism. Mark Twain famously called the Book of Mormon "chloroform in print", and I believe the reason it's so boring is that it was produced by a process similar to automatic writing. It's full of stalling and retreats to stock phrases. But what starts with the Book of Mormon sure doesn't end there. When I was a kid, church every week was a three-hour festival of stock phrases and repetition.

See, in the LDS church the task of coming up with things to say every week rotates around the general membership. Topics are assigned, and there are only about fifty topics total. Since every acceptable topic has been covered a million times before, the simplest way to make a new talk is to remember bits of old talks and mash them together.

When I was a kid I experienced this from both ends, and writing the talks was especially intense for me because despite my best efforts, I didn't actually believe. My talks were literally constructed by assembling meaningless symbols into patterns that matched what I saw other people doing. Naturally, ever since I caught the botmaking bug I've wanted to recreate this experience with a bot. I registered @TheBotOfMormon quite a while ago. But I couldn't figure out what to do until recently, when I hit upon the idea of taking as my corpus not the Book of Mormon itself, but the General Conference talks.

General Conference is a big twice-yearly event in Salt Lake where the top brass show y'all how it's done. These guys used to be lawyers and corporate executives, and their talks are all vetted by committee, so the result is... well, sometimes someone will say something offensive, but even that I wouldn't call "interesting". What is interesting is that Conference is where Mormonism meets the twenty-first century. By which I mean that's where you can see the pros use nineteenth-century language and rhetoric to talk about same-sex marriage (undesirable!) and the Internet (a mixed bag!) That's the kind of juxtaposition I thought would make a good bot. As it turns out, I was right... sort of. Eventually.

To give you a picture of what goes on in General Conference, here's a table I made of the top ten topics by decade, according to the keywords in the <meta> tags for each talk.

1970s1980s1990s2000s2010s
  1. obedience
  2. missionary work
  3. spirituality
  4. testimony
  5. Jesus Christ
  6. welfare
  7. priesthood
  8. family
  9. plan of salvation
  10. youth
  1. Jesus Christ
  2. missionary work
  3. service
  4. obedience
  5. priesthood
  6. faith
  7. love
  8. family
  9. spirituality
  10. adversity
  1. Jesus Christ
  2. faith
  3. family
  4. priesthood
  5. love
  6. service
  7. Holy Ghost
  8. obedience
  9. prayer
  10. Atonement
  1. faith
  2. Jesus Christ
  3. service
  4. testimony
  5. obedience
  6. family
  7. Holy Ghost
  8. prayer
  9. love
  10. priesthood
  1. Jesus Christ
  2. service
  3. faith
  4. priesthood
  5. obedience
  6. adversity
  7. family
  8. love
  9. Holy Ghost
  10. Atonement

You can see the shape of the fifty acceptable topics there. Anyway, I downloaded the Conference talks and set about applying my usual bag of tricks to the corpus to come up with an interesting transformation. Imagine my surprise when none of my techniques worked!

The _ebooks algorithm, up to this point an unending generator of hilarity from any corpus, failed miserably. The word-frequency filter I used to find the interesting signs for Minecraft Signs, also failed. Markov chains were useless, big surprise. I had a dim idea that the key to bot gold here was the subordinate clauses: the sentences that run on and on in a lawyerly way, embroidering themselves with their own Talmudic interpretations. I tried Queneau assembly of sentences at the clause level. This was good enough to get the bot launched, but it wasn't great. Each individual clause is very likely to be boring, its boringness has no relationship to word frequency, and combining clauses doesn't help. The corpus is fractally boring.

"Here you will find happiness, we know that the rejoicing, or anything else, they are in a state contrary to the nature of happiness."

— The Bot Of Mormon (@TheBotOfMormon) October 2, 2014

Okay, I thought, time to break out the big guns. I incorporated the Book of Mormon into my corpus, the Doctrine & Covenants; even the Pearl of Great Price, the bizarro crown jewel of the LDS canon. None of it helped. (The Pearl of Great Price helped a little—it's really weird—but it's also very short.)

Behold, and began to put heavy burdens upon their backs, and prayers of faith.

— The Bot Of Mormon (@TheBotOfMormon) October 6, 2014

But legend told of a secret weapon: the Journal of Discourses. Basically a large collection of General Conference talks from the late 19th century, during the polygamy era, containing a ton of fiery rhetoric and juicy doctrines downplayed or outright disowned by the modern church. Some might consider it dirty pool, but I was desperate to get some interesting content out of my bot. I Queneau-ified every Discourse in the Journal and added it to the corpus... to no avail. It was still dull! On the sentence fragment level, it's tough to even distinguish between the 'scandalous' stuff in the Journal and the dishwater they serve up at Conference nowadays.

And now behold, as it were, most of them in environments very different from their own.

— The Bot Of Mormon (@TheBotOfMormon) October 9, 2014

At this point I was so frustrated that I honestly started to question my unbelief. What are the odds that a corpus of text spanning hundreds of authors over nearly 200 years could be so uniformly dull? Was some divine hand at work, keeping things from getting too interesting? With shaking hands I ran my tests against a control sample: the Gutenberg text of a non-Mormon book of sermons. And it turns out nineteenth-century religious language is what's fractally boring. It's nothing to do with Mormonism in particular. The modern stuff is dull because it copies and recombines the nineteenth-century stuff.

And that, finally, was the key to what little success I've achieved with @TheBotOfMormon. When the bot is funny, the funny thing is not the rambling juxtaposition of sentence fragments per se. It's the juxtaposition of modern concepts with nineteenth-century language. To get the bot to work I would have to actually recreate that juxtaposition, not just hope for it.

Enter the Corpus of Historical American English. (Thanks, BYU! Seriously, what a great project.) This has word frequencies for every decade from the 1810s up to 2009. I picked out all the words that were 10x more common between 1930 and 1980 as they were between 1830 and 1880. I tagged all the sentence fragments that were distinctly twentieth-century. Now I can guarantee that every assemblage has an old-timey component and a more modern component, and the chances of humor go way up.

The lesson I want to take from this is that every corpus is different. I thought I could handle the LDS corpus with the same tools I use on Gutenberg, because they're both full of archaic language, but I was totally wrong. Once I engaged with the text this became obvious, but I came into this holding the text at arms' length because it held a lot of bad childhood memories.

There's no generic bot kit that will work on anything. (Well, there is, but it uses Markov chains and I don't like it.) Even my really simple bots like I Like Big Bot and Boat Names required a lot of custom behind-the-scenes work to find the most interesting subset of the data.

Perhaps this can serve as my new rule. A new bot needs to present a different way of being a bot, not just a different corpus. And adding more text to a corpus I don't know how to handle just makes the problem worse.


[Main]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.