# News You Can Bruise for 2011August 18

Queneau Assembly: That's my name for the formerly-unnamed technique I used in "Board Game Dadaist". It all started in April, the night I was guest critic for Adam's ITP class. Afterwards I went out to dinner with Adam and Rob, and Adam was talking up Markov chains. Dude loves him some Markov chains. I said "Man, I'm tired of Markov chains. Markov chains are so 70s, they have little coke spoons dangling from them. I'm gonna come up with a better algorithm for creating generative text."

Big talk, but fortunately I didn't have to come up with a better algorithm, because I already had. Back in 2008 I released a project called "Spurious", which generates new Shakespearean sonnets by picking lines from the existing sonnets. It generates two sonnets at once using two different algorithms. Algorithm B (the one lower down on that page) is totally random: you could get a new sonnet made entirely of the first lines of other sonnets. But Algorithm A (the first one on that page) creates what I'm calling a Queneau assembly. The first line of a new sonnet is the first line of some existing sonnet. The second line is the second line of some other sonnet. The third line comes from the set of third lines, and so on to the end.

Oulipo founder Raymond Queneau did something very similar in his 1961 book "Hundred Thousand Billion Poems". This may be where I got the algorithm I used in "Spurious", though I don't think it was a conscious homage. In "Hundred Thousand Billion Poems" there are ten sonnets bound such that you can "turn the page" for a single line of the sonnet, changing that line while leaving the rest of the poem intact. Each generated poem feels like a sonnet because it starts with a "first line" and ends with a "last line" and every line in between is placed where it was in some manually generated sonnet.

I've named the technique in honor of Queneau because I can't find anyone who used it earlier. It's not universally better than a Markov chain, because it only works in certain cases:

• It doesn't work on free text, only on texts with a small-scale internal structure (a poem, a paragraph, a short biography).
• You must have many source texts with that structure.
• You might not like how much of the original text it preserves.

That said, the Queneau assembly gives very entertaining results, and it's now my go-to dada technique, promoted over Markov chains and even unadulterated randomness.

The simple algorithm

I've come up with a number of algorithms for making Queneau assemblies. I'll talk about the simplest first, just so you'll see how this works. This is a refined version of the algorithm I used for "Mashteroids" (yes, those asteroid descriptions were me reinventing Queneau assemblies). It's not the algorithm I used for "Board Game Dadaist"; I'll talk about that later.

You've got a body of N texts, T0, T1, ..., TN-1. Each text can be split into some number of chunks, eg. T00, T01, ..., T0M-1.

Split each text into chunks and assign each chunk to one of three buckets. The first chunk from each text goes into the "first" bucket. The last chunk from each text goes into the "last" bucket". All the other chunks go into the "middle" bucket.

Also keep track of how text lengths are distributed: how likely it is that a text consists of one chunk, how likely that it consists of two chunks, and so on.

Now you're ready to assemble. Pick a length for your new text that reflects the length distribution of the existing texts. Then pick a chunk from the "first" bucket. If your target length is greater than 1, pick a chunk from the "last" bucket". If your target length is greater than 2, pick chunks from the "middle" bucket intil you've got enough. Concatenate the chunks first-middle-last, and you've got a Queneau assembly!

Now let's look at the scales on which you might create a Queneau assembly. Outside of poetry, the paragraph is the Queneau assembly's natural habitat. A pragraph has a flow to it, especially when you've got something like a description of a board game or an asteroid that's only one paragraph long.

You need to handle things like quotes and parentheses that open in one sentence and don't close by the end of the sentence, or that close without having opened. I wrote code for this in BGD but it doesn't catch all the cases.

In "Board Game Dadaist", the names of games are also Queneau assemblies. Here the chunks are words. I take the first word from the name of game A, the second word from the name of game B, and so on. So "Pirates! Denver" might come from "Pirates! Miniature Battles on the High Seas" and "Monopoly: Denver Broncos".

Quotes and parentheses are still problems, though it's not as bad. The big problem I ran into was repeated words, and words like "the" which are not allowed to end a game name. (The simple algorithm, with its "last" bucket, prevents "the" from showing up last unless it showed up as the last word of an existing game. In the algorithm I used for "Board Game Dadaist", I had to special-case this.)

In general, Queneau assemblies will not create coherent English sentences. Much as it pains me to admit, a Markov chain is better for that. It works for board game titles because we allow titles a lot of creative license, even up to the point of suspending the rules of grammar. "Pirates! Denver" makes no sense as a sentence, but it's a perfectly good game title.

Many games have single-word titles, eg. "Carcassonne". I wanted to have single-word titles in BGD, but I didn't want to duplicate real names. So I applied the Queneau assembly algorithm on the word level.

Here, the chunk is a run of letters that's all vowels or all consonants. So "Carcassonne" would be split into the chunks ["C", "a", "rc", "a", "ss", "o", "nn", e"]. I keep two sets of buckets, one for vowels and one for consonants. If the first chunk was a vowel chunk, the second chunk is a consonant chunk, and I alternate til I reach the end.

This means that single-word BGD titles are almost never English words, but they do capture the feel of those one-word titles that aren't words (examples: "Zajekan", "Fraseda", "Kongin", "Q-blardo").

The BGD algorithm

Now that you see how it works, I'll explain the algorithm I actually use for "Board Game Dadaist". Instead of three buckets, I have a lot of numbered buckets. When I split a text into T00, T01, ..., T0M-1, I put T00 into bucket 0, T01 into bucket 1, and so on, with T0M-1 going into bucket M-1. I create an assembly by picking from bucket 0, then bucket 1, and so on until I've reached the target length.

This is the algorithm that "Hundred Thousand Billion Poems" uses, and when the texts have more structure than "beginning/middle/end", this algorithm works a lot better. I don't think it matters much for BGD descriptions, but I do think it matters for game names. I would like to combine this algorithm with the "last" bucket from the simple algorithm, because right now board game descriptions sometimes end abruptly with a sentence like "Contents:".

[Main]

 Unless otherwise noted, all content licensed by Leonard Richardsonunder a Creative Commons License.