Run RandomSentenceBus.py and it'll queue up a bunch of random
sentences containing words taken from Basic English and a list of
fictional curse words. It will create files Curses.corpus and
Curses.guests. These then need to be made writable by the web server
user.

Run curses.cgi through a web server, and you'll be given the
opportunity to determine which sentences are "clean" and which ones
contain the fictional curse words.

As you rate sentences you'll start seeing words show up in different
colors: a reddish tint means the word is associated with "dirty"
sentences, and a greenish tint means the word is associated with
"clean" sentences. Some other color means it shows up in both types of
sentences.

Over a long period of time you should start to see the fictional curse
words turning red, the other words turning green, and sentences
automatically categorized as "dirty" or "clean" depending on whether
or not they contain a curse word. 

Sources for the word list:

http://www.diac.com/~entente/basicpg.html#wurdz
http://en.wikipedia.org/wiki/List_of_fictional_curse_words