------------------------------------------------------------- Abstract Submission for the 1st and Last Un-Natural Language Processing Workshop, April 1, 2005. ------------------------------------------------------------- Title: "Automatic Generation of Fortune Cookie Messages" Authors: - General Tsao (Hunan Irregulars, Regiment 534) - Kevin Scooby Duh (U.Dub Electrical Engineering) - Tim Ng-ram (Hong Kong Univ. of Sci. & Tech.) - Jon Malkin vich (U.Dub Electrical Engineering) - Chris Bartelz (janitor at Happy Panda Garden) - Jeremy Rathoff Kahn (U.Dub Linguistics) - TBA (This could be you!) Abstract: Fortune cookie messages are a critical ingredient in the dining experience of many fast food Chinese restaurants in the United States. High-quality hand-crafted fortune messages are expensive to create or gather (e.g. $6.99 per dish only gets you one cookie). Therefore, we propose an automatic method for fortune message generation. As evidenced by a recent fortune cookie message "Help! I am being held hostage at a Chinese bakery", these fortune cookie message writers are under severe pressure. For the sake of human rights and the betterment of society in the upcoming 22nd Century, we unNLP researchers must come to the rescue. In this work, we propose an automatic method for generating fortune cookie messages. Since n-grams are obviously a brain-damaged technique (cite the one hundred language model papers that say this), we propose a new method called z-grams, where Z stands for Zorro. To ensure high performance, the perplexity of the new z-gram model is measured on the training data. Further, with our new modified definition of perplexity, which is defined as 10000^(cross-entropy), we observe a whopping 300% relative improvement from the baseline unigram model. Data is collected by attending a local Chinese restaurant for many weeks in a row and ordering General Tsao's chicken until the authors (all seven of them) were hospitalized. The reason for ordering the same dish is to ensure the data is i.i.d. generated. We collected a total of 150 fortune cookie messages, the largest corpus known to date. The messages are further hand-labelled by undergraduate linguistics students to indicate whether it is a good fortune or bad fortune. In addition, a Fortune Cookie Treebank is in progess. This is an ungoing project. Future work includes automatically generating the lotto number in the back of fortune cookie messages, as well as automatically generating the delicious cookie itself. Further, we will explore applying SVMs somewhere in the process, since everyone is SVMing this and SVMing that. Finally, we would like to point out that the fortune cookie generator created the following message during the course of this paper proposal: "You, oh anonymous reviewer, will achieve great happiness in life if you accept this paper."