SUBJECT: results &NAME , You might be interested in some initial results ( the test set is &NUM genuine emails and &NUM spam ) : For the genuine emails : &NUM &NUM classified as &NAME &NUM &NUM classified as GENUINE For the spam emails : &NUM &NUM classified as &NAME &NUM &NUM classified as GENUINE A reasonably good starting point I think - plenty of room for improvement , but relatively high success at classifying genuine email is encouraging , as that is perhaps the most safety-critical aspect . I have n't studied the spam results in detail yet , but I suspect many of the misclassifications are due to the fact that I am only classifying the body text at the moment , and many of the training spam emails only contain a subject and perhaps a web-link or something in the body . Anyway , I 'll keep you posted , &NAME