SUBJECT: Re : tokeniser What 's the command line switch for using an exclusion list in the tokeniser ? And is there a similar one for the tagger ? The thing I showed you the other day was for the tagger , NOT the tokeniser . The syntax is : &CHAR fn where fn the file containing the exclusions The tokeniser is a standard flex ( bog-standard &CHAR lexical analyser-generator ) program , and as such would probably need to have the exclusions written into the .lex file containing the lexical rules . It is n't difficult to modify the lexer ( I 've made quite a few modifications for my dodgy email data ) , and the whole thing compiles into a &CHAR program , so you could modify it to take a list of exclusions ; however I do n't think flex automatically generates an exclusion list feature ( although I could be wrong - have a look at the online documentation ) . I guess that 's not great news , but I hope you can do something with it . &CHAR