/**************************************************************
JLex: A Lexical Analyzer Generator for Java(TM)
Written by Elliot Berk <ejberk@cs.princeton.edu>. Copyright 1996.
Maintained by C. Scott Ananian <cananian@alumni.princeton.edu>.
See below for copyright notice, license, and disclaimer.
New releases from http://www.cs.princeton.edu/~appel/modern/java/JLex/
Version 1.2.5, 7/25/99-5/16/00, [C. Scott Ananian]
Stomped on one more 8-bit character bug. Should work now (really!).
Added unicode support, including unicode escape sequences.
Rewrote internal JavaLexBitSet class as SparseBitSet for efficient
unicoding.
Added an NFA character class simplification pass for unicode efficiency.
Changed byte- and stream-oriented I/O routines to use characters and
java.io.Reader and java.io.Writer instead --- which means we read in
unicode specifications correctly and write out a proper unicode java
source file. As a happy side-effect, the output java file is written
with your platform's preferred newline character(s).
Rewrote CInput to fix bugs with line-counting in the specification file
and "unusual behaviour" when the last line of the specification wasn't
terminated with a newline. Thanks to Matt Hanna <mhanna@cs.caltech.edu>
for pointing out the bug.
Fixed a bug that would cause JLex not to terminate given certain input
specifications. Thanks to Mark Greenstreet <mrg@cs.ubc.ca> and
Frank B. Brokken <frank@suffix.icce.rug.nl> for reporting this.
CUP parser integration improved according to suggestions made by
David MacMahon <davidm@smartsc.com>. The %cup directive now tells
JLex to generate a parser conforming to the java_cup.runtime.Scanner
interface; see manual for more details.
Fixed bug with null string literals ("") in regexps. Reported by
Charles Fischer <fischer@cs.wisc.edu>.
Rewrote start-of-line and end-of-line handling, closing active bug #5.
Also fixed line-counting code, closing active bug #12. All
new-line handling is now platform-independent.
Used unpackFromString more extensively to allow larger cmap, etc,
tables. This helps unicode support work reliably. It's also
prettier now if you happen to read the source to the generated
lexer.
Generated lexer now accepts unicode LS (U+2028) and PS (U+2029) as
line separators for strict unicode compliance; see
http://www.unicode.org/unicode/reports/tr18/
Fixed bug with character constants in action strings. Reported by
Andrew Appel against 1.2.5b3.
Fixed bug with illegal \^C-style escape sequences. Reported by
Toshiya Iwai <iwai@isdnet.co.jp> against 1.2.5b4.
Fixed "newline in quoted string" error when unpaired single- or
double-quotes were present in comments in the action phrase.
Reported by Stephen Ostermiller <1010JLex@ostermiller.com>
against 1.2.5b4. Reported by Eric Esposito <eric.esposito@unh.edu>
against 1.2.4 and 1.2.5b2.
Fixed "newline in quoted string" error when /* or // appeared
in quoted strings in the action phrase. Reported by
David Eichmann <david-eichmann@uiowa.edu> against 1.2.5b5.
Fixed 'illegal constant' errors in case statements caused by
Sun's JDK 1.3 more closely adhering to the Java Language
Specification. Reported by a number of people, but
Harold Grovesteen <hgrovesteen@home.com> was the first to direct me to
a Sun bug report (4119776) which quoted the relevant section of the
JLS (15.27) to convince me that the JLex construction actually was
illegal. Reported against 1.2.5b6, but this bit of code has been
present since the very first version of JLex (1.1.1).
Version 1.2.4, 7/24/99, [C. Scott Ananian]
Correct the parsing of '-' in character classes, closing active
bug #1. Behaviour follows egrep: leading and trailing dashes in
a character class lose their special meaning, so [-+] and [+-] do
what you would expect them to.
New %ignorecase directive for generating case-insensitive lexers by
expanding matched character classes in a unicode-friendly way.
Handle unmatched braces in quoted strings or comments within
action code blocks.
Fixed input lexer to allow whitespace in character classes, closing
active bug #9. Whitespace in quotes had been previously fixed.
Made Yylex.YYEOF and %yyeof work like the manual says they should.
Version 1.2.3, 6/26/97, [Raimondas Lencevicius]
Fixed the yy_nxt[][] assignment that has generated huge code
exceeding 64K method size limit. Now the assignment
is handled by unpacking a string encoding of integer array.
To achieve that, added
"private int [][] unpackFromString(int size1, int size2, String st)"
function and coded the yy_nxt[][] values into a string
by printing integers into a string and representing
integer sequences as "value:length" pairs.
Improvement: generated .java file reduced 2 times, .class file
reduced 6 times for sample grammar. No 64K errors.
Possible negatives: Some editors and OSs may not be able to handle
the huge one-line generated string. String unpacking may be slower
than direct array initialization.
Version 1.2.2, 10/24/97, [Martin Dirichs]
Notes:
Changed yy_instream to yy_reader of type BufferedReader. This reflects
the improvements in the JDK 1.1 concerning InputStreams. As a
consequence, changed yy_buffer from byte[] to char[].
The lexer can now be initialized with either an InputStream
or a Reader. A third, private constructor is called by the other
two to execute user specified constructor code.
Version 1.2.1, 9/15/97 [A. Appel]
Fixed bugs 6 (character codes > 127) and 10 (deprecated String constructor).
Version 1.2, 5/5/97, [Elliot Berk]
Notes:
Simply changed the name from JavaLex to JLex. No other changes.
Version 1.1.5, 2/25/97, [Elliot Berk]
Notes:
Simple optimization to the creation of the source files.
Added a BufferedOutputStream in the creation of the DataOutputStream
field m_outstream of the class CLexGen. This helps performance by
doing some buffering, and was suggested by Max Hailperin,
Associate Professor of Computer Science, Gustavus Adolphus College.
Version 1.1.4, 12/12/96, [Elliot Berk]
Notes:
Added %public directive to make generated class public.
Version 1.1.3, 12/11/96, [Elliot Berk]
Notes:
Converted assertion failure on invalid character class
when a dash '-' is not preceded with a start-of-range character.
Converted this into parse error E_DASH.
Version 1.1.2, October 30, 1996 [Elliot Berk]
Fixed BitSet bugs by installing a BitSet class of my own,
called JavaLexBitSet. Fixed support for '\r', non-UNIX
sequences. Added try/catch block around lexer generation
in main routine to moderate error information presented
to user. Fixed macro expansion, so that macros following
quotes are expanded correctly in regular expressions.
Fixed dynamic reallocation of accept action buffers.
Version 1.1.1, September 3, 1996 [Andrew Appel]
Made the class "Main" instead of "JavaLex",
improved the installation instructions to reflect this.
Version 1.1, August 15, 1996 [Andrew Appel]
Made yychar, yyline, yytext global to the lexer so that
auxiliary functions can access them.
**************************************************************/
/***************************************************************
JLEX COPYRIGHT NOTICE, LICENSE, AND DISCLAIMER
Copyright 1996-2000 by Elliot Joel Berk and C. Scott Ananian
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both the copyright notice and this permission notice and warranty
disclaimer appear in supporting documentation, and that the name of
the authors or their employers not be used in advertising or publicity
pertaining to distribution of the software without specific, written
prior permission.
The authors and their employers disclaim all warranties with regard to
this software, including all implied warranties of merchantability and
fitness. In no event shall the authors or their employers be liable
for any special, indirect or consequential damages or any damages
whatsoever resulting from loss of use, data or profits, whether in an
action of contract, negligence or other tortious action, arising out
of or in connection with the use or performance of this software.
**************************************************************/
/***************************************************************
Package Declaration
**************************************************************/
// package JLex;
/***************************************************************
Imported Packages
**************************************************************/
import java.lang.System;
import java.lang.Integer;
import java.lang.Character;
import java.util.Enumeration;
import java.util.Stack;
import java.util.Hashtable;
import java.util.Vector;
/******************************
Questions:
2) How should I use the Java package system
to make my tool more modularized and
coherent?
Unimplemented:
!) Fix BitSet issues -- expand only when necessary.
2) Repeated accept rules.
6) Clean up the CAlloc class and use buffered
allocation.
9) Add to spec about extending character set.
11) m_verbose -- what should be done with it?
12) turn lexical analyzer into a coherent
Java package
13) turn lexical analyzer generator into a
coherent Java package
16) pretty up generated code
17) make it possible to have white space in
regular expressions
18) clean up all of the class files the lexer
generator produces when it is compiled,
and reduce this number in some way.
24) character format to and from file: writeup
and implementation
25) Debug by testing all arcane regular expression cases.
26) Look for and fix all UNDONE comments below.
27) Fix package system.
28) Clean up unnecessary classes.
*****************************/
/***************************************************************
Class: CSpec
**************************************************************/
class CSpec
{
/***************************************************************
Member Variables
**************************************************************/
/* Lexical States. */
Hashtable m_states; /* Hashtable taking state indices (Integer)
to state name (String). */
/* Regular Expression Macros. */
Hashtable m_macros; /* Hashtable taking macro name (String)
to corresponding char buffer that
holds macro definition. */
/* NFA Machine. */
CNfa m_nfa_start; /* Start state of NFA machine. */
Vector m_nfa_states; /* Vector of states, with index
corresponding to label. */
Vector m_state_rules[]; /* An array of Vectors of Integers.
The ith Vector represents the lexical state
with index i. The contents of the ith
Vector are the indices of the NFA start
states that can be matched while in
the ith lexical state. */
int m_state_dtrans[];
/* DFA Machine. */
Vector m_dfa_states; /* Vector of states, with index
corresponding to label. */
Hashtable m_dfa_sets; /* Hashtable taking set of NFA states
to corresponding DFA state,
if the latter exists. */
/* Accept States and Corresponding Anchors. */
Vector m_accept_vector;
int m_anchor_array[];
/* Transition Table. */
Vector m_dtrans_vector;
int m_dtrans_ncols;
int m_row_map[];
int m_col_map[];
/* Special pseudo-characters for beginning-of-line and end-of-file. */
static final int NUM_PSEUDO=2;
int BOL; // beginning-of-line
int EOF; // end-of-line
/** NFA character class minimization map. */
int m_ccls_map[];
/* Regular expression token variables. */
int m_current_token;
char m_lexeme;
boolean m_in_quote;
boolean m_in_ccl;
/* Verbose execution flag. */
boolean m_verbose;
/* JLex directives flags. */
boolean m_integer_type;
boolean m_intwrap_type;
boolean m_yyeof;
boolean m_count_chars;
boolean m_count_lines;
boolean m_cup_compatible;
boolean m_unix;
boolean m_public;
boolean m_ignorecase;
char m_init_code[];
int m_init_read;
char m_init_throw_code[];
int m_init_throw_read;
char m_class_code[];
int m_class_read;
char m_eof_code[];
int m_eof_read;
char m_eof_value_code[];
int m_eof_value_read;
char m_eof_throw_code[];
int m_eof_throw_read;
char m_yylex_throw_code[];
int m_yylex_throw_read;
/* Class, function, type names. */
char m_class_name[] = {
'Y', 'y', 'l',
'e', 'x'
};
char m_implements_name[] = {};
char m_function_name[] = {
'y', 'y', 'l',
'e', 'x'
};
char m_type_name[] = {
'Y', 'y', 't',
'o', 'k', 'e',
'n'
};
/* Lexical Generator. */
private CLexGen m_lexGen;
/***************************************************************
Constants
***********************************************************/
static final int NONE = 0;
static final int START = 1;
static final int END = 2;
/***************************************************************
Function: CSpec
Description: Constructor.
**************************************************************/
CSpec
(
CLexGen lexGen
)
{
m_lexGen = lexGen;
/* Initialize regular expression token variables. */
m_current_token = m_lexGen.EOS;
m_lexeme = '\0';
m_in_quote = false;
m_in_ccl = false;
/* Initialize hashtable for lexer states. */
m_states = new Hashtable();
m_states.put(new String("YYINITIAL"),new Integer(m_states.size()));
/* Initialize hashtable for lexical macros. */
m_macros = new Hashtable();
/* Initialize variables for lexer options. */
m_integer_type = false;
m_intwrap_type = false;
m_count_lines = false;
m_count_chars = false;
m_cup_compatible = false;
m_unix = true;
m_public = false;
m_yyeof = false;
m_ignorecase = false;
/* Initialize variables for JLex runtime options. */
m_verbose = true;
m_nfa_start = null;
m_nfa_states = new Vector();
m_dfa_states = new Vector();
m_dfa_sets = new Hashtable();
m_dtrans_vector = new Vector();
m_dtrans_ncols = CUtility.MAX_SEVEN_BIT + 1;
m_row_map = null;
m_col_map = null;
m_accept_vector = null;
m_anchor_array = null;
m_init_code = null;
m_init_read = 0;
m_init_throw_code = null;
m_init_throw_read = 0;
m_yylex_throw_code = null;
m_yylex_throw_read = 0;
m_class_code = null;
m_class_read = 0;
m_eof_code = null;
m_eof_read = 0;
m_eof_value_code = null;
m_eof_value_read = 0;
m_eof_throw_code = null;
m_eof_throw_read = 0;
m_state_dtrans = null;
m_state_rules = null;
}
}
/***************************************************************
Class: CEmit
**************************************************************/
class CEmit
{
/***************************************************************
Member Variables
**************************************************************/
private CSpec m_spec;
private java.io.PrintWriter m_outstream;
/***************************************************************
Constants: Anchor Types
**************************************************************/
private final int START = 1;
private final int END = 2;
private final int NONE = 4;
/***************************************************************
Constants
**************************************************************/
private final boolean EDBG = true;
private final boolean NOT_EDBG = false;
/***************************************************************
Function: CEmit
Description: Constructor.
**************************************************************/
CEmit
(
)
{
reset();
}
/***************************************************************
Function: reset
Description: Clears member variables.
**************************************************************/
private void reset
(
)
{
m_spec = null;
m_outstream = null;
}
/***************************************************************
Function: set
Description: Initializes member variables.
**************************************************************/
private void set
(
CSpec spec,
java.io.PrintWriter outstream
)
{
if (CUtility.DEBUG)
{
CUtility.assert(null != spec);
CUtility.assert(null != outstream);
}
m_spec = spec;
m_outstream = outstream;
}
/***************************************************************
Function: emit_imports
Description: Emits import packages at top of
generated source file.
**************************************************************/
/*void emit_imports
(
CSpec spec,
OutputStream outstream
)
throws java.io.IOException
{
set(spec,outstream);
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}*/
/*m_outstream.println("import java.lang.String;");
m_outstream.println("import java.lang.System;");
m_outstream.println("import java.io.BufferedReader;");
m_outstream.println("import java.io.InputStream;");*/
/*
reset();
}*/
/***************************************************************
Function: print_details
Description: Debugging output.
**************************************************************/
private void print_details
(
)
{
int i;
int j;
int next;
int state;
CDTrans dtrans;
CAccept accept;
boolean tr;
System.out.println("---------------------- Transition Table "
+ "----------------------");
for (i = 0; i < m_spec.m_row_map.length; ++i)
{
System.out.print("State " + i);
accept = (CAccept) m_spec.m_accept_vector.elementAt(i);
if (null == accept)
{
System.out.println(" [nonaccepting]");
}
else
{
System.out.println(" [accepting, line "
+ accept.m_line_number
+ " <"
+ (new java.lang.String(accept.m_action,0,
accept.m_action_read))
+ ">]");
}
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(m_spec.m_row_map[i]);
tr = false;
state = dtrans.m_dtrans[m_spec.m_col_map[0]];
if (CDTrans.F != state)
{
tr = true;
System.out.print("\tgoto " + state + " on [" + ((char) 0));
}
for (j = 1; j < m_spec.m_dtrans_ncols; ++j)
{
next = dtrans.m_dtrans[m_spec.m_col_map[j]];
if (state == next)
{
if (CDTrans.F != state)
{
System.out.print((char) j);
}
}
else
{
state = next;
if (tr)
{
System.out.println("]");
tr = false;
}
if (CDTrans.F != state)
{
tr = true;
System.out.print("\tgoto " + state + " on [" + ((char) j));
}
}
}
if (tr)
{
System.out.println("]");
}
}
System.out.println("---------------------- Transition Table "
+ "----------------------");
}
/***************************************************************
Function: emit
Description: High-level access function to module.
**************************************************************/
void emit
(
CSpec spec,
java.io.PrintWriter outstream
)
throws java.io.IOException
{
set(spec,outstream);
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}
if (CUtility.OLD_DEBUG) {
print_details();
}
emit_header();
emit_construct();
emit_helpers();
emit_driver();
emit_footer();
reset();
}
/***************************************************************
Function: emit_construct
Description: Emits constructor, member variables,
and constants.
**************************************************************/
private void emit_construct
(
)
throws java.io.IOException
{
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}
/* Constants */
m_outstream.println("\tprivate final int YY_BUFFER_SIZE = 512;");
m_outstream.println("\tprivate final int YY_F = -1;");
m_outstream.println("\tprivate final int YY_NO_STATE = -1;");
m_outstream.println("\tprivate final int YY_NOT_ACCEPT = 0;");
m_outstream.println("\tprivate final int YY_START = 1;");
m_outstream.println("\tprivate final int YY_END = 2;");
m_outstream.println("\tprivate final int YY_NO_ANCHOR = 4;");
// internal
m_outstream.println("\tprivate final int YY_BOL = "+m_spec.BOL+";");
m_outstream.println("\tprivate final int YY_EOF = "+m_spec.EOF+";");
// external
if (m_spec.m_integer_type || true == m_spec.m_yyeof)
m_outstream.println("\tpublic final int YYEOF = -1;");
/* User specified class code. */
if (null != m_spec.m_class_code)
{
m_outstream.print(new String(m_spec.m_class_code,0,
m_spec.m_class_read));
}
/* Member Variables */
m_outstream.println("\tprivate java.io.BufferedReader yy_reader;");
m_outstream.println("\tprivate int yy_buffer_index;");
m_outstream.println("\tprivate int yy_buffer_read;");
m_outstream.println("\tprivate int yy_buffer_start;");
m_outstream.println("\tprivate int yy_buffer_end;");
m_outstream.println("\tprivate char yy_buffer[];");
if (m_spec.m_count_chars)
{
m_outstream.println("\tprivate int yychar;");
}
if (m_spec.m_count_lines)
{
m_outstream.println("\tprivate int yyline;");
}
m_outstream.println("\tprivate boolean yy_at_bol;");
m_outstream.println("\tprivate int yy_lexical_state;");
/*if (m_spec.m_count_lines || true == m_spec.m_count_chars)
{
m_outstream.println("\tprivate int yy_buffer_prev_start;");
}*/
m_outstream.println();
/* Function: first constructor (Reader) */
m_outstream.print("\t");
if (true == m_spec.m_public) {
m_outstream.print("public ");
}
m_outstream.print(new String(m_spec.m_class_name));
m_outstream.print(" (java.io.Reader reader)");
if (null != m_spec.m_init_throw_code)
{
m_outstream.println();
m_outstream.print("\t\tthrows ");
m_outstream.print(new String(m_spec.m_init_throw_code,0,
m_spec.m_init_throw_read));
m_outstream.println();
m_outstream.println("\t\t{");
}
else
{
m_outstream.println(" {");
}
m_outstream.println("\t\tthis ();");
m_outstream.println("\t\tif (null == reader) {");
m_outstream.println("\t\t\tthrow (new Error(\"Error: Bad input "
+ "stream initializer.\"));");
m_outstream.println("\t\t}");
m_outstream.println("\t\tyy_reader = new java.io.BufferedReader(reader);");
m_outstream.println("\t}");
m_outstream.println();
/* Function: second constructor (InputStream) */
m_outstream.print("\t");
if (true == m_spec.m_public) {
m_outstream.print("public ");
}
m_outstream.print(new String(m_spec.m_class_name));
m_outstream.print(" (java.io.InputStream instream)");
if (null != m_spec.m_init_throw_code)
{
m_outstream.println();
m_outstream.print("\t\tthrows ");
m_outstream.println(new String(m_spec.m_init_throw_code,0,
m_spec.m_init_throw_read));
m_outstream.println("\t\t{");
}
else
{
m_outstream.println(" {");
}
m_outstream.println("\t\tthis ();");
m_outstream.println("\t\tif (null == instream) {");
m_outstream.println("\t\t\tthrow (new Error(\"Error: Bad input "
+ "stream initializer.\"));");
m_outstream.println("\t\t}");
m_outstream.println("\t\tyy_reader = new java.io.BufferedReader(new java.io.InputStreamReader(instream));");
m_outstream.println("\t}");
m_outstream.println();
/* Function: third, private constructor - only for internal use */
m_outstream.print("\tprivate ");
m_outstream.print(new String(m_spec.m_class_name));
m_outstream.print(" ()");
if (null != m_spec.m_init_throw_code)
{
m_outstream.println();
m_outstream.print("\t\tthrows ");
m_outstream.println(new String(m_spec.m_init_throw_code,0,
m_spec.m_init_throw_read));
m_outstream.println("\t\t{");
}
else
{
m_outstream.println(" {");
}
m_outstream.println("\t\tyy_buffer = new char[YY_BUFFER_SIZE];");
m_outstream.println("\t\tyy_buffer_read = 0;");
m_outstream.println("\t\tyy_buffer_index = 0;");
m_outstream.println("\t\tyy_buffer_start = 0;");
m_outstream.println("\t\tyy_buffer_end = 0;");
if (m_spec.m_count_chars)
{
m_outstream.println("\t\tyychar = 0;");
}
if (m_spec.m_count_lines)
{
m_outstream.println("\t\tyyline = 0;");
}
m_outstream.println("\t\tyy_at_bol = true;");
m_outstream.println("\t\tyy_lexical_state = YYINITIAL;");
/*if (m_spec.m_count_lines || true == m_spec.m_count_chars)
{
m_outstream.println("\t\tyy_buffer_prev_start = 0;");
}*/
/* User specified constructor code. */
if (null != m_spec.m_init_code)
{
m_outstream.print(new String(m_spec.m_init_code,0,
m_spec.m_init_read));
}
m_outstream.println("\t}");
m_outstream.println();
}
/***************************************************************
Function: emit_states
Description: Emits constants that serve as lexical states,
including YYINITIAL.
**************************************************************/
private void emit_states
(
)
throws java.io.IOException
{
Enumeration states;
String state;
int index;
states = m_spec.m_states.keys();
/*index = 0;*/
while (states.hasMoreElements())
{
state = (String) states.nextElement();
if (CUtility.DEBUG)
{
CUtility.assert(null != state);
}
m_outstream.println("\tprivate final int "
+ state
+ " = "
+ (m_spec.m_states.get(state)).toString()
+ ";");
/*++index;*/
}
m_outstream.println("\tprivate final int yy_state_dtrans[] = {");
for (index = 0; index < m_spec.m_state_dtrans.length; ++index)
{
m_outstream.print("\t\t" + m_spec.m_state_dtrans[index]);
if (index < m_spec.m_state_dtrans.length - 1)
{
m_outstream.println(",");
}
else
{
m_outstream.println();
}
}
m_outstream.println("\t};");
}
/***************************************************************
Function: emit_helpers
Description: Emits helper functions, particularly
error handling and input buffering.
**************************************************************/
private void emit_helpers
(
)
throws java.io.IOException
{
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}
/* Function: yy_do_eof */
m_outstream.println("\tprivate boolean yy_eof_done = false;");
if (null != m_spec.m_eof_code)
{
m_outstream.print("\tprivate void yy_do_eof ()");
if (null != m_spec.m_eof_throw_code)
{
m_outstream.println();
m_outstream.print("\t\tthrows ");
m_outstream.println(new String(m_spec.m_eof_throw_code,0,
m_spec.m_eof_throw_read));
m_outstream.println("\t\t{");
}
else
{
m_outstream.println(" {");
}
m_outstream.println("\t\tif (false == yy_eof_done) {");
m_outstream.print(new String(m_spec.m_eof_code,0,
m_spec.m_eof_read));
m_outstream.println("\t\t}");
m_outstream.println("\t\tyy_eof_done = true;");
m_outstream.println("\t}");
}
emit_states();
/* Function: yybegin */
m_outstream.println("\tprivate void yybegin (int state) {");
m_outstream.println("\t\tyy_lexical_state = state;");
m_outstream.println("\t}");
/* Function: yy_initial_dtrans */
/*m_outstream.println("\tprivate int yy_initial_dtrans (int state) {");
m_outstream.println("\t\treturn yy_state_dtrans[state];");
m_outstream.println("\t}");*/
/* Function: yy_advance */
m_outstream.println("\tprivate int yy_advance ()");
m_outstream.println("\t\tthrows java.io.IOException {");
/*m_outstream.println("\t\t{");*/
m_outstream.println("\t\tint next_read;");
m_outstream.println("\t\tint i;");
m_outstream.println("\t\tint j;");
m_outstream.println();
m_outstream.println("\t\tif (yy_buffer_index < yy_buffer_read) {");
m_outstream.println("\t\t\treturn yy_buffer[yy_buffer_index++];");
/*m_outstream.println("\t\t\t++yy_buffer_index;");*/
m_outstream.println("\t\t}");
m_outstream.println();
m_outstream.println("\t\tif (0 != yy_buffer_start) {");
m_outstream.println("\t\t\ti = yy_buffer_start;");
m_outstream.println("\t\t\tj = 0;");
m_outstream.println("\t\t\twhile (i < yy_buffer_read) {");
m_outstream.println("\t\t\t\tyy_buffer[j] = yy_buffer[i];");
m_outstream.println("\t\t\t\t++i;");
m_outstream.println("\t\t\t\t++j;");
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t\tyy_buffer_end = yy_buffer_end - yy_buffer_start;");
m_outstream.println("\t\t\tyy_buffer_start = 0;");
m_outstream.println("\t\t\tyy_buffer_read = j;");
m_outstream.println("\t\t\tyy_buffer_index = j;");
m_outstream.println("\t\t\tnext_read = yy_reader.read(yy_buffer,");
m_outstream.println("\t\t\t\t\tyy_buffer_read,");
m_outstream.println("\t\t\t\t\tyy_buffer.length - yy_buffer_read);");
m_outstream.println("\t\t\tif (-1 == next_read) {");
m_outstream.println("\t\t\t\treturn YY_EOF;");
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t\tyy_buffer_read = yy_buffer_read + next_read;");
m_outstream.println("\t\t}");
m_outstream.println();
m_outstream.println("\t\twhile (yy_buffer_index >= yy_buffer_read) {");
m_outstream.println("\t\t\tif (yy_buffer_index >= yy_buffer.length) {");
m_outstream.println("\t\t\t\tyy_buffer = yy_double(yy_buffer);");
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t\tnext_read = yy_reader.read(yy_buffer,");
m_outstream.println("\t\t\t\t\tyy_buffer_read,");
m_outstream.println("\t\t\t\t\tyy_buffer.length - yy_buffer_read);");
m_outstream.println("\t\t\tif (-1 == next_read) {");
m_outstream.println("\t\t\t\treturn YY_EOF;");
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t\tyy_buffer_read = yy_buffer_read + next_read;");
m_outstream.println("\t\t}");
m_outstream.println("\t\treturn yy_buffer[yy_buffer_index++];");
m_outstream.println("\t}");
/* Function: yy_move_end */
m_outstream.println("\tprivate void yy_move_end () {");
m_outstream.println("\t\tif (yy_buffer_end > yy_buffer_start &&");
m_outstream.println("\t\t '\\n' == yy_buffer[yy_buffer_end-1])");
m_outstream.println("\t\t\tyy_buffer_end--;");
m_outstream.println("\t\tif (yy_buffer_end > yy_buffer_start &&");
m_outstream.println("\t\t '\\r' == yy_buffer[yy_buffer_end-1])");
m_outstream.println("\t\t\tyy_buffer_end--;");
m_outstream.println("\t}");
/* Function: yy_mark_start */
m_outstream.println("\tprivate boolean yy_last_was_cr=false;");
m_outstream.println("\tprivate void yy_mark_start () {");
if (m_spec.m_count_lines || true == m_spec.m_count_chars)
{
if (m_spec.m_count_lines)
{
m_outstream.println("\t\tint i;");
m_outstream.println("\t\tfor (i = yy_buffer_start; "
+ "i < yy_buffer_index; ++i) {");
m_outstream.println("\t\t\tif ('\\n' == yy_buffer[i] && !yy_last_was_cr) {");
m_outstream.println("\t\t\t\t++yyline;");
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t\tif ('\\r' == yy_buffer[i]) {");
m_outstream.println("\t\t\t\t++yyline;");
m_outstream.println("\t\t\t\tyy_last_was_cr=true;");
m_outstream.println("\t\t\t} else yy_last_was_cr=false;");
m_outstream.println("\t\t}");
}
if (m_spec.m_count_chars)
{
m_outstream.println("\t\tyychar = yychar");
m_outstream.println("\t\t\t+ yy_buffer_index - yy_buffer_start;");
}
}
m_outstream.println("\t\tyy_buffer_start = yy_buffer_index;");
m_outstream.println("\t}");
/* Function: yy_mark_end */
m_outstream.println("\tprivate void yy_mark_end () {");
m_outstream.println("\t\tyy_buffer_end = yy_buffer_index;");
m_outstream.println("\t}");
/* Function: yy_to_mark */
m_outstream.println("\tprivate void yy_to_mark () {");
m_outstream.println("\t\tyy_buffer_index = yy_buffer_end;");
m_outstream.println("\t\tyy_at_bol = "+
"(yy_buffer_end > yy_buffer_start) &&");
m_outstream.println("\t\t "+
"('\\r' == yy_buffer[yy_buffer_end-1] ||");
m_outstream.println("\t\t "+
" '\\n' == yy_buffer[yy_buffer_end-1] ||");
m_outstream.println("\t\t "+ /* unicode LS */
" 2028/*LS*/ == yy_buffer[yy_buffer_end-1] ||");
m_outstream.println("\t\t "+ /* unicode PS */
" 2029/*PS*/ == yy_buffer[yy_buffer_end-1]);");
m_outstream.println("\t}");
/* Function: yytext */
m_outstream.println("\tprivate java.lang.String yytext () {");
m_outstream.println("\t\treturn (new java.lang.String(yy_buffer,");
m_outstream.println("\t\t\tyy_buffer_start,");
m_outstream.println("\t\t\tyy_buffer_end - yy_buffer_start));");
m_outstream.println("\t}");
/* Function: yylength */
m_outstream.println("\tprivate int yylength () {");
m_outstream.println("\t\treturn yy_buffer_end - yy_buffer_start;");
m_outstream.println("\t}");
/* Function: yy_double */
m_outstream.println("\tprivate char[] yy_double (char buf[]) {");
m_outstream.println("\t\tint i;");
m_outstream.println("\t\tchar newbuf[];");
m_outstream.println("\t\tnewbuf = new char[2*buf.length];");
m_outstream.println("\t\tfor (i = 0; i < buf.length; ++i) {");
m_outstream.println("\t\t\tnewbuf[i] = buf[i];");
m_outstream.println("\t\t}");
m_outstream.println("\t\treturn newbuf;");
m_outstream.println("\t}");
/* Function: yy_error */
m_outstream.println("\tprivate final int YY_E_INTERNAL = 0;");
m_outstream.println("\tprivate final int YY_E_MATCH = 1;");
m_outstream.println("\tprivate java.lang.String yy_error_string[] = {");
m_outstream.println("\t\t\"Error: Internal error.\\n\",");
m_outstream.println("\t\t\"Error: Unmatched input.\\n\"");
m_outstream.println("\t};");
m_outstream.println("\tprivate void yy_error (int code,boolean fatal) {");
m_outstream.println("\t\tjava.lang.System.out.print(yy_error_string[code]);");
m_outstream.println("\t\tjava.lang.System.out.flush();");
m_outstream.println("\t\tif (fatal) {");
m_outstream.println("\t\t\tthrow new Error(\"Fatal Error.\\n\");");
m_outstream.println("\t\t}");
m_outstream.println("\t}");
/* Function: yy_next */
/*m_outstream.println("\tprivate int yy_next (int current,char lookahead) {");
m_outstream.println("\t\treturn yy_nxt[yy_rmap[current]][yy_cmap[lookahead]];");
m_outstream.println("\t}");*/
/* Function: yy_accept */
/*m_outstream.println("\tprivate int yy_accept (int current) {");
m_outstream.println("\t\treturn yy_acpt[current];");
m_outstream.println("\t}");*/
// Function: private int [][] unpackFromString(int size1, int size2, String st)
// Added 6/24/98 Raimondas Lencevicius
// May be made more efficient by replacing String operations
// Assumes correctly formed input String. Performs no error checking
m_outstream.println("\tprivate int[][] unpackFromString"+
"(int size1, int size2, String st) {");
m_outstream.println("\t\tint colonIndex = -1;");
m_outstream.println("\t\tString lengthString;");
m_outstream.println("\t\tint sequenceLength = 0;");
m_outstream.println("\t\tint sequenceInteger = 0;");
m_outstream.println();
m_outstream.println("\t\tint commaIndex;");
m_outstream.println("\t\tString workString;");
m_outstream.println();
m_outstream.println("\t\tint res[][] = new int[size1][size2];");
m_outstream.println("\t\tfor (int i= 0; i < size1; i++) {");
m_outstream.println("\t\t\tfor (int j= 0; j < size2; j++) {");
m_outstream.println("\t\t\t\tif (sequenceLength != 0) {");
m_outstream.println("\t\t\t\t\tres[i][j] = sequenceInteger;");
m_outstream.println("\t\t\t\t\tsequenceLength--;");
m_outstream.println("\t\t\t\t\tcontinue;");
m_outstream.println("\t\t\t\t}");
m_outstream.println("\t\t\t\tcommaIndex = st.indexOf(',');");
m_outstream.println("\t\t\t\tworkString = (commaIndex==-1) ? st :");
m_outstream.println("\t\t\t\t\tst.substring(0, commaIndex);");
m_outstream.println("\t\t\t\tst = st.substring(commaIndex+1);");
m_outstream.println("\t\t\t\tcolonIndex = workString.indexOf(':');");
m_outstream.println("\t\t\t\tif (colonIndex == -1) {");
m_outstream.println("\t\t\t\t\tres[i][j]=Integer.parseInt(workString);");
m_outstream.println("\t\t\t\t\tcontinue;");
m_outstream.println("\t\t\t\t}");
m_outstream.println("\t\t\t\tlengthString =");
m_outstream.println("\t\t\t\t\tworkString.substring(colonIndex+1);");
m_outstream.println("\t\t\t\tsequenceLength="+
"Integer.parseInt(lengthString);");
m_outstream.println("\t\t\t\tworkString="+
"workString.substring(0,colonIndex);");
m_outstream.println("\t\t\t\tsequenceInteger="+
"Integer.parseInt(workString);");
m_outstream.println("\t\t\t\tres[i][j] = sequenceInteger;");
m_outstream.println("\t\t\t\tsequenceLength--;");
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t}");
m_outstream.println("\t\treturn res;");
m_outstream.println("\t}");
}
/***************************************************************
Function: emit_header
Description: Emits class header.
**************************************************************/
private void emit_header
(
)
throws java.io.IOException
{
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}
m_outstream.println();
m_outstream.println();
if (true == m_spec.m_public) {
m_outstream.print("public ");
}
m_outstream.print("class ");
m_outstream.print(new String(m_spec.m_class_name,0,
m_spec.m_class_name.length));
if (m_spec.m_implements_name.length > 0) {
m_outstream.print(" implements ");
m_outstream.print(new String(m_spec.m_implements_name,0,
m_spec.m_implements_name.length));
}
m_outstream.println(" {");
}
/***************************************************************
Function: emit_table
Description: Emits transition table.
**************************************************************/
private void emit_table
(
)
throws java.io.IOException
{
int i;
int elem;
int size;
CDTrans dtrans;
boolean is_start;
boolean is_end;
CAccept accept;
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}
m_outstream.println("\tprivate int yy_acpt[] = {");
size = m_spec.m_accept_vector.size();
for (elem = 0; elem < size; ++elem)
{
accept = (CAccept) m_spec.m_accept_vector.elementAt(elem);
m_outstream.print("\t\t/* "+elem+" */ ");
if (null != accept)
{
is_start = (0 != (m_spec.m_anchor_array[elem] & CSpec.START));
is_end = (0 != (m_spec.m_anchor_array[elem] & CSpec.END));
if (is_start && true == is_end)
{
m_outstream.print("YY_START | YY_END");
}
else if (is_start)
{
m_outstream.print("YY_START");
}
else if (is_end)
{
m_outstream.print("YY_END");
}
else
{
m_outstream.print("YY_NO_ANCHOR");
}
}
else
{
m_outstream.print("YY_NOT_ACCEPT");
}
if (elem < size - 1)
{
m_outstream.print(",");
}
m_outstream.println();
}
m_outstream.println("\t};");
// CSA: modified yy_cmap to use string packing 9-Aug-1999
int[] yy_cmap = new int[m_spec.m_ccls_map.length];
for (i = 0; i < m_spec.m_ccls_map.length; ++i)
yy_cmap[i] = m_spec.m_col_map[m_spec.m_ccls_map[i]];
m_outstream.print("\tprivate int yy_cmap[] = unpackFromString(");
emit_table_as_string(new int[][] { yy_cmap });
m_outstream.println(")[0];");
m_outstream.println();
// CSA: modified yy_rmap to use string packing 9-Aug-1999
m_outstream.print("\tprivate int yy_rmap[] = unpackFromString(");
emit_table_as_string(new int[][] { m_spec.m_row_map });
m_outstream.println(")[0];");
m_outstream.println();
// 6/24/98 Raimondas Lencevicius
// modified to use
// int[][] unpackFromString(int size1, int size2, String st)
size = m_spec.m_dtrans_vector.size();
int[][] yy_nxt = new int[size][];
for (elem=0; elem<size; elem++) {
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(elem);
CUtility.assert(dtrans.m_dtrans.length==m_spec.m_dtrans_ncols);
yy_nxt[elem] = dtrans.m_dtrans;
}
m_outstream.print
("\tprivate int yy_nxt[][] = unpackFromString(");
emit_table_as_string(yy_nxt);
m_outstream.println(");");
m_outstream.println();
}
/***************************************************************
Function: emit_driver
Description: Output an integer table as a string. Written by
Raimondas Lencevicius 6/24/98; reorganized by CSA 9-Aug-1999.
From his original comments:
yy_nxt[][] values are coded into a string
by printing integers and representing
integer sequences as "value:length" pairs.
**************************************************************/
private void emit_table_as_string(int[][] ia) {
int sequenceLength = 0; // RL - length of the number sequence
boolean sequenceStarted = false; // RL - has number sequence started?
int previousInt = -20; // RL - Bogus -20 state.
// RL - Output matrix size
m_outstream.print(ia.length);
m_outstream.print(",");
m_outstream.print(ia.length>0?ia[0].length:0);
m_outstream.println(",");
StringBuffer outstr = new StringBuffer();
// RL - Output matrix
for (int elem = 0; elem < ia.length; ++elem)
{
for (int i = 0; i < ia[elem].length; ++i)
{
int writeInt = ia[elem][i];
if (writeInt == previousInt) // RL - sequence?
{
if (sequenceStarted)
{
sequenceLength++;
}
else
{
outstr.append(writeInt);
outstr.append(":");
sequenceLength = 2;
sequenceStarted = true;
}
}
else // RL - no sequence or end sequence
{
if (sequenceStarted)
{
outstr.append(sequenceLength);
outstr.append(",");
sequenceLength = 0;
sequenceStarted = false;
}
else
{
if (previousInt != -20)
{
outstr.append(previousInt);
outstr.append(",");
}
}
}
previousInt = writeInt;
// CSA: output in 75 character chunks.
if (outstr.length() > 75) {
String s = outstr.toString();
m_outstream.println("\""+s.substring(0,75)+"\" +");
outstr = new StringBuffer(s.substring(75));
}
}
}
if (sequenceStarted)
{
outstr.append(sequenceLength);
}
else
{
outstr.append(previousInt);
}
// CSA: output in 75 character chunks.
if (outstr.length() > 75) {
String s = outstr.toString();
m_outstream.println("\""+s.substring(0,75)+"\" +");
outstr = new StringBuffer(s.substring(75));
}
m_outstream.print("\""+outstr+"\"");
}
/***************************************************************
Function: emit_driver
Description:
**************************************************************/
private void emit_driver
(
)
throws java.io.IOException
{
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}
emit_table();
if (m_spec.m_integer_type)
{
m_outstream.print("\tpublic int ");
m_outstream.print(new String(m_spec.m_function_name));
m_outstream.println(" ()");
}
else if (m_spec.m_intwrap_type)
{
m_outstream.print("\tpublic java.lang.Integer ");
m_outstream.print(new String(m_spec.m_function_name));
m_outstream.println(" ()");
}
else
{
m_outstream.print("\tpublic ");
m_outstream.print(new String(m_spec.m_type_name));
m_outstream.print(" ");
m_outstream.print(new String(m_spec.m_function_name));
m_outstream.println(" ()");
}
/*m_outstream.println("\t\tthrows java.io.IOException {");*/
m_outstream.print("\t\tthrows java.io.IOException");
if (null != m_spec.m_yylex_throw_code)
{
m_outstream.print(", ");
m_outstream.print(new String(m_spec.m_yylex_throw_code,0,
m_spec.m_yylex_throw_read));
m_outstream.println();
m_outstream.println("\t\t{");
}
else
{
m_outstream.println(" {");
}
m_outstream.println("\t\tint yy_lookahead;");
m_outstream.println("\t\tint yy_anchor = YY_NO_ANCHOR;");
/*m_outstream.println("\t\tint yy_state "
+ "= yy_initial_dtrans(yy_lexical_state);");*/
m_outstream.println("\t\tint yy_state "
+ "= yy_state_dtrans[yy_lexical_state];");
m_outstream.println("\t\tint yy_next_state = YY_NO_STATE;");
/*m_outstream.println("\t\tint yy_prev_stave = YY_NO_STATE;");*/
m_outstream.println("\t\tint yy_last_accept_state = YY_NO_STATE;");
m_outstream.println("\t\tboolean yy_initial = true;");
m_outstream.println("\t\tint yy_this_accept;");
m_outstream.println();
m_outstream.println("\t\tyy_mark_start();");
/*m_outstream.println("\t\tyy_this_accept = yy_accept(yy_state);");*/
m_outstream.println("\t\tyy_this_accept = yy_acpt[yy_state];");
m_outstream.println("\t\tif (YY_NOT_ACCEPT != yy_this_accept) {");
m_outstream.println("\t\t\tyy_last_accept_state = yy_state;");
m_outstream.println("\t\t\tyy_mark_end();");
m_outstream.println("\t\t}");
if (NOT_EDBG)
{
m_outstream.println("\t\tjava.lang.System.out.println(\"Begin\");");
}
m_outstream.println("\t\twhile (true) {");
m_outstream.println("\t\t\tif (yy_initial && yy_at_bol) "+
"yy_lookahead = YY_BOL;");
m_outstream.println("\t\t\telse yy_lookahead = yy_advance();");
m_outstream.println("\t\t\tyy_next_state = YY_F;");
/*m_outstream.println("\t\t\t\tyy_next_state = "
+ "yy_next(yy_state,yy_lookahead);");*/
m_outstream.println("\t\t\tyy_next_state = "
+ "yy_nxt[yy_rmap[yy_state]][yy_cmap[yy_lookahead]];");
if (NOT_EDBG)
{
m_outstream.println("java.lang.System.out.println(\"Current state: \""
+ " + yy_state");
m_outstream.println("+ \"\tCurrent input: \"");
m_outstream.println(" + ((char) yy_lookahead));");
}
if (NOT_EDBG)
{
m_outstream.println("\t\t\tjava.lang.System.out.println(\"State = \""
+ "+ yy_state);");
m_outstream.println("\t\t\tjava.lang.System.out.println(\"Accepting status = \""
+ "+ yy_this_accept);");
m_outstream.println("\t\t\tjava.lang.System.out.println(\"Last accepting state = \""
+ "+ yy_last_accept_state);");
m_outstream.println("\t\t\tjava.lang.System.out.println(\"Next state = \""
+ "+ yy_next_state);");
m_outstream.println("\t\t\tjava.lang.System.out.println(\"Lookahead input = \""
+ "+ ((char) yy_lookahead));");
}
// handle bare EOF.
m_outstream.println("\t\t\tif (YY_EOF == yy_lookahead "
+ "&& true == yy_initial) {");
if (null != m_spec.m_eof_code)
{
m_outstream.println("\t\t\t\tyy_do_eof();");
}
if (true == m_spec.m_integer_type)
{
m_outstream.println("\t\t\t\treturn YYEOF;");
}
else if (null != m_spec.m_eof_value_code)
{
m_outstream.print(new String(m_spec.m_eof_value_code,0,
m_spec.m_eof_value_read));
}
else
{
m_outstream.println("\t\t\t\treturn null;");
}
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t\tif (YY_F != yy_next_state) {");
m_outstream.println("\t\t\t\tyy_state = yy_next_state;");
m_outstream.println("\t\t\t\tyy_initial = false;");
/*m_outstream.println("\t\t\t\tyy_this_accept = yy_accept(yy_state);");*/
m_outstream.println("\t\t\t\tyy_this_accept = yy_acpt[yy_state];");
m_outstream.println("\t\t\t\tif (YY_NOT_ACCEPT != yy_this_accept) {");
m_outstream.println("\t\t\t\t\tyy_last_accept_state = yy_state;");
m_outstream.println("\t\t\t\t\tyy_mark_end();");
m_outstream.println("\t\t\t\t}");
/*m_outstream.println("\t\t\t\tyy_prev_state = yy_state;");*/
/*m_outstream.println("\t\t\t\tyy_state = yy_next_state;");*/
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t\telse {");
m_outstream.println("\t\t\t\tif (YY_NO_STATE == yy_last_accept_state) {");
/*m_outstream.println("\t\t\t\t\tyy_error(YY_E_MATCH,false);");
m_outstream.println("\t\t\t\t\tyy_initial = true;");
m_outstream.println("\t\t\t\t\tyy_state "
+ "= yy_state_dtrans[yy_lexical_state];");
m_outstream.println("\t\t\t\t\tyy_next_state = YY_NO_STATE;");*/
/*m_outstream.println("\t\t\t\t\tyy_prev_state = YY_NO_STATE;");*/
/*m_outstream.println("\t\t\t\t\tyy_last_accept_state = YY_NO_STATE;");
m_outstream.println("\t\t\t\t\tyy_mark_start();");*/
/*m_outstream.println("\t\t\t\t\tyy_this_accept = yy_accept(yy_state);");*/
/*m_outstream.println("\t\t\t\t\tyy_this_accept = yy_acpt[yy_state];");
m_outstream.println("\t\t\t\t\tif (YY_NOT_ACCEPT != yy_this_accept) {");
m_outstream.println("\t\t\t\t\t\tyy_last_accept_state = yy_state;");
m_outstream.println("\t\t\t\t\t}");*/
m_outstream.println("\t\t\t\t\tthrow (new Error(\"Lexical Error: Unmatched Input.\"));");
m_outstream.println("\t\t\t\t}");
m_outstream.println("\t\t\t\telse {");
m_outstream.println("\t\t\t\t\tyy_anchor = yy_acpt[yy_last_accept_state];");
/*m_outstream.println("\t\t\t\t\tyy_anchor "
+ "= yy_accept(yy_last_accept_state);");*/
m_outstream.println("\t\t\t\t\tif (0 != (YY_END & yy_anchor)) {");
m_outstream.println("\t\t\t\t\t\tyy_move_end();");
m_outstream.println("\t\t\t\t\t}");
m_outstream.println("\t\t\t\t\tyy_to_mark();");
m_outstream.println("\t\t\t\t\tswitch (yy_last_accept_state) {");
emit_actions("\t\t\t\t\t");
m_outstream.println("\t\t\t\t\tdefault:");
m_outstream.println("\t\t\t\t\t\tyy_error(YY_E_INTERNAL,false);");
/*m_outstream.println("\t\t\t\t\t\treturn null;");*/
m_outstream.println("\t\t\t\t\tcase -1:");
m_outstream.println("\t\t\t\t\t}");
m_outstream.println("\t\t\t\t\tyy_initial = true;");
m_outstream.println("\t\t\t\t\tyy_state "
+ "= yy_state_dtrans[yy_lexical_state];");
m_outstream.println("\t\t\t\t\tyy_next_state = YY_NO_STATE;");
/*m_outstream.println("\t\t\t\t\tyy_prev_state = YY_NO_STATE;");*/
m_outstream.println("\t\t\t\t\tyy_last_accept_state = YY_NO_STATE;");
m_outstream.println("\t\t\t\t\tyy_mark_start();");
/*m_outstream.println("\t\t\t\t\tyy_this_accept = yy_accept(yy_state);");*/
m_outstream.println("\t\t\t\t\tyy_this_accept = yy_acpt[yy_state];");
m_outstream.println("\t\t\t\t\tif (YY_NOT_ACCEPT != yy_this_accept) {");
m_outstream.println("\t\t\t\t\t\tyy_last_accept_state = yy_state;");
m_outstream.println("\t\t\t\t\t\tyy_mark_end();");
m_outstream.println("\t\t\t\t\t}");
m_outstream.println("\t\t\t\t}");
m_outstream.println("\t\t\t}");
m_outstream.println("\t\t}");
m_outstream.println("\t}");
/*m_outstream.println("\t\t\t\t");
m_outstream.println("\t\t\t");
m_outstream.println("\t\t\t");
m_outstream.println("\t\t\t");
m_outstream.println("\t\t\t");
m_outstream.println("\t\t}");*/
}
/***************************************************************
Function: emit_actions
Description:
**************************************************************/
private void emit_actions
(
String tabs
)
throws java.io.IOException
{
int elem;
int size;
int bogus_index;
CAccept accept;
if (CUtility.DEBUG)
{
CUtility.assert(m_spec.m_accept_vector.size()
== m_spec.m_anchor_array.length);
}
bogus_index = -2;
size = m_spec.m_accept_vector.size();
for (elem = 0; elem < size; ++elem)
{
accept = (CAccept) m_spec.m_accept_vector.elementAt(elem);
if (null != accept)
{
m_outstream.println(tabs + "case " + elem
+ ":");
m_outstream.print(tabs + "\t");
m_outstream.print(new String(accept.m_action,0,
accept.m_action_read));
m_outstream.println();
m_outstream.println(tabs + "case " + bogus_index + ":");
m_outstream.println(tabs + "\tbreak;");
--bogus_index;
}
}
}
/***************************************************************
Function: emit_footer
Description:
**************************************************************/
private void emit_footer
(
)
throws java.io.IOException
{
if (CUtility.DEBUG)
{
CUtility.assert(null != m_spec);
CUtility.assert(null != m_outstream);
}
m_outstream.println("}");
}
}
/***************************************************************
Class: CBunch
**************************************************************/
class CBunch
{
/***************************************************************
Member Variables
**************************************************************/
Vector m_nfa_set; /* Vector of CNfa states in dfa state. */
SparseBitSet m_nfa_bit; /* BitSet representation of CNfa labels. */
CAccept m_accept; /* Accepting actions, or null if nonaccepting state. */
int m_anchor; /* Anchors on regular expression. */
int m_accept_index; /* CNfa index corresponding to accepting actions. */
/***************************************************************
Function: CBunch
Description: Constructor.
**************************************************************/
CBunch
(
)
{
m_nfa_set = null;
m_nfa_bit = null;
m_accept = null;
m_anchor = CSpec.NONE;
m_accept_index = -1;
}
}
/***************************************************************
Class: CMakeNfa
**************************************************************/
class CMakeNfa
{
/***************************************************************
Member Variables
**************************************************************/
private CSpec m_spec;
private CLexGen m_lexGen;
private CInput m_input;
/***************************************************************
Function: CMakeNfa
Description: Constructor.
**************************************************************/
CMakeNfa
(
)
{
reset();
}
/***************************************************************
Function: reset
Description: Resets CMakeNfa member variables.
**************************************************************/
private void reset
(
)
{
m_input = null;
m_lexGen = null;
m_spec = null;
}
/***************************************************************
Function: set
Description: Sets CMakeNfa member variables.
**************************************************************/
private void set
(
CLexGen lexGen,
CSpec spec,
CInput input
)
{
if (CUtility.DEBUG)
{
CUtility.assert(null != input);
CUtility.assert(null != lexGen);
CUtility.assert(null != spec);
}
m_input = input;
m_lexGen = lexGen;
m_spec = spec;
}
/***************************************************************
Function: allocate_BOL_EOF
Description: Expands character class to include special BOL and
EOF characters. Puts numeric index of these characters in
input CSpec.
**************************************************************/
void allocate_BOL_EOF
(
CSpec spec
)
{
CUtility.assert(CSpec.NUM_PSEUDO==2);
spec.BOL = spec.m_dtrans_ncols++;
spec.EOF = spec.m_dtrans_ncols++;
}
/***************************************************************
Function: thompson
Description: High level access function to module.
Deposits result in input CSpec.
**************************************************************/
void thompson
(
CLexGen lexGen,
CSpec spec,
CInput input
)
throws java.io.IOException
{
int i;
CNfa elem;
int size;
/* Set member variables. */
reset();
set(lexGen,spec,input);
size = m_spec.m_states.size();
m_spec.m_state_rules = new Vector[size];
for (i = 0; i < size; ++i)
{
m_spec.m_state_rules[i] = new Vector();
}
/* Initialize current token variable
and create nfa. */
/*m_spec.m_current_token = m_lexGen.EOS;
m_lexGen.advance();*/
m_spec.m_nfa_start = machine();
/* Set labels in created nfa machine. */
size = m_spec.m_nfa_states.size();
for (i = 0; i < size; ++i)
{
elem = (CNfa) m_spec.m_nfa_states.elementAt(i);
elem.m_label = i;
}
/* Debugging output. */
if (CUtility.DO_DEBUG)
{
m_lexGen.print_nfa();
}
if (m_spec.m_verbose)
{
System.out.println("NFA comprised of "
+ (m_spec.m_nfa_states.size() + 1)
+ " states.");
}
reset();
}
/***************************************************************
Function: discardCNfa
Description:
**************************************************************/
private void discardCNfa
(
CNfa nfa
)
{
m_spec.m_nfa_states.removeElement(nfa);
}
/***************************************************************
Function: processStates
Description:
**************************************************************/
private void processStates
(
SparseBitSet states,
CNfa current
)
{
int size;
int i;
size = m_spec.m_states.size();
for (i = 0; i < size; ++i)
{
if (states.get(i))
{
m_spec.m_state_rules[i].addElement(current);
}
}
}
/***************************************************************
Function: machine
Description: Recursive descent regular expression parser.
**************************************************************/
private CNfa machine
(
)
throws java.io.IOException
{
CNfa start;
CNfa p;
SparseBitSet states;
if (CUtility.DESCENT_DEBUG)
{
CUtility.enter("machine",m_spec.m_lexeme,m_spec.m_current_token);
}
start = CAlloc.newCNfa(m_spec);
p = start;
states = m_lexGen.getStates();
/* Begin: Added for states. */
m_spec.m_current_token = m_lexGen.EOS;
m_lexGen.advance();
/* End: Added for states. */
if (m_lexGen.END_OF_INPUT != m_spec.m_current_token) // CSA fix.
{
p.m_next = rule();
processStates(states,p.m_next);
}
while (m_lexGen.END_OF_INPUT != m_spec.m_current_token)
{
/* Make state changes HERE. */
states = m_lexGen.getStates();
/* Begin: Added for states. */
m_lexGen.advance();
if (m_lexGen.END_OF_INPUT == m_spec.m_current_token)
{
break;
}
/* End: Added for states. */
p.m_next2 = CAlloc.newCNfa(m_spec);
p = p.m_next2;
p.m_next = rule();
processStates(states,p.m_next);
}
// CSA: add pseudo-rules for BOL and EOF
SparseBitSet all_states = new SparseBitSet();
for (int i = 0; i < m_spec.m_states.size(); ++i)
all_states.set(i);
p.m_next2 = CAlloc.newCNfa(m_spec);
p = p.m_next2;
p.m_next = CAlloc.newCNfa(m_spec);
p.m_next.m_edge = CNfa.CCL;
p.m_next.m_next = CAlloc.newCNfa(m_spec);
p.m_next.m_set = new CSet();
p.m_next.m_set.add(m_spec.BOL);
p.m_next.m_set.add(m_spec.EOF);
p.m_next.m_next.m_accept = // do-nothing accept rule
new CAccept(new char[0], 0, m_input.m_line_number+1);
processStates(all_states,p.m_next);
// CSA: done.
if (CUtility.DESCENT_DEBUG)
{
CUtility.leave("machine",m_spec.m_lexeme,m_spec.m_current_token);
}
return start;
}
/***************************************************************
Function: rule
Description: Recursive descent regular expression parser.
**************************************************************/
private CNfa rule
(
)
throws java.io.IOException
{
CNfaPair pair;
CNfa p;
CNfa start = null;
CNfa end = null;
int anchor = CSpec.NONE;
if (CUtility.DESCENT_DEBUG)
{
CUtility.enter("rule",m_spec.m_lexeme,m_spec.m_current_token);
}
pair = CAlloc.newCNfaPair();
if (m_lexGen.AT_BOL == m_spec.m_current_token)
{
anchor = anchor | CSpec.START;
m_lexGen.advance();
expr(pair);
// CSA: fixed beginning-of-line operator. 8-aug-1999
start = CAlloc.newCNfa(m_spec);
start.m_edge = m_spec.BOL;
start.m_next = pair.m_start;
end = pair.m_end;
}
else
{
expr(pair);
start = pair.m_start;
end = pair.m_end;
}
if (m_lexGen.AT_EOL == m_spec.m_current_token)
{
m_lexGen.advance();
// CSA: fixed end-of-line operator. 8-aug-1999
CNfaPair nlpair = CAlloc.newNLPair(m_spec);
end.m_next = CAlloc.newCNfa(m_spec);
end.m_next.m_next = nlpair.m_start;
end.m_next.m_next2 = CAlloc.newCNfa(m_spec);
end.m_next.m_next2.m_edge = m_spec.EOF;
end.m_next.m_next2.m_next = nlpair.m_end;
end = nlpair.m_end;
anchor = anchor | CSpec.END;
}
/* Check for null rules. Charles Fischer found this bug. [CSA] */
if (end==null)
CError.parse_error(CError.E_ZERO, m_input.m_line_number);
/* Handle end of regular expression. See page 103. */
end.m_accept = m_lexGen.packAccept();
end.m_anchor = anchor;
/* Begin: Removed for states. */
/*m_lexGen.advance();*/
/* End: Removed for states. */
if (CUtility.DESCENT_DEBUG)
{
CUtility.leave("rule",m_spec.m_lexeme,m_spec.m_current_token);
}
return start;
}
/***************************************************************
Function: expr
Description: Recursive descent regular expression parser.
**************************************************************/
private void expr
(
CNfaPair pair
)
throws java.io.IOException
{
CNfaPair e2_pair;
CNfa p;
if (CUtility.DESCENT_DEBUG)
{
CUtility.enter("expr",m_spec.m_lexeme,m_spec.m_current_token);
}
if (CUtility.DEBUG)
{
CUtility.assert(null != pair);
}
e2_pair = CAlloc.newCNfaPair();
cat_expr(pair);
while (m_lexGen.OR == m_spec.m_current_token)
{
m_lexGen.advance();
cat_expr(e2_pair);
p = CAlloc.newCNfa(m_spec);
p.m_next2 = e2_pair.m_start;
p.m_next = pair.m_start;
pair.m_start = p;
p = CAlloc.newCNfa(m_spec);
pair.m_end.m_next = p;
e2_pair.m_end.m_next = p;
pair.m_end = p;
}
if (CUtility.DESCENT_DEBUG)
{
CUtility.leave("expr",m_spec.m_lexeme,m_spec.m_current_token);
}
}
/***************************************************************
Function: cat_expr
Description: Recursive descent regular expression parser.
**************************************************************/
private void cat_expr
(
CNfaPair pair
)
throws java.io.IOException
{
CNfaPair e2_pair;
if (CUtility.DESCENT_DEBUG)
{
CUtility.enter("cat_expr",m_spec.m_lexeme,m_spec.m_current_token);
}
if (CUtility.DEBUG)
{
CUtility.assert(null != pair);
}
e2_pair = CAlloc.newCNfaPair();
if (first_in_cat(m_spec.m_current_token))
{
factor(pair);
}
while (first_in_cat(m_spec.m_current_token))
{
factor(e2_pair);
/* Destroy */
pair.m_end.mimic(e2_pair.m_start);
discardCNfa(e2_pair.m_start);
pair.m_end = e2_pair.m_end;
}
if (CUtility.DESCENT_DEBUG)
{
CUtility.leave("cat_expr",m_spec.m_lexeme,m_spec.m_current_token);
}
}
/***************************************************************
Function: first_in_cat
Description: Recursive descent regular expression parser.
**************************************************************/
private boolean first_in_cat
(
int token
)
{
switch (token)
{
case CLexGen.CLOSE_PAREN:
case CLexGen.AT_EOL:
case CLexGen.OR:
case CLexGen.EOS:
return false;
case CLexGen.CLOSURE:
case CLexGen.PLUS_CLOSE:
case CLexGen.OPTIONAL:
CError.parse_error(CError.E_CLOSE,m_input.m_line_number);
return false;
case CLexGen.CCL_END:
CError.parse_error(CError.E_BRACKET,m_input.m_line_number);
return false;
case CLexGen.AT_BOL:
CError.parse_error(CError.E_BOL,m_input.m_line_number);
return false;
default:
break;
}
return true;
}
/***************************************************************
Function: factor
Description: Recursive descent regular expression parser.
**************************************************************/
private void factor
(
CNfaPair pair
)
throws java.io.IOException
{
CNfa start = null;
CNfa end = null;
if (CUtility.DESCENT_DEBUG)
{
CUtility.enter("factor",m_spec.m_lexeme,m_spec.m_current_token);
}
term(pair);
if (m_lexGen.CLOSURE == m_spec.m_current_token
|| m_lexGen.PLUS_CLOSE == m_spec.m_current_token
|| m_lexGen.OPTIONAL == m_spec.m_current_token)
{
start = CAlloc.newCNfa(m_spec);
end = CAlloc.newCNfa(m_spec);
start.m_next = pair.m_start;
pair.m_end.m_next = end;
if (m_lexGen.CLOSURE == m_spec.m_current_token
|| m_lexGen.OPTIONAL == m_spec.m_current_token)
{
start.m_next2 = end;
}
if (m_lexGen.CLOSURE == m_spec.m_current_token
|| m_lexGen.PLUS_CLOSE == m_spec.m_current_token)
{
pair.m_end.m_next2 = pair.m_start;
}
pair.m_start = start;
pair.m_end = end;
m_lexGen.advance();
}
if (CUtility.DESCENT_DEBUG)
{
CUtility.leave("factor",m_spec.m_lexeme,m_spec.m_current_token);
}
}
/***************************************************************
Function: term
Description: Recursive descent regular expression parser.
**************************************************************/
private void term
(
CNfaPair pair
)
throws java.io.IOException
{
CNfa start;
boolean isAlphaL;
int c;
if (CUtility.DESCENT_DEBUG)
{
CUtility.enter("term",m_spec.m_lexeme,m_spec.m_current_token);
}
if (m_lexGen.OPEN_PAREN == m_spec.m_current_token)
{
m_lexGen.advance();
expr(pair);
if (m_lexGen.CLOSE_PAREN == m_spec.m_current_token)
{
m_lexGen.advance();
}
else
{
CError.parse_error(CError.E_SYNTAX,m_input.m_line_number);
}
}
else
{
start = CAlloc.newCNfa(m_spec);
pair.m_start = start;
start.m_next = CAlloc.newCNfa(m_spec);
pair.m_end = start.m_next;
if (m_lexGen.L == m_spec.m_current_token &&
Character.isLetter(m_spec.m_lexeme))
{
isAlphaL = true;
}
else
{
isAlphaL = false;
}
if (false == (m_lexGen.ANY == m_spec.m_current_token
|| m_lexGen.CCL_START == m_spec.m_current_token
|| (m_spec.m_ignorecase && isAlphaL)))
{
start.m_edge = m_spec.m_lexeme;
m_lexGen.advance();
}
else
{
start.m_edge = CNfa.CCL;
start.m_set = new CSet();
/* Match case-insensitive letters using character class. */
if (m_spec.m_ignorecase && isAlphaL)
{
start.m_set.addncase(m_spec.m_lexeme);
}
/* Match dot (.) using character class. */
else if (m_lexGen.ANY == m_spec.m_current_token)
{
start.m_set.add('\n');
start.m_set.add('\r');
// CSA: exclude BOL and EOF from character classes
start.m_set.add(m_spec.BOL);
start.m_set.add(m_spec.EOF);
start.m_set.complement();
}
else
{
m_lexGen.advance();
if (m_lexGen.AT_BOL == m_spec.m_current_token)
{
m_lexGen.advance();
// CSA: exclude BOL and EOF from character classes
start.m_set.add(m_spec.BOL);
start.m_set.add(m_spec.EOF);
start.m_set.complement();
}
if (false == (m_lexGen.CCL_END == m_spec.m_current_token))
{
dodash(start.m_set);
}
/*else
{
for (c = 0; c <= ' '; ++c)
{
start.m_set.add((byte) c);
}
}*/
}
m_lexGen.advance();
}
}
if (CUtility.DESCENT_DEBUG)
{
CUtility.leave("term",m_spec.m_lexeme,m_spec.m_current_token);
}
}
/***************************************************************
Function: dodash
Description: Recursive descent regular expression parser.
**************************************************************/
private void dodash
(
CSet set
)
throws java.io.IOException
{
int first = -1;
if (CUtility.DESCENT_DEBUG)
{
CUtility.enter("dodash",m_spec.m_lexeme,m_spec.m_current_token);
}
while (m_lexGen.EOS != m_spec.m_current_token
&& m_lexGen.CCL_END != m_spec.m_current_token)
{
// DASH loses its special meaning if it is first in class.
if (m_lexGen.DASH == m_spec.m_current_token && -1 != first)
{
m_lexGen.advance();
// DASH loses its special meaning if it is last in class.
if (m_spec.m_current_token == m_lexGen.CCL_END)
{
// 'first' already in set.
set.add('-');
break;
}
for ( ; first <= m_spec.m_lexeme; ++first)
{
if (m_spec.m_ignorecase)
set.addncase((char)first);
else
set.add(first);
}
}
else
{
first = m_spec.m_lexeme;
if (m_spec.m_ignorecase)
set.addncase(m_spec.m_lexeme);
else
set.add(m_spec.m_lexeme);
}
m_lexGen.advance();
}
if (CUtility.DESCENT_DEBUG)
{
CUtility.leave("dodash",m_spec.m_lexeme,m_spec.m_current_token);
}
}
}
/**
* Extract character classes from NFA and simplify.
* @author C. Scott Ananian 25-Jul-1999
*/
class CSimplifyNfa
{
private int[] ccls; // character class mapping.
private int original_charset_size; // original charset size
private int mapped_charset_size; // reduced charset size
void simplify(CSpec m_spec) {
computeClasses(m_spec); // initialize fields.
// now rewrite the NFA using our character class mapping.
for (Enumeration e=m_spec.m_nfa_states.elements(); e.hasMoreElements(); ) {
CNfa nfa = (CNfa) e.nextElement();
if (nfa.m_edge==CNfa.EMPTY || nfa.m_edge==CNfa.EPSILON)
continue; // no change.
if (nfa.m_edge==CNfa.CCL) {
CSet ncset = new CSet();
ncset.map(nfa.m_set, ccls); // map it.
nfa.m_set = ncset;
} else { // single character
nfa.m_edge = ccls[nfa.m_edge]; // map it.
}
}
// now update m_spec with the mapping.
m_spec.m_ccls_map = ccls;
m_spec.m_dtrans_ncols = mapped_charset_size;
}
/** Compute minimum set of character classes needed to disambiguate
* edges. We optimistically assume that every character belongs to
* a single character class, and then incrementally split classes
* as we see edges that require discrimination between characters in
* the class. [CSA, 25-Jul-1999] */
private void computeClasses(CSpec m_spec) {
this.original_charset_size = m_spec.m_dtrans_ncols;
this.ccls = new int[original_charset_size]; // initially all zero.
int nextcls = 1;
SparseBitSet clsA = new SparseBitSet(), clsB = new SparseBitSet();
Hashtable h = new Hashtable();
System.out.print("Working on character classes.");
for (Enumeration e=m_spec.m_nfa_states.elements(); e.hasMoreElements(); ) {
CNfa nfa = (CNfa) e.nextElement();
if (nfa.m_edge==CNfa.EMPTY || nfa.m_edge==CNfa.EPSILON)
continue; // no discriminatory information.
clsA.clearAll(); clsB.clearAll();
for (int i=0; i<ccls.length; i++)
if (nfa.m_edge==i || // edge labeled with a character
nfa.m_edge==CNfa.CCL && nfa.m_set.contains(i)) // set of characters
clsA.set(ccls[i]);
else
clsB.set(ccls[i]);
// now figure out which character classes we need to split.
clsA.and(clsB); // split the classes which show up on both sides of edge
System.out.print(clsA.size()==0?".":":");
if (clsA.size()==0) continue; // nothing to do.
// and split them.
h.clear(); // h will map old to new class name
for (int i=0; i<ccls.length; i++)
if (clsA.get(ccls[i])) // a split class
if (nfa.m_edge==i ||
nfa.m_edge==CNfa.CCL && nfa.m_set.contains(i)) { // on A side
Integer split = new Integer(ccls[i]);
if (!h.containsKey(split))
h.put(split, new Integer(nextcls++)); // make new class
ccls[i] = ((Integer)h.get(split)).intValue();
}
}
System.out.println();
System.out.println("NFA has "+nextcls+" distinct character classes.");
this.mapped_charset_size = nextcls;
}
}
/***************************************************************
Class: CMinimize
**************************************************************/
class CMinimize
{
/***************************************************************
Member Variables
**************************************************************/
CSpec m_spec;
Vector m_group;
int m_ingroup[];
/***************************************************************
Function: CMinimize
Description: Constructor.
**************************************************************/
CMinimize
(
)
{
reset();
}
/***************************************************************
Function: reset
Description: Resets member variables.
**************************************************************/
private void reset
(
)
{
m_spec = null;
m_group = null;
m_ingroup = null;
}
/***************************************************************
Function: set
Description: Sets member variables.
**************************************************************/
private void set
(
CSpec spec
)
{
if (CUtility.DEBUG)
{
CUtility.assert(null != spec);
}
m_spec = spec;
m_group = null;
m_ingroup = null;
}
/***************************************************************
Function: min_dfa
Description: High-level access function to module.
**************************************************************/
void min_dfa
(
CSpec spec
)
{
set(spec);
/* Remove redundant states. */
minimize();
/* Column and row compression.
Save accept states in auxilary vector. */
reduce();
reset();
}
/***************************************************************
Function: col_copy
Description: Copies source column into destination column.
**************************************************************/
private void col_copy
(
int dest,
int src
)
{
int n;
int i;
CDTrans dtrans;
n = m_spec.m_dtrans_vector.size();
for (i = 0; i < n; ++i)
{
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(i);
dtrans.m_dtrans[dest] = dtrans.m_dtrans[src];
}
}
/***************************************************************
Function: trunc_col
Description: Truncates each column to the 'correct' length.
**************************************************************/
private void trunc_col
(
)
{
int n;
int i;
CDTrans dtrans;
n = m_spec.m_dtrans_vector.size();
for (i = 0; i < n; ++i)
{
int[] ndtrans = new int[m_spec.m_dtrans_ncols];
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(i);
System.arraycopy(dtrans.m_dtrans, 0, ndtrans, 0, ndtrans.length);
dtrans.m_dtrans = ndtrans;
}
}
/***************************************************************
Function: row_copy
Description: Copies source row into destination row.
**************************************************************/
private void row_copy
(
int dest,
int src
)
{
CDTrans dtrans;
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(src);
m_spec.m_dtrans_vector.setElementAt(dtrans,dest);
}
/***************************************************************
Function: col_equiv
Description:
**************************************************************/
private boolean col_equiv
(
int col1,
int col2
)
{
int n;
int i;
CDTrans dtrans;
n = m_spec.m_dtrans_vector.size();
for (i = 0; i < n; ++i)
{
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(i);
if (dtrans.m_dtrans[col1] != dtrans.m_dtrans[col2])
{
return false;
}
}
return true;
}
/***************************************************************
Function: row_equiv
Description:
**************************************************************/
private boolean row_equiv
(
int row1,
int row2
)
{
int i;
CDTrans dtrans1;
CDTrans dtrans2;
dtrans1 = (CDTrans) m_spec.m_dtrans_vector.elementAt(row1);
dtrans2 = (CDTrans) m_spec.m_dtrans_vector.elementAt(row2);
for (i = 0; i < m_spec.m_dtrans_ncols; ++i)
{
if (dtrans1.m_dtrans[i] != dtrans2.m_dtrans[i])
{
return false;
}
}
return true;
}
/***************************************************************
Function: reduce
Description:
**************************************************************/
private void reduce
(
)
{
int i;
int j;
int k;
int nrows;
int reduced_ncols;
int reduced_nrows;
SparseBitSet set;
CDTrans dtrans;
int size;
set = new SparseBitSet();
/* Save accept nodes and anchor entries. */
size = m_spec.m_dtrans_vector.size();
m_spec.m_anchor_array = new int[size];
m_spec.m_accept_vector = new Vector();
for (i = 0; i < size; ++i)
{
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(i);
m_spec.m_accept_vector.addElement(dtrans.m_accept);
m_spec.m_anchor_array[i] = dtrans.m_anchor;
dtrans.m_accept = null;
}
/* Allocate column map. */
m_spec.m_col_map = new int[m_spec.m_dtrans_ncols];
for (i = 0; i < m_spec.m_dtrans_ncols; ++i)
{
m_spec.m_col_map[i] = -1;
}
/* Process columns for reduction. */
for (reduced_ncols = 0; ; ++reduced_ncols)
{
if (CUtility.DEBUG)
{
for (i = 0; i < reduced_ncols; ++i)
{
CUtility.assert(-1 != m_spec.m_col_map[i]);
}
}
for (i = reduced_ncols; i < m_spec.m_dtrans_ncols; ++i)
{
if (-1 == m_spec.m_col_map[i])
{
break;
}
}
if (i >= m_spec.m_dtrans_ncols)
{
break;
}
if (CUtility.DEBUG)
{
CUtility.assert(false == set.get(i));
CUtility.assert(-1 == m_spec.m_col_map[i]);
}
set.set(i);
m_spec.m_col_map[i] = reduced_ncols;
/* UNDONE: Optimize by doing all comparisons in one batch. */
for (j = i + 1; j < m_spec.m_dtrans_ncols; ++j)
{
if (-1 == m_spec.m_col_map[j] && true == col_equiv(i,j))
{
m_spec.m_col_map[j] = reduced_ncols;
}
}
}
/* Reduce columns. */
k = 0;
for (i = 0; i < m_spec.m_dtrans_ncols; ++i)
{
if (set.get(i))
{
++k;
set.clear(i);
j = m_spec.m_col_map[i];
if (CUtility.DEBUG)
{
CUtility.assert(j <= i);
}
if (j == i)
{
continue;
}
col_copy(j,i);
}
}
m_spec.m_dtrans_ncols = reduced_ncols;
/* truncate m_dtrans at proper length (freeing extra) */
trunc_col();
if (CUtility.DEBUG)
{
CUtility.assert(k == reduced_ncols);
}
/* Allocate row map. */
nrows = m_spec.m_dtrans_vector.size();
m_spec.m_row_map = new int[nrows];
for (i = 0; i < nrows; ++i)
{
m_spec.m_row_map[i] = -1;
}
/* Process rows to reduce. */
for (reduced_nrows = 0; ; ++reduced_nrows)
{
if (CUtility.DEBUG)
{
for (i = 0; i < reduced_nrows; ++i)
{
CUtility.assert(-1 != m_spec.m_row_map[i]);
}
}
for (i = reduced_nrows; i < nrows; ++i)
{
if (-1 == m_spec.m_row_map[i])
{
break;
}
}
if (i >= nrows)
{
break;
}
if (CUtility.DEBUG)
{
CUtility.assert(false == set.get(i));
CUtility.assert(-1 == m_spec.m_row_map[i]);
}
set.set(i);
m_spec.m_row_map[i] = reduced_nrows;
/* UNDONE: Optimize by doing all comparisons in one batch. */
for (j = i + 1; j < nrows; ++j)
{
if (-1 == m_spec.m_row_map[j] && true == row_equiv(i,j))
{
m_spec.m_row_map[j] = reduced_nrows;
}
}
}
/* Reduce rows. */
k = 0;
for (i = 0; i < nrows; ++i)
{
if (set.get(i))
{
++k;
set.clear(i);
j = m_spec.m_row_map[i];
if (CUtility.DEBUG)
{
CUtility.assert(j <= i);
}
if (j == i)
{
continue;
}
row_copy(j,i);
}
}
m_spec.m_dtrans_vector.setSize(reduced_nrows);
if (CUtility.DEBUG)
{
/*System.out.println("k = " + k + "\nreduced_nrows = " + reduced_nrows + "");*/
CUtility.assert(k == reduced_nrows);
}
}
/***************************************************************
Function: fix_dtrans
Description: Updates CDTrans table after minimization
using groups, removing redundant transition table states.
**************************************************************/
private void fix_dtrans
(
)
{
Vector new_vector;
int i;
int size;
Vector dtrans_group;
CDTrans first;
int c;
new_vector = new Vector();
size = m_spec.m_state_dtrans.length;
for (i = 0; i < size; ++i)
{
if (CDTrans.F != m_spec.m_state_dtrans[i])
{
m_spec.m_state_dtrans[i] = m_ingroup[m_spec.m_state_dtrans[i]];
}
}
size = m_group.size();
for (i = 0; i < size; ++i)
{
dtrans_group = (Vector) m_group.elementAt(i);
first = (CDTrans) dtrans_group.elementAt(0);
new_vector.addElement(first);
for (c = 0; c < m_spec.m_dtrans_ncols; ++c)
{
if (CDTrans.F != first.m_dtrans[c])
{
first.m_dtrans[c] = m_ingroup[first.m_dtrans[c]];
}
}
}
m_group = null;
m_spec.m_dtrans_vector = new_vector;
}
/***************************************************************
Function: minimize
Description: Removes redundant transition table states.
**************************************************************/
private void minimize
(
)
{
Vector dtrans_group;
Vector new_group;
int i;
int j;
int old_group_count;
int group_count;
CDTrans next;
CDTrans first;
int goto_first;
int goto_next;
int c;
int group_size;
boolean added;
init_groups();
group_count = m_group.size();
old_group_count = group_count - 1;
while (old_group_count != group_count)
{
old_group_count = group_count;
if (CUtility.DEBUG)
{
CUtility.assert(m_group.size() == group_count);
}
for (i = 0; i < group_count; ++i)
{
dtrans_group = (Vector) m_group.elementAt(i);
group_size = dtrans_group.size();
if (group_size <= 1)
{
continue;
}
new_group = new Vector();
added = false;
first = (CDTrans) dtrans_group.elementAt(0);
for (j = 1; j < group_size; ++j)
{
next = (CDTrans) dtrans_group.elementAt(j);
for (c = 0; c < m_spec.m_dtrans_ncols; ++c)
{
goto_first = first.m_dtrans[c];
goto_next = next.m_dtrans[c];
if (goto_first != goto_next
&& (goto_first == CDTrans.F
|| goto_next == CDTrans.F
|| m_ingroup[goto_next] != m_ingroup[goto_first]))
{
if (CUtility.DEBUG)
{
CUtility.assert(dtrans_group.elementAt(j) == next);
}
dtrans_group.removeElementAt(j);
--j;
--group_size;
new_group.addElement(next);
if (false == added)
{
added = true;
++group_count;
m_group.addElement(new_group);
}
m_ingroup[next.m_label] = m_group.size() - 1;
if (CUtility.DEBUG)
{
CUtility.assert(m_group.contains(new_group)
== true);
CUtility.assert(m_group.contains(dtrans_group)
== true);
CUtility.assert(dtrans_group.contains(first)
== true);
CUtility.assert(dtrans_group.contains(next)
== false);
CUtility.assert(new_group.contains(first)
== false);
CUtility.assert(new_group.contains(next)
== true);
CUtility.assert(dtrans_group.size() == group_size);
CUtility.assert(i == m_ingroup[first.m_label]);
CUtility.assert((m_group.size() - 1)
== m_ingroup[next.m_label]);
}
break;
}
}
}
}
}
System.out.println(m_group.size() + " states after removal of redundant states.");
if (m_spec.m_verbose
&& true == CUtility.OLD_DUMP_DEBUG)
{
System.out.println();
System.out.println("States grouped as follows after minimization");
pgroups();
}
fix_dtrans();
}
/***************************************************************
Function: init_groups
Description:
**************************************************************/
private void init_groups
(
)
{
int i;
int j;
int group_count;
int size;
CAccept accept;
CDTrans dtrans;
Vector dtrans_group;
CDTrans first;
boolean group_found;
m_group = new Vector();
group_count = 0;
size = m_spec.m_dtrans_vector.size();
m_ingroup = new int[size];
for (i = 0; i < size; ++i)
{
group_found = false;
dtrans = (CDTrans) m_spec.m_dtrans_vector.elementAt(i);
if (CUtility.DEBUG)
{
CUtility.assert(i == dtrans.m_label);
CUtility.assert(false == group_found);
CUtility.assert(group_count == m_group.size());
}
for (j = 0; j < group_count; ++j)
{
dtrans_group = (Vector) m_group.elementAt(j);
if (CUtility.DEBUG)
{
CUtility.assert(false == group_found);
CUtility.assert(0 < dtrans_group.size());
}
first = (CDTrans) dtrans_group.elementAt(0);
if (CUtility.SLOW_DEBUG)
{
CDTrans check;
int k;
int s;
s = dtrans_group.size();
CUtility.assert(0 < s);
for (k = 1; k < s; ++k)
{
check = (CDTrans) dtrans_group.elementAt(k);
CUtility.assert(check.m_accept == first.m_accept);
}
}
if (first.m_accept == dtrans.m_accept)
{
dtrans_group.addElement(dtrans);
m_ingroup[i] = j;
group_found = true;
if (CUtility.DEBUG)
{
CUtility.assert(j == m_ingroup[dtrans.m_label]);
}
break;
}
}
if (false == group_found)
{
dtrans_group = new Vector();
dtrans_group.addElement(dtrans);
m_ingroup[i] = m_group.size();
m_group.addElement(dtrans_group);
++group_count;
}
}
if (m_spec.m_verbose
&& true == CUtility.OLD_DUMP_DEBUG)
{
System.out.println("Initial grouping:");
pgroups();
System.out.println();
}
}
/***************************************************************
Function: pset
**************************************************************/
private void pset
(
Vector dtrans_group
)
{
int i;
int size;
CDTrans dtrans;
size = dtrans_group.size();
for (i = 0; i < size; ++i)
{
dtrans = (CDTrans) dtrans_group.elementAt(i);
System.out.print(dtrans.m_label + " ");
}
}
/***************************************************************
Function: pgroups
**************************************************************/
private void pgroups
(
)
{
int i;
int dtrans_size;
int group_size;
group_size = m_group.size();
for (i = 0; i < group_size; ++i)
{
System.out.print("\tGroup " + i + " {");
pset((Vector) m_group.elementAt(i));
System.out.println("}");
System.out.println();
}
System.out.println();
dtrans_size = m_spec.m_dtrans_vector.size();
for (i = 0; i < dtrans_size; ++i)
{
System.out.println("\tstate " + i
+ " is in group "
+ m_ingroup[i]);
}
}
}
/***************************************************************
Class: CNfa2Dfa
**************************************************************/
class CNfa2Dfa
{
/***************************************************************
Member Variables
**************************************************************/
private CSpec m_spec;
private int m_unmarked_dfa;
private CLexGen m_lexGen;
/***************************************************************
Constants
**************************************************************/
private static final int NOT_IN_DSTATES = -1;
/***************************************************************
Function: CNfa2Dfa
**************************************************************/
CNfa2Dfa
(
)
{
reset();
}
/***************************************************************
Function: set
Description:
**************************************************************/
private void set
(
CLexGen lexGen,
CSpec spec
)
{
m_lexGen = lexGen;
m_spec = spec;
m_unmarked_dfa = 0;
}
/***************************************************************
Function: reset
Description:
**************************************************************/
private void reset
(
)
{
m_lexGen = null;
m_spec = null;
m_unmarked_dfa = 0;
}
/***************************************************************
Function: make_dfa
Description: High-level access function to module.
**************************************************************/
void make_dfa
(
CLexGen lexGen,
CSpec spec
)
{
int i;
reset();
set(lexGen,spec);
make_dtrans();
free_nfa_states();
if (m_spec.m_verbose && true == CUtility.OLD_DUMP_DEBUG)
{
System.out.println(m_spec.m_dfa_states.size()
+ " DFA states in original machine.");
}
free_dfa_states();
}
/***************************************************************
Function: make_dtrans
Description: Creates uncompressed CDTrans transition table.
**************************************************************/
private void make_dtrans
(
)
/* throws java.lang.CloneNotSupportedException*/
{
CDfa next;
CDfa dfa;
CBunch bunch;
int i;
int nextstate;
int size;
CDTrans dtrans;
CNfa nfa;
int istate;
int nstates;
System.out.print("Working on DFA states.");
/* Reference passing type and initializations. */
bunch = new CBunch();
m_unmarked_dfa = 0;
/* Allocate mapping array. */
nstates = m_spec.m_state_rules.length;
m_spec.m_state_dtrans = new int[nstates];
for (istate = 0; nstates > istate; ++istate)
{
/* CSA bugfix: if we skip all zero size rules, then
an specification with no rules produces an illegal
lexer (0 states) instead of a lexer that rejects
everything (1 nonaccepting state). [27-Jul-1999]
if (0 == m_spec.m_state_rules[istate].size())
{
m_spec.m_state_dtrans[istate] = CDTrans.F;
continue;
}
*/
/* Create start state and initialize fields. */
bunch.m_nfa_set = (Vector) m_spec.m_state_rules[istate].clone();
sortStates(bunch.m_nfa_set);
bunch.m_nfa_bit = new SparseBitSet();
/* Initialize bit set. */
size = bunch.m_nfa_set.size();
for (i = 0; size > i; ++i)
{
nfa = (CNfa) bunch.m_nfa_set.elementAt(i);
bunch.m_nfa_bit.set(nfa.m_label);
}
bunch.m_accept = null;
bunch.m_anchor = CSpec.NONE;
bunch.m_accept_index = CUtility.INT_MAX;
e_closure(bunch);
add_to_dstates(bunch);
m_spec.m_state_dtrans[istate] = m_spec.m_dtrans_vector.size();
/* Main loop of CDTrans creation. */
while (null != (dfa = get_unmarked()))
{
System.out.print(".");
System.out.flush();
if (CUtility.DEBUG)
{
CUtility.assert(false == dfa.m_mark);
}
/* Get first unmarked node, then mark it. */
dfa.m_mark = true;
/* Allocate new CDTrans, then initialize fields. */
dtrans = new CDTrans(m_spec.m_dtrans_vector.size(),m_spec);
dtrans.m_accept = dfa.m_accept;
dtrans.m_anchor = dfa.m_anchor;
/* Set CDTrans array for each character transition. */
for (i = 0; i < m_spec.m_dtrans_ncols; ++i)
{
if (CUtility.DEBUG)
{
CUtility.assert(0 <= i);
CUtility.assert(m_spec.m_dtrans_ncols > i);
}
/* Create new dfa set by attempting character transition. */
move(dfa.m_nfa_set,dfa.m_nfa_bit,i,bunch);
if (null != bunch.m_nfa_set)
{
e_closure(bunch);
}
if (CUtility.DEBUG)
{
CUtility.assert((null == bunch.m_nfa_set
&& null == bunch.m_nfa_bit)
|| (null != bunch.m_nfa_set
&& null != bunch.m_nfa_bit));
}
/* Create new state or set state to empty. */
if (null == bunch.m_nfa_set)
{
nextstate = CDTrans.F;
}
else
{
nextstate = in_dstates(bunch);
if (NOT_IN_DSTATES == nextstate)
{
nextstate = add_to_dstates(bunch);
}
}
if (CUtility.DEBUG)
{
CUtility.assert(nextstate < m_spec.m_dfa_states.size());
}
dtrans.m_dtrans[i] = nextstate;
}
if (CUtility.DEBUG)
{
CUtility.assert(m_spec.m_dtrans_vector.size() == dfa.m_label);
}
m_spec.m_dtrans_vector.addElement(dtrans);
}
}
System.out.println();
}
/***************************************************************
Function: free_dfa_states
**************************************************************/
private void free_dfa_states
(
)
{
m_spec.m_dfa_states = null;
m_spec.m_dfa_sets = null;
}
/***************************************************************
Function: free_nfa_states
**************************************************************/
private void free_nfa_states
(
)
{
/* UNDONE: Remove references to nfas from within dfas. */
/* UNDONE: Don't free CAccepts. */
m_spec.m_nfa_states = null;
m_spec.m_nfa_start = null;
m_spec.m_state_rules = null;
}
/***************************************************************
Function: e_closure
Description: Alters and returns input set.
**************************************************************/
private void e_closure
(
CBunch bunch
)
{
Stack nfa_stack;
int size;
int i;
CNfa state;
/* Debug checks. */
if (CUtility.DEBUG)
{
CUtility.assert(null != bunch);
CUtility.assert(null != bunch.m_nfa_set);
CUtility.assert(null != bunch.m_nfa_bit);
}
bunch.m_accept = null;
bunch.m_anchor = CSpec.NONE;
bunch.m_accept_index = CUtility.INT_MAX;
/* Create initial stack. */
nfa_stack = new Stack();
size = bunch.m_nfa_set.size();
for (i = 0; i < size; ++i)
{
state = (CNfa) bunch.m_nfa_set.elementAt(i);
if (CUtility.DEBUG)
{
CUtility.assert(bunch.m_nfa_bit.get(state.m_label));
}
nfa_stack.push(state);
}
/* Main loop. */
while (false == nfa_stack.empty())
{
state = (CNfa) nfa_stack.pop();
if (CUtility.OLD_DUMP_DEBUG)
{
if (null != state.m_accept)
{
System.out.println("Looking at accepting state " + state.m_label
+ " with <"
+ (new String(state.m_accept.m_action,0,
state.m_accept.m_action_read))
+ ">");
}
}
if (null != state.m_accept
&& state.m_label < bunch.m_accept_index)
{
bunch.m_accept_index = state.m_label;
bunch.m_accept = state.m_accept;
bunch.m_anchor = state.m_anchor;
if (CUtility.OLD_DUMP_DEBUG)
{
System.out.println("Found accepting state " + state.m_label
+ " with <"
+ (new String(state.m_accept.m_action,0,
state.m_accept.m_action_read))
+ ">");
}
if (CUtility.DEBUG)
{
CUtility.assert(null != bunch.m_accept);
CUtility.assert(CSpec.NONE == bunch.m_anchor
|| 0 != (bunch.m_anchor & CSpec.END)
|| 0 != (bunch.m_anchor & CSpec.START));
}
}
if (CNfa.EPSILON == state.m_edge)
{
if (null != state.m_next)
{
if (false == bunch.m_nfa_set.contains(state.m_next))
{
if (CUtility.DEBUG)
{
CUtility.assert(false == bunch.m_nfa_bit.get(state.m_next.m_label));
}
bunch.m_nfa_bit.set(state.m_next.m_label);
bunch.m_nfa_set.addElement(state.m_next);
nfa_stack.push(state.m_next);
}
}
if (null != state.m_next2)
{
if (false == bunch.m_nfa_set.contains(state.m_next2))
{
if (CUtility.DEBUG)
{
CUtility.assert(false == bunch.m_nfa_bit.get(state.m_next2.m_label));
}
bunch.m_nfa_bit.set(state.m_next2.m_label);
bunch.m_nfa_set.addElement(state.m_next2);
nfa_stack.push(state.m_next2);
}
}
}
}
if (null != bunch.m_nfa_set)
{
sortStates(bunch.m_nfa_set);
}
return;
}
/***************************************************************
Function: move
Description: Returns null if resulting NFA set is empty.
**************************************************************/
void move
(
Vector nfa_set,
SparseBitSet nfa_bit,
int b,
CBunch bunch
)
{
int size;
int index;
CNfa state;
bunch.m_nfa_set = null;
bunch.m_nfa_bit = null;
size = nfa_set.size();
for (index = 0; index < size; ++index)
{
state = (CNfa) nfa_set.elementAt(index);
if (b == state.m_edge
|| (CNfa.CCL == state.m_edge
&& true == state.m_set.contains(b)))
{
if (null == bunch.m_nfa_set)
{
if (CUtility.DEBUG)
{
CUtility.assert(null == bunch.m_nfa_bit);
}
bunch.m_nfa_set = new Vector();
/*bunch.m_nfa_bit
= new SparseBitSet(m_spec.m_nfa_states.size());*/
bunch.m_nfa_bit = new SparseBitSet();
}
bunch.m_nfa_set.addElement(state.m_next);
/*System.out.println("Size of bitset: " + bunch.m_nfa_bit.size());
System.out.println("Reference index: " + state.m_next.m_label);
System.out.flush();*/
bunch.m_nfa_bit.set(state.m_next.m_label);
}
}
if (null != bunch.m_nfa_set)
{
if (CUtility.DEBUG)
{
CUtility.assert(null != bunch.m_nfa_bit);
}
sortStates(bunch.m_nfa_set);
}
return;
}
/***************************************************************
Function: sortStates
**************************************************************/
private void sortStates
(
Vector nfa_set
)
{
CNfa elem;
int begin;
int size;
int index;
int value;
int smallest_index;
int smallest_value;
CNfa begin_elem;
size = nfa_set.size();
for (begin = 0; begin < size; ++begin)
{
elem = (CNfa) nfa_set.elementAt(begin);
smallest_value = elem.m_label;
smallest_index = begin;
for (index = begin + 1; index < size; ++index)
{
elem = (CNfa) nfa_set.elementAt(index);
value = elem.m_label;
if (value < smallest_value)
{
smallest_index = index;
smallest_value = value;
}
}
begin_elem = (CNfa) nfa_set.elementAt(begin);
elem = (CNfa) nfa_set.elementAt(smallest_index);
nfa_set.setElementAt(elem,begin);
nfa_set.setElementAt(begin_elem,smallest_index);
}
if (CUtility.OLD_DEBUG)
{
System.out.print("NFA vector indices: ");
for (index = 0; index < size; ++index)
{
elem = (CNfa) nfa_set.elementAt(index);
System.out.print(elem.m_label + " ");
}
System.out.println();
}
return;
}
/***************************************************************
Function: get_unmarked
Description: Returns next unmarked DFA state.
**************************************************************/
private CDfa get_unmarked
(
)
{
int size;
CDfa dfa;
size = m_spec.m_dfa_states.size();
while (m_unmarked_dfa < size)
{
dfa = (CDfa) m_spec.m_dfa_states.elementAt(m_unmarked_dfa);
if (false == dfa.m_mark)
{
if (CUtility.OLD_DUMP_DEBUG)
{
System.out.print("*");
System.out.flush();
}
if (m_spec.m_verbose && true == CUtility.OLD_DUMP_DEBUG)
{
System.out.println("---------------");
System.out.print("working on DFA state "
+ m_unmarked_dfa
+ " = NFA states: ");
m_lexGen.print_set(dfa.m_nfa_set);
System.out.println();
}
return dfa;
}
++m_unmarked_dfa;
}
return null;
}
/***************************************************************
function: add_to_dstates
Description: Takes as input a CBunch with details of
a dfa state that needs to be created.
1) Allocates a new dfa state and saves it in
the appropriate CSpec vector.
2) Initializes the fields of the dfa state
with the information in the CBunch.
3) Returns index of new dfa.
**************************************************************/
private int add_to_dstates
(
CBunch bunch
)
{
CDfa dfa;
if (CUtility.DEBUG)
{
CUtility.assert(null != bunch.m_nfa_set);
CUtility.assert(null != bunch.m_nfa_bit);
CUtility.assert(null != bunch.m_accept
|| CSpec.NONE == bunch.m_anchor);
}
/* Allocate, passing CSpec so dfa label can be set. */
dfa = CAlloc.newCDfa(m_spec);
/* Initialize fields, including the mark field. */
dfa.m_nfa_set = (Vector) bunch.m_nfa_set.clone();
dfa.m_nfa_bit = (SparseBitSet) bunch.m_nfa_bit.clone();
dfa.m_accept = bunch.m_accept;
dfa.m_anchor = bunch.m_anchor;
dfa.m_mark = false;
/* Register dfa state using BitSet in CSpec Hashtable. */
m_spec.m_dfa_sets.put(dfa.m_nfa_bit,dfa);
/*registerCDfa(dfa);*/
if (CUtility.OLD_DUMP_DEBUG)
{
System.out.print("Registering set : ");
m_lexGen.print_set(dfa.m_nfa_set);
System.out.println();
}
return dfa.m_label;
}
/***************************************************************
Function: in_dstates
**************************************************************/
private int in_dstates
(
CBunch bunch
)
{
CDfa dfa;
if (CUtility.OLD_DEBUG)
{
System.out.print("Looking for set : ");
m_lexGen.print_set(bunch.m_nfa_set);
}
dfa = (CDfa) m_spec.m_dfa_sets.get(bunch.m_nfa_bit);
if (null != dfa)
{
if (CUtility.OLD_DUMP_DEBUG)
{
System.out.println(" FOUND!");
}
return dfa.m_label;
}
if (CUtility.OLD_DUMP_DEBUG)
{
System.out.println(" NOT FOUND!");
}
return NOT_IN_DSTATES;
}
}
/***************************************************************
Class: CAlloc
**************************************************************/
class CAlloc
{
/***************************************************************
Function: newCDfa
**************************************************************/
static CDfa newCDfa
(
CSpec spec
)
{
CDfa dfa;
dfa = new CDfa(spec.m_dfa_states.size());
spec.m_dfa_states.addElement(dfa);
return dfa;
}
/***************************************************************
Function: newCNfaPair
Description:
**************************************************************/
static CNfaPair newCNfaPair
(
)
{
CNfaPair pair = new CNfaPair();
return pair;
}
/***************************************************************
Function: newNLPair
Description: return a new CNfaPair that matches a new
line: (\r\n?|[\n\uu2028\uu2029])
Added by CSA 8-Aug-1999, updated 10-Aug-1999
**************************************************************/
static CNfaPair newNLPair(CSpec spec) {
CNfaPair pair = newCNfaPair();
pair.m_end=newCNfa(spec); // newline accepting state
pair.m_start=newCNfa(spec); // new state with two epsilon edges
pair.m_start.m_next = newCNfa(spec);
pair.m_start.m_next.m_edge = CNfa.CCL;
pair.m_start.m_next.m_set = new CSet();
pair.m_start.m_next.m_set.add('\n');
if (spec.m_dtrans_ncols-CSpec.NUM_PSEUDO > 2029) {
pair.m_start.m_next.m_set.add(2028); /*U+2028 is LS, the line separator*/
pair.m_start.m_next.m_set.add(2029); /*U+2029 is PS, the paragraph sep.*/
}
pair.m_start.m_next.m_next = pair.m_end; // accept '\n', U+2028, or U+2029
pair.m_start.m_next2 = newCNfa(spec);
pair.m_start.m_next2.m_edge = '\r';
pair.m_start.m_next2.m_next = newCNfa(spec);
pair.m_start.m_next2.m_next.m_next = pair.m_end; // accept '\r';
pair.m_start.m_next2.m_next.m_next2 = newCNfa(spec);
pair.m_start.m_next2.m_next.m_next2.m_edge = '\n';
pair.m_start.m_next2.m_next.m_next2.m_next = pair.m_end; // accept '\r\n';
return pair;
}
/***************************************************************
Function: newCNfa
Description:
**************************************************************/
static CNfa newCNfa
(
CSpec spec
)
{
CNfa p;
/* UNDONE: Buffer this? */
p = new CNfa();
/*p.m_label = spec.m_nfa_states.size();*/
spec.m_nfa_states.addElement(p);
p.m_edge = CNfa.EPSILON;
return p;
}
}
/***************************************************************
Class: Main
Description: Top-level lexical analyzer generator function.
**************************************************************/
public class Main
{
/***************************************************************
Function: main
**************************************************************/
public static void main
(
String arg[]
)
throws java.io.IOException
{
CLexGen lg;
if (arg.length < 1)
{
System.out.println("Usage: JLex.Main <filename>");
return;
}
/* Note: For debuging, it may be helpful to remove the try/catch
block and permit the Exception to propagate to the top level.
This gives more information. */
try
{
lg = new CLexGen(arg[0]);
lg.generate();
}
catch (Error e)
{
System.out.println(e.getMessage());
}
}
}
/***************************************************************
Class: CDTrans
**************************************************************/
class CDTrans
{
/*************************************************************
Member Variables
***********************************************************/
int m_dtrans[];
CAccept m_accept;
int m_anchor;
int m_label;
/*************************************************************
Constants
***********************************************************/
static final int F = -1;
/*************************************************************
Function: CTrans
***********************************************************/
CDTrans
(
int label,
CSpec spec
)
{
m_dtrans = new int[spec.m_dtrans_ncols];
m_accept = null;
m_anchor = CSpec.NONE;
m_label = label;
}
}
/***************************************************************
Class: CDfa
**************************************************************/
class CDfa
{
/***************************************************************
Member Variables
***********************************************************/
int m_group;
boolean m_mark;
CAccept m_accept;
int m_anchor;
Vector m_nfa_set;
SparseBitSet m_nfa_bit;
int m_label;
/***************************************************************
Function: CDfa
**************************************************************/
CDfa
(
int label
)
{
m_group = 0;
m_mark = false;
m_accept = null;
m_anchor = CSpec.NONE;
m_nfa_set = null;
m_nfa_bit = null;
m_label = label;
}
}
/***************************************************************
Class: CAccept
**************************************************************/
class CAccept
{
/***************************************************************
Member Variables
**************************************************************/
char m_action[];
int m_action_read;
int m_line_number;
/***************************************************************
Function: CAccept
**************************************************************/
CAccept
(
char action[],
int action_read,
int line_number
)
{
int elem;
m_action_read = action_read;
m_action = new char[m_action_read];
for (elem = 0; elem < m_action_read; ++elem)
{
m_action[elem] = action[elem];
}
m_line_number = line_number;
}
/***************************************************************
Function: CAccept
**************************************************************/
CAccept
(
CAccept accept
)
{
int elem;
m_action_read = accept.m_action_read;
m_action = new char[m_action_read];
for (elem = 0; elem < m_action_read; ++elem)
{
m_action[elem] = accept.m_action[elem];
}
m_line_number = accept.m_line_number;
}
/***************************************************************
Function: mimic
**************************************************************/
void mimic
(
CAccept accept
)
{
int elem;
m_action_read = accept.m_action_read;
m_action = new char[m_action_read];
for (elem = 0; elem < m_action_read; ++elem)
{
m_action[elem] = accept.m_action[elem];
}
}
}
/***************************************************************
Class: CAcceptAnchor
**************************************************************/
class CAcceptAnchor
{
/***************************************************************
Member Variables
**************************************************************/
CAccept m_accept;
int m_anchor;
/***************************************************************
Function: CAcceptAnchor
**************************************************************/
CAcceptAnchor
(
)
{
m_accept = null;
m_anchor = CSpec.NONE;
}
}
/***************************************************************
Class: CNfaPair
**************************************************************/
class CNfaPair
{
/***************************************************************
Member Variables
**************************************************************/
CNfa m_start;
CNfa m_end;
/***************************************************************
Function: CNfaPair
**************************************************************/
CNfaPair
(
)
{
m_start = null;
m_end = null;
}
}
/***************************************************************
Class: CInput
Description:
**************************************************************/
class CInput
{
/***************************************************************
Member Variables
**************************************************************/
private java.io.BufferedReader m_input; /* JLex specification file. */
boolean m_eof_reached; /* Whether EOF has been encountered. */
boolean m_pushback_line;
char m_line[]; /* Line buffer. */
int m_line_read; /* Number of bytes read into line buffer. */
int m_line_index; /* Current index into line buffer. */
int m_line_number; /* Current line number. */
/***************************************************************
Constants
**************************************************************/
static final boolean EOF = true;
static final boolean NOT_EOF = false;
/***************************************************************
Function: CInput
Description:
**************************************************************/
CInput
(
java.io.Reader input
)
{
if (CUtility.DEBUG)
{
CUtility.assert(null != input);
}
/* Initialize input stream. */
m_input = new java.io.BufferedReader(input);
/* Initialize buffers and index counters. */
m_line = null;
m_line_read = 0;
m_line_index = 0;
/* Initialize state variables. */
m_eof_reached = false;
m_line_number = 0;
m_pushback_line = false;
}
/***************************************************************
Function: getLine
Description: Returns true on EOF, false otherwise.
Guarantees not to return a blank line, or a line
of zero length.
**************************************************************/
boolean getLine
(
)
throws java.io.IOException
{
String lineStr;
int elem;
/* Has EOF already been reached? */
if (m_eof_reached)
{
return EOF;
}
/* Pushback current line? */
if (m_pushback_line)
{
m_pushback_line = false;
/* Check for empty line. */
for (elem = 0; elem < m_line_read; ++elem)
{
if (false == CUtility.isspace(m_line[elem]))
{
break;
}
}
/* Nonempty? */
if (elem < m_line_read)
{
m_line_index = 0;
return NOT_EOF;
}
}
while (true)
{
if (null == (lineStr = m_input.readLine()))
{
m_eof_reached = true;
m_line_index = 0;
return EOF;
}
m_line = (lineStr + "\n").toCharArray();
m_line_read=m_line.length;
++m_line_number;
/* Check for empty lines and discard them. */
elem = 0;
while (CUtility.isspace(m_line[elem]))
{
++elem;
if (elem == m_line_read)
{
break;
}
}
if (elem < m_line_read)
{
break;
}
}
m_line_index = 0;
return NOT_EOF;
}
}
/********************************************************
Class: Utility
*******************************************************/
class CUtility
{
/********************************************************
Constants
*******************************************************/
static final boolean DEBUG = true;
static final boolean SLOW_DEBUG = true;
static final boolean DUMP_DEBUG = true;
/*static final boolean DEBUG = false;
static final boolean SLOW_DEBUG = false;
static final boolean DUMP_DEBUG = false;*/
static final boolean DESCENT_DEBUG = false;
static final boolean OLD_DEBUG = false;
static final boolean OLD_DUMP_DEBUG = false;
static final boolean FOODEBUG = false;
static final boolean DO_DEBUG = false;
/********************************************************
Constants: Integer Bounds
*******************************************************/
static final int INT_MAX = 2147483647;
static final int MAX_SEVEN_BIT = 127;
static final int MAX_EIGHT_BIT = 255;
static final int MAX_SIXTEEN_BIT=65535;
/********************************************************
Function: enter
Description: Debugging routine.
*******************************************************/
static void enter
(
String descent,
char lexeme,
int token
)
{
System.out.println("Entering " + descent
+ " [lexeme: " + lexeme
+ "] [token: " + token + "]");
}
/********************************************************
Function: leave
Description: Debugging routine.
*******************************************************/
static void leave
(
String descent,
char lexeme,
int token
)
{
System.out.println("Leaving " + descent
+ " [lexeme:" + lexeme
+ "] [token:" + token + "]");
}
/********************************************************
Function: assert
Description: Debugging routine.
*******************************************************/
static void assert
(
boolean expr
)
{
if (DEBUG && false == expr)
{
System.out.println("Assertion Failed");
throw new Error("Assertion Failed.");
}
}
/***************************************************************
Function: doubleSize
**************************************************************/
static char[] doubleSize
(
char oldBuffer[]
)
{
char newBuffer[] = new char[2 * oldBuffer.length];
int elem;
for (elem = 0; elem < oldBuffer.length; ++elem)
{
newBuffer[elem] = oldBuffer[elem];
}
return newBuffer;
}
/***************************************************************
Function: doubleSize
**************************************************************/
static byte[] doubleSize
(
byte oldBuffer[]
)
{
byte newBuffer[] = new byte[2 * oldBuffer.length];
int elem;
for (elem = 0; elem < oldBuffer.length; ++elem)
{
newBuffer[elem] = oldBuffer[elem];
}
return newBuffer;
}
/********************************************************
Function: hex2bin
*******************************************************/
static char hex2bin
(
char c
)
{
if ('0' <= c && '9' >= c)
{
return (char) (c - '0');
}
else if ('a' <= c && 'f' >= c)
{
return (char) (c - 'a' + 10);
}
else if ('A' <= c && 'F' >= c)
{
return (char) (c - 'A' + 10);
}
CError.impos("Bad hexidecimal digit" + c);
return 0;
}
/********************************************************
Function: ishexdigit
*******************************************************/
static boolean ishexdigit
(
char c
)
{
if (('0' <= c && '9' >= c)
|| ('a' <= c && 'f' >= c)
|| ('A' <= c && 'F' >= c))
{
return true;
}
return false;
}
/********************************************************
Function: oct2bin
*******************************************************/
static char oct2bin
(
char c
)
{
if ('0' <= c && '7' >= c)
{
return (char) (c - '0');
}
CError.impos("Bad octal digit " + c);
return 0;
}
/********************************************************
Function: isoctdigit
*******************************************************/
static boolean isoctdigit
(
char c
)
{
if ('0' <= c && '7' >= c)
{
return true;
}
return false;
}
/********************************************************
Function: isspace
*******************************************************/
static boolean isspace
(
char c
)
{
if ('\b' == c
|| '\t' == c
|| '\n' == c
|| '\f' == c
|| '\r' == c
|| ' ' == c)
{
return true;
}
return false;
}
/********************************************************
Function: isnewline
*******************************************************/
static boolean isnewline
(
char c
)
{
if ('\n' == c
|| '\r' == c)
{
return true;
}
return false;
}
/********************************************************
Function: bytencmp
Description: Compares up to n elements of
byte array a[] against byte array b[].
The first byte comparison is made between
a[a_first] and b[b_first]. Comparisons continue
until the null terminating byte '\0' is reached
or until n bytes are compared.
Return Value: Returns 0 if arrays are the
same up to and including the null terminating byte
or up to and including the first n bytes,
whichever comes first.
*******************************************************/
static int bytencmp
(
byte a[],
int a_first,
byte b[],
int b_first,
int n
)
{
int elem;
for (elem = 0; elem < n; ++elem)
{
/*System.out.print((char) a[a_first + elem]);
System.out.print((char) b[b_first + elem]);*/
if ('\0' == a[a_first + elem] && '\0' == b[b_first + elem])
{
/*System.out.println("return 0");*/
return 0;
}
if (a[a_first + elem] < b[b_first + elem])
{
/*System.out.println("return 1");*/
return 1;
}
else if (a[a_first + elem] > b[b_first + elem])
{
/*System.out.println("return -1");*/
return -1;
}
}
/*System.out.println("return 0");*/
return 0;
}
/********************************************************
Function: charncmp
*******************************************************/
static int charncmp
(
char a[],
int a_first,
char b[],
int b_first,
int n
)
{
int elem;
for (elem = 0; elem < n; ++elem)
{
if ('\0' == a[a_first + elem] && '\0' == b[b_first + elem])
{
return 0;
}
if (a[a_first + elem] < b[b_first + elem])
{
return 1;
}
else if (a[a_first + elem] > b[b_first + elem])
{
return -1;
}
}
return 0;
}
}
/********************************************************
Class: CError
*******************************************************/
class CError
{
/********************************************************
Function: impos
Description:
*******************************************************/
static void impos
(
String message
)
{
System.out.println("JLex Error: " + message);
}
/********************************************************
Constants
Description: Error codes for parse_error().
*******************************************************/
static final int E_BADEXPR = 0;
static final int E_PAREN = 1;
static final int E_LENGTH = 2;
static final int E_BRACKET = 3;
static final int E_BOL = 4;
static final int E_CLOSE = 5;
static final int E_NEWLINE = 6;
static final int E_BADMAC = 7;
static final int E_NOMAC = 8;
static final int E_MACDEPTH = 9;
static final int E_INIT = 10;
static final int E_EOF = 11;
static final int E_DIRECT = 12;
static final int E_INTERNAL = 13;
static final int E_STATE = 14;
static final int E_MACDEF = 15;
static final int E_SYNTAX = 16;
static final int E_BRACE = 17;
static final int E_DASH = 18;
static final int E_ZERO = 19;
static final int E_BADCTRL = 20;
/********************************************************
Constants
Description: String messages for parse_error();
*******************************************************/
static final String errmsg[] =
{
"Malformed regular expression.",
"Missing close parenthesis.",
"Too many regular expressions or expression too long.",
"Missing [ in character class.",
"^ must be at start of expression or after [.",
"+ ? or * must follow an expression or subexpression.",
"Newline in quoted string.",
"Missing } in macro expansion.",
"Macro does not exist.",
"Macro expansions nested too deeply.",
"JLex has not been successfully initialize