CS 600.226 Data Structures: Class Challenge

FSA Minimization Algorithms


Back to the Challenge page, or to syllabus.

The bottom-up algorithm (acyclic FSA only).
The top-down general algorithm.



The Bottom-Up Algorithm for Acyclic FSAs


Example (bottom-up algorithm)

Input FSA


Final states have green color (2,3,5,8,9). Input symbols are only a and b.

All strings which are recognized by this FSA are thus:
a, b, bb, bbba, bbbb, baaa, baab, ab, abba, abbb, aaaa, aaab.

Initial Step

Nodes with no outgoing arcs, grouped according to their FINAL status:

Call the new two states "9" (for {8,9}) and "10" (for {10}). The PROCESSED set contains {9,10}. The FSA after merge:

Main Loop

Iteration 1

Now the CANDIDATES set is merged into a single state (let's call it "6"), since the pattern of outgoing arcs from 6 and 7 is the same. The FSA after merge (states which have been PROCESSED in the past have a thick circle around them):

Iteration 2

The states 4 and 5 (CANDIDATES) cannot be merged (not only 4 has no outgoing arc marked b, but 5 final whereas 4 is not). They have to be kept separate (say let's keep their names, "4" and "5"). The FSA after merge:

Iteration 3

States 2 and 3 can be merged (say, into "2"): both arcs labeled a go to state 4, and both arcs labeled b go to state 5. The FSA after merge:

Iteration 4

No merge possible, but treat "1" as if merged. The FSA after merge:

Iteration 4



Back to top, the Challenge page, or to syllabus.

The Top-down Algorithm for (Genreal) FSAs

This algorithm works for general FSAs (i.e., with loops). If the input is guaranteed to be acyclic, then the following trade-off has to be taken into consideration: the top-down version needs [linearly] less memory than the bottom-up version but it might take substantially longer to compute, unless the 'outgoing arc sharing' tests for the splits (see below) are implemented in a clever way.

Example (top-down algorithm)

Input FSA


Final states have green color (2,3,5,8,9). Input symbols are only a and b.

All strings which are recognized by this FSA are thus:
a, b, bb, bbba, bbbb, baaa, baab, ab, abba, abbb, aaaa, aaab.

Initial Step

Initial split: {2,3,5,8,9} (final states, C1), {1,4,6,7,10} (non-final states, C2).

Iteration 1

Transition table: (orig. state) x (symbol) -> CLASS (for C1):

Orig. state:
Symbol
2
 
3
 
5
 
8
 
9
 
a C2 C2 C2 - -
b C1 C1 C2 - -

Splitting C1 into: {2,3} (new name for next iteration: C1), {5} (C2), and {8,9} (C3).

Transition table: (orig. state) x (symbol) -> CLASS (for C2):

Orig. state:
Symbol
1
 
4
 
6
 
7
 
10
 
a C1 C2 C1 C1 -
b C1 - C1 C1 -

Splitting C2 into: {1,6,7} (new name for next iteration: C4), {4} (C5), and {10} (C6).

Iteration 2

Transition table: (orig. state) x (symbol) -> CLASS (for C1):

Orig. state:
Symbol
2
 
3
 
a C5 C5
b C2 C2

No further split possible (good :-)). Gets the name C1 (again). NB: even though no split is possible now, the CLASS might split in the future, if any of the classes on the other end of outgoing arcs do split.

Transition table: (orig. state) x (symbol) -> CLASS (for C2):

Orig. state:
Symbol
5
 
a C6
b C4

Obviously, any CLASS containing one state only may be safely left intact for the remaining time, except it always gets a new class name: C2 ({5}; not quite 'new' in this case, but...)

Transition table: (orig. state) x (symbol) -> CLASS (for C3):

Orig. state:
Symbol
8
 
9
 
a - -
b - -

No further split possible. Name for next iteration: C3 ({8,9}).

Transition table: (orig. state) x (symbol) -> CLASS (for C4):

Orig. state:
Symbol
1
 
6
 
7
 
a C1 C3 C3
b C1 C3 C3

Splitting C4 into: {1} (new name for next iteration: C4), {6,7} (C5).

Since C5 and C6 already contain only one state ({4} and {10}, respectively), assign them only a new name for the next iteration: C5->C6 ({4}), C6->C7 ({10}).

Iteration 3

Only tables with 2 or more columns shown:

Transition table: (orig. state) x (symbol) -> CLASS (for C1):

Orig. state:
Symbol
2
 
3
 
a C6 C6
b C2 C2

No further split possible. Name for next iteration: C1.

Transition table for C2 not shown (1 state only: {5}). New name: C2.

Transition table: (orig. state) x (symbol) -> CLASS (for C3):

Orig. state:
Symbol
8
 
9
 
a - -
b - -

No further split possible. Name for next iteration: C3.

Transition table for C4 not shown (1 state only: {1}). New name: C4.

Transition table: (orig. state) x (symbol) -> CLASS (for C5):

Orig. state:
Symbol
6
 
7
 
a C3 C3
b C3 C3

No further split possible.

Transition table for C6 not shown (1 state only: {4}). New name: C6.

Transition table for C7 not shown (1 state only: {10}). New name: C7.

No class split during this iteration - termination condition met.

Write out the resulting FSA, picking up one of the states from the sets {2,3}, {6,7}, and {8,9} as a representant of the states of that CLASS. Ignore arcs going nowhere (after the nodes are deleted).

Or alternatively, simply consider the CLASSes to be states, number them (easy!) and write out the resulting FSA:

Resulting minimal FSA (using CLASSes as states):


Back to top, the Challenge page, or to syllabus.