Next: Pattern matching with DFA, Previous: Introduction to the DFA, Up: DFA
The acronym DFA means Deterministic Finite state Automaton (See http://www.eti.pg.gda.pl/~jandac/thesis/node12.html or Hopcroft & Ullman "Introduction to Language Theory" for more details). DFA are common tools in compilers design (Read Aho, Ravi Sethi, Ullman "COMPILERS: Principles, Techniques and Tools" for a complete introduction), a lot of powerfull text searching algorithm like Knuth-Morris-Pratt or Boyer-Moore algorithms are based on DFA's (See http://www-igm.univ-mlv.fr/~lecroq/string/ for a bibliography of pattern matching algorithms).
Basically, a DFA is a set of states connected by labeled transitions. The labels are the values read on the board, in GNU Go these values are EMPTY, WHITE, BLACK or OUT_BOARD, denoted respectively by '.','O','X' and '#'.
The best way to represent a DFA is to draw its transition graph: the pattern "????..X" is recognized by the following DFA:
This means that starting from state [1], if you read '.','X' or 'O' on the board, go to state [2] and so on until you reach state [5]. From state [5], if you read '.', go to state [6] otherwise go to error state [0]. And so on until you reach state [8]. As soon as you reach state [8], you recognize Pattern "????..X"
Adding a pattern like "XXo" ('o' is a wildcard for not 'X') will transform directly the automaton by synchronization product (Building the DFA). Consider the following DFA:
By adding a special error state and completing each state by a transition to error state when there is none, we transform easily a DFA in a Complete Deterministic Finite state Automaton (CDFA). The synchronization product (Building the DFA) is only possible on CDFA's.
The graph of a CDFA is coded by an array of states: The 0 state is the "error" state and the start state is 1.
---------------------------------------------------- state | . | O | X | # | att ---------------------------------------------------- 1 | 2 | 2 | 9 | 0 | 2 | 3 | 3 | 3 | 0 | 3 | 4 | 4 | 4 | 0 | 5 | 6 | 0 | 0 | 0 | 6 | 7 | 0 | 0 | 0 | 7 | 0 | 0 | 8 | 0 | 8 | 0 | 0 | 0 | 0 | Found pattern "????..X" 9 | 3 | 3 | A | 0 | A | B | B | 4 | 0 | B | 5 | 5 | 5 | 0 | Found pattern "XXo" ----------------------------------------------------
To each state we associate an often empty list of attributes which is the list of pattern indexes recognized when this state is reached. In 'dfa.h' this is basically represented by two stuctures:
/* dfa state */
typedef struct state
{
int next[4]; /* transitions for EMPTY, BLACK, WHITE and OUT_BOARD */
attrib_t *att;
}
state_t;
/* dfa */
typedef struct dfa
{
attrib_t *indexes; /* Array of pattern indexes */
int maxIndexes;
state_t *states; /* Array of states */
int maxStates;
}
dfa_t;