Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DFA Minimization Brzozowski algorithm

I am trying to implement Brzozowski's algorithm for minimizing my DFA Following is the algorithm for the same.

DFA = d(r(d(r(NFA)))) 

where r() is reversal of NFA and D() converts NFA to DFA.

But I do not understand what is the meaning of r() searching on google also does not gives much information.

Can anybody please explain what is r() of NFA.

Any other simple algorithm or C++ implementation available please let me know the link.

like image 630
Avinash Avatar asked Oct 18 '25 19:10

Avinash


2 Answers

Brzozowski's algorithm

Brzozowski's algorithm is clearer written as:

minimized_DFA = subset(reverse(subset(reverse(NFA))))

where subset denotes subset construction (also known as powerset construction). Subset construction builds a DFA by simulating all transitions for each equivalent set of states (due to epsilon transitions) in the NFA.

Reversing an NFA involves these steps:

  1. Reverse the direction of all transition edges in the NFA.
  2. Create a new starting state that has epsilon transitions to all accepting states in the NFA.
  3. Mark all accepting states as non-accepting in the NFA.
  4. Make the old starting state the new accepting state.

Steps 2-4 effectively swaps the roles of accepting and starting states.


Brzozowski's algorithm example

Here's an example of minimizing a DFA based on a Udacity quiz for a compilers course (the steps are the same with an NFA as initial input).

Initial DFA:

initial DFA

This DFA accepts strings like {"", "a", "aa", "aaa", "aaabba", "ba", "bab", "ababa", "ababb"} and rejects strings like {"b", "ab", "aab", "aabb", "bb", "bbb"}. In other words, it rejects strings that have a "b" unless they also have a "ba" substring. It's pretty obvious that s1-s3 and s2-s4 are redundant.

Step 1: reverse(DFA):

reverse(DFA)

Step 2: subset(reverse(DFA)):

Run subset construction to build a table of DFA states to represent possible transitions from each unique epsilon closure (^ denotes a start state, $ denotes an accepting state):

A = e-closure({s5}) = {s0,s2,s4}
B = e-closure({s0,s1,s2,s3,s4}) = {s0,s1,s2,s3,s4}
C = e-closure({s2,s4}) = {s2,s4}
D = e-closure({s1,s2,s3,s4}) = {s1,s2,s3,s4}

     a   b
-----------
A^$  B   C
B$   B   B
C    D   C
D    D   B

subset(reverse(DFA))

Step 3: reverse(subset(reverse(DFA))):

Reverse the DFA. After eliminating common prefixes, another pass enables elimination of common suffixes.

reverse(subset(reverse(DFA)))

Step 4: subset(reverse(subset(reverse(DFA)))):

Run subset construction one more time to minimize the NFA.

A = e-closure({E}) = {A,B}
B = e-closure({B,D}) = {B,D}
C = e-closure({A,B,C,D} = {A,B,C,D}

    a  b
--------
A^$ A  B
B   C  B
C$  C  C

subset(reverse(subset(reverse(DFA))))


References:

  • Engineering a Compiler by Cooper and Torczon, 2nd edition. Subset construction is described on page 49 and Brzozowski's algorithm is on page 75.

  • Udacity's Compilers: Theory and Practice course, lesson 3.

Graphviz code for above diagrams:

// initial DFA
digraph G {
    rankdir=LR;
    size="8,5"
    
    node [shape=point]; qi;
    node [shape=doublecircle]; s0 s2 s4;
    node [shape=circle];

    qi -> s0;
    s0 -> s0 [label="a"];
    s0 -> s1 [label="b"];
    s1 -> s2 [label="a"];
    s2 -> s2 [label="a"];
    s2 -> s4 [label="b"];
    s4 -> s4 [label="a,b"];
    s1 -> s3 [label="b"];
    s3 -> s3 [label="b"];
    s3 -> s2 [label="a"];
}
// reverse(DFA)
digraph G {
    rankdir=LR;
    size="8,5"
    
    node [shape=point]; qi;
    node [shape=doublecircle]; s0;
    node [shape=circle];

    qi -> s5;
    s0 -> s0 [label="a"];
    s1 -> s0 [label="b"];
    s2 -> s1 [label="a"];
    s2 -> s2 [label="a"];
    s4 -> s2 [label="b"];
    s4 -> s4 [label="a,b"];
    s3 -> s1 [label="b"];
    s3 -> s3 [label="b"];
    s2 -> s3 [label="a"];
    s5 -> s2 [label="e"];
    s5 -> s0 [label="e"];
    s5 -> s4 [label="e"];
}
// subset(reverse(DFA))
digraph G {
    rankdir=LR;
    size="8,5"
    
    node [shape=point]; qi;
    node [shape=doublecircle]; A B;
    node [shape=circle];

    qi -> A;
    A -> B [label="a"];
    A -> C [label="b"];
    B -> B [label="a,b"];
    D -> B [label="b"];
    C -> D [label="a"];
    C -> C [label="b"];
    D -> D [label="a"];
}
// reverse(subset(reverse(DFA)))
digraph G {
    rankdir=LR;
    size="8,5"
    
    node [shape=point]; qi;
    node [shape=doublecircle]; A;
    node [shape=circle];

    qi -> E;
    B -> A [label="a"];
    C -> A [label="b"];
    B -> B [label="a,b"];
    B -> D [label="b"];
    D -> C [label="a"];
    C -> C [label="b"];
    D -> D [label="a"];
    E -> A [label="e"];
    E -> B [label="e"];
}
// subset(reverse(subset(reverse(DFA))))
digraph G {
    rankdir=LR;
    size="8,5"
    
    node [shape=point]; qi;
    node [shape=doublecircle]; A C;
    node [shape=circle];

    qi -> A;
    A -> A [label="a"];
    A -> B [label="b"];
    B -> B [label="b"];
    B -> C [label="a"];
    C -> C [label="a,b"];
}
like image 68
ggorlen Avatar answered Oct 20 '25 09:10

ggorlen


This is the implementation from OpenFst.

In this paper is a diagram (page 15) that show results of applying the reverse operation.

An easier way to help understand FSM operations is to use a library like OpenFst to create and manipulable the machines and then visualize the results using Graphviz.

like image 32
Paul Dixon Avatar answered Oct 20 '25 10:10

Paul Dixon