How teachers help us learn deterministic finite automata →

Minimizing finite state automata

September 18, 2013 by Artem Kaznatcheev 12 Comments

Computer science is a strange mix of engineering, science, and math. This is captured well by the historic roots of deterministic finite state automata (DFAs). The first ideas that can be recognized as a precursor to DFAs can be found with Gilberth & Gilberth (1921) introducing flow process charts into mechanical and industrial engineering. Independently, McCullock & Pitts (1943) used nerve-nets as a model of neural activity. This preceded Turing’s 1948 entry into brain science with B-type neural networks, and Rosenblatt’s perceptrons (1957). Unlike Turing and Rosenblatt, McCullock & Pitts model did not incorporate learning. However, the nerve nets went on to have a more profound effect on ratiocination because — as Kleene (1951) recognized — they became the modern form of DFAs. Although DFAs are now in less vogue than they were a few decades ago, they are an essential part of a standard computer science curriculum due to their centrality in computer science. Yesterday, I had the privilege of familiarizing students with the importance of DFAs by giving a lecture of Prakash Panangaden’s COMP 330 course. Since Prakash already introduced the students to the theory of DFAs, regular expressions, and languages, I was tasked with explaining the more practical task of DFA minimization.

Consider the two DFAs above, they both recognize the same language $\{a,b\}$ and yet the first has three states, and the second has four. Can we tell from just examining the left machine that it has some redundant states? Well, consider states $s_a,s_b$ in the right machine, no matter what happens in the future, they will have the same output. If the future has no more letters, i.e. is $\latex \epsilon$ then both states accept, and if the future has any non-zero string then they both transition to state $s_2$ where the machine will remain forever and reject. We could easily merge the two states into a single state $s_1'$ and still recognize the same language. Let’s formalize this insight into an equivalence relationship on states:

$p \approx q \iff \forall x \in \Sigma^* \;\; (\delta^*(p,x) \in F \iff \delta^*(q,x) \in F)$

This can be used to design a quotient automaton on equivalence classes $[p] = \{q | \; q \approx p\}$ . If we start with a DFA $M = (Q,\Sigma,\delta,s_0,F)$ then the quotient automaton is $M/\approx \; = (Q',\Sigma,\delta',[s_0],F')$ where $Q' = \{[p]| \; p \in Q\}$ , $F' = \{[p]| \; p \in F\}$ , and $\delta'([p],a) = [\delta(p,a)]$ . To prove that this quotient machine recognizes the same languages, we need three lemmas. The first checks that $\delta'$ is well defined and in-fact a function:

Lemma 1: $[p] = [q] \Rightarrow \forall a \in \Sigma \quad [\delta(p,a)] = [\delta(q,a)]$

Proof:

$\begin{aligned} \delta^*(\delta(p,a),y) \in F & \iff \delta^*(p,ay) \in F \\ & \iff \delta^*(q,ay) \in F \\ & \iff \delta^*(\delta(q,a),y) \in F \end{aligned}$

Lemma 2: $\forall x \in \Sigma^* \quad \delta^*([p],x) = [\delta^*(p,x)]$

Proof: By induction on length of $x$ with Lemma 1 as base case.

Lemma 3: $p \in F \iff [p] \in F'$

Proof: $(\Rightarrow)$ by definition of $F'$ , and $(\Leftarrow)$ by definition of $p \approx q$ with $x$ set to $\epsilon$ .

This allows us to prove that the quotient automaton recognizes the same language as the original automaton.

Theorem 1: $L(M/\approx) = L(M)$

Proof:

$\begin{aligned} x \in L(M/\approx) & \iff \delta'^*([s_0],x) \in F' \\ & \iff [\delta^*(s_0,x)] \in F' \\ & \iff \delta^*(s_0,x) \in F \\ & \iff x \in L(M) \end{aligned}$

Now, we just need an algorithm to find the equivalence classes. The most straightforward way is to assume that all states are mergeable, and then mark the ones we can’t merge, progressing backwards from states that differ in output on $\epsilon$ to preceding states that need to be distinct to preserve determinism. More formally, this gives us the algorithm:

Write down a table of all pairs $\{p,q\}$ ; initialize all as unmarked.
Mark $\{p,q\}$ if $p \in F$ and $q \not\in F$ (or vice versa).
Repeat until a whole pass through the table of pairs without a new marking. For each unmarked ${p,q}$ , if there is some $a \in \Sigma$ such that $\latex {\delta(p,a), \delta(q,a)\}$ are marked then mark $\{p,q\}$ .
If ${p,q}$ is not marked then $p \approx q$ .

I encourage you to use induction to prove that this algorithm is correct. More formally:

Theorem 2: $\{p,q\}$ is marked if and only if $\exists x \in \Sigma^*$ such that $\delta^*(p,x) \in F$ and $\delta^*(q,x) \not\in F$ (or vice versa).

References

Gilbreth, F.B., & Gilbreth, L.M. (1921) Process Charts. American Society of Mechanical Engineers.

Kleene, S. C. (1951). Representation of events in nerve nets and finite automata. Project RAND Research Memorandum RM-704.

McCulloch, Warren S., & Pitts, Walter (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115-133 DOI: 10.1007/BF02478259

Rosenblatt, F. (1957). The Perceptron — a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory.

Filed under Commentary, Technical Tagged with cstheory, finite state automata

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.

12 Responses to Minimizing finite state automata

Neuroskeptic (@Neuro_Skeptic) says:

September 19, 2013 at 15:44

“Computer science is a strange mix of engineering, science, and math.” …and turning things off and on again until they work.

Reply
- Artem Kaznatcheev says:
  
  March 7, 2015 at 18:29
  
  One of my favourite hacker koans:
  
  A novice was trying to fix a broken Lisp machine by turning the power off and on.
  
  Tom Knight, seeing what the student was doing, spoke sternly: “You cannot fix a machine by just power-cycling it with no understanding of what is going wrong.”
  
  Knight turned the machine off and on. The machine worked.
  
  Reply
Pingback: How teachers help us learn deterministic finite automata | Theory, Evolution, and Games Group
Pingback: Limits on efficient minimization and the helpfulness of teachers. | Theory, Evolution, and Games Group
Deepanjan Roy says:

October 2, 2013 at 01:52

I’m not sure if this is a typo, but in the fourth paragraph (just above Lemma 1), should it be Q’ = { [q] | q \in Q} instead of Q’ = { [p] | q \in Q}?

Reply
Deepanjan Roy says:

October 2, 2013 at 02:35

Also, in the first line of proof of Lemma 1, I think it should be $\delta * (\delta (p, a), y ) \in F$ .

Reply
- Artem Kaznatcheev says:
  
  October 2, 2013 at 07:09
  
  You are correct on both counts, thank you! I’ve fixed the typos.
  
  Reply
  - Deepanjan Roy says:
    
    October 2, 2013 at 11:44
    
    No problem. Thanks so much for writing up all of these!
    
    Reply
Matthew Wetmore says:

October 8, 2013 at 14:23

for Lemma 2, should we be proving the statement for $\delta'^*$ and not $\delta^*$ ? The statement as it is doesn’t “type check” since $\delta^*$ operates on states in $M$ and not on the states in $M'$ .

Reply
- Artem Kaznatcheev says:
  
  October 8, 2013 at 17:44
  
  Yup! The one on the left should be primed, I will fix it up shortly. Thank you :D.
  
  Reply
Pingback: Cataloging a year of blogging: the algorithmic world | Theory, Evolution, and Games Group
Pingback: Evolution is a special kind of (machine) learning | Theory, Evolution, and Games Group