Minimizing finite state automata

Computer science is a strange mix of engineering, science, and math. This is captured well by the historic roots of deterministic finite state automata (DFAs). The first ideas that can be recognized as a precursor to DFAs can be found with Gilberth & Gilberth (1921) introducing flow process charts into mechanical and industrial engineering. Independently, McCullock & Pitts (1943) used nerve-nets as a model of neural activity. This preceded Turing’s 1948 entry into brain science with B-type neural networks, and Rosenblatt’s perceptrons (1957). Unlike Turing and Rosenblatt, McCullock & Pitts model did not incorporate learning. However, the nerve nets went on to have a more profound effect on ratiocination because — as Kleene (1951) recognized — they became the modern form of DFAs. Although DFAs are now in less vogue than they were a few decades ago, they are an essential part of a standard computer science curriculum due to their centrality in computer science. Yesterday, I had the privilege of familiarizing students with the importance of DFAs by giving a lecture of Prakash Panangaden’s COMP 330 course. Since Prakash already introduced the students to the theory of DFAs, regular expressions, and languages, I was tasked with explaining the more practical task of DFA minimization.


Consider the two DFAs above, they both recognize the same language \{a,b\} and yet the first has three states, and the second has four. Can we tell from just examining the left machine that it has some redundant states? Well, consider states s_a,s_b in the right machine, no matter what happens in the future, they will have the same output. If the future has no more letters, i.e. is $\latex \epsilon$ then both states accept, and if the future has any non-zero string then they both transition to state s_2 where the machine will remain forever and reject. We could easily merge the two states into a single state s_1' and still recognize the same language. Let’s formalize this insight into an equivalence relationship on states:

p \approx q \iff \forall x \in \Sigma^* \;\; (\delta^*(p,x) \in F \iff \delta^*(q,x) \in F)

This can be used to design a quotient automaton on equivalence classes [p] = \{q | \; q \approx p\}. If we start with a DFA M = (Q,\Sigma,\delta,s_0,F) then the quotient automaton is M/\approx \; = (Q',\Sigma,\delta',[s_0],F') where Q' = \{[p]| \; p \in Q\}, F' = \{[p]| \; p \in F\}, and \delta'([p],a) = [\delta(p,a)]. To prove that this quotient machine recognizes the same languages, we need three lemmas. The first checks that \delta' is well defined and in-fact a function:

Lemma 1: [p] = [q] \Rightarrow \forall a \in \Sigma \quad [\delta(p,a)] = [\delta(q,a)]


\begin{aligned}  \delta^*(\delta(p,a),y) \in F & \iff \delta^*(p,ay) \in F \\  & \iff \delta^*(q,ay) \in F \\  & \iff \delta^*(\delta(q,a),y) \in F  \end{aligned}

Lemma 2: \forall x \in \Sigma^* \quad \delta^*([p],x) = [\delta^*(p,x)]

Proof: By induction on length of x with Lemma 1 as base case.

Lemma 3: p \in F \iff [p] \in F'

Proof: $(\Rightarrow)$ by definition of F', and $(\Leftarrow)$ by definition of p \approx q with x set to \epsilon.

This allows us to prove that the quotient automaton recognizes the same language as the original automaton.

Theorem 1: L(M/\approx) = L(M)


\begin{aligned}  x \in L(M/\approx) & \iff \delta'^*([s_0],x) \in F' \\  & \iff [\delta^*(s_0,x)] \in F' \\  & \iff \delta^*(s_0,x) \in F \\  & \iff x \in L(M)  \end{aligned}

Now, we just need an algorithm to find the equivalence classes. The most straightforward way is to assume that all states are mergeable, and then mark the ones we can’t merge, progressing backwards from states that differ in output on \epsilon to preceding states that need to be distinct to preserve determinism. More formally, this gives us the algorithm:

  1. Write down a table of all pairs \{p,q\}; initialize all as unmarked.
  2. Mark \{p,q\} if p \in F and q \not\in F (or vice versa).
  3. Repeat until a whole pass through the table of pairs without a new marking. For each unmarked {p,q}, if there is some a \in  \Sigma such that $\latex {\delta(p,a), \delta(q,a)\}$ are marked then mark \{p,q\}.
  4. If {p,q} is not marked then p \approx q.

I encourage you to use induction to prove that this algorithm is correct. More formally:

Theorem 2: \{p,q\} is marked if and only if $\exists x \in \Sigma^*$ such that \delta^*(p,x) \in F and \delta^*(q,x) \not\in F (or vice versa).


Gilbreth, F.B., & Gilbreth, L.M. (1921) Process Charts. American Society of Mechanical Engineers.

Kleene, S. C. (1951). Representation of events in nerve nets and finite automata. Project RAND Research Memorandum RM-704.

McCulloch, Warren S., & Pitts, Walter (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115-133 DOI: 10.1007/BF02478259

Rosenblatt, F. (1957). The Perceptron — a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory.

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.

12 Responses to Minimizing finite state automata

  1. “Computer science is a strange mix of engineering, science, and math.” …and turning things off and on again until they work.

    • One of my favourite hacker koans:

      A novice was trying to fix a broken Lisp machine by turning the power off and on.

      Tom Knight, seeing what the student was doing, spoke sternly: “You cannot fix a machine by just power-cycling it with no understanding of what is going wrong.”

      Knight turned the machine off and on. The machine worked.

  2. Pingback: How teachers help us learn deterministic finite automata | Theory, Evolution, and Games Group

  3. Pingback: Limits on efficient minimization and the helpfulness of teachers. | Theory, Evolution, and Games Group

  4. I’m not sure if this is a typo, but in the fourth paragraph (just above Lemma 1), should it be Q’ = { [q] | q \in Q} instead of Q’ = { [p] | q \in Q}?

  5. Also, in the first line of proof of Lemma 1, I think it should be \delta * (\delta (p, a), y ) \in F.

  6. for Lemma 2, should we be proving the statement for \delta'^* and not \delta^*? The statement as it is doesn’t “type check” since \delta^* operates on states in M and not on the states in M'.

  7. Pingback: Cataloging a year of blogging: the algorithmic world | Theory, Evolution, and Games Group

  8. Pingback: Evolution is a special kind of (machine) learning | Theory, Evolution, and Games Group

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s