Über die algorithmische Komplexität regulärer Sprachen

(On the algorithmic complexity of regular languages)

Gregor Gramlich

Ph.D. thesis in German, July 2007.

Abstract: We consider the approximate minimization of NFAs and regular expressions. It is known that exact minimization is PSpace hard in the general case. We show that even weak approximations solve hard problems and thus efficient approximations with reasonable approximation factors probably don't exist.
We also consider the problem of learning regular languages and show positive and negative results for the problem of learning of learning a unary regular language in some well known frameworks of machine learning.

Learning Unary Automata

Gregor Gramlich, Ralf Herrmann

Proceedings of the Seventh International Workshop on Descriptional Complexity of Formal Systems (DCFS'05), Como (Italy), pp. 122-133.

Abstract. We determine the complexity of learning problems for unary regular languages. We begin by investigating the minimum consistent dfa (resp. nfa) problem which is known not to be approximable within any polynomial, unless P=NP. For the case of unary dfa's, we exhibit an efficient algorithm. On the other hand, we show the intractability of the unary minimum consistent nfa problem but provide an efficient quadratic approximation for its optimization version. The VC dimension for the class of languages accepted by unary dfa's with at most n states is computed as n + log n ± Θ(log log n), which (together with the efficient solution for the consistency problem) yields an efficient PAC learning algorithm for this class. We also show that there are no efficient PAC learning algorithms for the class of languages accepted by unary nfa's with at most n states, unless every problem in NP is solvable by a quasipolynomial time Monte-Carlo algorithm. Here we assume that nfa's with few states serve as hypotheses.
In the context of learning with equivalence queries, we consider the number of counterexamples required to learn a unary regular language that is accepted by a dfa with at most n states. When submitting dfa's with at most nd states (d ≤ n) as queries, we show the upper bound O((n2)/d) and the lower bound Ω((n2 ln d)/ (d (ln n)2)). If only prime cycle lengths ≤ n are allowed as queries, we prove that Θ(n2/(ln n)) counterxamples are necessary and sufficient.

Conference: Seventh International Workshop on Descriptional Complexity of Formal Systems, (DCFS) 2005.
Como, Italy, June/July 2005

Minimizing NFA's and Regular Expressions

Gregor Gramlich, Georg Schnitger

Proceedings of the 22nd Annual Symposium on Theoretical Aspects of Computer Science (STACS'05), Stuttgart (Germany), Springer-Verlag, Lecture Notes in Computer Science 3404, 2005, pp. 399-411.

Abstract. We show inapproximability results concerning minimization of nondeterministic finite automata (nfa's) as well as regular expressions relative to given nfa's, regular expressions or deterministic finite automata (dfa's). We show that it is impossible to efficiently minimize a given nfa or regular expression with n states, transitions, resp. symbols within the factor o(n), unless P=PSPACE. Our inapproximability results for a given dfa with n states are based on cryptographic assumptions and we show that any efficient algorithm will have an approximation factor of at least n/(poly(log n)). Our setup also allows us to analyze the minimum consistent dfa problem.

Conference: 22nd International Symposium on Theoretical Aspects of Computer Science (STACS) 2005.
Stuttgart, Germany, February 2005

Probabilistic and Nondeterministic Unary Automata

Gregor Gramlich

Proceedings of Mathematical Foundations of Computer Science (MFCS) 2003, Springer-Verlag, Lecture Notes in Computer Science 2747, 2003, pp. 460-469.

Abstract. We investigate unary regular languages and compare deterministic finite automata (DFA's), nondeterministic finite automata (NFA's) and probabilistic finite automata (PFA's) with respect to their size.
Given a unary PFA with n states and an epsilon-isolated cutpoint, we show that the minimal equivalent DFA has at most n^(1/(2 epsilon)) states in its cycle. This result is almost optimal, since for any c<1 a family of PFA's can be constructed such that every equivalent DFA has at least n^(c/(2 epsilon)) states. Thus we show that for the model of probabilistic automata with a constant error bound, there is only a polynomial blowup for cyclic languages.
Given a unary NFA with n states, we show that efficiently approximating the size of a minimal equivalent NFA within the factor n^(1/2)/ln n is impossible unless P=NP. This result even holds under the promise that the accepted language is cyclic. On the other hand we show that we can approximate a minimal NFA within the factor ln n, if we are given a cyclic unary n-state DFA.

Conference: Mathematical Foundations of Computer Science
28th International Symposium, Bratislava, Slovakia, August 2003
The article won the "Best student paper award".

Unäre stochastische Automaten

(Unary Stochastic Automata)

Master Thesis in Computer Science (German)

Diplomarbeit am Institut für Informatik, Johann Wolfgang Goethe-Universität Frankfurt