Two Variable vs. Linear Temporal Logic in Model Checking and Games

Model checking linear-time properties expressed in first-order logic has non-elementary complexity, and thus various restricted logical languages are employed. In this paper we consider two such restricted specification logics, linear temporal logic (LTL) and two-variable first-order logic (FO2). LTL is more expressive but FO2 can be more succinct, and hence it is not clear which should be easier to verify. We take a comprehensive look at the issue, giving a comparison of verification problems for FO2, LTL, and various sublogics thereof across a wide range of models. In particular, we look at unary temporal logic (UTL), a subset of LTL that is expressively equivalent to FO2; we also consider the stutter-free fragment of FO2, obtained by omitting the successor relation, and the expressively equivalent fragment of UTL, obtained by omitting the next and previous connectives. We give three logic-to-automata translations which can be used to give upper bounds for FO2 and UTL and various sublogics. We apply these to get new bounds for both non-deterministic systems (hierarchical and recursive state machines, games) and for probabilistic systems (Markov chains, recursive Markov chains, and Markov decision processes). We couple these with matching lower-bound arguments. Next, we look at combining FO2 verification techniques with those for LTL. We present here a language that subsumes both FO2 and LTL, and inherits the model checking properties of both languages. Our results give both a unified approach to understanding the behaviour of FO2 and LTL, along with a nearly comprehensive picture of the complexity of verification for these logics and their sublogics.


Introduction
The complexity of verification problems clearly depends on the specification language for describing properties. Arguably the most important such language is Linear Temporal Logic (LTL). LTL has a simple syntax, one can verify LTL properties over Kripke structures in polynomial space, and one can check satisfiability also in polynomial space. Moreover, Kamp [Kam68] has shown that LTL has the same expressiveness as first-order logic over words. For example, the first-order property "after we are born, we live until we die": is expressed in LTL by the formula (born → live U die).
In contrast with LTL, model checking first-order queries has non-elementary complexity [Sto74]-thus LTL could be thought of as a tractable syntactic fragment of FO. Another approach to obtaining tractability within first-order logic is by maintaining first-order syntax, but restricting to two-variable formulas. The resulting specification language FO 2 has also been shown to have dramatically lower complexity than full first-order logic. In particular, Etessami, Vardi and Wilke [EVW02] showed that satisfiability for FO 2 is NEXPTIMEcomplete and that FO 2 is strictly less expressive than FO (and thus less expressive than LTL also). Indeed, [EVW02] shows that FO 2 has the same expressive power as Unary Temporal Logic (UTL): the fragment of LTL with only the unary operators "previous", "next", "sometime in the past", "sometime in the future". Consider the example above. We have shown that it can be expressed in LTL, but it is easy to show that it cannot be expressed in UTL, and therefore cannot be expressed in FO 2 .
Although FO 2 is less expressive than LTL, there are some properties that are significantly easier to express in FO 2 than in LTL. Consider the property that two n-bit identifiers agree: It is easy to show that there is an exponential blow-up in transforming the above FO 2 formula into an equivalent LTL formula. We thus have three languages UTL, LTL and FO 2 , with UTL and FO 2 equally expressive, LTL more expressive, and with FO 2 incomparable in succinctness with LTL. Are verification tasks easier to perform in LTL, or in FO 2 ? This is the main question we address in this paper. There are well-known examples of problems that are easier in LTL than in FO 2 : in particular satisfiability, which is PSPACE-complete for LTL and NEXPTIME-complete for FO 2 [EVW02]. We will show that there are also tasks where FO 2 is more tractable than LTL.
Our main contribution is a uniform approach to the verification of FO 2 via automata. We show that translations to the appropriate automata can give optimal bounds for verification of FO 2 on both non-deterministic and probabilistic structures. We also show that such translations allow us to understand the verification of the fragment of FO 2 formed by removing the successor relation from the signature, denoted FO 2 [<]. It turns out, somewhat surprisingly, that for this fragment we can get the same complexity upper bounds for verification as for the simplest temporal logic-TL [ , ]. For our translations from FO 2 [<] to automata, we make use of a key result from Weis [Wei11], showing that models of FO 2 [<] formulas realise only a polynomial number of types. We extend this "few types" result from finite to infinite words and use it to characterise the structure of automata for FO 2 [<].
The outcome of our translations is a comprehensive analysis of the complexity of FO 2 and UTL verification problems, together with those for the respective stutter-free fragments FO 2 [<] and TL [ , ]. We begin with model checking problems for Kripke structures and for recursive state machines (RSMs), which we compare to known results for LTL on these models. We then turn to two-player games, considering the complexity of the problem of determining which player has a strategy to ensure that a given formula is satisfied. We then move from non-deterministic systems to probabilistic systems. We start with Markov chains and recursive Markov chains, the analogs of Kripke structures and RSMs in the probabilistic case. Finally we consider one-player stochastic games, looking at the question of whether the player can devise a strategy that is winning with a given probability.
Towards the end of the paper, we consider extensions of FO 2 , and in particular how FO 2 verification techniques can be combined with those for Linear Temporal Logic (LTL). We present here a language that we denote FO 2 [LTL], subsuming both FO 2 and LTL. We show that the complexity of verification problems for FO 2 [LTL] can be attacked by our automatatheoretic methods, and indeed reduces to verification of FO 2 and LTL individually. As a result we show that the worst-case complexity of probabilistic verification, as well as non-deterministic verification, for FO 2 [LTL] is (roughly speaking) the maximum of the complexity for FO 2 and LTL.
This paper expands on results presented in two conference papers, [BLW11,BLW12]. Organization: Section 2 contains preliminaries, while Section 3 gives fundamental results on the model theory of FO 2 and its relation to UTL that will be used in the remainder of the paper. Section 4 presents the logic-to-automata translations used in our upper bounds. The first is a translation of a given UTL formula to a large disjoint union of Büchi automata with certain structural restrictions. This can also be used to give a translation from a given FO 2 formula to an (still larger) union of Büchi automata. The second does something similar for FO 2 [<] formulas. The last translation maps FO 2 [<] and FO 2 formulas to deterministic parity automata, which is useful for certain problems involving games.
Section 6 gives upper and lower bounds for non-deterministic systems, while Section 7 is concerned with probabilistic systems. In Section 8 we consider model checking of FO 2 [LTL], which subsumes both FO 2 and LTL, and finally in Section 9 we consider the impact of extending all the previous logics with let definitions.

Logic, Automata and Complexity Classes
We consider a first-order signature with set of unary predicates P = {P 1 , . . . , P m } and binary predicates < (less than) and suc (successor). Fixing two distinct variables x and y, we denote by FO 2 the set of first-order formulas over the above signature involving only the variables x and y. We denote by FO 2 [<] the sublogic in which the binary predicate suc is not used. We write ϕ(x) for a formula in which only the variable x occurs free.
In this paper we are interested in interpretations of FO 2 on infinite words. An ω-word u = u 0 u 1 . . . over the powerset alphabet Σ = 2 P represents a first-order structure extending N, <, suc , in which predicate P i is interpreted by the set {n ∈ N : P i ∈ u n } and the binary predicates < and suc have the obvious interpretations.
We also consider Linear Temporal Logic LTL on ω-words. The formulas of LTL are built from atomic propositions using Boolean connectives and the temporal operators (next), (previously), (eventually), (sometime in the past), U (until ), and S (since). Formally, LTL is defined by the following grammar: where P 0 , P 1 , . . . are propositional variables. Unary temporal logic (UTL) denotes the subset without U and S, while TL[ , ] denotes the stutter-free subset of UTL without and . We use ϕ as an abbreviation for ¬ ¬ϕ.
Let (u, i) be the suffix u i u i+1 . . . of ω-word u. We define the semantics of LTL inductively on the structure of the formulas as follows: (1) (u, i) |= P k iff atomic prop. P k holds at position i of u (2) (u, i) |= ϕ 1 ∧ ϕ 2 iff (u, i) |= ϕ 1 and (u, i) |= ϕ 2 (3) (u, i) |= ¬ϕ iff it is not the case that (u, i) |= ϕ (4) (u, i) |= ϕ iff (u, i + 1) |= ϕ (5) (u, i) |= ϕ iff (u, i − 1) |= ϕ (6) (u, i) |= ϕ 1 U ϕ 2 iff ∃j ≥ i s.t. (u, j) |= ϕ 2 and ∀k, i ≤ k < j we have (u, k) |= ϕ 1 (7) (u, i) |= ϕ 1 S ϕ 2 iff ∃j ≤ i s.t. (u, j) |= ϕ 2 and ∀k, j < k ≤ i we have (u, k) |= ϕ 1 (8) (u, i) |= ϕ iff (u, i) |= true U ϕ (9) (u, i) |= ϕ iff (u, i) |= true S ϕ It is well known that over ω-words LTL has the same expressiveness as first-order logic, and UTL has the same expressiveness as FO 2 . Moreover, while FO 2 is less expressive than LTL, it can be exponentially more succinct [EVW02] -for concrete examples of these facts, see the introduction. We can combine the succinctness of FO 2 and the expressiveness of LTL by extending the former with the temporal operators U and S (applied to formulas with at most one free variable). We call the resulting logic FO 2 [LTL]. The syntax of FO 2 [LTL] divides formulas into two syntactic classes: temporal formulas and first-order formulas. Temporal formulas are given by the grammar where P i is an atomic proposition and ψ is a first-order formula with one free variable. First-order formulas are given by the grammar where ϕ is a temporal formula. Here the first-order formula ϕ(x) asserts that the temporal formula ϕ holds at position x. The temporal operators , , and can all be introduced as derived operators. An example of FO 2 [LTL] formula is: The relative expressiveness of the logics defined thus far is illustrated in Figure 1.
Finally, we consider an extension of FO 2 [LTL] with let definitions. We inductively define the formulas and the unary predicate subformulas that occur free in such a formula. The atomic formulas of FO 2 [LTL] Let are as in FO 2 [LTL], with the formula P (x) occurring freely in itself. The constructors include all those of FO 2 [LTL], with the set of free subformula occurrences being preserved by all of these constructors.
There is one new formula constructor of the form: TWO VARIABLE VS. LINEAR TEMPORAL LOGIC IN MODEL CHECKING AND GAMES   5 where P i is a unary predicate, ϕ 1 (x) is an FO 2 [LTL] Let formula in which x is the only free variable and no occurrence of predicate P i is free, and ϕ 2 is an arbitrary FO 2 [LTL] Let formula. A subformula P j (z) occurs freely in ϕ(x) iff it occurs freely in ϕ 1 (x) or it occurs freely in ϕ 2 and the predicate is not P i . The semantics of FO 2 [LTL] Let is defined via a translation function T to FO 2 [LTL], with the only non-trivial rule being: where T (ϕ 1 )[x → y] denotes the formula obtained by substituting variable y for all free occurrences of x in T (ϕ 1 ), and ) denotes substitution of any free occurrence of the form P i (x) in T (ϕ 1 ) and every occurrence of P i (y) by T (ϕ 1 )[x → y]. We let UTL Let be the extension of UTL by the operator above, and similarly define TL For ϕ a temporal logic formula or an FO 2 formula with one free variable, we denote by L(ϕ) the set {w ∈ Σ ω : (w, 0) |= ϕ} of infinite words that satisfy ϕ at the initial position. The quantifier depth of an FO 2 formula ϕ is denoted qdp(ϕ) and the operator depth of a UTL formula ϕ is denoted odp(ϕ). In either case the length of the formula is denoted |ϕ|.
The notion of a subformula of an FO 2 [LTL] formula is defined as usual. For an FO 2 [LTL] Let formula ϕ, let sub(ϕ) denote the set of subformulas of the equivalent FO 2 [LTL] formula T (ϕ), where T is the translation function defined above.
Büchi Automata. Our results will be obtained via transforming formulas to automata that accept ω-words. We will be most concerned with generalised Büchi automata (GBA). A GBA A is a tuple (Σ, S, S 0 , ∆, F) with alphabet Σ, set of states S, set of initial states S 0 ⊆ S, transition function ∆ and set of sets of final states F. The accepting condition is that for each F ∈ F there is a state s ∈ F which is visited infinitely often. We can have labels either on states or on transitions, but both models are equivalent. For more details, see [VW86]. We will consider two important classes of Büchi automata: the automaton A is said to be deterministic in the limit if all states reachable from accepting states are deterministic; A is unambiguous if for each state s each word is accepted along at most one run that starts at s.
Deterministic Parity Automata. For some model checking problems, we will need to work with deterministic automata. In particular, we will use deterministic parity automata. A deterministic parity automaton A is a tuple (Σ, S, s 0 , ∆, P r) with alphabet Σ, set of states S, an initial state s 0 ∈ S, transition function ∆ and a priority function P r mapping each state to a natural number. The transition function ∆ maps each state and symbol of the alphabet exactly to one new state. A run of such an automaton on input ωword induces an infinite sequence of priorities. The acceptance condition is that the highest infinitely often occurring priority in this sequence is even.
Complexity Classes. Our complexity bounds involve counting classes. #P is the class of functions f for which there is a non-deterministic polynomial-time Turing Machine T such that f (x) is the number of accepting computation paths of T on input x. A complete problem for #P is #SAT, the problem of counting the number of satisfying assignments of a given boolean formula. We will be considering computations of probabilities, not integers, so our problems will technically not be in #P; but some of them will have representations computable in the related class F P #P , and will be #P -hard. For brevity, we will sometimes abuse notation by saying that such probability computation problems are #P -complete. The class of functions #EXP is defined analogously to #P, except with T a non-deterministic exponential-time machine. We will deal with a decision version of #EXP, PEXP, the set of problems solvable by nondeterministic Turing machine in exponential time, where the acceptance condition is that more than a half of computation paths accept [BFT98].
Notation: In our complexity bounds, we will often write poly to denote a fixed but arbitrary polynomial.

FO 2 model theory and succinctness
We now discuss the model theory of FO 2 , summarizing and slightly extending the material presented in Etessami, Vardi, and Wilke [EVW02] and in Weis and Immerman [WI09].
Recall that we will consider strings over alphabet Σ = 2 P , where P is the set of unary predicates appearing in the input FO 2 [<] formula. We start by recalling the small-model property of FO 2 that underlies the NEXPTIME satisfiability result of Etessami, Vardi, and Wilke [EVW02], it is also implicit in Theorem 6.2 of [WI09].
The domain of a word u ∈ Σ * ∪Σ ω is the set dom(u) = {i ∈ N : 0 ≤ i < |u|} of positions in u. The range of u is the set ran(u) = {u i : i ∈ dom(u)} of letters occurring in u. Write also inf(u) for the set of letters that occur infinitely often in u.
The small model property of [EVW02] can then be stated as follows: ). Let Σ = 2 P . Then (i) For any string u ∈ Σ * and positive integer k there exists v ∈ Σ * such that u ∼ k v and |v| ∈ 2 O(|P|k) ; (ii) for any infinite string u ∈ Σ ω and positive integer k there are finite strings v and w, with |v|, |w| ∈ 2 O(|P|k) , such that u ∼ k vw ω .
For completeness, we give a constructive proof of Proposition 3.1, which will be used in one of our translations of FO 2 to automata. This is Lemma 3.9 at the end of this section. For this it is convenient to use the following inductive characterisation of ∼ k , which is proven in [EVW02] by a straightforward induction: The next proposition states that we can collapse any two positions in a string that have the same k-type without affecting the k-type of the string.
From these two propositions it follows that every finite string is equivalent under ∼ k to a string of length exponential in k and |P|.
Proposition 3.4. Given a nonnegative integer k, for all strings u ∈ Σ * there exists a string v ∈ Σ * such that u ∼ k v and |v| is bounded by 2 O(|P|k) .
Proof. We prove by induction on k that the set {τ k (u, i) : i ∈ dom(u)} of k-types occurring along u has size at most |Σ|(2|Σ| + 2) k .
The base case k = 0 is clear. For the induction step, assume that the number of (k − 1)-types occurring along u is at most |Σ|(2|Σ| + 2) k−1 . Define a boundary point in u to be the position of the first or last occurrence of a given (k − 1)-type. Then there are at most 2|Σ|(2|Σ| + 2) k−1 boundary points. But by Proposition 3.2 the k-type at a given position i in u is determined by u i , the set of boundary points strictly less than i, and the set of boundary points strictly greater than i. Thus the number of k-types along u is at most (3.1) By Proposition 3.3, given any string v in which there are two distinct positions with the same k-type there exists a shorter string w with v ∼ k w. From the bound (3.1) on the number of boundary points, we conclude that there exists a string v such that u ∼ k v and |v| ≤ |Σ|(2|Σ| + 2) k ∈ 2 O(|P|k) .
The relation ∼ k is also easy to compute: Proof. For m = 0, 1, . . . , k we successively pass along u labelling each position i with its m-type τ m (u, i). Each rank m requires two passes: we pass leftward through u computing the set of (m − 1)-types to the left of each position, then we pass rightward computing the set of (m − 1)-types to the right of each position. This requires 2k passes, with each pass taking time linear in h and at most quadratic in the number of k-types that occur along u. The bound now follows using the estimate of the number of types given in Proposition 3.4.
Combining Propositions 3.4 and 3.5 we get: Corollary 3.6. Given k there exists a set Rep k (Σ) ⊆ Σ * of representative strings such that each v ∈ Rep k (Σ) has |v| ≤ |Σ|(2|Σ| + 2) k and for each string u ∈ Σ * there exists a unique v ∈ Rep k (Σ) such that u ∼ k v. Moreover Rep k (Σ) can be computed from k in time The following result is classical, and can be proven using games.
Proposition 3.7. Given u, v ∈ Σ * and u , v ∈ Σ ω , for all k if u ∼ k v and u ∼ k v then uu ∼ k vv .
From the above we infer that the equivalence class of an infinite string under ∼ k is determined by a prefix of the string and the set of letters appearing infinitely often within it.
Proposition 3.8. Fix k ∈ N. Given u = u 0 u 1 . . . ∈ Σ ω , there exists N ∈ N such that for all n ≥ N and any word w ∈ Σ ω with inf(w) = ran(w) = inf(u) it holds that u ∼ k u 0 u 1 . . . u n w.
Proof. Define a strictly increasing sequence of integers n 0 < n 1 < . . . < n k inductively as follows.
Let n 0 be such that for all i ≥ n 0 letter u i occurs infinitely often in u. For 0 < s ≤ k let n s be such that ran(u n s−1 . . . u ns ) = inf(u). Now define N := n k .
Let n ≥ N and let v := u 0 u 1 . . . u n w for some w such that inf(w) = ran(w) = inf(u). We claim that for all 0 ≤ s ≤ k: This claim entails the proposition. We prove the claim by induction on s. The base case s = 0 is obvious.
The induction step for Clause 1 is as follows. Suppose that i ≤ n s ; we must show that τ s (u, i) = τ s (v, i). Certainly u i = v i since u and v agree in the first N letters. Similarly for all j < i we have τ s−1 (u, j) = τ s−1 (u, j) by Parts 1 and 2 of the induction hypothesis. Now for all j > i there exists j > i such that u j = v j and hence by Part 2 of the induction hypothesis τ s−1 (u, j) = τ s−1 (v, j ). We conclude that τ s (u, i) = τ s (v, i) by Proposition 3.2.
The induction step for Clause 2 is as follows. Suppose that i, j > n s and u i = v j ; we must show that τ s (u, i) = τ s (v, j). We will again use Proposition 3.2. Certainly for all i > i there exists j > j such that u i = v j and hence τ s−1 (u, i ) = τ s−1 (v, j ). Now let i < i. If i ≤ n s then i < j, u i = v i and hence τ s−1 (u, i ) = τ s−1 (v, i ). Otherwise suppose n s < i < i. By definition of n s there exists j , n s−1 < j ≤ n s such that u i = v j . Then τ s−1 (u, i ) = τ s−1 (u, j ) by Clause 2 of the induction hypothesis.
Combining Proposition 3.7 and Proposition 3.8, we complete the proof of Proposition 3.1, giving a slight strengthening of the conclusion for infinite words.
Lemma 3.9. For any string u ∈ Σ ω and positive integer k there exists v ∈ Σ * with |v| ∈ 2 O(|P|k) such that v ∼ k u for infinitely many prefixes u of u, and u ∼ k vw ω , where w is a list of the letters occurring infinitely often in u.
3.1. FO 2 and temporal logic. We now examine the relationship between FO 2 and UTL. Again we will be summarizing previous results while adding some new ones about the complexity of translation.
As mentioned previously, Etessami, Vardi and Wilke [EVW02] have studied the expressiveness and complexity of FO 2 on words. They show that FO 2 has the same expressiveness as unary temporal logic UTL, giving a linear translation of UTL into FO 2 , and an exponential translation in the reverse direction.
With regard to complexity, [EVW02] shows that satisfiability for FO 2 over finite words or ω-words is NEXP-complete. The NEXP upper bound follows immediately from their "small model" theorem (see Proposition 3.1 stated earlier). NEXP-hardness is by reduction from a tiling problem. This reduction requires either the use of the successor predicate, or consideration of models where an arbitrary Boolean combination of predicates can hold, that is, they consider words over an alphabet of the form Σ = 2 {P 1 ,P 2 ,...,Pn} .
The NEXP-hardness result for FO 2 [<] does not carry over from satisfiability to model checking since the collection of alphabet symbols that can appear in a word generated by the system being checked is bounded by the size of the system. However the complexity of model checking is polynomially related to the complexity of satisfiability when the latter is measured as a function of both formula size and alphabet size. Hence in the rest of the section we will deal with words over alphabet Σ = {P 0 , P 1 , . . . , P n }, i.e., in which a unique proposition holds in each position. We call this the unary alphabet restriction.
One obvious approach to obtaining upper bounds for model checking FO 2 [<] would be to give a polynomial translation to TL [ , ], and use logic-to-automata translation for TL [ , ]. Without the unary alphabet restriction an exponential blow-up in translating from FO 2 [<] to TL[ , ] was shown necessary by Etessami, Vardi, and Wilke: Proposition 3.11 ( [EVW02]). There is a sequence (ψ n ) n≥1 of FO 2 [<] sentences over {P 1 , P 2 , . . . , P n } of size O(n 2 ) such that the shortest temporal logic formula equivalent to ψ n has size 2 Ω(n) .
The sequence given in [EVW02] to prove the above theorem is In particular, their proof does not apply under the unary alphabet restriction. However below we show that the exponential blow-up is necessary even in this restricted setting. Our proof is indirect; it uses the following result about extensions of FO 2 with let definitions: Lemma 3.12. There is a sequence (ϕ n ) n≥1 of FO 2 [<] Let sentences mentioning predicates {P 1 , P 2 , . . . , P n } such that the shortest model of ϕ n under the unary alphabet restriction has size 2 Ω(|ϕn|) .
Proof. We define ϕ n as follows.
The body of the nested sequence of let definitions states that for all x and for all 1 ≤ i ≤ n there exists y such that the vector of formulas (R 1 (x), R 2 (x), . . . , R n (x)) has the same truth value as the vector (R 1 (y), R 2 (y), . . . , R n (y)) in all but position i. Hence the vector (R 1 (x), R 2 (x), . . . , R n (x)) must take all 2 n possible truth values as x ranges over all positions in the word, i.e., any model of ϕ n must have length at least 2 n .
We now claim that ϕ n is satisfiable. To show this, recursively define a sequence of words w (k) over alphabet Σ = {P 0 , P 1 , . . . , P n } by w (0) = ε and w (k+1) = w (k) P n−k w (k) , where 0 ≤ k < n. Finally write w = w n P 0 . Then the vector of truth values (R 1 (x), R 2 (x), . . . , R n (x)) counts down from 2 n − 1 to 0 in binary as one moves along w.
In contrast, we show that basic temporal logic enhanced with let definitions has the small model property: Lemma 3.13. There is a polynomial poly such that every satisfiable TL[ , ] Let formula ϕ has a model of size poly(|ϕ|).
Proof. In [EVW02, Section 5], Etessami, Vardi, and Wilke prove a small model property for TL [ , ], which follows the same lines as the one given for FO 2 , but with polynomial rather than exponential bounds on sizes. Instead of using types based on quantifier-rank, the notion of type is based on the nesting of modalities; they thus look at modal k-type, where k is the nesting of modalities in ϕ. It was shown how to collapse infinite ω-words in order to get "smaller" ω-words with essentially the same type structure. Then in Lemma 4 of [EVW02] it is shown that for each u ∈ Σ ω there are strings v, w such that the type of u at position 0 is equal to the type of vw ω at position 0 and the length of both v and w is less than (t + 1) 2 , where t is number of types occurring along u (that is, a polynomial version to Proposition 3.1).
The type is determined by the predicate and the combination of temporal subformulas of ϕ holding at the given position. Each temporal subformula, i.e. subformula which starts with or , can change its truth value at most once along the infinite word. Therefore there are at most polynomially many (in |Σ| and in number of temporal subformulas of ϕ) different combinations and so also types along u.
Lemma 2.1 tells us that number of temporal subformulas of ϕ is linear in |ϕ|, and therefore the number of types t occurring along any word is polynomial in |ϕ|. Thus applying the above-mentioned type-collapsing argument of [EVW02] we conclude that there is a polynomial size model of ϕ.
The small model property for TL[ , ] Let will allow the lifting of NP model-checking results to this language. Most relevant to our discussion of succinctness, it can be combined with the previous result to show that FO 2 [<] is succinct with respect to TL[ , ]: Proposition 3.14. Even assuming the unary alphabet restriction, there is no polynomial translation from FO 2 [<] formulas to equivalent TL[ , ]-formulas.
Proof. Proof by contradiction. Assuming there were such a polynomial translation, we could apply it locally to the body of every let definition in an FO 2 [<] Let formula. This would allow us to translate an FO 2 [<] Let formula to a TL[ , ] Let formula of polynomial size. Therefore it would follow from Lemma 3.13 that every FO 2 [<] Let formula that is satisfiable has a polynomial sized model, which is a contradiction of Lemma 3.12.
Proposition 3.14 shows that we cannot obtain better bounds for FO 2 [<] merely by translation to TL[ , ]. Weis [Wei11] showed an NP-bound on satisfiability of FO 2 [<] under the unary alphabet restriction (compared to NEXP-completeness of satisfiability in the general case). His approach is to show that models realise only polynomially many types. We will later show that the approach of Weis can be extended to obtain complexity bounds for model checking FO 2 [<] that are as low as one could hope, i.e., that match the complexity bounds for the simplest temporal logic, TL [ , ]. We do so by building sufficiently small unambiguous Büchi automata for FO 2 [<] formulas.

Translations
This section contains a key contribution of this paper-three logic-to-automata translations for UTL, FO 2 , and FO 2 [<]. We will later use these translations to obtain upper complexity bounds for model checking both non-deterministic and probabilistic systems. As we will show, for most of the problems it is sufficient to translate a given formula to an unambiguous Büchi automaton. Our first translation produces such an automaton from a given UTL formula. This is then lifted to full FO 2 via a standard syntactic transformation from FO 2 to UTL. Our second translation goes directly from the stuffer-free fragment FO 2 [<] to unambiguous Büchi automata, and is used to obtain optimal bounds for this fragment. Our third translation constructs a deterministic parity automaton from an FO 2 formula. Having a deterministic automaton is necessary for solving two-player games and quantitative model checking of Markov decision processes. 4.1. Translation I: From UTL to unambiguous Büchi automata. We begin with a translation that takes UTL formulas to Büchi automata. Combining this with the standard syntactic transformation of FO 2 to UTL, we obtain a translation from FO 2 to Büchi automata.
Recall from the preliminaries section that a Büchi automaton A is said to be deterministic in the limit if all accepting states and their descendants are deterministic, and that A is unambiguous if each word has at most one accepting run.
We will aim at the following result: Theorem 4.1. Let ϕ be a UTL formula over set of propositions P with operator depth n with respect to and . Given an alphabet Σ ⊆ 2 P , there is a family of at most 2 |ϕ| 2 Büchi automata {A i } i∈I such that (i) {w ∈ Σ ω : w |= ϕ} is the disjoint union of the languages L(A i ); (ii) A i has at most O(|ϕ||Σ| n+1 ) states; (iii) A i is unambiguous and deterministic in the limit; (iv) there is a polynomial-time procedure that outputs A i given input ϕ and index i ∈ I.
We first outline the construction of the family {A i }. Let ϕ be a formula of TL[ , ] over set of atomic propositions P. Following Wolper's construction [Wol01], define cl (ϕ), the closure of ϕ, to consist of all subformulas of ϕ (including ϕ) and their negations, where we identify ¬¬ψ with ψ. Furthermore, say that s ⊆ cl (ϕ) is a subformula type if (i) for each formula ψ ∈ cl (ϕ) precisely one of ψ and ¬ψ is a member of s; (ii) ψ ∈ s implies ψ, ψ ∈ s; (iii) ψ 1 ∧ ψ 2 ∈ s iff ψ 1 ∈ s and ψ 2 ∈ s. Given subformula types s and t, write s ∼ t if s and t agree on all formulas whose outermost connective is a temporal operator, i.e., for all formulas ψ we have ψ ∈ s iff ψ ∈ t, and ψ ∈ s iff ψ ∈ t. Note that these types are different from the types based on modal depth considered before.
Fix an alphabet Σ ⊆ 2 P and write tp Σ ϕ for the set of subformula types s ⊆ cl (ϕ) with s ∩ P ∈ Σ. In subsequent applications Σ will arise as the set of propositional labels in a structure to be model checked. Following [Wol01] we define a generalised Büchi automaton The set of states is S = tp Σ ϕ , with the set S 0 of initial states comprising those s ∈ tp Σ ϕ such that (i) ϕ ∈ s and (ii) ψ ∈ s if and only if ψ ∈ s for any formula ψ. The state labelling function λ : S → Σ is defined by λ(s) = s ∩ P . The transition relation ∆ consists of those pairs (s, t) such that (i) ψ ∈ t iff either ψ ∈ t or ψ ∈ s; (ii) ψ ∈ s and ψ ∈ s implies ψ ∈ t; Moreover it can be shown that if the run is accepting then for all formulas ψ ∈ cl (ϕ), Lemma 2]. But since f (i) contains each subformula or its negation, we have ψ ∈ f (i) if and only if (u, i) |= ψ for all ψ ∈ cl (ϕ). We conclude that A Σ ϕ is unambiguous and accepts the language L(ϕ). The following lemma summarises some structural properties of the automaton A Σ ϕ .
Lemma 4.2. Consider the automaton A Σ ϕ as a directed graph with set of vertices S and set of edges ∆. Then (i) states s and t are in the same strongly connected component iff s ∼ t; (ii) each strongly connected component has size at most |Σ|; (iii) the dag of strongly connected components has depth at most |ϕ| and outdegree at most 2 |ϕ| ; (iv) A Σ ϕ is deterministic within each strongly connected component, i.e., given transitions (s, t) and (s, u) with s, t and u in the same strongly connected component, we have t = u if and only if λ(t) = λ(u).
Proof. (i) If s ∼ t then by definition of the transition relation ∆ we have that (s, t) ∈ ∆. Thus s and t are in the same connected component. Conversely, suppose that s and t are in the same connected component. By clauses (i) and (iii) in the definition of the transition relation ∆ we have that ψ ∈ s iff ψ ∈ t and likewise ¬ ψ ∈ s iff ¬ ψ ∈ t. But for each formula ψ ∈ cl (ϕ) either s contains ψ or its negation, and similarly for t; it follows that s ∼ t.
(ii) If s ∼ t, then s = t if and only if λ(s) = λ(t). Thus the number of states in an SCC is at most the number |Σ| of labels.
(iii) Suppose that (s, t) ∈ ∆ is an edge connecting two distinct SCC's, i.e., s ∼ t. Then there is a subformula ψ ∈ s such that ¬ ψ ∈ t. Note that ¬ ψ lies in all states reachable from t under ∆. Since there at most |ϕ| such subformulas, we conclude that the depth of the DAG of SCC's is at most |ϕ|.
(iv) This follows immediately from (i).
We proceed to the proof of Theorem 4.1.
Proof. We first treat the case n = 0, i.e., ϕ does not mention or . Let A Σ ϕ = (Σ, S, S 0 , ∆, λ, F) be the automaton corresponding to ϕ, as defined above. For each path π = C 0 , C 1 , . . . , C k of SCC's in the SCC dag of A Σ ϕ we define a sub-automaton A π as follows. A π has set of states S π = C 0 ∪ C 1 ∪ · · · ∪ C k ; its set of initial states is S 0 ∩ S π ; its transition relation is ∆ π = ∆ ∩ (S π × S π ), i.e., the transition relation of A Σ ϕ restricted to S π ; its collection of accepting states is F π = {F ∩ C k : F ∈ F}.
It follows from observations (ii) and (iii) in Lemma 4.2 that A π has at most |ϕ||Σ| states, and from observation (iii) that there are at most 2 |ϕ| 2 such automata. Since A Σ ϕ is unambiguous, each accepting run of A Σ ϕ yields an accepting run of A π for a unique path π, and so the L(A π ) partition L(A Σ ϕ ). Finally, A π is deterministic in the limit since all accepting states lie in a bottom strongly connected component, and all states in such a component are deterministic by Lemma 4.2(iv). If we convert A π from a generalised Büchi automaton to an equivalent Büchi automaton (using the construction from [Wol01]), then the resulting automaton remains unambiguous and deterministic in the limit. This transformation touches only the bottom strongly connected component of A π , whose size will become at most quadratic.
This completes the proof in case n = 0. The general case can be handled by reduction to this case. A UTL formula ϕ can be transformed to a normal form such that all next-time and last-time operators are pushed inside the other Boolean and temporal operators. Now the formula can be regarded as a TL[ , ] formula ϕ over an extended set of propositions { i P, i P : 0 ≤ i ≤ n, P ∈ P}. Applying the case n = 0 to ϕ we obtain a family of , A i is unambiguous and deterministic in the limit, and A i has at most O(|ϕ ||Σ |) = O(|ϕ||Σ| n ) states. Now we can construct a deterministic transducer T with |Σ| n states that transforms (in the natural way) an ω-word over alphabet Σ into an ω-word over alphabet Σ . Such a machine can be made deterministic by having T produce its output n positions behind the input. To do this we maintain an n-place buffer in the states of T , which requires |Σ| n states.
We construct automaton A i over alphabet Σ by composing A i with T , i.e., by synchronising the output of T with the input of A i . The number of states of the composition is the product of the number of states of A i and T which are consistent with respect to their label in Σ . Thus the product has at most O(|ϕ||Σ| n+1 ) states.
This completes the proof of Theorem 4.1.
From Theorem 4.1 we can get a translation of FO 2 to automata with bounds as stated below: Theorem 4.3. Given an FO 2 formula ϕ, there is a collection of 2 2 poly(|ϕ|) generalised Büchi automata A i , each of size at most 2 poly(|ϕ|) such that the languages they accept partition the language {w ∈ Σ ω : w |= ϕ}. Moreover, each automaton A i is unambiguous and can be constructed by a non-deterministic Turing machine in polynomial time in its size.
Proof. First we apply Lemma 3.10 to translate the FO 2 formula ϕ to an equivalent UTL formula ϕ . We then apply Theorem 4.1 to ϕ , noting that the size of ϕ is exponential in the size of ϕ, while the operator depth of ϕ is polynomial in the quantifier depth of ϕ. Finally, we apply Theorem 4.1 to ϕ . 4.2. Translation II: From FO 2 [<] to unambiguous Büchi automata. The previous translation via UTL will be useful for giving bounds on verifying both UTL and FO 2 . However it does not give insight into the sublanguage FO 2 [<]. We will thus give another translation specific to this fragment. The main idea for getting upper bounds on verification problems for FO 2 [<] will be to show that for any FO 2 [<] formula ϕ, the number of onevariable subformula types realised along a finite or infinite word is polynomial in the size of ϕ. Informally these subformula types are the collections of one-variable subformulas of ϕ that might hold at a given position. Note that the types we consider here are collections of FO 2 [<] formulas, not temporal logic formulas as in the last section. Also note the contrast with the k-types of Proposition 3.1, which consider all formulas of a given quantifier rank.
Recall that the domain of a word u ∈ Σ * ∪ Σ ω is the set dom(u) = {i ∈ N : 0 ≤ i < |u|} of positions in u. Given an FO 2 [<]-formula ϕ, let cl(ϕ) denote the set of all subformulas of ϕ with at most one free variable (including atomic predicates). Given a finite or infinite word u ∈ Σ * ∪ Σ ω , a position i ∈ dom(u), we define the subformula type of u at position i to be the set of FO 2 [<] formulas We have omitted ϕ in this notation since it will be fixed for the remainder of the proof.
Few Types Property for FO 2 [<]. We will base our result on the following theorem of Weis [Wei11], showing that FO 2 [<] formulas divide a finite word into a small number of segments based on subformula type: where v i ∈ Σ * , n is polynomial in |ϕ| and |Σ|, and for any two positions i, j lying in the same factor v k having the same symbol, τ (u, i) = τ (u, j).
We will need an extension of this result to infinite words: where v k ∈ Σ * for k < n and v n ∈ Σ ω , n is polynomial in |ϕ| and |Σ|, and for any two positions i, j lying within the same factor and having the same symbol we have τ (u, i) = τ (u, j).
Proof. We note that for any u ∈ Σ ω , from some position onwards, the subformula type is determined only by the current symbol. In fact, the proof of Proposition 3.8 shows that we have u = vw for some prefix v ∈ Σ * of u and w ∈ Σ ω such that for any two positions i, j of vw such that i, j > |v| having the same symbol τ (vw, i) = τ (vw, j).
Given an infinite u, we can take a finite prefix v as above and apply Proposition 4.4 to it, adding on the infinite interval w as one additional member of the partition. Now if i and j are in the final partition, then agreement on the same symbol determines the entire set of formulas, and hence we are done. Otherwise, fix any two positions i, j ≤ |v| in u with the symbol a ∈ Σ holding at both i and j, and lying in the same partition within v. We claim that the subformula types τ (u, i) and τ (u, j) contain the same set of formulas. An atomic predicate ψ ∈ cl(ϕ) holds at position i iff it holds at j by assumption, since there is only one symbol true at each position. Positions i and j then by assumption satisfy the same subformula type within v. But using the hypothesis on v we can easily see inductively that a subformula holds on a position within v iff it holds at that position within vw.
We now present a result showing that the few subformula types property can be used to get a better translation to automata: Theorem 4.6. Assume the unary alphabet restriction. Then given an FO 2 [<] formula ϕ, there is a collection of 2 poly(|ϕ|,|Σ|) generalised Büchi automata A i (each of polynomial size in |ϕ| and |Σ|) such that the languages they accept are disjoint and the union of these languages is exactly {w ∈ Σ ω : w |= ϕ}. Moreover, each automaton A i is unambiguous and deterministic in the limit and can be constructed by a non-deterministic Turing machine in polynomial-time.
This notion is similar to the notion of "subformula type of a node" used in the prior results, except that a collection of formulas satisfying the above property may not be consistent, since the semantics of existential quantifiers is not taken into account.
In general the formulas in a (subformula) pre-type τ can have either x or y as free variables. We write τ (x) for the subformula pre-type obtained by interchanging x and y in all formulas in τ with y as free variable. Thus all formulas in τ (x) have free variable x. We similarly define τ (y).
An order formula is an atomic formula Given m, n ∈ N let α m,n denote the unique order formula satisfied by the valuation x, y → m, n. Given a pair of pre-types τ 1 , τ 2 , an order formula α, and a subformula θ of ϕ, we write τ 1 (x), τ 2 (y), α |= θ(x, y) to denote that when θ is transformed by replacing top-level subformulas by their truth values as specified by τ 1 (x), τ 2 (y), or α, then the resulting Boolean combination evaluates to true. Note that this implies that if word w and positions i, j satisfy τ 1 (x) ∪ τ 2 (y) ∪ {α}, then they also satisfy θ.
A closure labelling is a function f : It is easy to see that an ω-word w : N → Σ has a unique extension to a closure labelling We now define a generalised Büchi automaton A ϕ corresponding to ϕ.
Definition 4.7. The alphabet of A ϕ is Σ, and the other components of A ϕ are as follows: States. The states of A ϕ are tuples (s, τ, t), where τ ⊆ cl (ϕ) is a pre-type and s, t ⊆ 2 cl(ϕ) are sets of pre-types of size at most p(|ϕ|, |Σ|), where p is the polynomial from Proposition 4.5, such that the following consistency condition holds: for each formula ∃yθ ∈ τ we have that either τ (x), τ (y), x = y |= θ, τ (x), τ (y), x < y |= θ for some τ ∈ t, or τ (y), τ (x), y < x |= θ for some τ ∈ s. (This condition corresponds to the second clause in the definition of closure labelling.) Informally, a state consists of an assertion about the subformula pre-types seen in the past, the current subformula pre-type, and the subformula pre-types to be seen in the future.
Initial State. A state (s, τ, t) is initial if s = ∅ and ϕ ∈ τ . Accepting States. There is a set of accepting states F τ for each pre-type τ . We have (s, τ , t) ∈ F τ if and only if τ = τ or τ ∈ t.
Transitions. For each a ∈ Σ there is an a-labelled transition from (s, τ, t) to (s , τ , t ) iff (i) for the unique proposition The following proposition, whose proof follows straightforwardly from Proposition 4.5, shows that the automaton captures the formula: Proposition 4.8. If (s 0 , τ 0 , t 0 ), (s 1 , τ 1 , t 1 ), (s 2 , τ 2 , t 2 ), . . . is an accepting run of A ϕ , then the function f : N → 2 cl(ϕ) defined by f (n) = τ n is a closure labelling. Moreover every closure labelling f such that ϕ ∈ f (0) arises from a run of A ϕ in this manner.
We now analyze the automaton A ϕ . Because of the polynomial restriction on the number of pre-types, the automaton has at most exponentially many states. But by Proposition 4.5, any accepting run goes through only polynomially many states. For every path π in the DAG of strongly-connected components, we take the subautomaton A π of A ϕ obtained by restricting to the components in this path. We claim that this is the required decomposition of A ϕ . Note that an NP machine can construct these restrictions by iteratively making choices of successor components that are strictly lower in the DAG. Clearly the automata corresponding to distinct paths accept disjoint languages, since they correspond to different collections of pre-types holding in the word. One can show that for any word satisfying the formula, the unique accepting run is the one in which the state at a position corresponds to the pre-types seen before the positions, the pre-type seen at the position, and the pre-types seen after the position. In particular, this shows that each automaton is unambiguous. Finally, because the only nondeterministic choice is whether to leave an SCC or not, upon reaching the bottom SCC the automaton is deterministic-hence each automaton is deterministic in the limit. Thus this decomposition witnesses Theorem 4.6.
The above translation of FO 2 [<] formulas to unambiguous Büchi automata can be extended to handle formulas with successor, i.e., the full logic FO 2 , at the same time removing the unary alphabet restriction. Given an FO 2 formula ϕ over set of predicates P, we can consider an "equivalent" FO 2 [<] formula ϕ over a set of new predicates 2 |ϕ||P| . Intuitively each predicate in P specifies the truth values of all predicates in P in a neighbourhood of radius |ϕ| around the current position. Applying Theorem 4.6 to ϕ we obtain a collection of double-exponentially many automata A i , each of size exponential size in ϕ and Σ. Thus, we get a weaker version of Theorem 4.3 of the previous subsection, in which the size bound on the component automata has an exponential dependence on the alphabet as well as the formula size. 4.3. Translation III: From FO 2 to deterministic parity automata. While the previous translations are useful for relating FO 2 to unambiguous automata, for some problems it is useful to have deterministic automata. We now give a translation of FO 2 formulas to "small" deterministic parity automata. We give the translation first for the fragment FO 2 [<] without successor and show later how to handle the full logic. Specifically, we will show: Theorem 4.9. Given an FO 2 [<] formula ϕ over set of predicates P with quantifier depth k, there exists a deterministic parity automaton A ϕ accepting the language L(ϕ) such that A ϕ has 2 2 O(|P|k) states, 2 O(|P|) priorities, and can be computed from ϕ in time |ϕ| O(1) · 2 2 O(|P|k) .
The definition of the automaton A ϕ in Theorem 4.9 relies on the small-model property, as stated in Proposition 3.1. By Lemma 3.9, to know whether u ∈ Σ ω satisfies an FO 2 [<]formula of quantifier depth k it suffices to know some k-type such that infinitely many prefixes of u have that type, as well as which letters occur infinitely often in u. We will translate ϕ to a deterministic parity automaton A ϕ that detects this information. As A ϕ reads an input string u it stores a representative of the k-type of the prefix read so far. By Proposition 3.1(i) the number of such representatives is bounded by 2 2 O(|P|k) . Applying Lemma 3.9, we use a parity acceptance condition to determine whether u satisfies ϕ, based on which representatives and input letters occur infinitely often.
We are now ready to formally define A ϕ . To this end, define the last appearance record of a finite string u = u 0 . . . u n ∈ Σ * to be the substring LAR(u) := u i 1 u i 2 . . . u im such that for all k ∈ dom(u) there exists a unique i j ≥ k such that u i j = u k . Thus we obtain LAR(u) from u by keeping only the last occurrence of each symbol from u. Write LAR(Σ) for the set {LAR(u) : u ∈ Σ * } of all possible last appearance records. Recall also the set of strings Rep k (Σ) from Corollary 3.6 that represent the different k-types of strings in Σ * .
Definition 4.10. Let ϕ(x) be an FO 2 [<]-formula of quantifier depth k. We define a deterministic parity automaton A ϕ as follows.
• The priority of state (s, , i) where = 1 2 . . . j is given by It follows from Proposition 3.7 that in a run of A ϕ on a finite word u = u 0 u 1 . . . u n ∈ Σ * the last state (s, , i) is such that s has the same k-type as u. Also we note that is the LAR of u and i is the position in the previous LAR of u n .
Proof. Let u ∈ Σ ω and let N be as in Proposition 3.8. Suppose that the highest infinitely often occurring priority in a run of A ϕ on u is even. Then there exists n ≥ N such that We conclude that (u, 0) |= ϕ.
Similarly we can show that if the highest infinitely often occurring priority in a run of A ϕ on u is odd then (u, 0) |= ϕ.
Proposition 4.12. If ϕ over set of monadic predicates P has quantifier depth k, then A ϕ has number of states at most 2 2 O(|P|k) and can be computed from ϕ in time |ϕ| O(1) · 2 2 O(|P|k) .
Proof. The set of states Rep k (Σ) has size at most 2 2 O(|P|k) and can be constructed in time at most 2 2 O(|P|k) by Corollary 3.6. We can establish the existence of a transition between any pair of states of A ϕ in time at most 2 O(|P|k) by Proposition 3.5. Finally we can compute the priority of a state (s, , i) by model checking ϕ on a lasso of length at most 2 O(|P|k) , which can be done in time |ϕ| O(1) · 2 O(|P|k) .
Extension to FO 2 with successor. We now extend to successor using the same approach as in the proof of Theorem 4.1. By Lemma 3.10, given an FO 2 formula ϕ of quantifier depth k there is an equivalent UTL formula ϕ of at most exponential size and operator depth at most 2k. Moreover, ϕ can be transformed to a normal form such that all next-time and last-time operators are pushed inside the other operators. Again, we consider ϕ also as a TL[ , ]-formula over an extended set of predicates P = { i P j , i P j | P j ∈ P, i ≤ k}.
By a straightforward transformation we get an equivalent FO 2 [<] formula ϕ over P . Overall, this transformation creates exponentially larger formulas, but the quantifier depth is only doubled and the set of predicates is quadratic. Applying Theorem 4.9 for ϕ over set of predicates P gives: Theorem 4.13. Given an FO 2 formula ϕ with quantifier depth k, there is a deterministic parity automaton having 2 2 O(k 2 |P|) states and 2 O(k|P|) priorities that accepts the language L(ϕ).

Models considered
Next we collect together definitions of the various different types of state machine that we consider in this paper. For non-deterministic machines we will be interested in the existence of an accepting path through the machine that satisfies a formula, while for probabilistic models we want to know the probability of such paths.
Kripke Structures, Hierarchical and Recursive State Machines. Our most basic model of non-deterministic computation is a Kripke structure, which is just a graph with an additional set of nodes (the initial states), and a labelling of nodes with a subset of a collection of propositions. The behavior represented by such a structure is the set of paths through the graph, where paths can be seen as ω-words.
We will look also at more expressive and succinct structures for representing behaviours.  , x), where b is a box in B i and x is an exit node in Ex j for j = Y i (b). We require that the destination v be either a node in N i or a pair (b, e), where b is a box in B i and e is an entry node in En j for j = Y i (b). Informally, an RSM represents behaviors that can transition through a box into the entry node of the machine called by the box, and can transition via an exit node back to the calling box, as with function calls. The semantics can be found in [ABE + 05]. A hierarchical state machine (HSM) is an RSM in which the dependency relation between boxes is acyclic. HSMs have the same expressiveness as flat state machines, but can be exponentially more succinct.
Markov Chains. The basic probabilistic model corresponding to a Kripke structure is a (labelled) Markov chain, specified as M = (Σ, X, V, E, M, ρ), consisting of an alphabet Σ, a set X of states; a valuation V : X → Σ; a set E ⊆ X × X of edges; a transition probability M xy for each pair of states (x, y) ∈ E such that for each state x, y M xy = 1; an initial probability distribution ρ on the set of states X.
A Markov chain defines a probability distribution on trajectories-paths through the chain. Given a language L ⊆ Σ ω , we denote by P M (L) the probability of the set of trajectories of M whose image under V lies in L. We consider the complexity of the following model checking problem: Given a Markov chain M and an LTL-or FO 2 -formula ϕ, calculate P M (L(ϕ)). There is a decision version of this problem that asks whether this probability exceeds a given rational threshold.
Recursive Markov Chains. Recursive Markov chains (RMCs) are the analog of RSMs in the probabilistic context. They are defined as RSMs, except that the transition relation consists of triples (u, p u,v , v) where u and v are as with RSMs, and the p u,v are non-negative reals with Σ v p u,v = 1 or 0 for every u. As with Markov chains, these define a probability distribution on trajectories, but now trajectories are paths which must obey the box-entry/box-exit discipline of an RSM. The semantics of an RMC can be found in [EY05]. A hierarchical Markov chain (HMC) is the probabilistic analog of an HSM, that is, an RMC in which the calling graph is acyclic. An HMC can be converted to an ordinary Markov chain via unfolding, possibly incurring an exponential blow-up. An example of an RMC is shown in Figure 2. Markov Decision Processes. We will also deal with verification problems related to control of a probabilistic process by a scheduler. A Markov decision process (MDP) M = (Σ, X, N, R, V, E, M, ρ) consists of an alphabet Σ, a set X of states, which is partitioned into a set N of non-deterministic states and a set R of randomising states; a valuation V : X → Σ, a set E ⊆ X × X of edges, a transition probability M xy for each pair of states (x, y) ∈ E, x ∈ R such that y M xy = 1; an initial probability distribution ρ. This model is considered in [CY95] under the name Concurrent Markov chain.
We can view non-deterministic states as being controlled by the scheduler, which given a trajectory leading to a non-deterministic state s chooses a transition out of s. There are two basic qualitative model checking problems: the universal problem (∀) asks that a given formula be satisfied with probability one for all schedulers; the existential problem (∃) asks that the formula be satisfied with probability one for some scheduler. The latter corresponds to the problem of designing a system that behaves correctly in a probabilistic environment. In the quantitative model checking problem, we ask for the maximal probability for the formula to be satisfied on a given MDP when the scheduler chooses optimal moves in the non-deterministic states.
Two-player Games. A two-player game G = (Σ, X, X 1 , X 2 , V, E, x 0 ) consists of an alphabet Σ; a set X of states, which is partitioned into a set X 1 of states controlled by Player I and a set X 2 controlled by Player II ; a set of E ⊆ X × X of transitions; a valuation V : X → Σ; an initial state x 0 .
The game starts in the initial state and then the player who controls the current state, taking into account the whole history of the game, chooses one of the possible transitions. The verification problem of interest is whether Player I has a strategy such that for all infinite plays the induced infinite word u ∈ Σ ω satisfies a given formula ϕ.
Stochastic Two-player Games. A Stochastic two-player game (2 1 2 -player game) G = (X, X 1 , X 2 , R, V, E, M, p 0 ) consists of a set X of states, which is partitioned into a set X 1 of states controlled by the first player, a set X 2 controlled by the second player and a set R of randomising states; a valuation V : X → Σ; a set of E ⊆ X × X of transitions, a transition probability M xy for each pair of states (x, y) ∈ E, x ∈ R such that y M xy = 1; an initial probability distribution ρ. See Figure 3 for an example.
The universal (∀) qualitative model checking problem asks if the first player can enforce that the infinite word u, induced by the path through the game, satisfies ϕ with probability one.

Verifying non-deterministic systems
Model checking for traditional Kripke Structures is fairly well-understood. All of our logics subsume propositional logic, and the model checking problems we deal with generalise propositional satisfiability-hence they are all NP-hard. LTL and UTL are both PSPACEcomplete [SC82], while (TL[ , ]) is NP-complete. Translation I shows how to convert an FO 2 formula to a union of exponential sized automata. A NEXPTIME algorithm can guess such an automaton, take its product with a given Kripke Structure, and then determine non-emptiness of the resulting product. Coupled with the hardness argument in [EVW02], this gives an alternative proof of the result of Etessami, Vardi, and Wilke: Theorem 6.1. [EVW02] FO 2 model-checking is complete for NEXPTIME.
Below we extend these results to give a comparison of the complexity of model checking for recursive state machines and two-player games, applying all of the translations in the previous section. Proof. We give the upper bound for RSMs only, since the other classes are special cases. We describe an NP algorithm that checks satisfiability of an FO 2 [<] sentence ϕ on the language of RSM M. Model checking the structure involves only combinations of propositions occurring in the structure, and hence by expanding out these combinations explicitly, we can assume that the unary alphabet restriction holds. Thus we can apply Translation II, from FO 2 [<] to Büchi Automata, Theorem 4.6. It suffices to check that one of the automata A i produced by the translation accepts a word produced by M. We can thus guess such an A i and can then check intersection of A i with M in polynomial time, by forming the product and checking that we can reach an accepting bottom strongly connected component. This reachability analysis can be done efficiently using the "summary edge construction"-see, e.g., [ABE + 05].
In the same way, we can obtain the result for model checking full FO 2 on RSMs, but now using the FO 2 to automata translation in Translation 1, Theorem 4.3. Again we guess an automata A i , which is now of exponential size. Thus we have: Proposition 6.3. FO 2 model checking of RSMs can be done in NEXPTIME.
This result matches the known result for ordinary Kripke structures.
6.2. Two-player games with FO 2 winning condition. Two-player games are known to be in 2EXPTIME for LTL [PR89]. We now show that the same is true for FO 2 , making use of Translation III in the previous section, which translates to deterministic parity automata. We also utilise the fact that a parity game with n vertices, m edges and d priorities can be solved in time O(dmn d ) [Jur00].
From these two results we easily conclude the 2EXPTIME upper bound: Proposition 6.4. Two-player games with FO 2 winning conditions are solvable in 2EXP-TIME.
Proof. Using Theorem 4.13, we construct in 2EXPTIME a deterministic parity automaton for the FO 2 formula ϕ with doubly exponentially many states and at most exponentially many priorities. By taking the product of this automaton with the graph of the game, we get a parity game with doubly exponentially many states but only exponentially many priorities. (In fact if we define the automaton over an alphabet Σ ⊆ 2 P containing only sets of propositions that occur as labels of states in the game, then polynomially many priorities suffice.) We can then determine the winner in double exponential time, again applying the O(dmn d ) bound for solving games of [Jur00] mentioned above.
Combining this with the result by Alur, La Torre, and Madhusudan, who showed that two-player games are 2EXPTIME-hard [ATM03] already for the simplest TL[ , ], along with the fact that we can convert UTL formula to FO 2 formula in polynomial time, we get 2EXPTIME-completeness: Corollary 6.5. Deciding two-player games with FO 2 winning conditions is complete for 2EXPTIME.
The table below summarises both the known results and the results from this paper (in bold) concerning non-deterministic systems. All bounds are tight.

Verifying probabilistic systems
We now turn to probabilistic systems. Here we will make use of two key properties of the automata produced by the first two translations-unambiguity and determinism in the limit. We will need two lemmas, which show that the complexity bounds for model checking unambiguous Büchi automata on various probabilistic systems are the same as the bounds for deterministic Büchi automata on these systems. First, following [CSS03], we note the following property of unambiguous automata: Proof. We define a directed graph M⊗A representing the synchronised product of M and A. The vertices of M ⊗ A are pairs (x, s) ∈ X × S with matching propositional labels, i.e., such that V (x) = λ(s); the set of directed edges is {((x, s), (y, t)) : (x, y) ∈ E and (s, t) ∈ ∆}. We say that a strongly connected component (SCC) of M ⊗ A is accepting if (i) for each set of accepting states F ∈ F it contains a pair (x, s) with s ∈ F and (ii) for each pair (x, s) and each transition (x, y) ∈ E, there exists (s, t) ∈ ∆ such that (y, t) is in the same SCC as (x, s). This guarantees that we can stay in the SCC and visit each of its states infinitely often.
Let M xy · ξ y,t otherwise.
The correctness of the third equation follows from the following calculation: M xy · P M,y (L(A, y)) .
For an RMC M, we can compute reachability probabilities q (u,ex) of exiting a component M i starting at state u ∈ V i going to exit ex ∈ Ex i . Etessami and Yannakakis [EY05] show that these probabilities are the unique solution of a system of non-linear equations which can be found in polynomial space using a decision procedure for the existential theory of the reals. Following [EY05] for every vertex u ∈ V i we let ne(u) = 1 − ex∈Ex i q (u,ex) be the probability that a trajectory beginning from node u never exits the component M i of u. Etessami and Yannakakis [YE05] also show that one can check properties specified by deterministic Büchi automata in PSPACE, while for non-deterministic Büchi automata they give a bound of EXPSPACE. Thus the prior results would give a bound of EXPSPACE for UTL and 2EXPSPACE for FO 2 . We will improve upon both these bounds. We observe that the technique of [YE05] can be used to check properties specified by non-deterministic Büchi automata that are unambiguous in the same complexity as deterministic ones. This will then allow us to apply our logic-to-automata translations.
Proposition 7.2. Given an unambiguous Büchi automaton A and a RMC M, we can compute the probability that A accepts a trajectory of M in PSPACE.
Proof. Let A be an unambiguous Büchi automaton with set of states Q, transition function ∆ and labelling function λ. Let M be an RMC with valuation V . We define a product RMC M ⊗ A with component and call structure coming from M whose states are pairs (x, s), with x a state of M and s a state of A such that V (x) = λ(s) (i.e., x and s have the same label). Such a pair (x, s) is accepting if s is an accepting state of A. A run through the product chain is accepting if at least one of the accepting states is visited infinitely often. Note that a path through M may expand to several runs in M ⊗ A since A is non-deterministic. For each i, for each vertex x ∈ V i , exit ex ∈ Ex i and states s, t ∈ Q we define p(x, s → ex, t) to be the probability that a trajectory in RMC M that begins from a configuration with state x and some non-empty context (i.e. not at top-level) expands to an accepting run in M ⊗ A from (x, s) to (ex, t).
Just as in the case of deterministic automata, we can compute p(x, s → ex, t) as the solution of the following system of non-linear equations: p(x , s → ex, t) If x ∈ V i is entrance of the box b ∈ B i then we include the equations: The justification for these equations is as follows. Since A is unambiguous, each trajectory of M expands to at most one accepting run of M ⊗ A. Thus in summing over automaton states s in the two equations above we are summing probabilities over disjoint events which correctly gives us the probability of the union of these events.
We now explain how these probabilities can be used to compute the probability of acceptance. We assume without loss of generality that the transition function of A is total.
We construct a finite-state summary chain for the product M ⊗ A exactly as in the case of deterministic automata [YE05]. For each component M i of M, vertex x of M i , exit ex ∈ Ex i and for each pair of states s, t of A the probability to transition from (x, s) to (ex, t) in the summary chain is calculated from p(x, s → ex, t) after adjusting for probability ne(x) that M never exits M i starting at vertex x. Note that since automaton A is nonblocking, the probability of never exiting the current component of M ⊗ A starting at (x, s) is the same as ne(x) (the probability of never exiting the current component from vertex x in the RMC M alone).
To summarise, we first compute reachability probabilities q (u,ex) and probabilities ne(u) for the RMC M. Then we consider the product M ⊗ A and solve a system of non-linear equations to compute the probabilities of summary transitions p(x, s → ex, t). From these data we build the summary chain, identify accepting SCCs and compute the resulting probabilities in the same way as in [YE05]. All these steps can be expressed as a formula and its truth value can be decided using existential theory of the reals in PSPACE. 7.1. Markov chains. We are now ready to prove a new bound for the model checking problem on our most basic probabilistic system, Markov chains. Courcoubetis and Yannakakis [CY95] showed that one can determine if an LTL formula holds with non-zero probability in a Markov chain in PSPACE. This gives a PSPACE upper bound for TL [ , ] and an EXPSPACE upper bound for FO 2 . We will show how to get better bounds, even in the quantitative case, using the logic-to-automata translations. Proof. The result follows by the same argument as in Proposition 7.3, as we are essentially in the same situation, but now by Theorem 4.3 we have a collection of doubly-exponentially many automata, each of exponential size. , we can convert a formula ϕ into an equivalent disjoint union of exponentially many unambiguous automata of polynomial size (in |ϕ| and |Σ|) and the RMC. Using polynomial space we can generate each automaton, calculate the probability that the RMC generates an accepting trajectory by Proposition 7.2 , and sum these probabilities for each automaton. Proposition 7.7. The probability of an FO 2 formula holding on an RMC can be computed in EXPSPACE.
Proof. The result follows by the same argument as in Proposition 7.5, but now by Theorem 4.3 we have family of doubly exponentially many automata each of exponential size, with a non-deterministic exponential time algorithm for building each automaton. Therefore applying Proposition 7.2 we immediately obtain upper bounds for FO 2 .
For an ordinary Markov chain, calculating the probability of an LTL formula can be done in PSPACE [Yan10], while we have seen previously that we can calculate the probability of an FO 2 formula in PEXP. One can achieve the same bounds for LTL and FO 2 on hierarchical Markov chains. In each case we expand the HMC into an ordinary Markov chain and then use the model checking algorithm for a Markov chain. This does not impact the complexity, since the space complexity is only polylog in the size of the machine for LTL and the time complexity is only polynomial in the machine size for FO 2 . We thus get: Proposition 7.8. The probability of a FO 2 formula holding on a HMC can be computed in PEXP, while for an LTL formula it can be computed in PSPACE. [CY95] have shown that the maximal probability with which a scheduler can achieve an UTL objective on an MDP can be computed in 2EXPTIME. It follows from results of [ATM03] that even the qualitative problem of determining whether every scheduler achieves probability 1 is 2EXPTIME-hard. Combining the 2EXPTIME upper bound with the exponential translation from FO 2 to UTL [EVW02] yields a 3EXPTIME bound for FO 2 . Below we see that using our FO 2 -toautomaton construction we are able to improve this bound to 2EXPTIME.

Markov decision processes. Courcoubetis and Yannakakis
We begin with universal formulation of qualitative model checking MDPs. To deal with MDP's, we will make use of determinism in the limit. Proposition 7.9. Determining whether for all schedulers a FO 2 [<]-formula ϕ holds almost surely on a Markov decision process M is co-NP-complete.
Proof. The corresponding complement problem asks whether there exists a scheduler σ such that the probability of ¬ϕ is greater than 0. For this problem, there is an NP algorithm, as we now explain. In Courcoubetis and Yannakakis [CY95], there is a polynomial time algorithm for qualitative model checking deterministic Büchi automata on MDPs. As noted there, the algorithm applies to automata that are deterministic in the limit as well. Therefore we can just guess a particular automaton A i from the family of automata corresponding to ¬ϕ, as described in Theorem 4.6. The theorem guarantees that this automaton will be deterministic in the limit.
It is easy to see that the co-NP is tight, even for TL[ , ], since qualitative model checking for MDPs generalises validity for both TL[ , ] formulas, which is co-NP hard.
Proposition 7.10. Determining whether for all schedulers a UTL-formula ϕ holds almost surely on a Markov decision process M is in EXPTIME. For FO 2 the problem is in co-NEXPTIME.
Proof. The result for FO 2 follows along the lines of the proof of Proposition 7.9, but now we guess an automaton A i of exponential size (using Theorem 4.3).
Similarly, for UTL we can use Theorem 4.1. We still have exponential sized automata A i , but only exponentially many of them, so we can iterate over all of them, which gives us a single exponential algorithm.
Note that here the FO 2 problem is easier than the corresponding LTL problem, which is known to be 2EXPTIME-complete.
For the existential case of the qualitative model-checking problem, an upper bound of 2EXPTIME for all of our languages will follow from the quantitative case below. On the other hand the arguments from [ATM03] can be adapted to get a 2EXPTIME lower bound (see Proposition 7.18) even for qualitative model-checking TL[ , ] in the existential case. Hence we have: Proposition 7.11. Determining if there is a scheduler that enforces a formula with probability one is 2EXPTIME-complete for each of TL[ , ], UTL, LTL, FO 2 [<] and FO 2 .
We now turn to the quantitative case. We apply the translation from FO 2 to deterministic parity automata from Subsection 4.3, along with the result that the value of a Markov decision process with parity winning objective can be computed in polynomial time [CH12]. Using Theorem 4.13 we immediately get bounds for FO 2 that match the known bounds for LTL: Proposition 7.12. We can compute the maximum probability of an FO 2 formula ϕ over all schedulers on a Markov decision processes M in 2EXPTIME.
7.4. Stochastic two-player games with FO 2 winning condition. We can reduce the qualitative case of stochastic two-player games to the case of ordinary two-player games using the following result of Chatterjee, Jurdzinski and Henzinger: Proposition 7.13 ( [CJH03]). Every (universal) qualitative simple stochastic parity game with n vertices, m edges and d priorities can be translated to a simple parity game with the same set of priorities, with O(dn) vertices and O(d(m + n)) edges, and hence it can be solved in time O(d(m + n)(nd) d/2 ). Now combining the reduction with our results for two-player games, we ascertain the complexity of stochastic two-player games: Corollary 7.14. The universal qualitative model checking problem for Stochastic two-player games (2 1 2 -player game) with FO 2 winning condition is 2EXPTIME-complete. Proof. Hardness follows from 2EXPTIME-hardness for two-player games with FO 2 winning conditions. Membership is a consequence of the above reduction and our bounds for twoplayer games (see Proposition 6.4 and Proposition 7.13). 7.5. Lower bounds. We can get corresponding tight lower bounds for most of the probabilistic model checking problems. Proof. The proof is by reduction from #SAT. Let ϕ be a propositional formula over literals a 1 , a 2 , . . . a n . We construct a Markov chain M such that each trajectory generated by M corresponds to an assignment of truth values to literals a 1 , . . . a n , with each of the 2 n possible truth assignments arising with equal probability. We also construct a TL [ , ] formula ψ such that only trajectories of M that encode satisfying valuations contribute to the probability P M (L(ψ)). Therefore the number of satisfying assignments of the original propositional formula ϕ is 2 n P M (L(ψ)).
See Figure 4 for a depiction of the Markov chain M in case n = 3. All probabilities equal 1/2, except those on transitions leading to the final vertex f . A path going through vertex a i corresponds to assigning true to the literal a i and a path through a i to an assignment of false. We construct the TL[ , ] formula ψ corresponding to the propositional formula ϕ by replacing each positive literal a i in ϕ with a i and each negative literal ¬a i in ϕ with a i . Proof. PEXP-hardness is by reduction from the problem of whether a strict majority of computation paths of a given non-deterministic EXPTIME Turing machine T on a given input I are accepting. The Markov chain generates a uniform distribution over strings of the appropriate length, and the formula checks whether a given string encodes an accepting computation of M. The ability of FO 2 to check validity of such a string has already been exploited in the NEXPTIME-hardness proof for FO 2 satisfiability in [EVW02]. The details of this approach can be found in the proof of Proposition 9.3. Combining with the upper bound from Proposition 7.4, the quantitative model checking problem for FO 2 on Markov chains is PEXP-complete.
Turning to lower bounds for MDPs, note that co-NEXPTIME-hardness for FO 2 is inherited from the lower bound for Markov chains. On the other hand, we can show that the EXPTIME bound for UTL is tight: Proposition 7.17. Determining whether for all schedulers a UTL-formula ϕ holds almost surely on a Markov decision process M is EXPTIME-hard.
Proof. The argument is based on the idea of Courcoubetis and Yannakakis for lower bounds in the LTL case. We reduce the acceptance problem for an alternating PSPACE Turing machine to the problem of whether there is a scheduler that enforces that a UTL formula ϕ holds with positive probability. Thus we reduce to the complement of the problem of interest.
Consider an alternating PSPACE Turing machine T with input I. Without loss of generality we assume that each configuration of T has exactly two successors and that T uses space at most n on an input I of length n. Then we can encode a branch of the computation tree of T as a finite string in which each configuration is represented by a consecutive block of n + 1 letters: one bit to represent the choice to branch left or right, and n letters to represent the configuration. Let L T (I) be the language of infinite strings, each of which is an infinite concatenation of finite strings that encode accepting computations. It is standard that one can write a UTL formula ϕ that captures L T (I) .
Next we describe the MDP M. Intuitively the goal of the scheduler is to choose a path through M so as to generate a word in L T (I) . A high-level depiction of M is given in Figure 5. The boxes init-conf and next-conf contain gadgets that are used by the scheduler to generate the initial configuration and all successive configurations of T as strings of length n. The number of such strings is exponential in n, but clearly the gadgets can be constructed using only linearly many states. After producing an existential configuration of the Turing Machine, the scheduler sends control to the state sch, where it decides whether T should branch left or right. After generating a universal configuration, an honest scheduler sends control to pro, the only randomising state in M, where the branching direction T is selected uniformly at random. When the scheduler has successfully generated an accepting computation it visits acc, which is the only accepting state of M, and the simulation starts over again from the beginning. Only those computations that visit acc infinitely often and in which the scheduler behaves honestly satisfy ϕ.
We claim that there exists a scheduler such that P M (L(ϕ)) > 0 if and only if T accepts its input.
If the Turing Machine T accepts its input, then the scheduler can simply follow the strategy from the alternating computation of T . Regardless of the choice made by the probabilistic opponent, the scheduler can always go to an accepting vertex with probability 1. Therefore even if we repeat the whole simulation, for this scheduler P M (L(ϕ)) = 1, which is greater than 0 as required.
The infinite repetition is important in the second case, when the Turing Machine T rejects its input. If the process ran only once, it could happen that in the probabilistic choice, only one option would lead to a rejecting state, but it would not be chosen if the probabilistic opponent of the scheduler were unlucky. Therefore we repeat this process infinitely many times and thus guarantee that with probability 1 we will reach the rejecting vertex and then stay there forever, i.e. P M (L(ϕ)) will be 0 as required.
Combining with the upper bound from Proposition 7.10, determining whether for all schedulers a UTL-formula holds with probability one on a Markov decision process is EXPTIME-complete. The above was a lower bound for checking whether all schedulers enforce the property with probability 1. We now show a tight lower-bound for the existence of a probability one scheduler: Proposition 7.18. Given a Markov decision process and a TL[ , ] formula, determining whether the formula holds with probability one for some scheduler is 2EXPTIME-hard.
Proof. The proof is an adaptation of the 2EXPTIME-hardness proof of Alur et. al. for model checking TL[ , ] formulas on two-player games in [ATM03]. The proof there is based on a reduction from the membership problem for an alternating exponential-space Turing machine, where a game graph and a TL [ , ] formula are constructed such that the Turing machine accepts the given input if and only if the existential player has a winning strategy in the game.
We can adapt the proof by assigning the existential vertices of the game graph to a scheduler and assigning the universal vertices from the game graph to the probabilistic player (by setting the uniform outgoing probabilities from these vertices). When the Turing machine accepts its input we are guaranteed that there is a corresponding scheduler that leads to acceptance with the probability 1. On the other hand, if the Turing machine does not accept its input then after some finite number of transitions in the Markov decision process, either the scheduler "cheats" (does not follow the Turing machine transition function or cell numbering) or we get to a rejecting state. In both cases, the probability of acceptance is less than 1. Table 7.5 summarises the known results and the results from this paper (in bold) on probabilistic systems. An asterisk indicates bounds that are not known to be tight. Note that for the more complex verification problems, from strategy synthesis for MDPs onwards, all problems are 2EXP-complete. Intuitively the complexity of the model overwhelms the difference in the respective logics. Similarly, we see that in the stutter-free case the extra succinctness of FO 2 [<] comes at "no cost" over TL[ , ]-at least, for the complexity classes we consider, and where we can establish tight bounds, the respective columns are identical.
We now turn to combining FO 2 with automata-based techniques for LTL, examining verification of the hybrid language FO 2 [LTL]. As was done with FO 2 , we first show that we can translate FO 2 [LTL] into temporal logic with exponential blow-up in the size of the formula, giving a simple upper bound. While for FO 2 the translation was to unary temporal logic, in this case we have a translation to LTL Let . We can look at every FO 2 [LTL] formula as being rewritable using let definitions such that every let definition involves either a pure FO 2 formula or a pure LTL operator. We get this form by introducing a let definition for every subformula with one free variable. For example, rewriting the formula ϕ = ((∃y (suc(x, y) ∧ P 1 (x))) U P 0 )(x) with let definitions yields Note that although the above uses a combination of FO 2 and LTL, each individual definition is either "pure FO 2 ", or "pure LTL", and we can apply the translation of FO 2 to UTL in Lemma 3.10 to each FO 2 definition. This gives the following result: Lemma 8.1. Given an FO 2 [LTL] formula ϕ, we can convert it to an equivalent LTL Let formula ψ such that |ψ| = O(2 |ϕ| 2 ).
We could then translate the let definitions away for LTL, to get an ordinary LTL formula-thus showing that FO 2 [LTL] and LTL have the same expressiveness. However, there is no need to perform this second transformation to get a bound on the complexity of model checking. Let definitions do not increase complexity for model checking LTL, since non-deterministic Büchi automata for LTL and LTL Let have the same asymptotic size: Lemma 8.2. Given an LTL Let formula ϕ, there is an unambiguous Büchi automaton A with at most O(2 |ϕ| 2 ) states accepting exactly the language {w ∈ Σ ω : w |= ϕ}. Moreover this automaton can be constructed in polynomial time in its size.
This follows from the fact that the number of subformulas of LTL Let formulas is linear in the formula size (Lemma 2.1) and from the following result of Couvreur et al: . Given an LTL formula ϕ, there is an unambiguous Büchi automaton A with at most O(|Σ||sub(ϕ)|2 |sub(ϕ|) ) states accepting exactly the language {w : w ∈ Σ ω ∧ w |= ϕ}. Moreover this automaton can be constructed in polynomial time in its size.
As a corollary of Lemmas 8.1 and 8.2 we see that we can convert from an FO 2 [LTL] formula to an unambiguous Büchi automaton in doubly exponential time, giving a doublyexponential bound on the complexity of model-checking. However, just as in the previous section, we show that we can do better by direct analysis than via this translation approach.
We begin by looking at the translation given in Lemma 8.1 from a different perspective. Let us extend the set of atomic propositions P and alphabet Σ = 2 P by adding new atomic propositions R for every predicate created in that translation. Thus we have an extended alphabet Σ = 2 P∪R . There is an obvious restriction mapping taking an infinite word w over Σ to a word over Σ, simply by ignoring all propositions in R; we denote this by restrict(w , Σ).
Lemma 8.4. Given an FO 2 [LTL] formula ϕ alphabet Σ, there is an FO 2 formula ϕ F and an LTL formula ϕ L over Σ having the following two properties for all w ∈ Σ ω : (i) if w |= ϕ then there is a unique extension w of w such that w |= ϕ L ∧ ϕ F ; (ii) if w |= ϕ then there is no extension to w such that w |= ϕ L ∧ ϕ F . Moreover, |ϕ L |, |ϕ F | = O(|ϕ| 2 ) Proof. We use the translation in Lemma 8.1, but consider it simply returning the collection of let definitions. Corresponding to each definition is a conjunct stating that R i holds iff ϕ i holds. We now examine the form of this conjunct.
Each ϕ i is either a basic two-variable formula or an LTL atomic formula. If ϕ i is in LTL then the iff can be expressed again in LTL: (R i ↔ ϕ i ). If ϕ i is in FO 2 then the iff above can be expressed as ∀x.(R i (x) ↔ ϕ i (x)). We can simply let ϕ F be th FO 2 conjuncts and ϕ L be the LTL conjuncts to obtain the desired conclusion.
For the formula from the example at the beginning of this section we get following formulas ϕ L and ϕ F over Σ : to obtain an equisatisfiable formula ϕ L ∧ ϕ F , where ϕ L is an LTL formula and ϕ F is an FO 2 formula over the extended alphabet Σ . Now we can build a Büchi automaton B L for ϕ L using the construction from Lemma 8.3, as well as a collection of 2 2 poly(|ϕ F |) Büchi automata B F i for ϕ F , using Theorem 4.3. For each i we build a product automaton A i = B L ⊗ B F i synchronising on the truth values of the newly introduced atomic propositions R i . We claim that each product automaton A i is unambiguous, the languages they accept are disjoint, and their union is exactly {w ∈ Σ ω : w |= ϕ}. This follows from the fact that each word over Σ has only one extension to a word over Σ for which B L accepts, along with the fact that the languages accepted by the B F i are disjoint.
After producing the synchronised cross product, we can restrict the input alphabet back to Σ, because the values of all newly introduced atomic propositions p i ∈ Σ \ Σ are fully determined by the truth values of atomic predicates P i and the relations defined by ϕ.
Therefore we get the following theorem: Theorem 8.5. FO 2 [LTL] formula ϕ, there is a collection of doubly exponentially many (in |ϕ|) generalized Büchi automata A i , each of exponential size in |ϕ|, such that the languages they accept are disjoint and the union of these languages is exactly {w ∈ Σ ω : w |= ϕ}. Moreover, each automaton A i is unambiguous and can be constructed by a non-deterministic Turing machine in polynomial time in its size. Now let us consider model checking Markov decision processes. Recall that in the proof of the corresponding bound for FO 2 , Theorem 7.10, we relied on the fact that the automata are deterministic in the limit. Thus our translation for FO 2 [LTL] does not give us the same bounds as for FO 2 . And indeed, the corresponding bound for checking whether all schedulers achieve probability 1 is worse for LTL in this case, namely doubly-exponential. We will show that we can achieve the same bound as for LTL.
Proposition 8.9. Determining whether for all schedulers an FO 2 [LTL]-formula ϕ holds on a Markov decision process with probability one is in the complexity class 2EXPTIME.
Proof. We will decide the corresponding complement problem which asks whether there exists a scheduler σ such that the probability satisfying ¬ϕ is greater than 0. By applying the translation from Theorem 8.5, we get a collection of doubly-exponentially many automata, each of exponential size. We can go through all these automata and check if the probability is greater than 0 for one of them. For each automaton, we make a call to the exponential time algorithm for qualitative model checking Büchi automata on MDPs from Courcoubetis and Yannakakis [CY95]. In the process of examining two-variable logics and their extensions, we have utilized results on logics extended with Let definitions. We now return to considering the impact of Let for several temporal logics. First, we note that model checking TL[ , ] Let , UTL Let and LTL Let properties on both non-deterministic (Kripke structures, HSMs, RSMs) and probabilistic systems (Markov chains, HMCs, RMCs, MDPs (∀)) has similar computational complexity as for the corresponding logics without let definitions. We get these results by simply substituting let definitions to obtain formulas in the base logic, and then analyze the complexity of model-checking the resulting formulas.
In the case of LTL Let , we have already noted that the size of the automaton for LTL is exponential only in the number of subformulas (see, e.g. Couvreur et. al. [CSS03])-this leads to Lemma 8.2. Similarly, for TL[ , ] Let and UTL Let , we get the corresponding automata of the same asymptotic size as for TL [ , ] and UTL respectively, because their size depends on the number of subformulas and the operator depth and not directly on the size of the formula (see translation in Subsection 4.1).
In the case of FO 2 Let , we can use Lemma 3.10 to translate the formula to UTL Let and then use the result above that the sizes of the automata for UTL and UTL Let formulas of the same length are asymptotically equal. Moreover, since LTL Let and FO 2 Let have unambiguous Büchi automata of equal asymptotic size as for LTL and FO 2 respectively, we can combine them in the same way as in the proof of Theorem 8.5 to get the same complexity upper bounds for model checking Finally, we will show that, in contrast to the cases above, the complexity of model checking FO 2 [<] Let is exponentially worse than that of FO 2 [<] on both non-deterministic and probabilistic systems. Thus this is the only logic we have considered where the introduction of let definitions makes a difference in the computational complexity of model checking. The following two theorems show the lower bounds on the complexity of model checking FO 2 [<] Let , which match exactly the upper bounds for FO 2 Let (compare with Proposition 6.2).
Proposition 9.2. The satisfiability of a FO 2 [<] Let formula under the unary alphabet restriction is NEXP-hard.
Proof. The proof is by reduction from the halting problem of a non-deterministic EXPTIME Turing machine T on a given input I. Let Γ and Q be respectively the tape alphabet and set of control states of T . We consider infinite strings over alphabet Σ := ({P 0 , P 1 , . . . P 2n−1 } × {Γ ∪ (Γ × Q)}) ∪ {#} .
An infinite word u ∈ Σ ω encodes a computation of T as follows. Each configuration is encoded in a block of contiguous letters in u, with successive configurations arranged in successive blocks. Each such block comprises 2 n symbols denoting the contents of each tape cell in the configuration. A symbol encoding a tape cell consists of: a letter from Γ ∪ (Γ × Q) to denote the contents of the tape cell and whether the read head of the Turing Machine is currently on the cell (and if so, the current control state of T ), and a predicate P i denoting the address of the tape cell and the configuration number. Here we use the power of Let definitions to transform the sequence of 2n predicates to values of 2n-bit counter (see the proof of Lemma 3.12), which represent the address of configuration and tape cell. Having thus encoded a computation of T in a finite prefix of u we require that the remaining infinite tail of u be the string # ω .
We can use short FO 2 [<] Let formulas to identify the position in the string representing the previous or next position of the tape cell in the same configuration. We can also use such formulas to identify the same position of the tape cell in the previous or next configuration. Thus we can easily check if the tape symbols are consistent with the transition function of T . Finally, we ensure T is in the accepting state in the last configuration.
Proposition 9.3. The decision problem of whether a Markov chain M satisfies an FO 2 [<] Letformula ϕ with probability greater than 1/2 is PEXP-hard.
Proof. The proof is by reduction from the problem of whether a strict majority of computation paths of a given non-deterministic EXPTIME Turing machine T on a given input I are accepting. Without loss of generality we can assume that any non-halting configuration of T has exactly two successors and that all computations of T on input I make exactly 2 n steps, where n is the length of I.
The basic idea, following the proof of NEXPTIME-hardness of satisfiability for FO 2 [<] Let , is to encode computations of T as strings. We can define an FO 2 [<] Let formula that is satisfied by a word u ∈ Σ ω precisely when u encodes a legitimate computation of T on input I according to the encoding scheme used in Proposition 9.2 Indeed, the definition is just as described in the proof of NEXPTIME-hardness for FO 2 [<] Let satisfiability in Proposition 9.2.
The Markov Chain M in our reduction is constructed from two copies of a component M . The definition of M is very simple; it consists of a directed clique augmented with a single sink state. In detail, there is a state s σ for each letter σ ∈ Σ; s # is a sink that makes a transition to itself with probability 1; the next-state distribution from s σ , σ = #, is given by a uniform distribution over all states; finally, the label of state s σ is σ.
The Markov chain M consists of two disjoint copies M left and M right of M that are identical except that their states are distinguished by propositions P left and P right . The initial state of M is a uniform distribution over all states.
We can partition Σ ω into three sets N , A and R, respectively comprising those strings that don't encode computations of T on input I, those strings that encode accepting computations, and those strings that encode rejecting computations. Moreover each of these sets is definable in FO 2 [<] Let by formulas ϕ N , ϕ A and ϕ R respectively.
We define the formula ϕ by ϕ := ((∀x P left (x)) ∧ (ϕ N ∨ ϕ A )) ∨ ((∀x P right (x)) ∧ ϕ A ) . To complete the reduction, we claim that P M (L(ϕ)) > 1/2 if and only if a strict majority of the computations of Turing Machine T on input I are accepting. To see this, observe that if M produces a trajectory from N ⊆ Σ ω then that trajectory is equally likely to have come from M left or M right . Using this we can see that P M (L(ϕ)) is (P M (A) + P M (N ))/2 + P M (A)/2. Thus P M (L(ϕ)) > 1/2 iff 2P M (A) > 1 − P M (N ). From this we see that P M (L(ϕ)) > 1/2 if and only if |A| > |R|, as required. The

Conclusions and ongoing work
In this paper we have compared the complexity of verifying properties in the two bestknown elementary fragments of monadic first-order logic on words: LTL and FO 2 . We provided several different logic-to-automaton constructions that are useful for verification of FO 2 . One translations allows us to understand the complexity of verifying full FO 2 via analysis of unary temporal logic; a second is useful for the sublanguage of FO 2 with only the linear-ordering; the third is useful for getting deterministic automata, which is needed for obtaining bounds for certain game-related problems. We have shown that these translations put together allow us to understand the complexity of verification and synthesis problems for both non-deterministic and probabilistic models transition systems, including those arising from hierarchical and recursive state machines.
While LTL is more expressive than FO 2 , FO 2 can be exponentially more succinct. We have shown that the effect of these opposing factors on the complexity of model checking depends on the model, e.g., FO 2 has higher complexity on Markov chains while LTL has higher complexity on MDPs. By contrast, in the stutter-free case the extra succinctness of FO 2 [<] comes for free-all verification problems have the same complexity as for TL [ , ]. For the most structured models e.g., two-player games and quantitative verification of MDPs, the complexity of the model dominates any difference in the logics.
We are currently examining the succinctness of Let definitions when added to each of our logics. A number of succinctness results can be found in this work, but we have left open the succinctness of Let in certain situations, e.g., for the logic FO 2 [LTL]. Finally, we are investigating the extension of the techniques introduced here from words to trees.