Presburger Arithmetic with algebraic scalar multiplications

We consider Presburger arithmetic (PA) extended by scalar multiplication by an algebraic irrational number $\alpha$, and call this extension $\alpha$-Presburger arithmetic ($\alpha$-PA). We show that the complexity of deciding sentences in $\alpha$-PA is substantially harder than in PA. Indeed, when $\alpha$ is quadratic and $r\geq 4$, deciding $\alpha$-PA sentences with $r$ alternating quantifier blocks and at most $c\ r$ variables and inequalities requires space at least $K 2^{\cdot^{\cdot^{\cdot^{2^{C\ell(S)}}}}}$ (tower of height $r-3$), where the constants $c, K, C>0$ only depend on $\alpha$, and $\ell(S)$ is the length of the given $\alpha$-PA sentence $S$. Furthermore deciding $\exists^{6}\forall^{4}\exists^{11}$ $\alpha$-PA sentences with at most $k$ inequalities is PSPACE-hard, where $k$ is another constant depending only on~$\alpha$. When $\alpha$ is non-quadratic, already four alternating quantifier blocks suffice for undecidability of $\alpha$-PA sentences.

1. Introduction 1.1. Main results. Let α be a real number. An α-Presburger sentence (short: an α-PA sentence) is a statement of the form Q 1 x 1 ∈ Z n 1 . . . Q r x r ∈ Z nr Φ(x 1 , . . . , x r ), (1.1) where Q 1 , . . . , Q r ∈ {∀, ∃} are r alternating quantifiers, x 1 , . . . , x r are r blocks of integer variables, and Φ is a Boolean combination of linear inequalities in x 1 , . . . , x r with coefficients and constant terms in Z [α]. As the number r of alternating quantifier blocks and the dimensions n 1 , . . . , n r increase, the truth of α-PA sentences becomes harder to decide. In this paper, we study the computational complexity of deciding α-PA sentences.
In a special case of r = 2, Q 1 = ∀ and Q 2 = ∃, the sentence S can ask whether projections of integer points in a convex polyhedron P ⊆ R k+m cover all integer points in another polyhedron R ⊆ R k : (1.3) Here both P and R are defined over Q [α]. When α is rational, the classical problems (1.2) and (1.3) are repectively known as Integer Programming and Parametric Integer Programming (see eg. [Schr], [Len], [Kan]). Further variations on the theme and increasing number of quantifiers allow more general formulas with integer valuations of the polytope algebra. For a survey of this area, see Barvinok [Bar].
Recall that classical Presburger arithmetic (PA) is the first-order theory of (Z, <, +), introduced by Presburger in [Pre]. When α ∈ Q, then deciding the truth of α-PA sentence is equivalent to deciding whether or not a PA sentence is true. The latter decision problem has been studied extensively and we review some of their results below. The focus of this paper is the case when α is irrational, which is implicitly assumed whenever we mention α-PA.
Let α ∈ Q alg , where Q alg is the field of real algebraic numbers. We think of α ∈ Q alg as being given by its defining Z[x]-polynomial of degree d, with a rational interval to single out a unique root. We say that α ∈ Q alg is quadratic if d = 2. Similarly, the elements γ ∈ Z[α] are represented in the form γ = c 0 + c 1 α + . . . + c d−1 α d−1 , where c 0 , . . . , c d−1 ∈ Z. For example, α = √ 2 is quadratic and given by {α 2 − 2 = 0, α > 0}. Thus Z[ √ 2] = {a + b √ 2, a, b ∈ Z}. For γ ∈ Z[α], the encoding length (γ) is the total bit length of the c i 's defined above. Similarly, the encoding length (S) of an α-PA sentence S is defined to be the total bit length of all symbols in S, with integer coefficients and constants represented in binary.
The only existing result that directly relates to the complexity of deciding α-PA sentences is the following theorem due to Khachiyan and Porkolab, which extends Lenstra's classical result [Len] on Integer Programming in fixed dimensions.
Theorem 1.1 [KP]. For every fixed n, sentences of the form ∃y ∈ Z n : Ay ≤ b with A ∈ Q m×n alg , b ∈ Q m alg can be decided in polynomial time.
For a bounded number of variables, two important cases are known to be polynomial time decidable, namely the analogues of (1.2) and (1.3) with rational polyhedra P and R. These are classical results by Lenstra [Len] and Kannan [Kan], respectively. Scarpellini [Sca] showed that all ∃ n -sentences are still polynomial time decidable for every n fixed. However, for two alternating quantifiers, Schöning proved in [Schö] that deciding ∃y ∀x : Φ(x, y) is NP-complete. Here Φ is any Boolean combination of linear inequalities in two variables, instead of those in the particular form (1.3). This improved an earlier result by Grädel in [Grä], who also showed that PA sentences with m + 1 alternating quantifier blocks and m + 5 variables are complete for the m-th level in the Polynomial Hierarchy PH. In these results, the number of inequalities (atoms) in Φ is still part of the input, i.e., allowed to vary.
Much of the recent work concerns the most restricted PA sentences for which the number of alternations (r + 2), number of variables and number of inequalities in Φ are all fixed. Thus, the input is essentially a bounded list of integer coefficients and constants in Φ, encoded in binary. For r = 0, such sentences are polynomial time decidable by Woods [Woo]. For r = 1, Nguyen and Pak [NP] showed that deciding ∃ 1 ∀ 2 ∃ 2 PA-sentences with at most 10 inequalities is NP-complete. More generally, they showed that such sentences with r + 2 alternations, O(r) variables and inequalities are complete for the r-th level in PH. Thus, limiting the "format" of a PA formula does not reduce the complexity by a lot. This is our main motivation for the lower bounds in Theorems 1.3 and 1.4 for α-PA sentences.
1.3. Proofs outline. Let S1S be the monadic second order theory of (N, +1), where +1 denotes the usual successor function, and let WS1S be the weak monadic second order theory of (N, +1), that is the monadic second order logic of (N, +1) in which quantification over sets is restricted to quantification over finite subsets. The main results of [H2] states that for quadratic α, one can decide T α -sentences by translating them into corresponding S1Ssentences, and then decide the latter. Since α-PA sentences form a subset of all T α -sentences, this method can be used to decide α-PA sentences. Thus, upper complexity bounds for S1S can theoretically be transferred to deciding α-PA sentences. Moreover, the work in [H2] also shows that one can translate S1S-sentences into T α -sentences. However, no efficient direct translation between L α -sentence and S1S-sentence was given in [H1, H2]. Ideally, one would like to do this translation with as few extra alternations of quantifiers as possible. In Theorems 1.2 and 1.3, we explicitly quantify this translation. We strengthen the result from [H2] by showing that one can translate α-PA sentences to WS1S-sentences. The translation then allows to us find upper and lower complexity bounds for deciding α-PA sentences.
The most powerful feature of α-PA sentences is that we can talk about Ostrowski representation of integers, which will be used throughout the paper as the main encoding tool. We first obtain the upper bound in Theorem 1.2 by directly translating α-PA sentences into the statements about automata using Ostrowski encoding and using known upper bounds for certain decision problems about automata. Next, we show the lower bound for three alternating quantifiers (Theorem 1.4) by a general argument on the Halting Problem with polynomial space constraint, again using Ostrowski encoding. We generalize the above argument to get a lower bound for r ≥ 3 alternating quantifier blocks (Theorem 1.3). For the latter result, we first translate WS1S-sentences to α-PA sentences with only one extra alternation, and then invoke a known tower lower bound for WS1S. Finally in the proof of Theorem 1.5, we again use the expressibility of Ostrowski representation to reduce the upper bound of the number of alternating quantifier blocks needed for undecidability of α-PA sentences. The use of Ostrowski representations allows us to replace more general arguments from [HTy] by explicit computations, and thereby reduce the quantifier-complexity of certain α-PA sentences.
Notation. We use bold notation like x, y to indicate vectors of variables.

Continued fractions and Ostrowski representation
Ostrowski representation and continued fractions play a crucial role throughout this paper. We recall basic definitions and facts in this subsection. We refer the reader to Rockett and Szüsz [RS] for more details and proofs.
A finite continued fraction expansion [a 0 ; a 1 , . . . , a k ] is an expression of the form For a real number α, we say [a 0 ; a 1 , . . . , a k , . . . ] is the continued fraction expansion of α if α = lim k→∞ [a 0 ; a 1 , . . . , a k ] and a 0 ∈ Z, a i ∈ N >0 for i > 0. For the rest of this subsection, fix a positive irrational real number α and let [a 0 ; a 1 , a 2 , . . . ] be the continued fraction expansion of α.
Let k ≥ 1. A quotient p k /q k ∈ Q is said to be the k-th convergent of α if p k ∈ N, q k ∈ Z, gcd(p k , q k ) = 1 and p k q k = [a 0 ; a 1 , . . . , a k ].
It is well known that the convergents of α follow the recurrence relation: (p −1 , q −1 ) = (1, 0); (p 0 , q 0 ) = (a 0 , 1); p n = a n p n−1 + p n−2 , q n = a n q n−1 + q n−2 for n ≥ 1. (2.1) This can be written as: Fact 2.1 [RS,Chapter II.2 Theorem 2]. The set of best rational approximations of α is precisely the set of all convergents {p k /q k } of α. In other words, for every p k /q k , we have: 3) The k-th difference of α is defined as β k := q k α − p k . We use the following properties of the k-th difference: (2.6) These can be easily proved using (2.1). We now introduce a class of numeration systems introduced by Ostrowski [Ost].
We refer to (2.7) as the α-Ostrowski representation of X. When α is clear from the context, we simply say the Ostrowski representation of X. We also denote the coefficients b n+1 in (2.7) by [q n ](X). When X is obvious from the context, we just write [q n ]. We denote by Ost(X) the set of q n with [q n ](X) > 0.
Observe that a 0 −α ∈ (−1, 0). Let I α be the interval a 0 −α, 1+(a 0 −α) . Define f α : N → I α to be the function that maps X to αX − U , where U is the unique natural number such that αX − U ∈ I α . In other words: (2.8) Let g α : N → N be the function that maps X to the natural number U satisfying αX −U ∈ I α . The reader can check that αX = f α (X) + g α (X). where c 0 , . . . , c kα−1 is the repeating block in the continued fraction expansion with the minimum period k α .
Fact 2.4. Let i ∈ N. There exist c i , d i ∈ Z such that for every n ∈ N with k α |n, we have: The coefficients c i , d i can be computed in time poly(i).
Proof. By (2.2), we have: Since Γ k+t = Γ t for every t ∈ N and k α |n, we have Γ n+2 . . . Γ n+i+1 = Γ 2 · · · Γ i+1 . Let This choice immediately gives that Thus (p n+i , q n+i ) = c i (p n , q n ) + d i (p n+1 , q n+1 ). Note that c i , d i only depend on i and can be computed in time poly(i) by (2.11).

α-Presburger formulas
Fix some α ∈ R. An α-PA formula is of the form where Φ is a Boolean combination of linear inequalities in y 1 ∈ Z n 1 , . . . , y r ∈ Z nr , x ∈ Z m with coefficients and constant terms in Z[α] and y 1 , . . . , y r , x are integer variables; or any logically equivalent first-order formula in the language L α = {+, 0, 1, <, λ p : p ∈ Z[α]} where λ p is a unary function symbol for multiplication by p ∈ Z[α]. We will denote a generic α-PA formula as F (x), where x are the free variables of F , i.e., those not associated with a quantifier. An α-PA sentence is an α-PA formula without free variables.
Given an α-PA formula F (x) and X ∈ Z |x| , we say F (X) holds (or is true) if the statement obtained by replacing the free variables in F by X and letting the quantified variables y i range over Z n i , is true. We say that a set S ⊆ Z m is α-PA definable (or an α-PA set) if there exists an α-PA formula F (x) such that S = {X ∈ Z |x| : F (X)}.
When α = 0, then a α-PA formula is just a classical PA-formula. Hence 0-PA is just PA, and therefore decidable. Let S α = (R, <, +, Z, x → αx). As pointed out in the introduction, the first-order theory T α of S α contains all true α-PA sentence. Since T α is decidable by [H2], we have: Theorem 3.1. Let α be quadratic. Then α-PA is decidable.
The main difference between the situation when α is rational and when it is irrational, is that when α is irrational, α-PA formulas can express properties of the α-Ostrowski representation of natural numbers. This increases the computational complexity of the decision procedure of α-PA in comparison to the one for PA.
3.1. α-PA formulas for working with Ostrowski representation. Let α be an irrational number, not necessarily quadratic. In this section, we will show that various properties of Ostrowski representations can be expressed using α-PA formulas.
By Fact 2.1 the convergents {p n /q n } of α can be characterized by the best approximation property. Namely, u/v with v > 1 is a convergent p n /q n for some n ∈ N if and only if ∀w, z (0 < z < v) → |zα − w| > |vα − u|. (3.1) Here gcd(u, v) = 1 is implied, since if k = gcd(u, v) > 1, then |αv/k − u/k| < |αv − u| and 0 < v/k < v. Now consider two consecutive convergents (u, v) = (p n , q n ) and (u , v ) = (p n+1 , q n+1 ) for some n ∈ N. For any integers 0 < z < v and w, first we have |zα − w| > |v α − u |. If |zα − w| < |vα − u|, then first we must have v < z < v . Then among all such pairs (w, z), the one with the minimum |zα − w| must necessarily be another convergent of α, which is impossible since we assumed that (u, v) and (u , v ) are consecutive. Thus, a necessary and sufficient condition for (u, v) and (u , v ) to be consecutive convergents is simply: Note that C ∀ is a ∀-formula. More generally, consider the α-PA formula: Then C ∀ is true if and only if (u 0 , v 0 ) = (p n , q n ), . . . , (u k , v k ) = (p n+k , q n+k ) for some n with q n > 1, i.e., k + 1 consecutive convergents of α.
Also Z is uniquely determined by Z if After or After holds.
Proof. This proof is similar to the proofs of Lemmas 4.6, 4.7 and 4.8 in [H2].
In other words, Compatible is satisfied if and only if Ost(X) and Ost(Z) can be directly concatenated at the point v = q n to form Ost(X + Z) (see (2.7)).

Quadratic irrationals: Upper bound
In this section we prove Theorem 1.2. Let α ∈ R be an irrational quadratic number. We will show that for every α-PA formula F (x), there is a finite automaton A such that A accepts precisely those words that are Ostrowski representations of numbers satisfying F . This will then allow to use automata-based decision procedure to decide α-PA sentence. It should be emphasized that the tower height in Theorem 1.2 only depends on the number of alternating quantifiers, but not on the number of variables in the sentence.
4.1. Finite automata and Ostrowski representations. We first remind the reader of the definitions of finite automata and recognizability. For more details, we refer the reader to Khoussainov and Nerode [KN]. Let Σ be a finite set. We denote by Σ * the set of words of finite length on Σ.
Definition 4.1. A nondeterministic finite automaton (NFA) A over Σ is a tuple (S, I, T, F ), where S is a finite non-empty set, called the set of states of A, I is a subset of S, called the set of initial states, T ⊆ S × Σ × S is a non-empty set, called the transition table of A and F is a subset of S, called the set of final states of A. An automaton A = (S, I, T, F ) is deterministic (DFA) if I contains exactly one element, and for every s ∈ S and σ ∈ Σ there is exactly one s ∈ S such that (s, σ, s ) ∈ T . We say that an automaton A on Σ accepts a word w = w n . . . w 1 ∈ Σ * if there is a sequence s n , . . . , s 1 , s 0 ∈ S such that s n ∈ I, s 0 ∈ F and for i = 1, . . . , n, By the size of an automaton, we mean its number of states. It is well known that recognizability by NFA and DFA are equivalent: Fact 4.2 [KN,Theorem 2.3.3]. If L is recognized by an NFA of size m, then L is recognized by a DFA of size 2 m .
Let Σ be a set containing 0. Let z = (z 1 , . . . , z n ) ∈ (Σ * ) n and let m be the maximal length of z 1 , . . . , z n . We add to each z i the necessary number of 0's to get a word z i of length m.
The convolution of z is defined as the word z 1 * · · · * z n ∈ (Σ n ) * whose i-th letter is the element of Σ n consisting of the i-th letters of z 1 , . . . , z n .
Σ-recognizable sets are closed under Boolean operations and first order quantifiers: If X 1 , X 2 ⊆ (Σ * ) n are recognized by DFAs of size m 1 and m 2 , respectively, then: a) X c 1 is recognized by a DFA of size m 1 . b) X 1 ∩ X 2 , X 1 ∪ X 2 by DFAs of size m 1 m 2 .
Proof. The set X is recognized by an NFA of size m (see [KN,Theorem 2.3.9]). Thus X is recognized by an DFA of size 2 m by Fact 4.2. It follows from Fact 4.4 that X c can be recognized by a DFA of size 2 m .
Let α be a quadratic irrational. Since the continued fraction expansion [a 0 ; a 1 , a 2 , . . . ] of α is periodic, it is bounded. Let M ∈ N be the maximum of the a i 's. Set Σ α := {0, . . . , M }. Recall from Fact 2.2 that every N ∈ N can be written uniquely as Definition 4.6. Let X ⊆ N n . We say that X is α-recognized by a finite automaton A over Σ n α if the set {ρ α (N 1 ) * · · · * ρ α (N n ) : (N 1 , . . . , N n ) ∈ X, l 1 , . . . , l n ∈ N} is recognized by A. We say X is α-recognizable if it is α-recognized by some finite automaton.
It follows easily from general facts about sets recognizable by finite automata that αrecognizable sets are closed under boolean operations and coordinate projections (see [KN,Chapter 2.3]). A crucial tool is the following results from Hieronymi and Terry [HTe]. Theorem 4.7 [HTe, Theorem B]. Let α be quadratic. Then {(x, y, z) ∈ N 3 : x + y = z} is α-recognizable.
Next, recall f α and g α from (2.8) and Fact 2.3.
It follows easily from (2.4) and Fact 2.3 that when (see [H2,Fact 2.13]). It is an easy exercise to construct a finite automaton that α-recognizes Lemma 4.9. The graph of the function g α : N → N is α-recognizable.
Proof. We can assume that α is purely periodic, with minimum period k (see Section 2.1). By Fact 2.6, there are µ, µ ∈ Q such that p n = µq n + µ q n+k for every n ≥ 0.
Since g α (x) is a linear combination of x and Shift(x), we see that g α is α-recognizable.
Here x ∈ Z d is a tuple of some auxiliary variables of length d ≤ δd. A has the extra property that it accepts at most one tuple (x, x ) ∈ Z d+d for every x ∈ Z d . Finally, the constant δ only depends on α.
Proof. Each single variable x in F takes value over Z, but can be replaced by x 1 − x 2 for two variables x 1 , x 2 ∈ N. Hence, we can assume that all variables take values over N. It is easy to see that we can assume F to be negation free. Recall that coefficients/constants in Z[α] are given in the form cα + d with c, d ∈ Z. Hence, each inequality in F can be reorganized into the form: Here a, b, c, d are tuples of coefficients in N, and y, z, t, w are subtuples of x. For each homogeneous term a y, we use an additional variable u = a y and replace each appearance of a y in the inequalities by u. Note that the length (F ) grows at most linearly after adding all such extra variables. The atoms in our formulas are either equalities of the form: ( ) We can rewrite each equality u = a y into single additions by utilizing binary representations of the coefficients. For example, the equality u = 5y + 2z can be replaced by the following conjunction: Note that the number of variables we introduce is linear in the length of the binary representation of a. So we still (F ) grows linearly when we introduce the new variables. We have that αx = f α (x) + g α (x) for every x ∈ Z. Here g α (x) ∈ Z and f α (x) always lies in the unit length interval I α . For u, v, w, z ∈ N, we have u + αv < w + αz if and only if: Now we see that each atom in F , which is of type ( ) or ( ), can be substituted by a Boolean combinations of simpler operation/functions, namely f α , g α , single additions x + y = z and comparisons x < y. We collect into a tuple x ∈ Z d all the auxiliary variables introduced in these substitutions. Observe that the total length (F ), which also includes d , only increased by some linear factor δ = δ(α). By Theorem 4.7, Fact 4.8 and Lemma 4.9 each of the above simpler operations/functions can be recognized by a DFA of constant size. Using Fact 4.4, we can combine those DFA to get a DFA of size 2 γ (F ) that recognizes F in the sense of (4.1). Note that for each value x ∈ Z d of the original variables, the auxiliary x ∈ Z d are uniquely determined by x.
Corollary 4.11. Let S ⊆ N n be α-PA definable. Then S is α-recognizable.
Proof. Follows directly from the above proposition, combined with Fact 4.5 for quantifier elimination.
Proof of Theorem 1.2. Consider an α-PA sentence S of the form (1.1). Without loss of generality, we can change domains from x i ∈ Z n i to x i ∈ N n i , as shown in the proof of Proposition 4.10. Also by negating S if necessary, we can assume that Q r = ∃. By Proposition 4.10, there is an NFA A of size 2 δ (F ) that α-recognizes the quantifier-free part Φ(x 1 , . . . , x r ) in the sense of (4.1). We can rewrite: Since Q r = ∃, we can group x r and x into one quantifier block. Repeatedly applying Fact 4.4 and 4.5 one after another, we can successively eliminate all r − 1 quantifier blocks. This blows up the size of A by at most r − 1 exponentiations. The resulting DFA A has size at most a tower of height r in δ (F ), and satisfies: So deciding S is equivalent deciding the whether "Q 1 x 1 A accepts x 1 ". Note that Q 1 can be ∃ or ∀. However, since A is deterministic, we can freely take its complement without blowing up its size. Thus, we can safely assume Q 1 = ∃. Now, viewing A as a directed 1 The case of a sharp inequality can be handled similarly.

4:14
P. Hieronymi, D. Nguyen, and I. Pak Vol. 17:3 graph, the sentence "∃x 1 A accepts x 1 " can be easily decided by a breadth first search argument. This can be done in time linear to the size of A .

Quadratic irrationals: PSPACE-hardness
We now give a proof of Theorem 1.4. Throughout this section, fix a PSPACE-complete language L ⊆ {0, 1} * and a 1-tape Turing Machine (TM) M that decides L. This means that given a finite input word x ∈ {0, 1} * on its tape T , the Turing machine M will run in space poly(|x|) and output 1 if x ∈ L and 0 otherwise.
The main technical theorem we establish in this section is the following: Theorem 5.1. Let α ∈ Q alg be a quadratic irrational. For every s ∈ N, there is a map X : {0, 1} s → N and an ∃ 6 ∀ 4 ∃ 11 α-PA formula Accept such that: Accept(X(x)) holds if and only if x ∈ L.
Both X(x) and Accept can be computed in time O(s c ) for all x ∈ {0, 1} s and all s ∈ N, where the constant c only depends on α. Furthermore, the number of inequalities in Accept only depends on α but not on s.
The main argument of the proof of Theorem 5.1 translates Turing machine computations into Ostrowski representations of natural numbers. This argument is presented in Subsection 5.3. An explicit bound on the number of variables and inequalities for the constructed sentences are then given in Subsection 5.4, where we also treat the case α = √ 2. Theorem 1.4 follows.
Before proving Theorem 5.1, we construct in Subsection 5.1 some explicit α-PA formulas to deal with the Ostrowski representation, exploiting the periodicity of the continued fraction expansion of α. Then we recall the definitions of Turing machine computations in Subsection 5.2.
5.1. Ostrowski representation for quadratic irrationals. Let α be a quadratic irrational. Recall from Section 2.1 that we only need to consider a purely periodic α with minimum period k. Set K := lcm(2, k) and keep this K for the remainder of this section.
We first construct α-PA formula that defines the set of convergents (p n , q n ) for which K|n.
Recall γ i from Remark 2.5 (also see Fact 2.4). Now define the α-PA formula: Proof. First, the condition ∀w, z 0 < z < γ i+1 (v, v ) → . . . implies that the pairs γ i (u, u ), γ i (v, v ) 0≤i≤k+1 are k + 2 consecutive convergents (see (3.1) and (3.2)). In other words, there is an n > 0 such that: So (u, v) = (p n , q n ) and (u , v ) = (p n+1 , q n+1 ). Then by (2.12): must be the next convergent (p n+2 , q n+2 ). Combined with (2.1), we have which implies a n+2 = a 2 . Similarly, we have a n+i = a i for all 2 ≤ i ≤ k + 1. Since k is the minimum period of α, we must have k|n. Also because 0 < αv − u = αq n − p n , we have 2|n (see (2.4)). Therefore, In prenex normal form, D K ∀ is a ∀ 2 -formula. Next, we can also define the set of convergents q n for which M |n, where M > 10 is some multiple of K (see Subsection 5.3.1 for why we need M > 10). To do this, we take a large enough prime P . There must exist some multiple M of K for which To see this, recall from Subsection 2.1 that: Also by the recurrence (2.1), we have (p mK+i , q mK+i ) ≡ (p i , q i ) (mod P ) for every i.
Clearly if P is large enough then M > 10. Now define: Note that congruences can be expressed by ∀-formula with one extra variable 2 . So D M ∀ is a ∀ 3 -formula in prenex normal form. To summarize: u, v, u , v ) holds if and only if there exists t ≥ 1 such that (u, v) = (p tM , q tM ) and (u , v ) = (p tM +1 , q tM +1 ).
Recall from (2.7) that every T ∈ N has a unique Ostrowski representation: We often just write [q n ] when the natural number T is clear from the context.
In the proof of Theorem 1.4, we consider numbers T such that [q n ](T ) = 0 when n is odd, and [q n ](T ) is either 0 or 1 when n is even. The reader can easily check that there is a bijection between the set of such natural numbers and finite words on {0, 1}. We will use 2 We have x1 ≡ x2 (mod P ) if and only if ∀w x1 − x2 − P w = 0 ∨ |x1 − x2 − P w| ≥ P .

4:16
P. Hieronymi,D. Nguyen,and I. Pak Vol. 17:3 this observation through out this section. In order to do so, we first observe that the above set of natural numbers is definable by the following α-PA formula: where After and Compatible as defined in (3.4) and (3.8). Proof. The statement follows easily from (2.4) and Facts 3.2 and 3.3.
Let T, X be natural numbers. If ZeroOne ∀ ∃ (T ) and ZeroOne ∀ ∃ (X), we can think of T and X as finite words on {0, 1}. Thus, it is natural to ask whether we can express that the word corresponding to X is a prefix of the word corresponding T , by an α-PA formula. It is not hard to see that the following α-PA formula is able to do so: Note that Pref ∀ ∃ is an ∀ 4 ∃ 2 -formula in prenex normal form.

Universal Turing machines.
Recall that we fixed a PSPACE-complete language L ⊆ {0, 1} * and a 1-tape Turing Machine M that decides L. Neary and Woods [NW] constructed a small universal 1-tape Turing machine (UTM) U = (Q, Σ, σ 1 , δ, q 1 , q 2 ), with |Q| = 8 states and |Σ| = 4 tape symbols. 3 Using U , we can simulate M in polynomial time and space. More precisely, let x ∈ {0, 1} * be an input to M. Then we can encode M and x in polynomial time as a string Mx ∈ Σ * . Upon input Mx , the UTM U simulates M on x, and halts with one of the two possible configurations: It is worth pointing out here that we take this detour via a universal Turing machine to keep the number of variables constant.
Let λ = λ(|x|) be the polynomial bound on the tape length, so that the computation U ( Mx ) always use less than λ tape positions.
Let x ∈ {0, 1} * . We now consider the simulation U ( Mx ). Denote by T i (x) ∈ Σ λ−1 the contents of U 's tape on step i of the computation U ( Mx ). For j ≤ λ − 1, T i,j (x) ∈ Σ is the j-th symbol of T i (x). Also denote by s i (x) ∈ Q the state of U on step i. We denote the i-th head position of U by π i (x). Note that 1 ≤ π i (x) ≤ λ − 1. As usual, we will suppress the dependence on x if x is clear from the context.
where × is a special marker symbol. For each step i, we now encode the tape content T i (x), the state s i (x) and the tape head position π i (x) by the finite B-word: The marker block [×, ×] is at the beginning of each T i (x), which is distinct from the other λ − 1 blocks in T i (x). Note that T i (x) has in total λ blocks. Observe T 1 (x) codes the starting configuration Mx of the simulation U ( Mx ). Now we concatenate T i over all steps 1 ≤ i ≤ ρ, where ρ is the terminating step of the simulation. We set 5.3. Proof of Theorem 5.1. The detailed description of X(x) and Accept will be provided in Subsection 5.3.5. Here, we give an initial outline of the proof. We begin by associating to transcript T a natural number T . Our goal then is to construct an α-PA formula Accept(X) consisting of four subformulas: In this formula, Transcript ∀∃ (T ) ensures that there is a transcript T to which T corresponds. The formula Pref ∀ ∃ (X, T ) guarantees that X is a prefix of T , and E ∃ ∀ (T ) says that T ends in "yes". We will need associate to each x ∈ {0, 1} * an X(x) ∈ N such that the conclusion of Theorem 1.4 holds.
For the rest of the proof, the meaning of c i , d i , a, b will change depending on the context. Recall the formulas D K ∀ , D M ∀ , ZeroOne ∀ ∃ , Pref ∀ ∃ from Section 5.1.

4:18
P. Hieronymi,D. Nguyen,and I. Pak Vol. 17:3 5.3.1. Encoding transcripts. We first encode the transcripts T by a number T ∈ N satisfying ZeroOne ∀ ∃ (T ). Recall that T is a finite B-word, and observe that |B| = 37. From now on, we view B as a set of 37 distinct strings in {0, 1} 6 , each containing at least one 1. Then we pick a large enough prime P in D M ∀ so that M > 10. Definition 5.5. Let T be a transcript, and let B t ∈ B be the t-th block in T . We associate to T the natural number T ∈ N such that ZeroOne ∀ ∃ (T ) and for all t ∈ N (1) Let r 1 = λM and r 2 = (λ + 1)M . Then the block B t+λ of T correspond to those [q tM +i ](T ) with r 1 ≤ i < r 2 . By Fact 2.4, we can write each convergent (p tM +i , q tM +i ) with r 1 − 1 ≤ i ≤ r 2 as a linear combination c i (u, v) + d i (u , v ). Note that the coefficients c i , d i ∈ Z are independent of t, but do depend on λ. They can be computed explicitly in time poly(λ).
Let B = f (B, B , B ). Then we sum up all q tM +r 1 +2j for every 0 ≤ j < 6 such that the j-th bit in B is '1'. This sum can be expressed as av + bv for some a, b ∈ Z computable in time poly(λ). Again, c i , d i and a, b depend on λ and also the triple B, B , B , but is independent of t. Then B t+λ = B if and only if we can uniquely write T = W 1 + (av + bv ) + W 2 , where W 1 < q tM +r 1 −1 and Ost(W 2 ) ⊂ {q n : n ≥ tM + r 2 }. Let Z 1 = W 1 + (av + bv ) and Z 2 = W 2 . Thus if B t+λ = B, they satisfy: i) 0 ≤ Z 1 − (av + bv ) < q tM +r 1 −1 , ii) Ost(Z 2 ) ⊂ q n : n ≥ tM + r 2 . Both (i) and (ii) can be expressed using quantifier-free α-PA formulas. For (i), recall that q tM +r 1 −1 is again linear combination of v, v , so the corresponding α-PA formula is just a conjunction of two linear inequalities in Z 1 , v, v . For (ii), using an auxiliary variable Z 3 , we can express it as After(p tM +r 2 −1 , q tM +r 2 −1 , p tM +r 2 , q tM +r 2 , Z 2 , Z 3 ) (see (3.4)). Thus final α-PA formula we want is: (5.8)

Constructing Read B,B ,B
∃ . Let B, B , B ∈ B, and let u, v, u , v ∈ N and t ≥ 1 be such that (u, v) = (p tM , q tM ) and (u , v ) = (p tM +1 , q tM +1 ). We will construct an α-PA formula Read B,B ,B ∃ (u, v, u , v , T ) that holds if the three blocks B t−1 , B t , B t+1 in T match with B, B , B . Since the construction is very similar to the one of Next B,B ,B ∃ , we will leave verifying some of the details to the reader. Note that the blocks B t−1 B t B t+1 in T correspond to [q n ](T ) with (t − 1)M ≤ n < (t + 2)M . So we just need to express (p tM +i , q tM +i ) for −M − 1 ≤ i ≤ 2M as linear combinations c i (u, v) + d i (u , v ). Then we sum up all q tM +i that should correspond to the '1' bits in B, B , B , which is again some linear combination av + bv . This time the coefficients c i , d i , a, b do not depend on λ and can be computed in constant time. Now we have B t−1 B t B t+1 = BB B if and only if we can uniquely write T = Z 1 + Z 2 , where Z 1 and Z 2 satisfy two conditions i'-ii') similar to i-ii) above. Again, we can express these two condition as quantifier free α-PA formula. Thus the α-PA formula we want is: (5.10) Note that Tran ∃ is an ∃ 6 -formula. Let c, d ∈ Q be such that q (t+λ)M = cq tM + dq tM +1 . To ensure that T obeys the transition rule f : B 3 → B everywhere, we simply require: (5.11) In this formula, D M ∀ (u, v, u , v ) guarantees that v = q tM is the beginning of some block B t , and q (t+λ)M = cv + dv is the beginning of the block B t+λ , should it not exceed T . Thus for all T ∈ N, we have Transcript ∀∃ (T ) holds if and only if there is a transcript T with T (T ) = T .
We now argue that Transcript ∀∃ is a ∀ 4 ∃ 6 α-PA formula. First, there are ∀ 4 variables u, v, u , v . Each Tran B,B ,B ∃ is an ∃ 6 -formula, which also commutes with the big disjunction. Also ¬D M ∀ is an ∃ 3 -formula, which can be merged with the ∃ 6 part. 5 Thus Transcript ∀∃ is a ∀ 4 ∃ 6 formula.
We need one last α-PA formula to say that the computation corresponding to T ends in the "yes" configuration (see (5.5)). Recall that "yes" has fixed length. Assume "yes" starts at v = q tM . Then just like before, we can sum up all q tM +i that correspond to '1' bits in "yes". This sum can be written as av + bv , with a, b ∈ Z explicit constants independent of λ. Also observe that q tM −1 = q tM +1 − a 1 q tM = v − a 1 v. So we define an α-PA formula as follows: (5.12) Observe that E ∃ ∀ (T ) holds if and only if the computation corresponding to T ends in "yes". Note that E ∃ ∀ is a ∃ 5 ∀ 3 -formula. 5.3.5. Completing the construction. Finally, given x ∈ {0, 1} * , we can easily construct in time poly(|x|) the content of the first segment T 1 (x) in T (x) (see (5.6)). Again, T 1 (x) is the starting configuration of the simulation U ( Mx ), which is basically just Mx . We denote by X(x) the natural number X such that ZeroOne ∀ ∃ (X) and for all t ∈ N (1) [q t ](X) = 0 for t > 10, (2) [q 0 ](X)[q 2 ](X) . . . [q 10 ](X) = T 1 (x). 5 We need to rewrite every implication "a → b" as "¬a ∨ b". It is easy to see that we can compute X(x) in time poly(|x|).
Now construct the α-PA formula Accept(X): From the construction it is clear that Accept is an ∃ * ∀ * ∃ * α-PA formula such that: Accept(X(x)) holds if and only if x ∈ L.
Finally, recall that in all constructed α-PA formulas of Subsection 5.3.1 and also Accept, the number of quantifiers/variables is constant, the number of linear inequalities (atoms) only depend on α, and the linear coefficients/constants can be computed in time poly(|x|). This completes the proof of Theorem 5.1.

Quadratic irrationals: General lower bound
In this section, we establish Theorem 1.3. Its proof follows the proof of Theorem 5.1 very closely. For a Turing machine M and s ∈ N, recall that in Theorem 1.4 we constructed an ∀ 6 ∃ 4 ∀ 11 α-PA formula Accept a function X : {0, 1} s → N such that for every input x ∈ {0, 1} s , Accept(X(x)) holds if and only if M accepts x within space poly(s). Here we show that we can extend the space in which M accepts x in exchange for adding alternating blocks of quantifiers in the α-PA formula. For λ ∈ N we define g 0 (s) = s and g r+1 (s) = g r (s) 2 gr(s) , r ≥ 0.
The following is the main theorem we establish in this section. By a basic diagonalization argument the problem whether given a Turing machine M halts on an input string x within space g r (|M| + |x|) itself requires space at least g r (|M| + |x|) to decide. Theorem 1.3 follows.
Recall that in Next B,B B ∃ , if v = q tM and v = q tM +1 then the shifted convergent q (t+λ)M can be written as cv +dv , with c, d ∈ Z having lengths poly(λ). The resulting sentence (5.13) has length poly(λ), and is PSPACE-complete to decide. To prove Theorem 6.1 we need to construct α-PA formula S r such that S r has length poly( ), at most r − 2 alternating quantifiers, and defines the graph of the shift map Shift r : q tM → q (t+gr(λ))M .
The following construction is classical. It was first used in Meyer [Mey] to prove that WS1S has non-elementary decision complexity, and was later improved on in Stockmeyer [Sto]. An expository version is given in Reinhardt [Rei]. For clarity and completeness, we reproduce it below in the setting of WS1S. Afterwards, we translate it to α-PA formulas.
6.1. A lower bound for WS1S. Let WS1S be the weak monadic second order theory of (N, +1), that is the monadic second order logic of (N, +1) in which quantification over sets is restricted to quantification over finite subsets of N. Formulas in the language of this theory are called WS1S-formulas. We will use lower case letters x, y, t, u, z to denote variables ranging over N and use upper case letters A, C, D, E to denote variables ranging over finite subsets of N.
We think of each subset X ∈ P fin (N) as an infinite word in {0, 1} that is eventually 0. When we write X = x 0 x 1 . . . x n 0 ω , we mean that X is the finite set {i ∈ {0, . . . , n} : x i = 1}. The relation i ∈ X simply means that the i-th digit x i is 1.
In statement of Lemma 6.2, the separator | is inserted just to improve the readability. The blocks in C represent the integers 0, 1, . . . , 2 gr(λ) − 1 in binary. The blocks in A mark the beginning of the blocks in C. The first '1' in A is at position x and the last '1' in A is at position y. 6 In total, the difference y − x is g r (λ)2 gr(λ) = g r+1 (λ).
Proof of Lemma 6.2. We construct the F λ r recursively, starting with the base case: Here x + λ represents λ iterations of the successor function s N . For F λ 0 we will not need any conditions on A, C.
Let r > 0 and suppose we have already constructed F λ r (x, y, A, C) with the desired property. We will exploit the fact the blocks in C represent the integers 0, 1, . . . , 2 gr(λ) − 1 in binary, and that adding 1 to the integer represented by one block gives the integer represented by the next block. For that, we will use least-significant digit first encoding. We recall the carry rule for addition by 1 in binary: if X = x 0 x 1 . . . , Y = y 0 y 1 . . . , then ∞ i=0 y i 2 −i = 1 + i=0 x i 2 −i if and only if for all i ∈ N The two conditions can be expressed by WS1S-formulas: Observe that if we apply these rules on blocks of length g r (λ), starting with 0 . . . 0, then we get: So the blocks cycle back to 0 . . . 0 eventually. Thus we will characterize C as the binary words obtained by applying this transformation rule until the block 0 . . . 0 is reached. We 6 Position indexing starts at 0.
For readability, we use commas and semicolons to denote conjunctions of atoms and subclauses in (6.3). Lines 2-3 of (6.3) set up the block structures in A and C. They make sure that A and C are empty outside the range [x, y], and that the blocks in A are of the form 100 . . . . Line 4 of (6.3) expresses the increment rule (6.1) for every two consecutive blocks in C. Here z, w represent the first digits in two consecutive blocks. Line 5 of of (6.3) expresses the carry rule (6.2). Line 6 of (6.3) ensures that the blocks in C do not cycle back to 0 . . . 0, because their last digits cannot decrease from 1 down to 0. The last line of (6.3) ensures that the last block in C is 1 . . . 1.
By induction, it is easy to see that F λ r has r alternating quantifier blocks, starting with ∀. Observe that F λ r+1 has 5 more variables than F λ r , namely z, w, D, E, t. Therefore, we have again by induction that F λ r has at most 5(r + 1) variables. We can also bound their lengths: because we needed to iterate the successor function s N λ times to represent y = x + λ. Theorem 6.3. Deciding WS1S-sentences S with k + 3 alternating quantifiers in requires space at least: , where the tower has height k, and ρ, η are absolute constants.
Sketch of proof. Consider the following decidable problem: Given a Turing machine M and an input string X, does M halt on X within space g r (|M| + |X|)? By the same construction as in Theorem 5.1, we can write down a WS1S-sentence S with length O(|M| + |X|) so that S holds if and only if M halts on x within space g r (|M| + |X|). Here λ = Ω(|M| + |X|). The last part [∀u, v, u , v . . . ] in (5.13) should be replaced by: ∀x, y, A, C F r (x, y, A, C) → transition rules . . .
Here x and y are bits in the transcript T = U ( MX ), with y = x + g r (λ). 7 The resulting sentence S has the form ∃ . . . ∀ . . . ¬F r ∨ . . . Since F r has r alternating quantifiers, S has r + 2 alternating quantifiers. The length (S) is roughly the input length |M| + |X| plus (F r ), which is also O(|M| + |X|). 7 Here U is the universal TM used to emulate M(X).

4:24
P. Hieronymi, D. Nguyen, and I. Pak Vol. 17:3 6.2. Proof of Theorem 6.1. We first translate the WS1S-formula F λ r (x, y, A, C) with r alternating quantifiers into an α-PA formula S r with (r + 1) alternating quantifiers. To do this, we replace in F λ r each variable x ranging over individuals by a separate quadruple We replace each variable ranging over sets by an integer variable. We replace the relation x ∈ X in F λ r by the relation whether x is in Ost(X). By (3.6) and (3.7), this relation is definable by an ∃ α-PA formula and by an ∀ α-PA formula. Recall from Fact 2.4 that there are constants c, d ∈ Z such that if v = q tM and v = q tM +1 , then q (t+1)M = cv + dv for all t ∈ N. We replace every x + 1 term in F λ Observe that S 0 has just O(1) terms, instead of O(λ) terms like F 0 . By induction, S r has O(r) inequalities and variables. The total length (S r ) (including symbols and integer coefficients) is still O(r + λ).
Because of the D M ∀ predicate, S 0 now has one quantifier. For r > 0, we can merge the ∀ quantifiers in D M ∀ predicates with the ∀z, w, t, . . . quantifiers in F λ r (of course replaced by quadruples). Because x ∈ Ost(X) is definable by both an ∃ PA-formula and an ∀ α-PA formula, the body of the sentence S r+1 , consisting of Boolean combinations in ∈/ / ∈, can be written using only ∀ quantifiers. These extra ∀ quantifiers can again be merged into the ∀z, w, t part. This means S r+1 has only one more alternating quantifiers than S r . So S r (u x , v x , u x , v x , u y , v y , u y , v y , A, C) is an α-PA formula quantifier formula with (r + 1) alternating quantifier blocks. Now we are back to encoding Turing machine computations. We give a brief outline how the construction follows the proof of Theorem 5.1. In the definition of Transcript ∀∃ in (5.11), we replace [∀u, v, u , v . . . ] by: In these transition rules, Read B,B ,B ∃ is kept as before with u x , v x , u x , v x , but Next B,B ,B ∃ can be rewritten using the shifted convergents u y , v y , u y , v y . Altogether, this expresses the transition rule for each jump y = x + g r (λ). The resulting formula Transcript has the form ∀ . . . ¬S r ∨ . . . . Since S r has r + 1 alternating quantifiers, this formula has r + 2 alternating quantifiers. We now construct Accept M,s,r using Transcript as is in (5.13). This α-PA formula has r + 3 alternating blocks of quantifiers and the number of variables and inequalities used is just O(r).

Non-quadratic irrationals: Undecidablity
In this section, we consider the case that α is non-quadratic. As pointed out in the introduction it follows from [HTy] that α-PA is undecidable whenever α is non-quadratic. Here we will show that even α-PA sentences with only four alternating quantifier blocks are undecidable. We prove a slightly stronger result for which we have to introduce an extension of α-PA. Let K be a subfield of R. An K-Presburger sentence (short: K-PA sentence) is a statement of the form Q 1 x 1 ∈ Z n 1 . . . Q r x r ∈ Z nr Φ(x 1 , . . . , x r ), (7.1) where Q 1 , . . . , Q r ∈ {∀, ∃} are r alternating quantifiers, and Φ is a Boolean combination of linear inequalities in x 1 , . . . , x r with coefficients and constant terms in K. We define K-PA formulas and other relevant notations analogous to the case of α-PA sentences in Section 3. The following is the main result of this section.
When α is non-quadratic, then 1, α, α 2 are Q-linearly independent. As Q(α, α 2 ) = Q(α), Theorem 1.5 follows from Theorem 7.1. 7.1. Further tools. In this section we are working with two different irrationals α and β, and we will need to refer to the Ostrowski representation based on α and β. We denote by p n /q n and p n /q n the n-th convergent of α and β, respectively. Let Ost α := {q n : n ∈ N} and Ost β := {q n : n ∈ N}. For X ∈ N, denote by Ost α (X) the set of q n with non-zero coefficients in the α-Ostrowski representation of X. Then Ost β (X) is defined accordingly for the β-Ostrowski representation of X. All earlier notations can be easily adapted to α and β separately. For brevity, we define the remaining functions and notations just for α. The corresponding versions for β are defined accordingly, with obvious relabelings.
The function f α defined in (2.8) and its interaction with the corresponding function f β play a crucial role. We collect two easy facts about f α here.
Proof. Let m n=0 b n+1 q n be the α-Ostrowski representation of X. Without loss of generality, we may assume that αq m − p m > 0. Then set Z 2 = X + q m+2 and Z 1 = X + q m+3 .
Since αq m+2 − p m+2 > 0 and αq m+3 − p m+3 < 0, we get from Fact 2.3 that Now it follows easily from [H2,Fact 2.13] and Fact 2.3 that for all Y ∈ N Hieronymi,D. Nguyen,and I. Pak Vol. 17:3 Fact 7.3. Let X ∈ N and let J ⊆ R be an open interval around f α (X). Then there is Let Y ∈ N be such that Y α q n+2 = X. It is left to show that f α (Y ) ∈ J. By Fact 2.3 and [H2,Fact 2.13] we get that 7.2. Uniform definition of all finite subsets of N 2 . Let α, β be two positive irrational numbers such that 1, α, β are Q-linearly independent. The goal of this section is to produce a 6-ary Q(α, β)-PA formula Member such that for every set S ⊆ N 2 there is X ∈ N 4 such that for all (s, t) ∈ N 2 , (s, t) ∈ S ⇐⇒ Member(X, s, t). The Q-linear independence of 1, α, β is necessary as we will see that the existence of such a relation implies the undecidability of the theory. The failure of our argument in the case of Q-linear dependence of 1, α, β can be traced back to the fact that the following lemma fails when 1, α, β are Q-linearly dependent.
Definition 7.5. Define g : N 4 → R to be the function that maps (X, Y ) to Definition 7.6. Let Best be the relation on N × N × N 2 × N that holds precisely for all tuples (d, e, X, Y 1 ) for which there is a Y 2 ∈ N such that Observe that for given (d, e, X) ∈ N × N × N 2 there is at most one Y 1 ∈ N ≤d such that Best(d, e, X, Y 1 ) holds. We will later see in Lemma 7.8 that for given d ∈ N we can take e ∈ N large enough such that for all X 1 ∈ N and Y 1 ≤ d the set This implies the result.
The following lemma is crucial in what follows. It essentially says that for every subinterval of I α ∩ I β and every d ∈ Ost α , we can recover (Ost α ) ≤d just using parameters from this interval and Ost β . This should be compared to condition (ii) in [HTy, Th. A].
Lemma 7.8. Let d ∈ Ost α , e 0 ∈ Ost β , X ∈ N 2 and s ∈ N be such that Then there is e ∈ Ost β and an open interval J ⊆ f α (X 1 ), f α (X 2 ) such that e ≥ e 0 and for all Z ∈ N f α (Z) ∈ J =⇒ Best(d, e, X 1 , Z, s).
Proof. Let e ∈ Ost β be large enough such that for every w 1 ∈ N ≤d there is w 2 ∈ N ≤e such that The existence of such an e follows from the finiteness of N ≤d and the density of f β (N) in I β . Let w ∈ N ≤e be such that By Lemma 7.4 we can find an ε > 0 such that for all (w 1 , w 2 ) ∈ N ≤d × N ≤e with (w 1 , w 2 ) = (s, w) It is left to show that Best(d, e, X 1 , Z, s) holds. We have that for all (w 1 , w 2 ) ∈ N ≤d × N ≤e with (w 1 , w 2 ) = (s, w) Moreover, 4:28 P. Hieronymi,D. Nguyen,and I. Pak Vol. 17:3 Thus Best(d, e, X 1 , Z, s) holds, as desired.
Lemma 7.9. Let d ∈ Ost α , s ∈ N, X ∈ N 2 be such that By Lemma 7.8 there is an open interval J ⊆ f α (X 1 ), f α (X 2 ) and e 1 ∈ Ost β such that e 1 > d and for all Z ∈ N f α (Z) ∈ J =⇒ Best(d, e 1 , X 1 , Z, s).
Observe that we require the sequences (k i ) i=0,...,2n and (l i ) i=1,...,2n to be increasing sequences of non-consecutive natural numbers. Therefore the above description of Z 1 , Z 2 and Z 3 immediately gives us the α-Ostrowski representations of Z 1 and Z 3 and the β-Ostrowski representation of Z 2 . In particular, Ost β (Z 2 ) = q l i : i = 1, . . . , n , and Ost α (Z 3 ) = q k i : i = 0, . . . , n, i even . It is now left to prove that for all s, t ∈ N (s, t) ∈ S ⇐⇒ Member(Z 1 , Z 2 , Z 3 , Z 4 , s, t).
"⇒": Let (s, t) ∈ S. Let i ∈ {1, . . . , 2n} be such that (s, t) = (c i , c i+1 ). Observe that i is odd. We show that holds. By (7.3) and the fact that i − 1 is even, we have that Thus by (4) Since d 4 ∈ Ost α (Z 3 ) and d 1 ≤ d 4 < d 2 , it follows that d 4 = d 1 = q k i−1 and that i is odd. Since e 1 , e 2 ∈ Ost β (Z 2 ) and we get from (3) that e 1 = q l i and e 2 = q l i+1 . Thus by (7.5) By (4) we get that s = c i and t = c i+1 . Since i is odd, (s, t) = (c i , c i+1 ) ∈ S.
Proof. For Admissible (Definition 7.10), we replace each variable d i , which earlier represented some convergent q n ∈ Ost α , by a 6-tuple holds, in order to guarantee (7.6).
Here v i takes the earlier role of d i . Similarly, we replace each e i in Admissible by a 6-tuple e i and also require that C ∀,β (e i ) holds. Here C ∀,α and C ∀,β are from (3.3), with the extra subscript α or β indicating which irrational is being considered. These C ∀,α and C ∀,β conditions can be combined into a ∀ 2 -part. Altogether, the new Admissible has 42 variables.
It is now easy to see that Admissible is ∃ * ∀ * -definable, and so is Member. A direct count reveals that Admissible is at most ∃ 50 ∀ 10 , and Member is at most ∃ 100 ∀ 10 .
7.3. Proof of Theorem 7.1. Here we follow an argument given in the proof of Thomas [Tho,Th. 16.5]. Consider U = (Q, Σ, σ 1 , δ, q 1 , q 2 ) a universal 1-tape Turing machine with 8 states and 4 symbols, as given in [NW]. Here Q = {q 1 , . . . , q 8 } are the states, Σ = {σ 1 , . . . , σ 4 } are the tape symbols, σ 1 is the blank symbol, q 1 is the start state and q 2 is the unique halt state. Also, δ : is the transition function. In other words, we have δ(i, j) = (i , j , d) if upon state q i and symbol σ j , the machine changes to state q i , writes symbol σ j and moves left (d = −1) or right (d = 1). Given an input x ∈ Σ * , we will now produce an ∃ * ∀ * ∃ * ∀ * α-PA sentence ϕ x such that ϕ x holds if and only if U (x) halts.
We will now use sets A 1 , . . . , A 8 ⊆ N 2 and B 1 , . . . B 4 ⊆ N 2 to code the computation on U (x). The A i 's code the current state of the Turing machine. That is, for (s, t) ∈ N 2 , we have (s, t) ∈ A i if and only if at step s of the computation, U is in state q i and its head is over the t-th cell of the tape. The B j 's code which symbols are written on the tape at a given step of the computation. We have (s, t) ∈ B j if and only if at step s of the computation, the symbol σ j is written on t-th cell of the tape. The computation U (x) then halts if and only if there are A 1 , . . . , A 8 ⊆ N 2 and B 1 , . . . B 4 ⊆ N 2 such that: a) A i 's are pairwise disjoint; B j 's are pairwise disjoint. b) (0, 0) ∈ A 1 , i.e., the computation starts in the initial state. c) There exists some (u, v) ∈ A 2 , i.e., the computation eventually halts. d) For each s ∈ N, there is at most one t ∈ N such that (s, t) ∈ ∪ i A i , i.e., at each step of the computation, U can only be in exactly one state. e) If x = x 0 . . . x n ∈ Σ * , then for every 0 ≤ t ≤ n, we have x t = σ j ⇐⇒ (0, t) ∈ B j , i.e., the first rows of the B j 's code the input string x. f) Whenever (s, t) ∈ B j , f1) if (s, t) / ∈ A i for all i ∈ [8], then (s + 1, t) ∈ B j . That is, if the current head position is not at t, then the t-th symbol does not change. f2) if (s, t) ∈ A i for some i ∈ [8] and δ(i, j) = (δ 1 ij , δ 2 ij , δ 3 ij ) ∈ [8] × [4] × {±1}, then (s + 1, t) ∈ B δ 2 ij and (s + 1, t + δ 3 ij ) ∈ A δ 1 ij . That is, if the head position is at t, and the state is i, then a transition rule is applied. We use the predicate Member to code membership (s, t) ∈ A i , B j . By Theorem 7.11, there should exist tuples X i = (X i1 , . . . , X i4 ), Y j = (Y j1 , . . . , Y j4 ) ∈ N 4 that represent A i and B j . In other words, we have (s, t) ∈ A i ⇐⇒ Member(X i , s, t) , (s, t) ∈ B j ⇐⇒ Member(Y j , s, t).
For the input condition e), there exist Z j = (Z j1 , . . . , Z j4 ) ∈ N 4 so that x t = σ j ⇐⇒ Member(Z j , 0, t) ∀ 0 ≤ t ≤ n. 4:32 P. Hieronymi,D. Nguyen,and I. Pak Vol. 17:3 Note that Z j can be explicitly constructed from the input x (see Theorem 7.11's proof). Now the sentence ϕ x that encodes halting of U (x) is: Since Member is ∃ * ∀ * -definable, the sentence ϕ x is ∃ * ∀ * ∃ * ∀ * . Whether U (x) halts or not is undecidable, and so is ϕ x . A direct count shows that Member appears at most 200 times in ϕ x . From the last estimate in the proof of Lemma 7.12, we see that ϕ x is at most a ∃ k ∀ k ∃ k ∀ k sentence, where k = 20000. This completes the proof.

Final remarks and open problems
8.1. Comparing Theorem 1.1 and Theorem 1.4, we see a big complexity jump by going from one to three alternating quantifier blocks, even when α is quadratic. It is an interesting open problem to determine the complexity of α-PA sentence when r = 2, 3. Here we make the following conjecture: Conjecture 8.1. Let α be non-quadratic. Then α-PA sentences with three alternating blocks of quantifiers are undecidable.
Similarly, when α is quadratic we make the following conjecture: Conjecture 8.2. Let α be quadratic. Then deciding α-PA sentences with two alternating blocks of quantifiers and a fixed number of variables and inequalities is NP-hard.
We note that ∃ * ∀ * √ 5-PA sentences can already express non-trivial questions, such as the following: Given a, b ∈ Z, decide whether there is a Fibonacci number F n congruent to a modulo b? Note that the sequence {F n mod b} is periodic with period O(b), called the Pisano period. These periods were introduced by Lagrange and heavily studied in number theory (see e.g. [Sil,§29]), but the question above is likely computationally hard.  [KP] is the following general integer optimization result on convex semialgebraic sets.
Theorem 8.3 [KP]. Consider a first order formula F (y) over the reals of the form: y ∈ R k : Q 1 x 1 ∈ R n 1 . . . Q m x m ∈ R nm P (y, x 1 , . . . , x m ), where P (y, x 1 , . . . , x w ) is a Boolean combination of equalities/inequalities of the form g i (y, x 1 , . . . , x w ) * i 0 with * i ∈ {>, <, =} and g i ∈ Z[y, x 1 , . . . , x w ]. Let k, m, n 1 , . . . , n m be fixed, and suppose that the set S F := {y ∈ R n : F (y) = true} is convex. Then we can either decide in polynomial time that S F ∩ Z k = ∅, or produce in polynomial time some y ∈ S F ∩ Z k .
This immediately implies Theorem 1.1. Here there is no restriction on the number of g i 's and their degrees. The coefficients of g i 's are encoded in binary. Note that convexity is crucially important in the theorem. In Manders and Adleman [MA], it is shown that given a, b, c ∈ Z, deciding ∃y ∈ N 2 : ay 2 1 + by 2 + c = 0 is NP-complete. Here the semialgebraic set y ∈ R 2 : 0 ≤ ay 2 1 + by 2 + c < 1 is not necessarily convex.