Reasoning about Data Repetitions with Counter Systems

We study linear-time temporal logics interpreted over data words with multiple attributes. We restrict the atomic formulas to equalities of attribute values in successive positions and to repetitions of attribute values in the future or past. We demonstrate correspondences between satisfiability problems for logics and reachability-like decision problems for counter systems. We show that allowing/disallowing atomic formulas expressing repetitions of values in the past corresponds to the reachability/coverability problem in Petri nets. This gives us 2EXPSPACE upper bounds for several satisfiability problems. We prove matching lower bounds by reduction from a reachability problem for a newly introduced class of counter systems. This new class is a succinct version of vector addition systems with states in which counters are accessed via pointers, a potentially useful feature in other contexts. We strengthen further the correspondences between data logics and counter systems by characterizing the complexity of fragments, extensions and variants of the logic. For instance, we precisely characterize the relationship between the number of attributes allowed in the logic and the number of counters needed in the counter system.


Introduction
Words with multiple data. Finite data words [Bou02] are ubiquitous structures that include timed words, runs of counter automata or runs of concurrent programs with an unbounded number of processes. These are finite words in which every position carries a label from a finite alphabet and a data value from some infinite alphabet. More generally, structures over an infinite alphabet provide an adequate abstraction for objects from several domains: for example, infinite runs of counter automata can be viewed as infinite data words, finite arrays are finite data words [AvW12], finite data trees model XML documents with attribute values [Fig10] and so on. A wealth of specification formalisms for data words (or slight variants) has been introduced stemming from automata, see e.g. [NSV04,Seg06], to adequate logical languages such as first-order logic [BDM + 11,Dav09] or temporal logics [LP05,KV06,Laz06,Fig10,KSZ10,Fig11] (see also a related formalism in [Fit02]). Depending on the type of structures, other formalisms have been considered such as XPath [Fig10] or monadic second-order logic [BCGK12]. In full generality, most formalisms lead to undecidable decision problems and a well-known research trend consists of finding a good trade-off between expressiveness and decidability. Restrictions to regain decidability are protean: bounding the models (from trees to words for instance), restricting the number of variables, see e.g. [BDM + 11], limiting the set of the temporal operators or the use of the data manipulating operator, see e.g. [FS09,DDG12]. As far as classes of automata for data languages are concerned, other questions arise related to closure properties or to logical characterisations, see e.g. [BDM + 11,BL10,KST12]. Moreover, interesting and surprising results have been exhibited about relationships between logics for data words and counter automata (including vector addition systems with states) [BDM + 11,DL09,BL10], leading to a first classification of automata on data words [BL10,Bol11]. This is why logics for data words are not only interesting for their own sake but also for their deep relationships with data automata or with counter automata. Herein, we pursue further this line of work and we work with words in which every position contains a vector of data values.
Motivations. In [DDG12], a decidable linear-time temporal logic interpreted over (finite or infinite) sequences of variable valuations (understood as words with multiple data) is introduced in which the atomic formulae are of the form either x ≈ X i y or x ≈ ? y. The formula x ≈ X i y states that the current value of variable x is the same as the value of y i steps ahead (local constraint) whereas x ≈ ? y states that the current value of x is repeated in a future value of y (future obligation). Such atomic properties can be naturally expressed with a freeze operator that stores a data value for later comparison, and in [DDG12], it is shown that the satisfiability problem is decidable with the temporal operators in {X, X −1 , U, S}. The freeze operator allows to store a data value in a register and then to test later equality between the value in the register and a data value at some other position. This is a powerful mechanism but the logic in [DDG12] uses it in a limited way: only repetitions of data values can be expressed and it restricts very naturally the use of the freeze operator. The decidability result is robust since it holds for finite or infinite sequences, for any set of MSO-definable temporal operators and with the addition of atomic formulas of the form x ≈ ? −1 y stating that the current value of x is repeated in a past value of y (past obligation). Decidability can be shown either by reduction into FO 2 (∼, <, +ω), a first-order logic over data words introduced in [BDM + 11] or by reduction into the verification of fairness properties in Petri nets, shown decidable in [Jan95]. In both cases, an essential use of the decidability of the reachability problem for Petri nets is made, for which no primitive recursive algorithm is known, see e.g. [Ler11] (see also a first upper bound established recently in [LS15]). Hence, even though the logics shown decidable in [DDG12] poorly use the freeze operator (or equivalently, the only properties about data are related to controlled repetitions), the complexity of their satisfiability problems is unknown. Moreover, it is unclear whether the reductions into the reachability problem for Petri nets are really needed; this would be the case if reductions in the other direction exist. Note that in [KSZ10], a richer logic BD-LTL has been introduced and it has been shown that satisfiability is equivalent to the reachability problem for Petri nets. Moreover, in [DHLT14], REASONING ABOUT DATA REPETITIONS WITH COUNTER SYSTEMS 3 BD-LTL [KSZ10] (≡ Reach(VASS)) LTL ↓ 1 (X, X −1 , U, S) [DL09] (undec.) PLRV LRV PLRV = CLTL XF,XF −1 [DDG12] LRV = CLTL XF [DDG12] (1 attribute) Figure 1: Placing LRV and variants in the family of data logics two fragments BD-LTL − and BD-LTL + of that richer logic have been introduced and shown to admit 2expspace-complete satisfiability problems. Forthcoming logic LRV is shown in [DHLT14] to be strictly less expressive than BD-LTL + .
Our main motivation is to investigate logics that express repetitions of values, revealing the correspondence between expressivity of the logic and reachability problems for counter machines, including well-known problems for Petri nets. This work can be seen as a study of the precision with which counting needs to be done as a consequence of having a mechanism for demanding "the current data value is repeated in the future/past" in a logic. Hence, this is not the study of yet another logic, but of a natural feature shared by most studied logics on data words [DDG12, BDM + 11, DL09,FS09,KSZ10,Fig10,Fig11]: the property of demanding that a data value be repeated. We consider different ways in which one can demand the repetition of a value, and study the repercussion in terms of the "precision" with which we need to count in order to solve the satisfiability problem. Our measurement of precision here distinguishes the reachability versus the coverability problem for Petri nets and the number of counters needed as a function of the number of variables used in the logic.
Our contribution. We introduce the linear-time temporal logic LRV ("Logic of Repeating Values") interpreted over finite words with multiple data, equipped with atomic formulas of the form either x ≈ X i y or x ≈ φ? y [resp. x ≈ φ? y], where x ≈ φ? y [resp. x ≈ φ? y] states that the current value of x is repeated [resp. is not repeated] in some future value of y in a position where φ holds true. When we impose φ = , the logic introduced in [DDG12] is obtained and it is denoted by LRV (a different name is used in [DDG12], namely CLTL XF ). Note that the syntax for future obligations is freely inspired from propositional dynamic logic PDL with its test operator '?'. Even though LRV contains the past-time temporal operators X −1 and S, it has no past obligations. We write PLRV to denote the extension of LRV with past obligations of the form x ≈ φ? −1 y or x ≈ φ? −1 y. Figure 1 illustrates how LRV and variants are compared to existing data logics.
Our main results are listed below. i. We begin where [DDG12] stopped: the reachability problem for Petri nets is reduced to the satisfiability problem of PLRV (i.e., the logic with past obligations). ii. Without past obligations, the satisfiability problem is much easier: we reduce the satisfiability problem of LRV and LRV to the control-state reachability problem for VASS, via a detour to a reachability problem on gainy VASS. But the number of counters in the VASS is exponential in the number of variables used in the formula. This gives us a 2expspace upper bound. iii. The exponential blow up mentioned above is unavoidable: we show a polynomial-time reduction in the converse direction, starting from a linear-sized counter machine (without zero tests) that can access exponentially many counters. This gives us a matching 2expspace lower bound. iv. Several augmentations to the logic do not alter the complexity: we show that complexity is preserved when MSO-definable temporal operators are added or when infinite words with multiple data are considered. v. The power of nested testing formulas: we show that the complexity of the satisfiability problem for LRV reduces to pspace-complete when the number of variables in the logic is bounded by a constant, while the complexity of the satisfiability of LRV does not reduce even when only one variable is allowed. Recall that the difference between LRV and LRV is that the later allows any φ in x ≈ φ? y while the former restricts φ to just . vi. The power of pairs of repeating values: we show that the satisfiability problem of LRV augmented with x, y ≈ ? x , y (repetitions of pairs of data values) is undecidable, even when x, y ≈ ? −1 x , y is not allowed (i.e., even when past obligations are not allowed). vii. Implications for classical logics: we show a 3expspace upper bound for the satisfiability problem for forward-EMSO 2 (+1, <, ∼) over data words, using results on LRV. For proving the result mentioned in point iii. above, we introduce a new class of counter machines that we call chain systems and show a key hardness result for them. This class is interesting for its own sake and could be used in situations where the power of binary encoding needs to be used. We prove the (k + 1)expspace-completeness of the control state reachability problem for chain systems of level k (we only use k = 1 in this paper but the proof for arbitrary k is no more complex than the proof for the particular case of k = 1). In chain systems, the number of counters is equal to an exponential tower of height k but we cannot access the counters directly in the transitions. Instead, we have a pointer that we can move along a chain of counters. The (k + 1)expspace lower bound is obtained by a non-trivial extension of the expspace-hardness result from [Lip76,Esp98]. Then we show that the control state reachability problem for the class of chain systems with k = 1 can be reduced to the satisfiability problem for LRV (see Section 5). It was known that data logics are strongly related to classes of counter automata, see e.g. [BDM + 11,DL09,BL10] but herein, we show how varying the expressive power of logics leads to correspondence with different reachability problems for counter machines.

Preliminaries
We write N [resp. Z] to denote the set of non-negative integers [resp. integers] and [i, j] to denote the set {k ∈ Z : i ≤ k and k ≤ j}. For every v ∈ Z n , v(i) denotes the i th element of v for every i ∈ [1, n]. We write v v whenever for every i ∈ [1, n], we have v(i) ≤ v (i). For a (possibly infinite) alphabet Σ, Σ * represents the set of finite words over Σ, Σ + the set of finite non-empty words over Σ. For a finite word u = a 1 . . . a k over Σ, we write |u| to denote its length k. For every 0 ≤ i < |u|, u(i) represents the (i + 1)-th letter of the word, here a i+1 . We use card(X) to denote the number of elements of a finite set X.
2.1. Logics of Repeating Values. Let VAR = {x 1 , x 2 , . . .} be a countably infinite set of variables. We denote by LRV the logic whose formulas are defined as follows: ABOUT DATA REPETITIONS WITH COUNTER SYSTEMS   5 where x, y ∈ VAR and i ∈ N. Formulas of one of the forms x ≈ X i y, x ≈ φ? y or x ≈ φ? y are said to be atomic and an expression of the form X i x (abbreviation for i next symbols followed by a variable) is called a term.
A valuation is a map from VAR to N, and a model is a finite non-empty sequence σ of valuations. All the subsequent developments can be equivalently done with the domain N replaced by an infinite set D since only equality tests are performed in the logics.
We write |σ| to denote the length of σ. For every model σ and 0 ≤ i < |σ|, the satisfaction relation |= is defined inductively as follows. Note that the temporal operators next (X), previous (X −1 ), until (U) and since (S) and Boolean connectives are defined in the usual way.
We also use the notation X i x ≈ X j y as an abbreviation for the formula X i (x ≈ X j−i y) (assuming without any loss of generality that i ≤ j). Similarly, X j y ≈ x is an abbreviation for x ≈ X j y.
Given a set of temporal operators O definable from those in {X, X −1 , S, U} and a natural number k ≥ 0, we write LRV k (O) to denote the fragment of LRV restricted to formulas with temporal operators from O and with at most k variables. The satisfiability problem for LRV (written SAT(LRV)) is to check for a given LRV formula φ, whether there exists a model σ such that σ |= φ. Note that there is a logarithmic-space reduction from the satisfiability problem for LRV to its restriction where atomic formulas of the form x ≈ X i y satisfy i ∈ {0, 1} (at the cost of introducing new variables).
Let PLRV be the extension of LRV with additional atomic formulas of the form x ≈ φ? −1 y and x ≈ φ? −1 y. The satisfaction relation is extended as follows: We write LRV [resp. PLRV ] to denote the fragment of LRV [resp. PLRV] in which atomic formulas are restricted to x ≈ X i y and x ≈ ? y [resp. x ≈ X i y, x ≈ ? y and x ≈ ? −1 y]. These are precisely the fragments considered in [DDG12] and shown decidable by reduction into the reachability problem for Petri nets.
Proposition 2.1. [DDG12] (I) Satisfiability problem for LRV is decidable (by reduction to the reachability problem for Petri nets). (II) Satisfiability problem for LRV restricted to a single variable is pspace-complete. (III) Satisfiability problem for PLRV is decidable (by reduction to the reachability problem for Petri nets).
In [DDG12], there are no reductions in the directions opposite to (I) and (III). The characterisation of the computational complexity for the satisfiability problems for LRV and PLRV remained unknown so far and this will be a contribution of the paper.
2.2. Properties. In the table below, we justify our choices for atomic formulae by presenting several abbreviations (with their obvious semantics). By contrast, we include in LRV both x ≈ φ? y and x ≈ φ? y when φ is an arbitrary formula since there is no obvious way to express one with the other.
Abbreviation Definition Models for LRV can be viewed as finite data words in (Σ × D) * , where Σ is a finite alphabet and D is an infinite domain. E.g., equalities between dedicated variables can simulate that a position is labelled by a letter from Σ; moreover, we may assume that the data values are encoded with the variable x. Let us express that whenever there are i < j such that i and j [resp. i + 1 and j + 1, i + 2 and j + 2] are labelled by a [resp. a , a ], σ(i + 1)(x) = σ(j + 1)(x). This can be stated in LRV by: This is an example of key constraints, see e.g. [NS11, Definition 2.1] and the current paper contains also numerous examples of properties that can be captured by LRV.
2.3. Basics on VASS. A vector addition system with states is a tuple A = Q, C, δ where Q is a finite set of control states, C is a finite set of counters and δ is a finite set of transitions → be the reflexive and transitive closure of − →. The reachability problem for VASS (written Reach(VASS)) consists of checking whether q 0 , v 0 * − → q f , v f , given two configurations q 0 , v 0 and q f , v f . The reachability problem for VASS is decidable but all known algorithms [May84,Kos82,Lam92,Ler11] take non-primitive recursive space in the worst case. The best known lower bound is expspace [Lip76,Esp98] whereas a first upper bound has been recently established in [LS15]. The control state reachability problem consists in checking whether q 0 , v 0 * − → q f , v for some v ∈ N C , given a configuration q 0 , v 0 and a control state q f . This problem is known to be expspace-complete [Lip76,Rac78]. The relation − → denotes the one-step transition in a perfect computation. In the paper, we need to introduce computations with gains or with losses. We define below the variant relations − → gainy and − → lossy . We write q, v − → gainy q , v if there is a transition (q, u, q ) ∈ δ and w, w ∈ N C such that v w, w = w + u and w v . Let * − → gainy be the reflexive and transitive closure of − → gainy . Similarly, we write q, v − → lossy q , v if there is a transition (q, u, q ) ∈ δ and w, w ∈ N C such that w v, w = w + u and v w . Let * − → lossy be the reflexive and transitive closure of − → lossy . Counter automata with imperfect computations such as lossy channel systems [AJ96,FS01], lossy counter automata [May03] or gainy counter automata [Sch10b] have been intensively studied (see also [Sch10a]). In the paper, imperfect computations are used with VASS in Section 4.

The Power of Past: From Reach(VASS) to SAT(PLRV)
While [DDG12] concentrated on decidability results, here we begin with a hardness result. When past obligations are allowed as in PLRV, SAT(PLRV) is equivalent to the very difficult problem of reachability in VASS (see recent developments in [LS15]). Combined with the result of the next section where we prove that removing past obligations leads to a reduction into the control state reachability problem for VASS, this means that reasoning with past obligations is probably much more complicated.
Theorem 3.1. There is a polynomial-space reduction from Reach(VASS) into SAT(PLRV).
The proof of Theorem 3.1 is analogous to the proof of [BDM + 11, Theorem 16] except that properties are expressed in PLRV instead of being expressed in FO 2 (∼, <, +1).
Proof. First, the reachability problem for VASS can be reduced in polynomial space to its restriction such that the initial and final configurations have all the counters equal to zero and each transition can only increment or decrement a unique counter. In the sequel, we consider an instance of this subproblem: A = Q, C, δ is a VASS, the initial configuration is q i , 0 and the final configuration is q f , 0 . Now, we build a formula φ in PLRV such that q f , 0 is reachable from q i , 0 iff φ is satisfiable. To do so, we encode runs of A by data words following exactly the proof of [BDM + 11, Theorem 16] except that the properties are expressed in PLRV instead of FO 2 (∼, <, +1). Letters from the finite alphabet are also encoded by equalities of the form The objective is to encode a word ρ ∈ δ * that represents an accepting run from q i , 0 to q f , 0 . We use the alphabet δ of transitions, that we code using a logarithmic number of variables. One can simulate m different labels in PLRV, by using log(m) + 1 variables and its equivalence classes. In order to simulate the alphabet δ, we use the variables x 0 , . . . , x N , with N = log(card(δ)) . For any t ∈ δ, let t ∈ PLRV be the formula that tests for label t at the current position. More precisely, for any fixed injective function λ : δ → 2 [1,N ] we define t = i∈λ(t) Note that t uses exclusively variables x 0 , . . . , x N , that is of size logarithmic in card(δ), and that t holds at a position for at most one t ∈ δ. We build a PLRV formula φ so that any word from δ * corresponding to a model of φ is an accepting run for A from q i , 0 to 8 S. DEMRI, D. FIGUEIRA, AND M. PRAVEEN q f , 0 . And conversely, for any accepting run of A from q i , 0 to q f , 0 there is a model of φ corresponding to the run. The following are standard counter-blind conditions to check.
(1) Every position satisfies t for some t ∈ δ.
(2) The first position satisfies q i , instr, q for some q ∈ Q.
(3) The last position satisfies q, instr, q f for some q ∈ Q.
In the formula φ, we use the distinguished variable x to relate increments and decrements.
Here are the main properties to satisfy.
(1) For every counter c ∈ C, there are no two positions labelled by a transition with instruction inc(c) having the same value for x: where inc(c) is a shortcut for q,inc(c),q ∈δ t . A similar constraint can be expressed with dec(c).
(2) For every counter c ∈ C, for every position labelled by a transition with instruction inc(c), there is a future position labelled by a transition with instruction dec(c) with the same value for x: where dec(c) is a shortcut for q,dec(c),q ∈δ t . This guarantees that the final configuration ends with all counters equal to zero.
(3) Similarly, for every counter c ∈ C, for every position labelled by a transition with instruction dec(c), there is a past position labelled by a transition with instruction inc(c) with the same value for x: This guarantees that every decrement follows a corresponding increment, satisfying that counter values are never negative.
Let φ be the conjunction of all the formulas defined above. Since all the properties considered herein are those used in the proof of [BDM + 11, Theorem 16] (but herein they are expressed in PLRV instead of FO2(∼, <, +1)), it follows that φ is satisfiable in PLRV iff there is an accepting run of A from q i , 0 to q f , 0 .

Leaving the Past Behind Simplifies Things: From SAT(LRV) to Control State Reachability
In this section, we show the reduction from SAT(LRV) to the control state reachability problem in VASS. We obtain a 2expspace upper bound for SAT(LRV) as a consequence. This is done in two steps: (1) simplifying formulas of the form x ≈ φ? y to remove the test formula φ (i.e., a reduction from SAT(LRV) into SAT(LRV )); and (2) reducing from SAT(LRV ) into the control state reachability problem in VASS. 4.1. Elimination of Test Formulas. We give a polynomial-time algorithm such that given ϕ ∈ LRV, it computes a formula ϕ ∈ LRV that preserves satisfiability: there is a model σ such that σ |= ϕ iff there is a model σ such that σ |= ϕ . We give the reduction in two steps. First, we eliminate formulas with inequality tests of the form x ≈ ψ? y using only positive tests of the form x ≈ ψ? y. We then eliminate formulas of the form x ≈ ψ? y, using only formulas of the form x ≈ ? y. Although both reductions share some common structure, they use independent coding strategies, and exploit different features of the logic; we therefore present them separately.
We first show how to eliminate all formulas with inequality tests of the form x ≈ ψ? y. Let LRV ≈ be the logic LRV where there are no appearances of formulas of the form x ≈ ψ? y; and let PLRV ≈ be PLRV without x ≈ ψ? y or x ≈ ψ? −1 y.
Henceforward, vars(ϕ) denotes the set of all variables in ϕ, and sub (ϕ) the set of all subformulas ψ such that x ≈ ψ? y or x ≈ ψ? y appears in ϕ for some x, y ∈ vars(ϕ).
In both reductions we make use of the following easy lemma.
Proof. This is a standard reduction. Indeed, given a formula ϕ with subformula x ≈ ψ? y, is satisfiable. We need to apply these replacements repeatedly, at most a polynomial number of times if we apply it to the innermost occurrences. Proof. For every ϕ ∈ LRV, we compute ϕ ∈ LRV ≈ in polynomial time, which preserves satisfiability. The variables of ϕ consist of all the variables of ϕ, plus: a distinguished variable k, and variables v ≈ y,ψ , v x ≈ ψ? y for every subformula ψ of ϕ and variables x, y of ϕ. The variables v x ≈ ψ? y 's will be used to get rid of ≈ in formulas of the form x ≈ ψ? y, and the variables v ≈ y,ψ 's to treat formulas of the form ¬(x ≈ ψ? y). Finally, k is a special variable, which has a constant value, different from all the values of the variables of ϕ.
Assume ϕ is in negation normal form. Note that each positive occurrence of x ≈ ψ? y can be safely replaced with x ≈ v x ≈ ψ? y ∧ v x ≈ ψ? y ≈ ψ? y. Indeed, the latter formula implies the former, and it is not difficult to see that whenever there is a model for the former formula, there is also one for the latter. On the other hand, translating formulas of the form ¬(x ≈ ψ? y) is more involved as these implicate some form of universal quantification. For treating these formulas, we use the variables v ≈ y,ψ and k as explained next. Let i be the first position of the model so that all future positions j > i verifying ψ have the same value on variable y, say value n. As we will see, with a formula of LRV ≈ , one can ensure that v ≈ y,ψ has the same value as k for all j ≤ i and value n for all other positions. The enforced values are illustrated below, with an initial prefix where variable v ≈ y,ψ is equal to k until we reach position i, from which point all values of v ≈ y,ψ concide with value n -represented as a dashed area.
Among these positions, those that have value of x equal to n satisfy ¬(x ≈ ψ? y) The positions satisfying ¬(x ≈ ψ? y) are of two types: the first type are those positions such that no future position satisfies ψ. The second type are those such that all future positions satisfying ψ have the same value n on variable y, and the variable x takes the value n. The first type of positions are captured by the formula ¬XFψ. As can be seen from the illustration above, the second type of positions can be captured by the formula x ≈ Xv ≈ y,ψ . Thus, ¬(x ≈ ψ? y) can be replaced with ¬XFψ ∨ x ≈ Xv ≈ y,ψ . Past obligations are treated in a symmetrical way. We now formalise these ideas, showing that σ |= ϕ implies that σ ϕ |= ϕ , where ϕ is the translation of ϕ and σ ϕ an extension of σ with the new variables of the translation and values corresponding to the intuition above. On the other hand, we will also show that σ |= ϕ implies σ |= ϕ. Next, we formally define σ ϕ and the translation ϕ , and then we show these two facts.
• First, the variable k will act as a constant; it will always have the same data value at any position of the model, which must be different from those of all variables of ϕ.
• Second, for every position we ensure that if v ≈ x,ψ is different from k, then it preserves its value until the last element verifying ψ; and if ψ holds at any of these positions then v ≈ x,ψ contains the value of x.
• Finally, let ϕ ≈ be the result of replacing (f) every appearance of ¬(x ≈ ψ? y) by ¬XFψ ∨ x ≈ Xv ≈ y,ψ , and (g) every positive appearance of x ≈ ψ? y by The formula ϕ is defined as follows: Notice that ϕ can be computed from ϕ in polynomial time in the size of ϕ. Proof.
[⇒] Note that one direction would follow from (4.1): if ϕ is satisfiable by some σ, then ϕ is satisfiable in σ ϕ . In order to establish (4.1), we show that for every subformula γ of ϕ and 0 ≤ i < |σ ϕ |, we have σ, i |= γ iff σ ϕ , i |= γ ≈ . (4.2) We further assume that γ is not an atomic formula that is dominated by a negation in ϕ.
We can easily extend this coding allowing for past obligations. We only need to use some extra variables v −1,≈ x,ψ , v −1 x ≈ ψ? −1 y that behave in the same way as the previously defined, but with past obligations. That is, we also define a val-v −1,≈ x,ψ as val-v ≈ x,ψ , but making use of v −1,≈ x,ψ , X −1 and F −1 . And finally, we have to further replace (c) every appearance of Proof. In a nutshell, for every ϕ ∈ LRV ≈ , we compute in polynomial time a formula ϕ ∈ LRV that preserves satisfiability. Besides all the variables from ϕ, ϕ uses a new distinguished variable k, and a variable v y,ψ for every subformula ψ of ϕ and every variable y of ϕ. We enforce k to have a constant value different from all values of variables of ϕ. At every position, we enforce ψ to hold if v y,ψ ≈ y, and ψ not to hold if v y,ψ ≈ k as shown above. Then x ≈ ψ? y is replaced by Next, we formalise these ideas. Let ϕ ∈ LRV ≈ . For any model σ, let σ ϕ be so that (a) |σ| = |σ ϕ |, (b) for every 0 ≤ i < |σ| and x ∈ vars(ϕ), σ(i)(x) = σ ϕ (i)(x), (c) there is some data value d ∈ {σ ϕ (i)(x) | x ∈ vars(ϕ), 0 ≤ i < |σ|} such that for every 0 ≤ i < |σ|, σ ϕ (i)(k) = d, and (d) for every 0 ≤ i < |σ|, σ ϕ (i)(v x,ψ ) = σ ϕ (i)(x) if σ, i |= ψ, and σ ϕ (i)(v x,ψ ) = σ ϕ (i)(k) otherwise.
For every other unmentioned variable, σ and σ ϕ coincide. It is evident that for every ϕ and σ, a model σ ϕ with the aforementioned properties exists. Next, we define ϕ ∈ LRV so that σ |= ϕ if, and only if, σ ϕ |= ϕ . We assume that for every ψ ∈ sub (ϕ), we have that sub (ψ) = ∅. This is without any loss of generality by Lemma 4.1.
• First, the variable k will act as a constant; it will always have the same data value at any position of the model, which must be different from those of all variables of ϕ.
• Second, any variable v x,ψ has either the value of k or that of x. Further, the latter holds if, and only if, ψ is true.
• Finally, let ϕ be the result of replacing every appearance of x ≈ ψ? y by x ≈ ? v y,ψ in ϕ. We then define ϕ as follows.
Notice that ϕ can be computed from ϕ in polynomial time.
Claim 4.5. ϕ is satisfiable if, and only if, ϕ is satisfiable. Proof.
Finally, note that we can extend this coding to treat past obligations in the obvious way.
By combining Proposition 4.2 and Proposition 4.4, we get the following result.
We have seen how to eliminate test formulas φ from x ≈ φ? y and x ≈ φ? −1 y. Combining this with the decidability proof for PLRV satisfiability from [DDG12], we get that both

4.2.
From LRV Satisfiability to Control State Reachability. Recall that in [DDG12], SAT(LRV ) is reduced to the reachability problem for a subclass of VASS. Herein, this is refined by introducing incremental errors in order to improve the complexity.
In [DDG12], the standard concept of atoms from the Vardi-Wolper construction of automaton for LTL is used (see also Appendix A). Refer to the diagram at the top of Figure 2. The formula x ≈ ? y in the left atom creates an obligation for the current value of x to appear some time in the future in y. This obligation cannot be satisfied in the second atom, since y has to satisfy some other constraint there (y ≈ z). To remember this unsatisfied obligation about y while taking the transition from the first atom to the second, the counter X {y} is incremented. The counter can be decremented later in transitions that allow the repetition in y. If several transitions allow such a repetition, only one of them needs to decrement the counter (since there was only one obligation at the beginning). The other transitions which should not decrement the counter can take the alternative labelled "no change" in the right part of Figure 2.
The idea here is to replace the combination of the decrementing transition and the "no change" transition in the top of Figure 2 with a single transition with incremental errors as shown in the bottom. After Lemma 4.9 below that formalises ideas from [DDG12, Section 7] (see also Appendix A), we prove that the transition with incremental errors is sufficient.
. For a LRV formula φ that uses the variables {x 1 , . . . , x k }, a VASS A φ = Q, C, δ can be defined, along with sets Q 0 , Q f ⊆ Q of initial and final states resp., such that • the set of counters C consists of all nonempty subsets of {x 1 , . . . , We call this property optional decrement. • Let 0 be the counter valuation that assigns 0 to all counters. Then φ is satisfiable iff At the top of Figure 2, only one counter X {y} is shown and is decremented by 1 for simplicity. In general, multiple counters can be changed and they can be incremented/decremented by any number up to k, depending on the initial and target atoms of the transition. If a counter can be incremented by k 1 and can be decremented by k 2 , then there will also be transitions between the same pair of atoms allowing changes of k 1 − 1, . . . , 1, 0, −1, . . . − (k 2 − 1). This corresponds to the closure under component-wise interpolation mentioned in Lemma 4.9.
The optional decrement property corresponds to the fact that there will always be a "no change" transition that does not decrement any counter. Now, we show that a single transition that decrements all counters by the maximal possible number can simulate the set of all transitions between two atoms, using incremental errors.
Let A inc = Q, C, δ min and Q 0 , Q f ⊆ Q, where Q, Q 0 , Q f and C are same as those of A φ and δ min is defined as follows: Proof. By induction on the length n of the run q 1 , v 1 * − → gainy q 2 , 0 . The base case n = 0 is trivial since there is no change in the configuration.
Induction step: the idea is to simulate the first gainy transition by a normal transition that decreases each counter as much as possible while ensuring that (1) the resulting value is non-negative and (2) we can apply the induction hypothesis to the resulting valuation. We calculate the update required for each counter individually and by closure under componentwise interpolation, there will always be a transition with the required update function. Let q 1 , v 1 − → gainy q 3 , v 3 * − → gainy q 2 , 0 and v 1 v 1 . We will define an update function u For each counter X ∈ C, u (X) is defined as follows: By definition of minup q 1 ,q 3 and the closure of the set of transitions Proof. The proof is in four steps.
Step 1: Step 2: This is the step that requires new insight. From Lemmas 4.10 and 4.11, Step 3: This is a standard trick. Let A dec = Q, C, δ rev be a VASS such that for every transition (q, u, q ) ∈ δ min of A inc , A dec has a transition (q , −u, q) ∈ δ rev , where −u : we can simply reverse every transition in the run of A inc to get a run of A dec and vice-versa).
Step 4: This is another standard trick. Since A dec does not have zero-tests, we can remove all decrementing errors from a run of A dec from q f , 0 to q 0 , 0 , to get another run from q f , 0 to q 0 , v , where v is some counter valuation (possibly different from 0). Using decremental errors at the last configuration, A dec can then reach the configuration q 0 , 0 . In other words, Checking the latter condition is precisely the control state reachability problem for VASS. If the control state in the above instance is reachable, then Rackoff's proof gives a bound on the length of a shortest run reaching it [Rac78] (see also [DJLL09]). The bound is doubly exponential in the size of the VASS. Since in our case, the size of the VASS is exponential in the size of the LRV formula φ, the bound is triply exponential. A non-deterministic Turing machine can maintain a binary counter to count up to this bound, using doubly exponential space. The machine can start by guessing some initial state q 0 and a counter valuation set to 0. This can be done in polynomial space. In one step, the machine guesses a transition to be applied next and updates the current configuration accordingly, while incrementing the binary counter. At any step, the space required to store the current configuration is at most doubly exponential. By the time the binary counter reaches its triply exponential bound, if a final control state is not reached, the machine rejects its input. Otherwise, it accepts. Since this non-deterministic machine operates in doubly exponential space, an application of Savitch's Theorem [Sav70] gives us the required 2expspace upper bound for the satisfiability problem of LRV .

Simulating Exponentially Many Counters
In Section 4, the reduction from SAT(LRV) to control state reachability for VASS involves an exponential blow up, since we use one counter for each nonempty subset of variables. The question whether this can be avoided depends on whether LRV is powerful enough to reason about subsets of variables or whether there is a smarter reduction without a blow-up.
Similar questions are open in other related areas [MR09,MR12].
Here we prove that LRV is indeed powerful enough to reason about subsets of variables. We establish a 2expspace lower bound. The main idea behind this lower bound proof is that the power of LRV to reason about subsets of variables can be used to simulate exponentially many counters. The lower bound is developed in three parts, with each part explained in a sub-section of this section. The first part defines chain systems, which are like VASS, except that transitions can not access counters directly, but can access a pointer to a chain of counters. The second part shows lower bounds for the control state reachability problem for chain systems. The third part shows that LRV can reason about chain systems. 5.1. Chain Systems. We introduce a new class of counter systems that is instrumental to show that SAT(LRV ) is 2expspace-hard. This is an intermediate formalism between counter automata with zero-tests with counters bounded by triple exponential values (having 2expspace-hard control state reachability problem) and properties expressed in LRV . Systems with chained counters have no zero-tests and the only updates are increments and decrements. However, the systems are equipped with a finite family of counters, each family having an exponential number of counters in some part of the input (see details below). Let exp(0, n) def = n and exp(k + 1, n) def = 2 exp(k,n) for every k ≥ 0.
Definition 5.1. A counter system with chained counters (herein called a chain system) is a tuple A = Q, f, k, Q 0 , Q F , δ where (1) f : [1, n] → N where n ≥ 1 is the number of chains and exp(k, f (α)) is the number of counters for the chain α where k ≥ 0, (2) Q is a non-empty finite set of states, By convention, sometimes, we write q instr − − → q instead of q, instr, q ∈ δ.
The system A = Q, f, k, Q 0 , Q F , δ is said to be at level k. In order to encode the natural numbers from f and the value k, we use a unary representation. We say that a transition containing inc(α) is α-incrementing, and a transition containing dec(α) is α-decrementing. The idea is that for each chain α ∈ [1, n], we have exp(k, f (α)) counters, but we cannot access them directly in the transitions as we do in VASS. Instead, we have a pointer to a counter that we can move. We can ask if we are pointing to the first counter (first(α)?) or not (first(α)?), or to the last counter (last(α)?) or not (last(α)?), and we can change the pointer to the next (next(α)) or previous (prev(α)) counter.
A run is a finite sequence ρ in δ * such that A run is accepting whenever ρ(1) starts with an initial state from Q 0 and ρ(|ρ|) ends with a final state from Q F . A run is perfect iff for every α ∈ [1, n], there is some injective function such that for every γ(i) = j we have that j < i and c α i = c α j . A run is gainy and ends at zero (different from 'ends in zero' defined below) iff for every chain α ∈ [1, n], there is some injective function such that for every γ(i) = j we have that j > i and c α i = c α j . In the sequel, we shall simply say that the run is gainy. Below, we define two problems for which we shall characterize the computational complexity.
Problem: Existence of a perfect accepting run of level k ≥ 0 (Per(k)) Input: A chain system A of level k. Question: Does A have a perfect accepting run?
Problem: Existence of a gainy accepting run of level k ≥ 0 (Gainy(k)) Input: A chain system A of level k. Question: Does A have a gainy accepting run? Per(k) is actually a control state reachability problem in VASS where the counters are encoded succinctly. Gainy(k) is a reachability problem in VASS with incrementing errors and the reached counter values are equal to zero. Here, the counters are encoded succinctly too.
Let A = Q, f, k, Q 0 , Q F , δ be a chain system of level k. A run ρ ends in zero whenever for every chain α ∈ [1, n], c α L = 0 with L = |ρ|. We write Per zero (k) and Gainy zero (k) to denote the variant problems of Per(k) and Gainy(k), respectively, in which runs that end in zero are considered. First, note that Per zero (k) and Per(k) are interreducible in logarithmic space since it is always possible to add adequately self-loops when a final state is reached in order to guarantee that c α L = 0 for every chain α ∈ [1, n]. Similarly, Gainy zero (k) and Gainy(k) are interreducible in logarithmic space.
Proof. Below, we show that Per zero (k) and Gainy zero (k) are interreducible in logarithmic space, which allows us to get the proof of the lemma since logarithmic-space reductions are closed under composition. From A, let us define a reverse chain system A of level k, where the reverse operation· is defined on instructions, transitions, sets of transitions and systems as follows: The reverse operation can be extended to sequences of transitions as follows: ε def = ε and t · u def = u · t. Note that· extends the reverse operation defined in the proof of Theorem 4.12 (step 3).
One can show the following implications, for any run ρ fo A that ends in zero: (1) ρ is perfect and accepting for A implies ρ is gainy and accepting for A.
(2) ρ is gainy and accepting for A implies ρ is perfect and accepting for A.
(3) ρ is gainy and accepting for A implies ρ is perfect and accepting for A.
(4) ρ is perfect and accepting for A implies ρ is gainy and accepting for A.
The proof of Lemma 5.3 consists of simulating perfect runs by the runs of a VASS in which the control states record the positions of the pointers in the chains.
Proof. It is sufficient to show that Per zero (k) is in (k + 1)expspace.
Let A = Q, f, k, Q 0 , Q F , δ be a chain system with f : [1, n] → N. We reduce this instance of Per zero (k) into several instances of the control state reachability problem for VASS such that the number of instances is bounded by O(|A| 2 ) and the size of each instance is in O(exp(k, |A|) |A| ), which provides the (k + 1)expspace upper bound by [Rac78]. The only instructions in the VASS A defined below are: increment a counter, decrement a counter or the skip action, which just changes the control state without modifying the counter values (which can be obviously simulated by an increment followed by a decrement).
In order to prove the above equivalence, we can use the transformations (I) and (II) stated below. ( ) (a similar clause holds for decrements), -otherwise (i.e., if instr i is not inc(α) nor dec(α) for any α), instr i = skip.

5.2.
Hardness Results for Chain Systems. We show that Per(k) is (k + 1)expspacehard. The implication is that replacing direct access to counters by a pointer that can move along a chain of counters does not decrease the power of VASS, while providing access to more counters. To demonstrate this, we extend Lipton's expspace-hardness proof for the control state reachability problem in VASS [Lip76] (see also its exposition in [Esp98]). Since our pointers can be moved only one step at a time, this extension involves new insights into the control flow of algorithms used in Lipton's proof, allowing us to implement it even with the limitations imposed by step-wise pointer movements.
Proof. We give a reduction from the control state reachability problem for the counter automata with zero tests whose counters are bounded by 2 2 N , where N = exp(k, n γ ) for some γ ≥ 1 and n is the size of counter automaton. Without any loss of generality, we can assume that the initial counter values are zero. The main challenge in showing the reduction to Per(k) is to simulate the zero tests of the counter automaton in a chain system. The main idea, as in Lipton's proof [Lip76], is to use the fact that counters are bounded by 2 2 N . For each counter c we keep another counter c so that the sum of the values of c and c is always 2 2 N . Testing c for zero is then equivalent to testing that c has the value 2 2 N , which can be done by decrementing c 2 2 N times. This involves extending ideas from [Lip76].
In light of this, the chain system will have two chains: counters and counters. The value of the counter c j of the counter automaton will be maintained in the j th counter of the chain counters, while the value of the complement counter c j will be maintained in the j th counter of the chain counters. In the descriptions that follow, whenever we write "increment j th counter of counters", we implicitly assume that the increment is followed by the decrement of the j th counter of counters, to maintain the sum at 2 2 N . The transitions of the chain system are designed to ensure the following high-level structure: (1) Begin by setting the values of the first D counters of the chain counters to 2 2 N , and other initializations that will be needed for simulating zero tests. (Here, D is the number of counters in the original counter automaton.) (2) For every transition q, c j ← c j + 1, q of the counter automaton, the chain system will have transitions corresponding to the following listing: (a) move the pointers to j th counters on counters and counters : (next(counters); next(counters)) j ; (We use ';' to compose transitions and '(·) j ' to repeat j times a finite sequence of instructions.) (b) increment the j th counter on counters, to simulate incrementing c j : inc(counters); (c) decrement the j th counter on counters, to maintain the sum of j th counters on counters and counters at 2 2 N : dec(counters); (d) move the pointers on counters and counters back to the first counter and go to q : (prev(counters); prev(counters)) j , goto q (3) For every transition q : if c j = 0 goto q 1 else c j ← c j − 1 goto q 2 of the counter automaton, the chain system will have transitions corresponding to the following listing: (a) non-deterministically guess that j th counter is nonzero or zero q: goto nonzero or zero; (b) if the guess is that j th counter is nonzero; decrement it and goto q 2 nonzero: (next(counters); next(counters)) j ; dec(counters); inc(counters); (prev(counters); prev(counters)) j ; goto q 2 (c) otherwise, the guess is that j th counter is zero, hence the complement of the j th counter is 2 2 N ; decrement the complement 2 2 N times, increment it back to its original value and go to q 1 zero: (next(counters); next(counters)) j ; [decrement counters 2 2 N times; increment counters 2 2 N times]; (prev(counters); prev(counters)) j ; goto q 1 .
In (2) above, the chain system simply imitates the counter automaton, maintaining the condition that c j + c j = 2 2 N . In (3), the zero test of the counter automaton is replaced by a non-deterministic choice between nonzero and zero. The counter c j itself can be nonzero or zero, so there are four possible cases. In the case where the nondeterministic choice coincides with the counter value, the chain system continues to simulate the counter automaton. On the other hand, if nonzero is chosen when c j = 0, the chain system gets stuck since (next(counters); next(counters)) j ; dec(counters) can not be executed (since the j th counter of counters has value zero and can not be decremented). If zero is chosen when c j = 0 (and hence c j < 2 2 N ), the chain system also gets stuck, since (as we shall show later) the sequence of transitions (next(counters); next(counters)) j ; [decrement counters 2 2 N times] cannot be executed. In the rest of this sub-section, we will show that [decrement counters 2 2 N times; increment counters 2 2 N times] can be implemented in such a way that there is one run that does exactly what is required and that any other run deviating from the expected behaviour gets stuck. This allows us to conclude that the final state in the given counter automaton is reachable if and only if it is reachable in the constructed chain system.
The basic principle for decrementing some counter 2 2 N times is the same one used in Lipton's proof [Lip76], which we now recall briefly. We will have two chains of counters s and s. We will denote the counters in these chains as s N , s N −1 , . . . , s 1 and s N , s N −1 , . . . , s 1 respectively. If the counter s N has the value 2 2 N , we describe how to decrement it 2 2 N times. Further, if the counter s i has the value 2 2 i , we describe how to decrement it 2 2 i times. For this, we will use four more chains y, y, z and z. Our description assumes that for each i, the counters y i and z i have the value 2 2 i (initializing these values will be described later as part of implementing step (1) in the proof of Theorem 5.4). Decrementing s i 2 2 i times is done by nested loops as shown in Figure 3. The outer loop is indexed by y i−1 and the inner loop by z i−1 . Since both y i−1 and z i−1 have the value 2 2 i−1 , the instruction inside the inner loop (decrementing s i ) is executed 2 2 i−1 × 2 2 i−1 = 2 2 i times. To implement these loops, we will need to test when y i−1 and z i−1 become zero. This is done the same way as above, replacing i, i − 1 by i − 1, i − 2 respectively. These two zero tests are done recursively by the same set of transitions. After the recursive zero test, there needs to be a mechanism to determine if the recursive call was made from the inner loop or the outer loop. This mechanism is provided by a chain of counters stack: if the i th counter in the chain stack has the value 1 (respectively 0), then the recursive call was made by the inner (respectively outer) loop.
We now give the implementation of instructions that follow the control state named zero in item (3) of page 21 above. In some intermediate configurations when the implementation is executing, the four pointers of the four chains y, z, s and stack will all be pointing to the i th counters of their chains. In this case, we say that the system is at stage i. The N th stage is the last one. We frequently need to move all four pointers to the next stage or previous stage, for which we introduce some macros. The macro Nextstage is short for the sequence of transitions next(y); next(y); . . . ; next(stack); next(stack), which moves all the pointers one stage up. The macro Prevstage similarly moves all the pointers one stage down. The macro transfer(z, s) is intended to transfer the value of the counter in z at current stage to the respective counter in s. The macro consists of the following transitions. The macro transfer(y, s) is similar to the above, with y and y replacing z and z respectively. The macro inc(next(z)) moves the pointer of the chain z one stage up, increments z once and moves the pointer back one stage down: next(z); inc(z); prev(z). The macros inc(next(y)) and inc(next(s)) are similar to inc(next(z)) with y and s replacing z respectively. The macro dec(next(s)) is similarly defined to decrement the next counter in the chain s. The macro stack == 1 tests if the counter in the current stage in the chain stack is greater than 0: dec(stack); inc(stack). Following is the set of transitions at the control state zero in (3) in the proof of Theorem 5.4 above. The listing below is written in the form of a program in a low level programming language for readability. It can be easily translated into a set of transitions of a chain system. There is a control state between every two consecutive instructions below, but only the important ones are given names like "outerloop2". A line such as "innernonzero2: dec(z); inc(z); goto innerloop2" actually represents the set of transitions  An instruction of the form "q : inc(stack); goto outernonzero2 or outerzero2" represents the set of transitions { q, inc(stack), outernonzero2 , q, inc(stack), outerzero2 }. Depending on the non-deterministic choices made at control states that have multiple transitions enabled, there will be several different runs. We prove later that there is one run which has the intended effect and the other runs will never reach the final state.
The action [decrement counters 2 2 N times] in item (3) of page 21 is implemented in Chain system 5.1 by "transfer(counters, s); goto Dec zerorep " in the first line. There is a non-deterministic choice for exiting "transfer(counters, s)", and we wish to block any run that exits before incrementing s 2 2 N times. This is done by "goto Dec zerorep ", which forces the chain system to decrement s 2 2 N times.
Suppose, as in the proof of Theorem 5.4, we want to simulate the c j = 0 case of the counter automaton's transition q : if c j = 0 goto q 1 else c j ← c j − 1 goto q 2 . The action [increment counters 2 2 N times] in the proof of Theorem 5.4 is implemented above by "transfer(counters, s); goto Dec zeropass " in the second line. If Dec zeropass returns successfully, the control can go to q 1 as required. The only difference between Dec zerorep and Dec zeropass is the return address zerorep and zeropass. A generic Dec address is shown above, which implements the algorithm shown in Figure 3. Formally, every instruction label (such as outerloop2, innerloop2 etc.) should also be parameterized with address , but these have been ommitted for the sake of readability. The case "if first(stack)?" implements the base case i = 1. If i > 1, the else branch is taken, where the first instruction is to move all pointers one stage down, so that they now point to y i−1 and z i−1 as required by the algorithm of Figure 3. The loop between outerloop2 and outerzero2 implements the outer loop of Figure 3 and the loop between innerloop2 and innerzero2 implements the inner loop of Figure 3. The non-deterministic choice in innertest2 decides whether to continue the inner loop or to exit. The macro transfer(z, s) at the beginning of innerzero2 will reset the (i − 1) th counter of z back to 2 2 i−1 , at the same time setting (i − 1) th counter of s to 2 2 i−1 . So the run that behaves as expected increments z i−1 exactly 2 2 i−1 times. All other runs are blocked by the recursive call to Dec address at the end of innerzero2. The purpose of inc(stack); dec(stack); in the middle of innerzero2 is to ensure that the recursive call to Dec address returns to the correct state. After similar tests in outerzero2, "Nextstage" at the beginning of outerexit2 moves all the pointers back to stage i. Then in backtrack, the correct return state is figured out. Exactly how this is done will be clear in the proof of the following lemma, where we construct runs of the chain system by induction on stage.
Lemma 5.5. In the state zero in the listing above, if the j th counter in the chain counters has the value 0, there is a run that reaches the state q 1 without causing any changes. Any run from the state zero will be blocked if the j th counter in the chain counters has a value other than 0.
Proof. We first prove the existence of a run reaching q 1 when the j th counter in the chain counters has the value 0. The j th counter in the chain counters will have the value 2 2 N . Our run executes transfer(counters, s) to set counters j and s N to 2 2 N and set s N and counters j to 0. With these values in the counters, we will prove next that there is run from Dec zerorep to zerorep that sets s N to 0 and s N to 2 2 N . From zerorep, our run continues to execute transfer(counters, s) to set counters j and s N to 0 and set counters j and s N to 2 2 N . Then there is again a run from Dec zeropass to zeropass which sets s N to 0 and sets s N to 2 2 N . From zeropass, our run moves the pointers on the chains counters and counters back to the first position and goes to state q 1 . Now suppose that in the state Dec address , s N is set to 2 2 N , s N is set to 0 and the pointer in the chain stack is at the last position, as mentioned in the previous paragraph. We will prove that there is a run that reaches address , setting s N to 0 and s N to 2 2 N . We will in fact prove the following claim by induction on i.
Claim: Suppose that in the state Dec address , s i is set to 2 2 i , s i is set to 0 and the pointers in the chains stack, s, y, z are at position i. There is a run ρ i that reaches DecFinished, setting s i to 0 and s i to 2 2 i , with the pointers in the same position.
When we first call Dec zerorep or Dec zeropass , the pointers in the chains stack, s, y, z are at the last position (N ). Hence, when the run ρ N given by the above claim reaches DecFinished, it can continue to the state backtrack and then to zerorep or zeropass respectively.
Since we set stack i to 1, we can return to outertest2 at the end of ρ i to continue our ρ i+1 */ : outertest2: dec(stack); inc(stack); goto outernonzero2 2 2 i − 1 times, then goto outerzero2 : outernonzero2: dec(y); inc(y); goto outerloop2 : outerzero2: transfer(y, s); goto Dec address /* Follow ρ i here. Since now stack i is 0, we can return to outerexit2 at the end of ρ i to continue our ρ i+1 */ : outerexit2: Nextstage; goto DecFinished This completes the induction step and the proof of the claim.
Finally, we will prove that any run from zero will be blocked if the j th counter in the chain counters has a value other than 0. Indeed, the j th counter in the chain counters has a value less than 2 2 N when the state is Dec zerorep . Hence no run can execute innerloop2 and outerloop2 2 2 N −1 times. Hence, any run will either get blocked when the pointer in the chain s is still at position N , or it will go to state Dec address with s N −1 less than 2 2 N −1 . In the latter case, we can argue in the same way to conclude that any run is blocked either when the pointer in the chain s is still at N − 1 or it will go to state Dec address with s N −2 less than 2 2 N −2 and so on. Any run is blocked at some stage between N and 1.
Next, we explain the implementation of step (1) in the proof of Theorem 5.4. We show how to get counters to their required initial values (at the beginning of the runs all the values are equal to zero). We briefly recall the required initial values: (1) each counter c has value zero, (2) for every i ∈ [1, N ], y i , z i and s i are equal to zero, (3) for every i ∈ [1, N ], stack i is equal to zero and stack i is equal to one, (4) each complement counter c has value 2 2 N , (5) for every i ∈ [1, N ], y i , z i and s i have the value 2 2 i . The initialization for the points (1)-(3) is easy to perform and below we focus on the initialization for the point (5). Initialization of the complement counter in point (4) will be dealt with later by simply adjusting what is done below. To help achieve these initial values for 5), we have another chain init, with N counters. We follow the convention that if at stage i, the counter in the chain init has value 1, then all the counters in all the chains at stage i or below have been properly initialized. By convention, the condition last(init)? is true if the pointer in the chain init is at stage N . The macros Nextstage and Prevstage moves the pointers of the chain init, in addition to all the other chains.
We now give the code used to initialize y, z, s and stack to the required values, assuming that all counters are initially set to 0 and all the pointers in all the chains are pointing to stage 0. The counters y 0 , z 0 , s 0 and stack 0 are initialized directly. For the counters at higher stages, we again use nested loops similar to those shown in Figure 3, assuming that counters at lower stages have been initialized. These nested loops are implemented between innerinit-innerzero1 and outerinit-outerzero1. The zero tests involved in terminating these loops are performed by the same decrementing algorithm, assuming that counters at lower stages are already initialized. This will introduce some complications while backtracking from the decrementing algorithm -we have to figure out whether the call to Dec was from innerzero1 or outerzero1, or from within Dec itself from a recursive call. To handle this, the "backtrack" part of the code below has been updated to check if (next(init) == 1): if so, we are inside some recursive call to Dec. Otherwise, the call to Dec was from one of the loops initializing a higher level. : begininit: (inc(y)) 4 ; (inc(z)) 4 ; (inc(s)) 4 ; inc(init); inc(stack) : initialize: If last(init)? goto beginsim else goto outerinit : outerinit: dec(y); inc(y) /* y is the index for outer loop */ : innerinit: dec(z); inc(z) /* z is the index for inner loop */ : INC: inc(next(y)); inc(next(z)); inc(next(s)) : innertest1: goto innernonzero1 or innerzero1 : innernonzero1: dec(z); inc(z); goto innerinit /* inner loop not yet complete */ : innerzero1: transfer(z, s); inc(stack); dec(stack); goto Dec /* inner loop complete.
Lemma 5.6. Suppose the control state is begininit, all the pointers in all the chains are at stage 1 and all the counters at all stages have the value 0. For any i between 1 and N , there is a run ρ i ending at initialise, such that the pointer is at stage i in all the chains and all the counters at or below stage i have been initialised. If a run starts from begininit as above, it will not reach beginsim unless all the counters have been properly initialized.
Proof. By induction on i. For the base case i = 1, ρ 1 is the run that executes the instructions immediately after begininit and ends at initialise. Now we assume the lemma is true up to i and prove it for i + 1. By induction hypothesis, there is a run ρ i ending at initialise, with all the pointers in all the chains at stage i and all the counters at or below stage i initialised. The run ρ i+1 is obtained by appending the following sequence to ρ i . It uses the runs ρ i constructed in the proof of Lemma 5.5. : initialise: If last(init)? goto beginsim else goto outerinit /* pointer in chain init is at i = N ; go to outerinit */ : outerinit: dec(y); inc(y) : innerinit: dec(z); inc(z) : INC: inc(next(y)); inc(next(z)); inc(next(s)) : innertest1: goto innernonzero1 2 2 i − 1 times then goto innerzero1 : innernonzero1: dec(z); inc(z); goto innerinit /* inner loop not yet complete */ : innerzero1: transfer(z, s); inc(stack); dec(stack); goto Dec /* follow ρ i here; since (next(init) == 0) and stack == 1, we can retrun to outertest1 to continue our ρ i+1 */ : outertest1: dec(stack); inc(stack); goto outernonzero1 2 2 i − 1 times then goto out-erzero1 : outernonzero1: dec(y); inc(y); goto outerinit : outerzero1: transfer(y, s); goto Dec /* follow ρ i here; since (next(init) == 0) and stack == 0, we can retrun to outerexit1 to continue our ρ i+1 */ : outerexit1: Nextstage; inc(init); inc(stack); goto initialise This completes the induction step, proving the existence of a run ending at initialize as required.
Finally we argue that any run from begininit that does not initialize all the counters properly will get stuck. Indeed, the only way out of the initialization code is to go to beginsim, which is only reachable via initialize. From initialize, we can go to beginsim only when the pointers are at stage N . We finish the argument by showing that any run that visits initialize for the first time with pointers at stage i would have initialized all counters at or below stage i. The only way to go to initialize with pointers at stage i is via begininit (for i = 1) or via outerexit1 (for i > 1). It is clear in the case of i = 1 that the counters at stage 1 are properly initialized. In the case of i > 1, the only way to reach outerexit1 is via Dec. In this case, all runs are forced to iterate the loops at outerinit and innerinit the correct number of times as we have seen in the proof of Lemma 5.5, which will again force all the counters at or below stage i to be properly initialized.
To complete the proof of Theorem 5.4, we finally show how to initialize the first D ≤ n counters from counters and counters that correspond to the counters of the original counter automaton (point (4) of the initialization values listed after Lemma 5.5). This is achieved by replacing the code for INC from the code for the initialization phase by the following one: : INC: inc(next(y)); inc(next(z)); inc(next(s)) : next(init) if last(init)? then : inc(counters) : (next(counters);inc(counters)) D−1 : (prev(counters)) D−1 prev(init) This finishes the proof of Theorem 5.4.

5.3.
Reasoning About Chain Systems with LRV. Given a chain system A of level 1, we construct a polynomial-size formula in LRV that is satisfiable iff A has a gainy accepting run. Hence, we get a 2expspace lower bound for SAT(LRV). The main idea is to encode runs of chain systems with LRV formulas that access the counters using binary encoding, so that the formulas can handle exponentially many counters.
We encode a word ρ ∈ δ * that represents an accepting run. For this, we use the alphabet δ of transitions. Note that we can easily simulate the labels δ = {t 1 , . . . , t m } with the variables t 0 , . . . , t m , where a node has an encoding of the label t i iff the formula t i def = t 0 ≈ t i holds true. We build a LRV formula ϕ so that there is an accepting gainy run ρ ∈ δ * of A if, and only if, there is a model σ so that σ |= ϕ and σ encodes the run ρ.
The following are standard counter-blind conditions to check. (a) Every position satisfies t for some unique t ∈ δ. (b) The first position satisfies (q 0 , instr, q) for some q 0 ∈ Q 0 , instr ∈ I, q ∈ Q. (c) The last position satisfies (q, instr, q ) for some q ∈ Q, instr ∈ I, q ∈ Q F . (d) There are no two consecutive positions i and i + 1 satisfying (q, u, q ) and (p, u , p ) respectively, with q = p.
We use a variable x, and variables x α inc , x α dec , x α i for every chain α and for every i ∈ [1, f (α)]. Let us fix the bijections χ α : [0, 2 f (α) − 1] → 2 [1,f (α)] for every α ∈ [1, n], that assign to each number m the set of 1-bit positions of the representation of m in base 2. We say that a position i in a run is α-incrementing [resp. α-decrementing] if it satisfies (q, u, q ) for some q, q ∈ Q and u = inc(α) [resp. u = dec(α)].
In the context of a model σ with the properties ((a))-((d)), we say that a counter position i in a run operates on the α-counter c, if χ α (c) = X i , where Note that thus 0 ≤ c < 2 f (α) . For every chain α, let us consider the following properties: Proof. (Only if ) Suppose we have a model σ that verifies all the properties above. Since by (1), all the positions have different data values for x α dec , it then follows that the position j to which property (4) makes reference is unique. Since, also by (1), all the positions have different data values for x α inc , we have that the α-decrement corresponds to at most one α-increment position in condition (4). Moreover, it is an α-decrement of the same counter, that is, X i = X j (cf. (5.2)). For these reasons, for every α-increment there is a future α-decrement of the same counter which corresponds in a unique way to that increment. More precisely, for every chain α, the function is well-defined and injective, and for every γ α σ (i) = j we have that j > i and X i = X j . Consider ρ ∈ δ * as having the same length as σ and such that ρ(i) = (q, instr, q ) whenever σ, i |= (q, instr, q ) . From conditions (2) and (3), it follows that the counter c α i corresponding to position i from ρ, is equal to χ −1 (X i ). Therefore, the functions {γ α σ } α∈[1,n] witness the fact that ρ is a gainy and accepting run of A.
(If ) Let us now focus on the converse, by exhibiting a model satisfying the above properties, assuming that A has an accepting gainy run ρ ∈ δ * . We therefore have, for every α ∈ [1, n], an injective function where for every γ α (i) = j we have that j > i and c α i = c α j . We build a model σ of the same length of ρ, and we now describe its data values. Let us fix two distinct data values d, e. For any position i, we have σ(i)(t 0 ) = d; and σ(i)(t j ) = d if the ith transition in ρ is t j and σ(i)(t j ) = e otherwise. In this way we make sure that the properties ((a))-((d)) hold.
We use distinct data values d 0 , . . . , d |σ|−1 and d 0 , . . . , d |σ|−1 (all different from d, e). We define the data values of the remaining variables for any position i: Observe that these last two items ensure that the properties (2) and (3) hold.
By definition, every position of σ has a different data value for x α inc . Let us show that the same holds for x α dec . Note that at any position i, x α dec has: the data value of σ(i )(x α inc ) = d i for some i < i; or the data value d i . If there were two positions i < j with It is evident that none of (i), (ii), (iii) can hold since all d 0 , . . . , d |σ|−1 , d 0 , . . . , d |σ|−1 are distinct. For this reason, if (iv) holds, it means that i = j , and hence that (γ α ) −1 (i) = (γ α ) −1 (j), implying that γ α is not injective, which is a contradiction. Hence, all the positions have different data values for x α dec . Therefore, σ has the property (1).
To show that σ has property (4), let i be a α-incrementing position of σ, remember that σ(i)(x α inc ) = d i . Note that position γ α (i) = j must be α-decrementing on the same counter. By definition of the value of x α dec , we have that σ(j)(x α dec ) must be equal to σ((γ α (j)) −1 )(x α inc ) = σ(i)(x α inc ) = d i . Thus, property (4) holds. We complete the reduction by showing that all the properties expressed before can be efficiently encoded in our logic.
Claim 5.9. Properties ((a))-((d)) and (1)-(4) can be expressed by formulas of LRV, that can be constructed in polynomial time in the size of A.
Proof. Along the proof, we use the following formulas, for any α ∈ [1, n] and for any 1 represents the most significant bit. Now, we show how to code each of the properties.
• Properties ((a))-((d)) are easy to express in LRV and present no complications.
• For expressing (1), we force x α inc and x α dec to have a different data value for every position of the model with the formula • Property (2) is straightforward to express in LRV. Then by making sure that all the bits before i are preserved.
And by making every bit greater or equal to i be a zero.
And finally by swapping the bit i.
Hence, the property to check is, for every α ∈ [1, n], The formula expressing the property for decrements of counter pointers (prev(α)) is analogous. • Finally, we express property (4) by testing, for every α-incrementing position, If the α-increment has some data value d at variable x α inc , there must be only one future position j where x α dec carries the data value d -since every x α dec has a different value, by (1). For this reason, both positions (the α-increment and the α-decrement) operate on the same counter, and thus the formula faithfully expresses property (4).
As a corollary of Claims 5.8 and 5.9 we obtain a polynomial-time reduction from Gainy(1) into the satisfiability problem for LRV(X, F).

A Robust Equivalence
We have seen that the satisfiability problem for LRV is equivalent to the control state reachability problem in an exponentially larger VASS. In this section we evaluate how robust is this result with LRV variants or fragments. We consider infinite data words (instead of finite data words), finite sets of MSO-definable temporal operators (instead of X, X −1 , S, U) and also repetitions of pairs of values (instead of repetitions of single values).
6.1. Infinite Words with Multiple Data. So far, we have considered only finite words with multiple data. It is also natural to consider the variant with infinite words but it is known that this may lead to undecidability: for instance, freeze LTL with a single register is decidable over finite data words whereas it is undecidable over infinite data words [DL09]. By contrast, FO 2 over finite data words or over infinite data words is decidable [BMS + 06]. Let SAT ω ( ) be the variant of the satisfiability problem SAT( ) in which infinite models of length ω are taken into account instead of finite ones. The satisfaction relation for LRV, PLRV, LRV , etc. on ω-models is defined accordingly. Proof. (I) The developments of Section 4.1 apply to the infinite case, and since SAT ω (PLRV ) is shown decidable in [DDG12], we get decidability of SAT ω (PLRV). (II) First, note that SAT ω (LRV) is 2expspace-hard. Indeed, there is a simple logarithmicspace reduction from SAT(LRV) into SAT ω (LRV), which can be performed as for standard LTL. Indeed, it is sufficient to introduce two new variables x new and y new , to state that x new ≈ y new is true at a finite prefix of the model (herein x new ≈ y new plays the role of a new propositional variable) and to relativize all the temporal operators and obligations to positions on which x new ≈ y new holds true.
Concerning the complexity upper bound, from Section 4.1, we can conclude that there is a polynomial-time reduction from SAT ω (LRV) into SAT ω (LRV ). The satisfiability problem SAT ω (LRV ) is shown decidable in [DDG12] and we can adapt developments from Section 4.2 to get also a 2expspace upper bound for SAT ω (LRV ). This is the purpose of the rest of the proof.
By analyzing the constructions from [DDG12, Section 7], one can show that φ of LRV built over the variables {x 1 , . . . , x k } is ω-satisfiable iff there are Z ⊆ P + ({x 1 , . . . , x k }), a VASS A Z φ = Q, Z, δ along with sets Q 0 , Q f ⊆ Q of initial and final states respectively and a Büchi automaton B Z such that: (2) B Z accepts a non-empty language.
By arguments similar to those from Section 4.2, existence of a run q 0 , 0 * − → q f , 0 can be checked in 2expspace. Observe that in the construction in [DDG12, Section 7], counters in P + ({x 1 , . . . , x k }) \ Z are also updated in A Z φ (providing the VASS A φ ) but one can show that this is not needed because of optional decrement condition and because B Z is precisely designed to take care of the future obligations in the infinite related to the counters in P + ({x 1 , . . . , x k }) \ Z. B Z can be built in exponential time and it is of exponential size in the size of φ. Hence, non-emptiness of B Z can be checked in expspace. Finally, the number of possible subsets Z is only at most double exponential in the size of φ, which allows to get a nondeterministic algorithm in 2expspace and provides a 2expspace upper bound by Savitch's Theorem [Sav70].
Proof. 2expspace-hardness is inherited from LRV. In order to establish the complexity upper bound, first note that LTL extended with a fixed finite set of MSO-definable temporal operators preserves the nice properties of LTL (see e.g. [GK03]): • Model-checking and satisfiability problems are pspace-complete.
• Given a formula φ from such an extension, one can build a Büchi automaton A φ accepting exactly the models for φ and the size of A φ is in O(2 p(|φ|) ) for some polynomial p(·) (depending on the finite set of MSO-definable operators). All these results hold because the set of MSO-definable operators is finite and fixed, otherwise the complexity for satisfiability and the size of the Büchi automata are of non-elementary magnitude in the worst case. Moreover, this holds for finite and infinite models.
In order to obtain the 2expspace, the following properties are now sufficient: (1) Following developments from Section 4.1, it is straightforward to show that there is a logarithmic-space reduction from the satisfiability problem for LRV + {⊕ 1 , . . . , ⊕ N } into the satisfiability problem for LRV + {⊕ 1 , . . . , ⊕ N }.
(2) By combining [GK03] and [DDG12, Theorem 4] (see also Appendix A), a formula φ in LRV + {⊕ 1 , . . . , ⊕ N } is satisfiable iff q 0 , 0 * − → q f , 0 for some q 0 ∈ Q 0 and q f ∈ Q f in some VASS A φ such that the number of states in exponential in |φ|. The relatively small size for A φ is due to the fact that A φ is built as the product of a VASS checking obligations (of exponential size in the number of variables and in the size of the local equalities) and of a finite-state automaton accepting the symbolic models of φ of exponential size thanks to [GK03] (see also more details in Section 4.2) . By using arguments from Section 4.2, we can then reduce existence of a run q 0 , 0 * − → q f , 0 to an instance of the control state reachability problem in some VASS of linear size in the size of A φ , whence the 2expspace upper bound.
Note that PLRV augmented with MSO-definable temporal operators is decidable too. Indeed, it can be translated into PLRV augmented with MSO-definable temporal operators by adapting the developments from Section 4.1. Then, the satisfiability problem of this latter logic can be translated in the reachability problem for VASS as done in [DDG12, Theorem 4] except that finite-state automata are built according to [GK03].
6.3. The Now Operator or the Effects of Moving the Origin. The satisfiability problem for Past LTL with the temporal operator Now is known to be expspace-complete [LMS02]. The satisfaction relation is parameterised by the current position of the origin and past-time temporal operators use that position. For instance, given i, o ∈ N with o ≤ i ('o' is the position of the origin), The powerful operator Now can be obviously defined in MSO but not with the above definition since it requires two free position variables, one of which refers to the current position of the origin and past-time operators are interpreted relatively to that position.
Proof. Again, 2expspace-hardness is inherited from LRV. In order to get the 2expspace, we use arguments similar to those from forthcoming Proposition 6.4. Indeed, the decidability proof from [DDG12, Theorem 4] (see also Appendix A) can be adapted to LRV + Now. The only difference is that the finite-state automaton is of double exponential size, see details below. Despite this exponential blow-up, the 2expspace upper bound can be preserved. Let φ be a formula with k variables. There exist a VASS A dec = Q, C, δ and Q 0 , Q f ⊆ Q such that φ is satisfiable iff there are q f ∈ Q f and q 0 ∈ Q 0 such that q f , 0 * − → q 0 , v for some counter valuation v. Note that the number of counters in A dec is bounded by 2 k , card(Q) is double exponential in |φ| and the maximal value for an update in a transition of A dec is k. Indeed, a formula from Past LTL+Now is equivalent to a Büchi automaton of double exponential size in its size [LMS02]. Moreover, deciding whether a state in Q belongs to Q 0 [resp. Q f ] can be checked in exponential space in |φ| and δ can be decided in exponential space too. By using [Rac78] (see also [DJLL09]), we can easily conclude that is a polynomial, f is a map of double exponential growth and max(A dec ) denotes the maximal absolute value in an update (bounded by k presently). In order to take advantage of the results on VAS (vector addition system -without states-), we use the translation from VASS to VAS introduced in [HP79]: if a VASS has N 1 control states, the maximal absolute value in an update is N 2 and it has N 3 counters, then we can build a VAS (being able to preserve coverability properties) such that it has N 3 + 3 counters, the maximal absolute value in an update is max(N 2 , N 2 1 ). From A dec , we can indeed build an equivalent VAS with a number of counters bounded by 2 k + 3 and with a maximal absolute value in an an update at most double exponential in the size of |φ|. So, the length of the run is at most triple exponential in |φ|. Consequently, the satisfiability problem for LRV + Now is in 2expspace.
This contrasts with the undecidability results from [KSZ10, Theorem 5] in presence of the operator Now. In [KSZ10, Theorem 5], the logic has two sorts of formulas: position formulae from class formulae (a class being a sequence of positions with the same data value). It is a more expressive formalism that can navigate both the word in the usual way, as well as its data classes. 6.4. Bounding the Number of Variables. Given the relationship between the number of variables in a formula and the number of counters needed in the corresponding VASS, we investigate the consequences of fixing the number of variables. Interestingly, this classical restriction has an effect only for LRV , i.e., when test formulas φ are restricted to in x ≈ φ? y. Let LRV k [resp. LRV k , PLRV k ] be the restriction to formulas with at most k variables. In [DDG12,Theorem 5], it is shown that SAT(LRV 1 ) is pspace-complete by establishing a reduction into the reachability problem for VASS when counter values are linearly bounded. Below, we generalize this result for any k ≥ 1 by using the proof of Theorem 4.12 and the fact that the control state reachability problem for VASS with at most k counters (where k is a constant) is in pspace. x 1 x 2 x 3 x σ : σ ϕ : segment 1 segment 2 segment 3 γ 2 γ 3 γ 1 Figure 4: Example of the reduction from SAT(LRV ) into SAT(LRV 1 ), for k = 3, N = 3 and d = 5 Proposition 6.4. For every k ≥ 1, SAT(LRV k ) is pspace-complete.
Proof. pspace-hardness is due to the fact that LTL with a single propositional variable is pspace-hard [DS02], which can be easily simulated with the atomic formula x ≈ Xx. Let k ≥ 1 be some fixed value and φ ∈ LRV k . In the proof of Theorem 4.12, we have seen that there exist a VASS A dec = Q, C, δ and Q 0 , Q f ⊆ Q such that φ is satisfiable iff there are q f ∈ Q f and q 0 ∈ Q 0 such that q f , 0 * − → q 0 , v for some counter valuation v. Note that the number of counters in A dec is bounded by 2 k , card(Q) is exponential in |φ| and the maximal value for an update in a transition of A dec is k. Moreover, deciding whether a state in Q belongs to Q 0 [resp. Q f ] can be checked in polynomial space in |φ| and δ can be decided in polynomial space too. By using [Rac78] (see also [DJLL09]), we can easily conclude that is a polynomial, f is a map of double exponential growth and max(A dec ) denotes the maximal absolute value in an update (bounded by k presently). Since k is fixed, the length of the run is at most exponential in |φ|. Consequently, the following polynomial-space nondeterministic algorithm allows to check whether φ is satisfiable. Guess q f ∈ Q f , q 0 ∈ Q 0 and guess on-the-fly a run of length at most p(|A dec | + max(A dec )) f(k) from q f , 0 to some q 0 , v . Counter valuations can be represented in polynomial space too. By Savitch's Theorem [Sav70], we conclude that the satisfiability problem for LRV k is in pspace.
This does not imply that LRV k is in pspace, since the reduction from LRV into LRV in Section 4.1 introduces new variables. In fact, it introduces a number of variables that depends on the size of the formula. It turns out that this is unavoidable, and that its satisfiability problem is 2expspace-hard, by the following reduction.
Proof. The idea of the coding is the following. Suppose we have a formula ϕ ∈ LRV using k variables x 1 , . . . , x k so that σ |= ϕ.
We will encode σ in a model σ ϕ that encodes in only one variable, say x the whole model of σ restricted to x 1 , . . . , x k . To this end, σ ϕ is divided into N segments s 1 · · · s N of equal length, where N = |σ|. A special fresh data value is used as a special constant. Suppose that d is a data value that is not in σ. Then, each segment s i has length k = 2k + 1, and is defined as the data values "d d 1 d d 2 . . . d d k d", where d j = σ(i)(x j ). Figure 4 contains an example for k = 3 and N = 3. In fact, we can force that the model has this shape with LRV 1 . Note that with this coding, we can tell that we are between two segments if there are two consecutive equal data values. In fact, we are at a position corresponding to x i (for i ∈ [1, k]) inside a segment if we are standing at the 2i-th element of a segment, and we can test this with the formula Using these formulas γ i , we can translate any test x i ≈ ? x j into a formula that (1) moves to the position 2i of the segment (the one corresponding to the x i data value), (2) tests x ≈ γ j ? x. We can do this similarly with all formulas.
Let us now explain in more detail how to do the translation, and why it works. Consider the following property of a given position i multiple of k of a model σ ϕ : if, and only if, j is even, and • either i + k ≥ |σ ϕ | or σ ϕ (i + k ) = σ ϕ (i). This property can be easily expressed with a formula Note that this formula tests that we are standing at the beginning of a segment in particular.
It is now straightforward to produce a LRV 1 formula that tests • σ ϕ , 0 |= segment-k • for every position i so that σ ϕ (i)(x) = σ ϕ (i + 1)(x), we have σ, i + 1 |= segment-k . Let us call many-segments the formula expressing such a property. Note that the property implies that σ ϕ is a succession of segments, as the one of Figure 4. We give now the translation of a formula ϕ of LRV with k variables into a formula ϕ of LRV 1 with 1 variable.
(and similarly for ≈) Now we have to translate x i ≈ ? x j . This would be translated in our encoding by saying that the i-th position of the current segment is equal to the j-th position of a future segment. Note that the following formula does not exactly encode this property. For example, consider that we would like to test x 2 ≈ ? x 3 at the first element of the model σ ϕ depicted in Figure 4. Although ξ 2,3 holds, the property is not true, there is no future segment with the data value 1 in the position encoding x 3 . In fact, the formula ξ encodes correctly the property only when However, this is not a problem since when x i ≈ x j the formula x i ≈ ? x j is equivalent to x j ≈ ? x j . We can then translate the formula as follows.
Recall that x i ≈ ? x j is not translated since such formulae can be eliminated. We then define ϕ = many-segments ∧ tr(ϕ).
Claim 6.6. ϕ is satisfiable if, and only if, ϕ is satisfiable.
Proof. In fact, if σ |= ϕ, then by the discussion above, σ ϕ |= ϕ . If, on the other hand, σ |= ϕ for some σ , then since σ |= many-segments it has to be a succession of segments of size 2k + 1, and we can recover a model σ of size |σ |/2k + 1 where σ(i)(x j ) is the data value of the 2j-th position of the i-th segment of σ . In this model, we have that σ |= ϕ.
This coding can also be extended with past obligations in a straightforward way, . Therefore, there is also a reduction from PLRV into PLRV 1 .
6.5. The Power of Pairs of Repeating Values. Let us consider the last variant of LRV in which the repetition of tuples of values is possible. Such an extension amounts to introducing additional variables in a first-order setting. This may lead to undecidability since 3 variables are enough for undecidability, see e.g. [BDM + 11]. However, LRV makes a restricted use of variables, leading to 2expspace-completeness. There might be hope that LRV augmented with repetitions of pairs of values have a reasonable computational cost. Consider an extension to LRV with atomic formulas of the form (x 1 , . . . , x k ) ≈ φ? (y 1 , . . . , y k ) where x 1 , . . . , x k , y 1 , . . . , y k ∈ VAR. This extends x ≈ ϕ? y by testing whether the vector of data values from the variables (x 1 , . . . , x k ) of the current position coincides with that of (y 1 , . . . , y k ) in a future position.
The semantics are extended accordingly: σ, i |= (x 1 , . . . , x k ) ≈ ϕ? (y 1 , . . . , y k ) iff there exists j such that i < j < |σ|, σ, j |= ϕ, and σ(i)(x l ) = σ(j)(y l ) for every l ∈ [1, k]. We call this extension LRV vec . Unfortunately, we can show that SAT(LRV vec ) is undecidable, even when only tuples of dimension 2 are allowed. This is proved by reduction from a variant of Post's Correspondence Problem (PCP), see below. In order to code solutions of PCP instances, we adapt a proof technique used in [BDM + 11] for first-order logic with two variables and two equivalence relations on words. However, our proof uses only future modalities (unlike the proof of [BDM + 11, Proposition 27]) and no past obligations (unlike the proof of [KSZ10,Theorem 4]). To prove this result, we work with a variant of the PCP problem in which solutions u i 1 · · · u in = v i 1 · · · v in have to satisfy |u i 1 · · · u i j | ≤ |v i 1 · · · v i j | for every j.
Lemma 6.10. The MPCP dir problem is undecidable.
This is a corollary of the undecidability proof for PCP as done in [HMU01]. By reducing MPCP dir to satisfiability for LRV vec , we get undecidability.
Proof. In [HMU01], the halting problem for Turing machines is first reduced to a variation of PCP called modified PCP. In modified PCP, a requirement is that the solution should begin with the pair u 1 , v 1 (this requirement can be easily eliminated by encoding it into the standard PCP, but here we find it convenient to work in the presence of a generalisation of this requirement). To ensure that there is a solution of even length whenever there is a solution, we let u 2 = $u 1 and v is a solution. In modified PCP, |u 1 | = 1 and |v 1 | ≥ 3. Hence, |u 2 | = 2 and |v 2 | ≥ 3. The resulting set of strings u 1 , . . . , u n , v 1 , . . . , v n make up our instance of MPCP dir .
In the encoding of the halting problem from the cited work, |u i | ≤ 2 for any i between 1 and n (this continues to hold even after we add $ to the first pair as above). A close examination of the proof in the cited work reveals that if the modified PCP instance u 1 , . . . , u n , v 1 , . . . , v n ∈ Σ * has a solution u i 1 · · · u im = v i 1 · · · v im , then for every |u i 1 | < j < |u i 1 · · · u im |, if the j th position of v i 1 · · · v im occurs in v i k for some k, then the j th position of u i 1 · · · u im occurs in u i k for some k > k (we call this the directedness property). In short, the reason for this is that for any k ∈ N, if v i 1 · · · v i k encodes the first + 1 consecutive configurations of a Turing machine, then u i 1 · · · u i k encodes the first configurations. Hence, for any letter in v i k+1 (which starts encoding ( + 2) th configuration), the corresponding letter cannot occur in u i 1 · · · u i k +1 (unless the single string u i k+1 encodes the entire ( + 1) st configuration and starts encoding ( + 2) th configuration; this however is not possible since there are at most 3 letters in u i k+1 and encoding a configuration requires at least 4 letters). After the last configuration has been encoded, the length of u i 1 · · · u i k starts catching up with the length of v i 1 · · · v i k with the help of pairs (u, v) where |u| = 2 and |v| = 1. However, as long as the lengths do not catch up, the directedness property continues to hold. As soon as the lengths do catch up, it is a solution to the PCP. Only the positions of u i 1 and the last position of the solution violate the directedness property.
Given an instance pcp of MPCP dir , we construct an LRV vec (X, U) formula φ pcp such that pcp has a solution iff φ pcp is satisfiable. To do so, we adapt a proof technique from [BDM + 11, KSZ10] but we need to provide substantial changes in order to fit our logic. Moreover, none of the results in [BDM + 11,KSZ10] allow to derive our main undecidability result since we use neither past-time temporal operators nor past obligations of the form x, x ≈ ϕ? −1 y, y . LetΣ = {ā | a ∈ Σ} be a disjoint copy of Σ. For convenience, we assume that each position of a LRV model is labelled by a letter from Σ ∪Σ (these labels can be easily encoded as equivalence classes of some extra variables). For such a model σ, let σ Σ (resp. σΣ) be the model obtained by restricting σ to positions labelled by Σ (resp.Σ). If u i 1 · · · u im = v i 1 · · · v im is a solution to pcp, the idea is to construct a LRV model whose projection to Σ ∪Σ is To check that such a model σ actually represents a solution, we will write LRV vec (X, U) formulas to say that "for all j, if the j th position of σ Σ is labelled by a, then the j th position of σΣ is labelled byā".
The main difficulty is to get a handle on the j th position of σ Σ . The difficulty arises since the LRV model σ is an interleaving of σ Σ and σΣ. This is handled by using two variables x and y. Positions 1 and 2 of σ Σ will have the same value of x, positions 2 and 3 will have the same value of y, positions 3 and 4 will have the same value of x and so on. Generalising, odd positions of σ Σ will have the same value of x as in the next position of σ Σ and the same value of y as in the previous position of σ Σ . Even positions of σ Σ will have the same value of x as in the previous position of σ Σ and the same value of y as in the next position of σ Σ . These constraints allow us to chain the positions of σ corresponding to σ Σ . To easily identify odd and even positions, an additional label is introduced at each position, which can be O or E. These labels encoding the parity status of the positions in σ Σ will be helpful to write formulae. The sequence σ Σ looks as follows.  The d i 's and d i 's are data values for the variables x and y respectively. Each label is actually a pair in Σ × {O, O in , E, E fi } to record the letter from Σ and the parity status of the position. The letter O identifies an odd position (a sequence starts here from the first position) and the first position is identified by the special letter O in . Similarly, the letter E identifies an even position and the last position is identified by the special letter E fi . Similarly, in order to define the sequence σΣ, we consider the lettersŌ,Ē,Ō in andĒ fi with analogous intentions. So, by way of example, each position of σ Σ is labelled by two letters: a letter in Σ and a letter specifying the parity of the position.
We assume that the atomic formula a (a ∈ Σ) is true at some position j of a model σ iff the position j is labelled by the letter a. We denote a∈Σ a by Σ. For a word u ∈ Σ * , we denote by φ i u the LRV formula that ensures that starting from i positions to the right of the current position, the next |u| positions are labelled by the respective letters of u (e.g., . We enforce a model of the form ( ) above through the following formulas.
(1) The projection of the labeling of σ on to Σ ∪Σ is in (u 1 v 1 + u 2 v 2 ){u ivi | i ∈ [1, n]} * : to facilitate writing this condition in LRV, we introduce two new variables z 1 b , z 2 b such that at any position, they have the same value only if that position is not the starting point of some pair u i v i .
The same condition for σΣ, enforced with a formula similar to the one above.
Note that O in [resp. E fi ] is useful to identify the first odd position [resp. last even position]. σΣ projected onto For every position i of σΣ labelled byŌ or O in , there exists a future position j > i of σΣ labelled byĒ or E fi so that σΣ(i)(x) = σΣ(j)(x) (enforced with a formula similar to the one above). (d) For every position i of σ Σ labelled by E, there exists a future position j > i of σ Σ labelled by O so that σ Σ (i)(y) = σ Σ (j)(y).
For every position i of σΣ labelled byĒ, there exists a future position j > i of σΣ labelled byŌ so that σΣ(i)(y) = σΣ(j)(y) (enforced with a formula similar to the one above).
(3) For any position i of u i 1 , the corresponding position j in v i 1 · · · v im (which always happens to be in v i 1 , since |u i 1 | ≤ 2 and |v i 1 | ≥ 3) should satisfy σ(i)(x) = σ(j)(x) and σ(i)(y) = σ(j)(y). In addition, position i is labelled with a ∈ Σ iff position j is labelled withā ∈Σ.
Given an instance pcp of MPCP dir , the required formula φ pcp is the conjunction of all the formulas above.
Lemma 6.11. Given an instance pcp of MPCP dir , φ pcp is satisfiable iff pcp has a solution.
Proof. Suppose u i 1 · · · u im = v i 1 · · · v im is a solution of pcp. It is routine to check that, with this solution, a model satisfying φ pcp can be built. Now suppose that φ pcp has a satisfying model σ. From condition (1), we get a sequence Let σ Σ (resp. σΣ) be the model obtained from σ by restricting it to positions labeled with Σ (resp.Σ). Construct a directed graph whose vertices are the positions of σ Σ and there is an edge from i to j iff i < j and σ Σ (i)(x) = σ Σ (j)(x) or σ Σ (i)(y) = σ Σ (j)(y). We claim that the set of edges of this directed graph represents the successor relation induced on σ Σ by σ. To prove this claim, we first show that all positions have indegree 1 (except the first one, which has indegree 0) and that all positions have outdegree 1 (except the last one, which has outdegree 0). Indeed, condition (2a) ensures that for any position, both the indegree and outdegree are at most 1. From conditions (2b), (2c) and (2d), each position (except the last one) has outdegree at least 1. Hence, all positions except the last one have outdegree exactly 1. By the definition of the set of edges, the last position has outdegree 0. If more than one position has indegree 0, it will force some other position to have indegree more than 1, which is not possible. By the definition of the set of edges, the first position has indegree 0 and hence, all other positions have indegree exactly 1. To finish proving the claim (that the set of edges of our directed graph represents the successor relation induced on σ Σ by σ), we will now prove that at any position of σ Σ except the last one, the outgoing edge goes to the successor position in σ Σ . If this is not the case, let i be the last position where this condition is violated. The outgoing edge from i then goes to some position j > i + 1. Since the outgoing edges from each position between i + 1 and j − 1 go to the respective successors, position j will have indegree 2, which is not possible.
Next we will prove that there cannot be two positions of σ Σ with the same value for variables x and y. Suppose there were two such positions and the first one is labeled O (the argument for E is similar). If the second position is also labeled O, then by condition (2c), there is at least one position labeled E with the same value for variable x, so there are three positions with the same value for variable x, violating condition (2a). Hence, the second position must be labelled E. Then by condition (2d), there is a position after the second position with the same value for variable y. This implies there are three positions with the same value for variable y, again violating condition (2a). Therefore, there cannot be two positions of σ Σ with the same value for variables x and y.
Finally, we prove that for every position i of σΣ, if a i is its label, then the unique position in σ Σ with the same value of x and y is position i of σ Σ and carries the label a i . For 1 ≤ i ≤ |u i 1 |, this follows from condition (3). For i = |u i 1 · · · u im |, this follows from condition (5). The rest of the proof is by induction on i. The base case is already proved since |u i 1 | ≥ 1. For the induction step, assume the result is true for all positions of σΣ up to position i. Suppose position i of σΣ is labeled byŌ (the case ofĒ is symmetric). Then by the induction hypothesis and (2b), position i of σ Σ is labeled by O (or O in , if i = 1). By condition (2c) and the definition of edges in the directed graph that we built, position i + 1 of σΣ (resp. σ Σ ) has same value of x as that of position i. We know from the previous paragraph that there is exactly one position of σ Σ with the same value of x and y as that of position i + 1 in σΣ. This position in σ Σ cannot be i or before due to induction hypothesis. If it is not i + 1 either, then there will be three positions of σ Σ with the same value of x, which violates condition (2a). Hence, the position of σ Σ with the same value of x and y as that of position i + 1 in σΣ is indeed i + 1 and it carries the label a i by condition (4).
As a conclusion, the satisfiability problem for LRV vec (X, U) is undecidable. The reduction from SAT(LRV) to SAT(LRV ) can be easily adapted to lead to a reduction from SAT(LRV vec ) to SAT(LRV vec ), whence SAT(LRV vec ) is undecidable too.

Implications for Logics on Data Words
A data word is an element of (Σ × D) * , where Σ is a finite alphabet and D is an infinite domain. We focus here on first-order logic with two variables, and on a temporal logic. 7.1. Two-variable Logics. We study a fragment of EMSO 2 (+1, <, ∼) on data words, and we show that it has a satisfiability problem in 3expspace, as a consequence of our results on the satisfiability for LRV. The satisfiability problem for EMSO 2 (+1, <, ∼) is known to be decidable, equivalent to the reachability problem for VASS [BDM + 11], with no known primitive-recursive algorithm. Here we show a large fragment with elementary complexity.
Consider the fragment of EMSO 2 (+1, <, ∼) -that is, first-order logic with two variables, with a prefix of existential quantification over monadic relations-where all formulas are of the form ∃X 1 , . . . , X n ϕ with for any a ∈ Σ, i ∈ [1, n], and ζ, ζ ∈ {x, y}. The relation x < y tests that the position y appears after x in the word; +1(x, y) tests that y is the next position to x; and x ∼ y tests that positions x and y have the same data value. We call this fragment forward-EMSO 2 (+1, <, ∼). In fact, forward-EMSO 2 (+1, <, ∼) captures EMSO 2 (+1, <) (i.e., all regular languages on the finite labeling of the data word). 1 However, it seems to be strictly less expressive than EMSO 2 (+1, <, ∼), since forward-EMSO 2 (+1, <, ∼) does not appear to be able to express the property there are exactly two occurrences of every data value, which can be easily expressed in EMSO 2 (+1, <, ∼). Yet, it can express there are at most two occurrences of every data value (with ∃X ∀x∀y. x ∼ y ∧ x < y → X(x) ∧ ¬X(y)), and there is exactly one occurrence of every data value. For the same reason, it would neither capture EMSO 2 (<, ∼).
Proof. Through a standard translation we can bring any formula of forward-EMSO 2 (+1, < , ∼) into a formula of the form ϕ = ∃X 1 , . . . , X n ∀x∀y χ ∧ k ∀x∃y (x ≤ y ∧ ψ k ) that preserves satisfiability, where χ and all ψ k 's are quantifier-free formulas, and there are no tests for labels. Furthermore, this is a polynomial-time translation. This translation is just the Scott normal form of EMSO 2 (+1, <, ∼) [BDM + 11] adapted to forward-EMSO 2 (+1, <, ∼), and can be done in the same way.
The translation makes use of: a distinguished variable x that encodes the data values of any data word satisfying ϕ; variables x 0 , . . . , x n that are used to encode the monadic relations X 1 , . . . , X n ; and a variable x prev whose purpose will be explained later on. We give now the translation. To translate ∀x∀y χ, we first bring the formula to a form m∈M ¬∃x∃y x ≤ y ∧ χ m ∧ χ x m ∧ χ y m , (7.1) where M ⊆ N, and every χ x m (resp. χ y m ) is a conjunction of (negations of) atoms of monadic relations on x (resp. y); and χ m = µ ∧ ν where µ ∈ {x=y, +1(x, y), ¬(+1(x, y) ∨ x=y)} and ν ∈ {x ∼ y, ¬(x ∼ y)}.
Claim 7.2. ∀x∀y χ can be translated into an equivalent formula of the form (7.1) in exponential time.
for all µ ∈ {+1(x, y), ¬(+1(x, y) ∨ x=y)}. We can bring the formula into this normal form in exponential time. To this end, we can first bring χ into CNF, χ = i∈I j∈J i ν ij , where every ν ij is an atom or a negation of an atom. Then, Let ν x↔y ij be ν ij where x and y are swapped. Note that ∃x∃y j∈J i ¬ν ij is equivalent to ∃x∃y j∈J i ¬ν x↔y ij . Now for every i ∈ I, let Note that ∃x∃y µ i,1 ∨ ∃x∃y µ i,2 is equivalent to ∃x∃y j∈J i ¬ν ij . Hence, i∈I ¬ j∈{1,2} ∃x∃y (x ≤ y ∧ µ i,j ) ≡ i∈I j∈{1,2} ¬∃x∃y (x ≤ y ∧ µ i,j ) is equivalent to ∀x∀y χ. Finally, every µ i,j can be easily split into a conjunction of three formulas (one of binary relations, one of unary relations on x, and one of unary relations on y), thus obtaining a formula of the form ( ). This procedure takes polynomial time once the CNF normal form is obtained, and it then takes exponential time in the worst case.
We define tr(χ x m ) as the conjunction of all the formulas x 0 ≈ x i so that X i (x) is a conjunct of χ x m , and all the formulas ¬(x 0 ≈ x i ) so that ¬X i (x) is a conjunct of χ x m ; we do similarly for tr(χ y m ). If µ = +1(x, y) and ν = x ∼ y we translate tr ∃y x ≤ y ∧ χ m ∧ χ x m ∧ χ y m = x ≈ Xx ∧ tr(χ x m ) ∧ Xtr(χ y m ). If µ = (x = y) and ν = x ∼ y we translate tr ∃y x ≤ y ∧ χ m ∧ χ x m ∧ χ y m = tr(χ x m ) ∧ tr(χ y m ). We proceed similarly for µ = +1(x, y), ν = ¬(x ∼ y); and the translation is of course ⊥ (false) if µ = (x=y), ν = ¬(x∼y). The difficult cases are the remaining ones. Suppose µ = ¬(+1(x, y) ∨ x=y), ν = x∼y. In other words, x is at least two positions before y, and they have the same data value. Observe that the formula tr(χ x m ) ∧ x ≈ tr(χ y m )? x does not encode precisely this case, as it would correspond to a weaker condition x < y ∧ x ∼ y. In order to properly translate this case we make use of the variable x prev , ensuring that it always has the data value of the variable x in the previous position prev = G X ⇒ x ≈ Xx prev .
We then define tr ∃y x ≤ y ∧ χ m ∧ χ x m ∧ χ y m as tr(χ x m ) ∧ x ≈ x prev ≈ tr(χ y m )?
x? x prev . Note that by nesting twice the future obligation we ensure that the target position where tr(χ y m ) must hold is at a distance of at least two positions. For ν = ¬(x∼y) we produce a similar formula, replacing the innermost appearance of ≈ with ≈ in the formula above. We then define tr(∀x∀y χ) as prev ∧ m∈M ¬ F tr ∃y (x ≤ y ∧ χ m ∧ χ x m ∧ χ y m ) . To translate ∀x∃y (x ≤ y ∧ ψ k ) we proceed in a similar way. As before, we bring x ≤ y ∧ ψ k into the form m∈M x ≤ y ∧ χ m ∧ χ x m ∧ χ y m , in exponential time. We then define tr(∀x∃y (x ≤ y ∧ ψ k )) as prev ∧ G m∈M tr ∃y (x ≤ y ∧ χ m ∧ χ x m ∧ χ y m ) .
One can show that the translation tr defined above preserves satisfiability. More precisely, it can be seen that: (1) Any data word whose data values are the x-projection from a model satisfying tr(ϕ), satisfies ϕ; and, conversely, (2) for any data word satisfying ϕ with a given assignment for X 1 , . . . , X n to the word positions, and for any model σ such that • σ has the same length as the data word, • for every position i, σ(i)(x) is the data value of position i from the data word, and σ(i)(x 0 ) = σ(i)(x j ) iff X j holds at position i of the data word, and • for every position i > 0, σ(i)(x prev ) = σ(i − 1)(x), we have that σ |= tr(ϕ).
By Corollary 4.13 we can decide the satisfiability of the translation in 2expspace, and since the translation is exponential, this gives us a 3expspace upper bound for the satisfiability of forward-EMSO 2 (+1, <, ∼).
Remark 7.3. The proof above can be also extended to work with a similar fragment of EMSO 2 (+1, . . . , +k, <, ∼), that is, EMSO 2 (+1, <, ∼) extended with all binary relations of the kind +i(x, y) for every i ≤ k, with the semantics that y is i positions after x. Hence, we also obtain the decidability of the satisfiability problem for this logic in 3expspace. We do not know if the upper bounds we give for forward-EMSO 2 (+1, <, ∼) and forward-EMSO 2 (+1, . . . , +k, <, ∼) can be improved.
Our result stating that PLRV 1 is equivalent to reachability in VASS (Corollary 6.8), can also be seen as a hardness result for FO 2 (<, ∼, {+k} k∈N ), that is, first-order logic with two variables on data words, extended with all binary relations +k(x, y) denoting that two elements are at distance k. It is easy to see that this logic captures PLRV 1 and hence that it is equivalent to reachability in VASS, even in the absence of an alphabet.
Corollary 7.4. The satisfiability problem for FO 2 (<, ∼, {+k} k∈N ) is as hard as the reachability problem in VASS, even when restricted to having an alphabet Σ = ∅.
Remark 7.5. The proof above can be also extended to work with a similar fragment of EMSO 2 (+1, . . . , +k, <, ∼), that is, EMSO 2 (+1, <, ∼) extended with all binary relations of the kind +i(x, y) for every i ≤ k, with the semantics that y is i positions after x. Hence, we also obtain the decidability of the satisfiability problem for this logic in 3expspace. We do not know if the upper bounds we give for forward-EMSO 2 (+1, <, ∼) and forward-EMSO 2 (+1, . . . , +k, <, ∼) can be improved.
Our result stating that PLRV 1 is equivalent to reachability in VASS (Corollary 6.8), can also be seen as a hardness result for FO 2 (<, ∼, {+k} k∈N ), that is, first-order logic with two variables on data words, extended with all binary relations +k(x, y) denoting that two elements are at distance k. It is easy to see that this logic captures PLRV 1 and hence that it is equivalent to reachability in VASS, even in the absence of an alphabet.
Corollary 7.6. The satisfiability problem for FO 2 (<, ∼, {+k} k∈N ) is as hard as the reachability problem in VASS, even when restricted to having an alphabet Σ = ∅. 7.2. Temporal Logics. Consider a temporal logic with (strict) future operators F = and F = , so that F = ϕ (resp. F = ϕ) holds at some position i of the finite data word if there is some future position j > i where ϕ is true, and the data values of positions i and j are equal (resp. distinct). We also count with "next" operators X k = and X k = for any k ∈ N, where X k = ϕ (resp. X k = ϕ) holds at position i if ϕ holds at position i + k, and the data values of position i + k and i are equal (resp. distinct). Finally, the logic also features a standard until operator U, tests for the labels of positions, and it is closed under Boolean operators. We call this logic LTL(U, F = , F = , {X k = , X k = } k∈N ). There is an efficient satisfiability-preserving translation from LTL(U, F = , F = , {X k = , X k = } k∈N ) into LRV and back and hence we have the following.
Proposition 7.7. The satisfiability problem for LTL(U, Proof sketch. For the 2expspace-membership, there is a straightforward polynomial-time translation from LTL(U, F = , F = , {X k = , X k = } k∈N ) into LRV that preserves satisfiability, where F = ϕ is translated as x ≈ ϕ ? x and F = ϕ is translated as x ≈ ϕ ? x; X k = ϕ as X k ϕ ∧ x ≈ X k x; and any test for label a i by x 0 ≈ x i (ϕ is the translation of ϕ).
On the other hand, given a formula φ of LRV using k variables, consider the alphabet {a 1 , . . . , a k } and the formula ψ k that forces the model to be a succession of 'blocks' of data words of length k with labels a 1 · · · a k (hence, forcing its length to be a multiple of k). We show how to define a formula tr(φ) of LTL(U, F = , F = , {X k = , X k = } k∈N ) so that tr(φ) ∧ ψ k is satisfiable if and only if φ is satisfiable. We define tr(Xφ ) as X k tr(φ ); and tr(Fφ ) as F(a 1 ∧ tr(φ )). For future obligation formulas, we define tr(x i ≈ ? x j ) as X i−1 F = a j if j ≤ i or X i−1 (X j−i = F = a j ∨ (F = a j ∧ X j−i = )) otherwise -note that in the second case special care must be taken to ensure that the data value is repeated at a strictly future block. Finally, for the local obligation formulas, we define tr(x i ≈ X t x j ) as X i−1 X t·k−(i−1)+(j−1) = . Note that this can be easily extended to treat inequalities. Thus, there is a polynomial-time reduction from the satisfiability problem for LRV into that of LTL(U, F = , F = , {X k = , X k = } k∈N ). In fact, LTL(U, F = , F = , {X k = , X k = } k∈N ) corresponds to a fragment of the linear-time temporal logic LTL extended with one register for storing and comparing data values. We denote it by LTL ↓ 1 , and it was studied in [DL09]. This logic contains one operator to store the current datum, one operator to test whether the current datum is equal to the one stored. The freeze operator ↓ ϕ permits to store the current datum in the register and continue the evaluation of the formula ϕ. The operator ↑ tests whether the current data value is equal to the one stored in the register. When the temporal operators are limited to F, U and X, this logic is decidable with non-primitive-recursive complexity [DL09].
Indeed LTL(U, F = , F = , {X k = , X k = } k∈N ) is the fragment where we only allow ↓ and ↑ to appear in the form of ↓ F(↑ ∧ ϕ) and ↓ X k (↑ ∧ ϕ) -or with ¬ ↑ instead of ↑. Markedly, this restriction allows us to jump from a non-primitive-recursive complexity of the satisfiability problem, to an elementary 2expspace complexity.

Conclusion
We introduced the logic LRV that significantly extends the languages in [DDG12], for instance by allowing future obligations of the general form x ≈ φ? y. We have shown that SAT( LRV) can be reduced to the control state reachability problem in VASS, obtaining a 2expspace upper bound as a consequence. Since LRV can be also viewed as a fragment of a logic introduced in [KSZ10] whose satisfiability is equivalent to Reach(VASS), we provide also an interesting fragment with elementary complexity. The reduction into the control state reachability problem involves an exponential blow-up, which is unavoidable as demonstrated by our 2expspace lower bound. To prove this lower bound, we introduced the class of chain systems of level k and we proved the (k + 1)expspace-completeness of the control state reachability problem by extending the proof from [Lip76,Esp98]. This class of systems is interesting for its own sake and could be used to establish other hardness results thanks to our results. We have also shown that the proof technique we used to reduce LRV to the control state reachability problem does not work in the presence of past obligations. Indeed, the satisfiability problem for PLRV (LRV with past obligations) is as hard as Reach(VASS) which witnesses that past obligations have a computational cost. Furthermore, note that none of our lower bound proofs involve the past-time operators X −1 and S. Finally, a new correspondence between data logics and decision problems for VASS is provided, apart from the fact that the complexity of several data logics has been characterised, some of the logics being defined with first-order features (see Section 7). A summary of the results can be found in Figure 5.