Quadratic Word Equations with Length Constraints, Counter Systems, and Presburger Arithmetic with Divisibility

Word equations are a crucial element in the theoretical foundation of constraint solving over strings, which have received a lot of attention in recent years. A word equation relates two words over string variables and constants. Its solution amounts to a function mapping variables to constant strings that equate the left and right hand sides of the equation. While the problem of solving word equations is decidable, the decidability of the problem of solving a word equation with a length constraint (i.e., a constraint relating the lengths of words in the word equation) has remained a long-standing open problem. In this paper, we focus on the subclass of quadratic word equations, i.e., in which each variable occurs at most twice. We first show that the length abstractions of solutions to quadratic word equations are in general not Presburger-definable. We then describe a class of counter systems with Presburger transition relations which capture the length abstraction of a quadratic word equation with regular constraints. We provide an encoding of the effect of a simple loop of the counter systems in the theory of existential Presburger Arithmetic with divisibility (PAD). Since PAD is decidable, we get a decision procedure for quadratic words equations with length constraints for which the associated counter system is \emph{flat} (i.e., all nodes belong to at most one cycle). We show a decidability result (in fact, also an NP algorithm with a PAD oracle) for a recently proposed NP-complete fragment of word equations called regular-oriented word equations, together with length constraints. Decidability holds when the constraints are additionally extended with regular constraints with a 1-weak control structure.


Introduction
Reasoning about strings is a fundamental problem in computer science and mathematics. The full first order theory over strings and concatenation is undecidable. A seminal result by Makanin [26] (see also [12,18]) shows that the satisfiability problem for the existential fragment is decidable, by showing an algorithm to check satisfiability of word equations. Precisely, a word equation L = R consists of two words L and R over an alphabet of constants and variables. Such an equation is satisfiable if there is a mapping σ from the variables to strings over the constants such that σ(L) and σ(R) are syntactically identical.
An original motivation for studying word equations was to show undecidability of Hilbert's 10th problem (see, e.g., [28]). While Makanin's later result shows that word equations could not, by themselves, show undecidability, Matiyasevich in 1968 considered an extension of word equations with length constraints as a possible route to showing undecidability of Hilbert's 10th problem [28]. A length constraint constrains the solution of a word equation by requiring a linear relationship to hold on the lengths of words in a solution σ. For example, a length constraint might require that a solution maps variable x and variable y to words of the same length. The decidability of word equations with length constraints remains open.
In recent years, reasoning about strings with length constraints has found renewed interest through applications in program verification and reasoning about security vulnerabilities. The focus of most research has been on developing practical string solvers [1,3,6,8,14,15,17,19,24,[32][33][34][35][36]. These solvers are sound but make no claims of completeness. Relatively few results are known about the decidability status of strings with length and other constraints (see [10] for an overview of the results in this area). The main idea in most existing decidability results is the encoding of length constraints into Presburger arithmetic [10,16]. However, the length abstraction of a word equation, that is, the set of possible lengths of variables in its solutions, need not be Presburger definable. (Indeed, this was Matiyasevich's motivation in studying this problem as a way to prove undecidability of Hilbert's 10th problem.) In this paper, we consider the case of quadratic word equations, in which each variable can appear at most twice [13,22], together with length and regularity constraints. For quadratic word equations, there is a simpler decision procedure (called the Nielsen transform or Levi's method) based on a non-deterministic proof tree construction. The technique can be extended to handle regular constraints [13]. However, we show that already for this class (even for a simple equation like xaby = yabx, where x, y are variables and a, b are constants), the length abstraction need not be Presburger-definable. Thus, techniques based on Presburger encodings are not sufficient to prove decidability.
Our first observation in this paper is a connection between the problem of quadratic word equations with length constraints and a class of counter systems with Presburger transitions. Informally, the counter system has control states corresponding to the nodes of the proof tree constructed by Levi's method, and a counter standing for the length each word variable. Each step of Levi's method may decrease at most one counter. Thus, from any initial state, the counter system terminates. We show that the set of initial counter values which can lead to a successful leaf (i.e., one containing the trivial equation ǫ = ǫ) is precisely the length abstraction of the word equation.
Our second observation is that the reachability relation for a simple loop of the counter system can be encoded in the existential theory of Presburger arithmetic with divisibility PAD. The encoding is non-trivial in the presence of regular constraints, and depends on structural results on semilinear sets. As PAD is decidable [21,25], we obtain a technique to symbolically represent the reachability relation for flat counter systems, in which each node belongs to at most one loop.
Moreover, the same encoding shows decidability for word equations with length constraints, provided the proof tree is associated with flat counter systems. In particular, we show that the class of regular-oriented word equations, introduced by [11], have flat proof trees. Thus, the satisfiability problem for quadratic regular-oriented word equations with length constraints is decidable (and in NEXP). 3 While our decidability result is for a simple subclass, this class is already non-trivial without length and regular constraints: satisfiability of regular-oriented word equations is NP-complete [11]. Our result generalizes previous decidability results [10]. Moreover, we believe that the techniques introduced in this paper, such as the connection between acceleration and word equations, and the use of existential Presburger with divisibility, can open the way to more sophisticated decision procedures or tools based on acceleration designed for counter systems.

Preliminaries
General notation: Let N = Z ≥0 be the set of all natural numbers. For integers i ≤ j, If S is a set, we use S * to denote the set of all finite sequences γ = s 1 . . . s n over S. The length |γ| of γ is n. The empty sequence is denoted by ǫ. Notice that S * forms a monoid with the concatenation operator ·. If γ ′ is a prefix of γ, we write γ ′ γ. Additionally, if γ ′ = γ (i.e. a strict prefix of γ), we write γ ′ ≺ γ. Note that the operator is overloaded here, but the meaning should be clear from the context.

Words and Automata:
We assume basic familiarity with word combinatorics and automata theory. Fix a (finite) alphabet A. For each finite word w := w 1 . . . w n ∈ A * , we write w[i, j], where 1 ≤ i ≤ j ≤ n, to denote the segment w i . . . w j . We write ǫ for the empty word.
Two words x and y are conjugates if there exist words u and v such that x = uv and y = vu. Equivalently, x = cyc k (y) for some k and for the cyclic permutationoperation cyc : A * → A * , defined as cyc(ǫ) = ǫ, and cyc(a · w) = w · a for a ∈ A and w ∈ A * . Given a nondeterministic finite automaton (NFA) A := (A, Q, ∆, q 0 , q F ), a run of A on w is a function ρ : N → Q with ρ(0) = q 0 that obeys the transition relation ∆. We may also denote the run ρ by the word ρ(0) · · · ρ(n) over the alphabet Q. The run ρ is said to be accepting if ρ(n) = q F , in which case we say that the word w is accepted by A. The language L(A) of A is the set of words in A * accepted by A. In the sequel, for p, q ∈ Q we will write A p,q to denote the NFA A with initial state replaced by p and final is replaced by q.
Word equations: Let A be a (finite) alphabet of constants and V a set of variables; A system is called quadratic if each variable occurs at most twice in all. A solution to a system of word equations is a homomorphism σ : (A ∪ V ) * → A * which maps each a ∈ A to itself that equates the l.h.s. and r.h.s. of each equation, i.e., σ(L i ) = σ(R i ) for each i = 1, . . . , k.
For each variable x ∈ V , we shall use |x| to denote a formal variable that stands for the length of variable x. Let L V be the set {|x| | x ∈ V }. A length constraint is a formula in Presburger arithmetic whose free variables are in L V .
A solution to a system of word equations with a length constraint Φ(l x1 , . . . , l xn ) is a homomorphism σ : (A ∪ V ) * → A * which maps each a ∈ A to itself such that σ(L i ) = σ(R i ) for each i = 1, . . . , k and moreover Φ(|σ(x 1 )|, . . . , |σ(x n )|) holds. That is, the homomorphism maps each variable to a word in A * such that each word equation is satisfied, and the lengths of these words satisfy the length constraint.
The satisfiability problem for word equations with length constraints asks, given a system of word equations and a length constraint, whether it has a solution.
We also consider the extension of the problem with regular constraints. For a system of word equations, a variable x ∈ V , and a regular language L ⊆ A * , a regular constraint x ∈ L imposes the additional restriction that any solution σ must satisfy σ(x) ∈ L. Given a system of word equations, a length constraint, and a set of regular constraints, the satisfiability problem asks if there is a solution satisfying the word equation, the length constraints, as well as the regular constraints.
In the sequel, for clarity of exposition, we restrict our discussion to a system consisting of a single word equation (w.l.o.g.).
Linear arithmetic with divisibility: Let P be a first-order language with equality, with binary relation symbol ≤, and with terms being linear polynomials with integer coefficients. We write f (x), g(x), etc., for terms in integer variables x = x 1 , . . . , x n . Atomic formulas in Presburger arithmetic have the form f (x) ≤ g(x) or f (x) = g(x). The language PAD of Presburger arithmetic with divisibility extends the language P with a binary relation | (for divides). An atomic formula has the form have the form f (x) ≤ g(x) or f (x) = g(x) of f (x)|g(x), where f (x) and g(x) are linear polynomials with integer coefficients. The full first order theory of PAD is undecidable, but the existential fragment is decidable [21,25].
Note that the divisibility predicate x|y is not expressible in Presburger arithmetic: a simple way to see this is that {(x, y) ∈ N 2 | x|y} is not a semi-linear set.
Counter systems: In this paper, we specifically use the term "counter systems" to mean counter systems with Presburger transition relations (e.g. see [4]). These more general transition relations can be simulated by standard Minsky's counter machines, but they are more useful for coming up with decidable subclasses of counter systems. A counter system C is a tuple (X, Q, ∆), where X = {x 1 , . . . , x m } is a finite set of counters, Q is a finite set of control states, and ∆ is a finite set of transitions of the form (q, Φ(x,x ′ ), q ′ ), where q, q ′ ∈ Q and Φ is a Presburger formula with free variables The semantics of counter systems is given as a transition system. A transition system is a tuple S := S; → , where S is a set of configurations and → ⊆ S × S is a binary relation over S. A path in S is a sequence s 0 → · · · → s n of configurations s 0 , ..., s n ∈ S.
A counter system C generates the transition system S C = S; → , where S is the set of all configurations of C, In the sequel, we will be needing the notion of flat counter systems [4,5,7,23]. Given a counter system C = (X, Q, ∆), the control structure of C is an edge-labeled directed graph G = (V, E) with the set V = Q of nodes and the set E = ∆. The counter system C is flat if each node v ∈ V is contained in at most one simple cycle.

Solving Quadratic Word Equations
We start by recalling a simple textbook recipe (called Nielsen transformation, a.k.a., Levi's Method) [12,22] for solving quadratic word equations, both for the cases with and without regular constraints. We then discuss the length abstractions of solutions to quadratic word equations, and provide several natural examples that are not Presburgerdefinable.

Nielsen transformation
We will define a rewriting relation E ⇒ E ′ between quadratic word equations E, E ′ . Let E be an equation of the form αw 1 = βw 2 with w 1 , w 2 ∈ (A ∪ V ) * and α, β ∈ A ∪ V . Then, there are several possible E ′ : -Rules for erasing an empty prefix variable. These rules can be applied if α ∈ V (symmetrically, β ∈ V ). In this case, we can nondeterministically guess that α be the empty word ǫ. That is, The symmetric case of β ∈ V is similar. -Rules for removing a nonempty prefix. These rules are applicable if each of α and β is either a constant or a variable that we nondeterministically guess to be a nonempty word. There are several cases: In this case, we nondeterministically guess if α β or β α. In the former case, the equation Note that the transformation keeps an equation quadratic.
See [12] for a proof. Roughly speaking, the proof uses the fact that each step either decreases the size of the equation, or the length of a length-minimal solution. It runs in PSPACE (in fact, linear space) because each rewriting does not increase the size of the equation.

Handling regular constraints
Nielsen transformation easily extends to quadratic word equations with regular constraints (e.g. see [13]). We assume that a regular constraint x ∈ L is given as an NFA A p,q representing L. [If q 0 and q F are the initial and final states (respectively) of an NFA A, we can be more explicit and write A q0,qF instead of A.] Our rewriting relation ⇒ now works over a pair consisting of an equation E and a set S of regular constraints over variables in E. Let E be an equation of the form by extending the definition of ⇒ without regular constraints. In particular, it has to be the case that E ⇒ E ′ and additionally do the following: -Rules for erasing an empty prefix variable x. When applied, ensure that each regular constraint x ∈ L in S satisfies ǫ ∈ L. Define S ′ as S minus all regular constraints of the form x ∈ L. -Rules for removing a nonempty prefix. For (P1), we have set S ′ to be S minus all the other case is symmetric. For each regular constraint β ∈ L(A p,q ), we nondeterministically guess a state r, and add α ∈ L(A p,r ) and β ∈ L(A r,q ) to S ′ . In the case when α ∈ A, we could immediately perform the check α ∈ L(A p,r ): a positive outcome implies removing this constraint from S ′ , while on a negative outcome our algorithm simply fails on this branch. For any variable y that is distinct from x, we add all regular constraints y ∈ L in S to S ′ .
Note that this is still a PSPACE algorithm because it never creates a new NFA or adds new states to existing NFA in the regular constraints, but rather adds a regular constraint x ∈ L(A p,q ) to a variable x, where A is an NFA that is already in the regular constraint.

Generating all solutions using Nielsen transformation
One result that we will need in this paper is that Nielsen transformation is able to generate all solutions of quadratic word equations with regular constraints. To clarify this, we extend the definition of ⇒ so that each a configuration E or (E, S) in the graph of ⇒ is also annotated by an assignment σ of the variables to concrete strings. We write and σ 2 is the modification from σ 1 according to the operation used to obtain E 2 from E 1 . That is, suppose that σ 1 (x) = ab and σ 1 (y) = abaaab and E 1 := xy = yx and E 2 := E 1 [xy/y]. In this case, σ 2 (x) = σ 1 (x) = ab but σ 2 (y) = aaab since we have taken off the prefix σ 1 (x) from σ 1 (y). This definition for the case with regular constraints is identical.
This proposition immediately follows from the proof of correctness of Nielsen transformation for quadratic word equations [12].

Length abstractions and semilinearity
Given a quadratic word equation E with constants A and variables V = {x 1 , . . . , x k }, its length abstraction is defined as follows namely the set of tuples of numbers corresponding to lengths of solutions to E.
Example 1. Consider the quadratic equation E := xaby = yz, where V = {x, y, z} and A contains at least two letters a and b. We will show that its length abstraction LEN(E) can be captured by the Presburger formula |x| = |y| + 2. Observe that each (n x , n y , n z ) ∈ LEN(E) must satisfy n x = n y + 2 by a length argument on E. Conversely, we will show that each triple (n x , n y , n z ) ∈ N 3 satisfying n x = n y + 2 must be in LEN(E). To this end, we will define a solution σ to E such that However, it turns out that Presburger Arithmetic is not sufficient for capturing length abstractions of quadratic word equations.
There is a quadratic word equation whose length abstraction is not Presburger-definable.
To this end, we show that the length abstraction of xaby = yabx, where a, b ∈ A and x, y ∈ V , is not Presburger definable.
Lemma 1. The length abstraction LEN(xaby = yabx) coincides with tuples (|x|, |y|) of numbers satisfying the expression ϕ(|x|, |y|) defined as: Observe that this would imply non-Presburger-definability: for otherwise, since the first three disjuncts are Presburger-definable, the last disjunct would also be Presburgerdefinable, which is not the case since the property that two numbers are relatively prime is not Presburger-definable. Note however, that the expression is definable in existential Presburger arithmetic with divisibility. Let us prove this lemma. Let S = LEN(xaby = yabx). We first show that given any numbers n x , n y satisfying ϕ(n x , n y ), there are solutions σ to xaby = yabx with σ(α) = n α for each α ∈ {x, y}. If they satisfy the first disjunct in ϕ (i.e. n x = n y ), then set σ(x) = σ(y) to an arbitrary word w ∈ A nx . If they satisfy the second disjunct, then aby = yab and so set σ(x) = ǫ σ(y) ∈ (ab) * . The same goes with the third disjunct. For the fourth disjunct (assuming the first three disjuncts are false), let d = gcd(n x + 2, n y + 2). Define σ(x), σ(y) ∈ (a d−1 b) * (a d−2 ) so that |σ(α)| = n α for α ∈ V . It follows that σ(x)abσ(y) = σ(y)abσ(x).
We now prove the converse. So, we are given a solution σ to xaby = yabx and let u := σ(x), v := σ(y). Assume to the contrary that ϕ(|u|, |v|) is false and that u and v are the shortest such solutions. We have several cases to consider: u = v. Then, |u| = |v|, contradicting that ϕ(|u|, |v|) is false.

Reduction to Counter Systems
In this section, we will provide an algorithm for computing a counter system from (E, S), where E is a quadratic word equation and S is a set of regular constraints. We will first describe this algorithm for the case without regular constraints, after which we show the extension to the case with regular constraints.
Given the quadratic word equation E, we show how to compute a counter system C(E) = (X, Q, ∆) such that the following theorem holds. Before defining C(E), we define some notation. Define the following formulas: Note that the = symbol in the guard of denotes syntactic equality (i.e. not equality in Preburger Arithmetic). We omit mention of the free variablesx andx ′ when they are clear from the context. We now define the counter system. Given a quadratic word equation E with constants A and variables V , we define a counter system C(E) = (X, Q, ∆) as follows.
The counters X will be precisely all variables that appear in E, i.e., X := V . The control states are precisely all equations E ′ that can be rewritten from E using Nielsen transformation, i.e., Q := {E ′ : E ⇒ * E ′ }. The set Q is finite (at most exponential in |E|) as per our discussion in the previous section.
We now define the transition relation ∆. We usex to enumerate V in some order.
where Φ is defined as follows: -If E 1 ⇒ E 2 applies a rule for erasing an empty prefix variable y ∈x, then Φ := y = 0 ∧ ID.
-If E 1 ⇒ E 2 applies a rule for removing a nonempty prefix: This implies the following lemma.

The proof of Theorem 2 immediately follows from Proposition 3 that Nielsen transformation generates all solutions.
Extension to the case with regular constraints: In this extension, we will only need to assert that the counter values belong to the length abstractions of the regular constraints, which are effectively semilinear due to Parikh's Theorem [29]. Given a quadratic word equation E with a set S of regular constraints, we define the counter system C(E, S) = (X, Q, ∆) as follows. Let C(E) = (X 1 , Q 1 , ∆ 1 ). Let X = X 1 . Let Q be the finite set of all configurations reachable from (E, S), i.e., Q = {(E ′ , S ′ ) : (E, S) ⇒ * (E ′ , S ′ )}. Given (E 1 , S 1 ) ⇒ (E 2 , S 2 ), we add the transition ((E 1 , S 2 ), Φ(x,x ′ ), (E 2 , S 2 )) as follows. Suppose that (E 1 , Φ ′ (x,x ′ ), E 2 ) was added to ∆ 1 by E 1 ⇒ E 2 . Then, The size of the NFA for (x∈L)∈S L is exponential in the number of constraints of the form (x ∈ L) in S (of which there are polynomially many). The constraint x ∈ LEN(L) is well-known to be effectively semilinear [29], and in fact we can compute using the algorithm of Chrobak-Martinez [9,27,30] in polynomial time two finite sets A, A ′ of integers and an integer b such that, for each n ∈ N, n ∈ U := A ∪ (A ′ + bN) is true iff n ∈ LEN(L). Note that U is a fintie union of arithmetic progressions (with period 0 and/or b). In fact, each number a ∈ A ∪ A ′ (resp. the number b) is at most quadratic in the size of the NFA, and so it is a polynomial 4 size even when they are written in unary. Therefore, treating U as an existential Presburger formula ϕ(x) with one free variable (an existential quantifier is needed to guess the coefficient n such that x = a i + bn for some i), the resulting Φ ′ is a polynomial-sized existential Presburger formula.
As for the case without regular constraints, the proof of Theorem 2 immediately follows from Proposition 3 that Nielsen transformation generates all solutions.
5 Decidability via Linear Arithmetic with Divisibility 5.1 Accelerating a 1-variable-reducing cycle Consider a counter system C = (X, Q, ∆) with X = {q 0 , . . . , q n−1 }, and for some y ∈ X the transition relation ∆ consists of precisely the following transition (q i , Φ i , q i+1 (mod n) ), for each i ∈ [n − 1], such that Φ i is either SUB y,z (with z a variable distinct from y) or DEC y . Such a counter system is said to be a 1-variable-reducing cycle.
Lemma 3. There exists a polynomial-time algorithm which given a 1-variablereducing cycle C = (X, Q, ∆) and two states p, q ∈ Q computes an formula ϕ p,q (x,x ′ ) in existential Presburger with divisibility such that This lemma can be seen as a special case of the acceleration lemma for flat parametric counter automata [7] (where all variables other than y are treated as parameters). However, its proof is in fact quite simple. Without loss of generality, we assume that q = q 0 and p = q i , for some i ∈ N. Any path (q 0 , v) → * C (q i , w) can be decomposed into the cycle (q 0 , v) → * (q 0 , v ′ ) and the simple path (q 0 , w 0 ) → · · · → (q i , w i ) of length i. Therefore, the reachability relation (q 0 , x) → * C (q i , y) can be expressed as Thus, it suffices to show that ϕ q0,q0 (x, x ′ ) is expressible in PAD. Define a multiset M of counter decrements as follows: -The number of the integer constant 1 M can contain is defined as the number of i such that Φ i = DEC y . -For each variable x ∈ V \ {y}, the number of times x could appear in M is defined as the number of i such that Φ i = SUB y,x .
For any variable/constant e, we will write M (e) to denote the number of times e appears in M . Therefore, for some n ∈ N we have y ′ = y − n e M (e), or equivalently n e M (e) = y − y ′ The formula ϕ q0,q0 can be defined as follows: Handling unary Presburger guards: Recalling our reduction for the case with regular constraints from Section 4 reveals that we also need unary Presburger guards on the counters. We will show how to extend our aforementioned acceleration lemma to handle such guards. As we will see shortly, we will need a bit of the theory of semilinear sets. As before, our counter system C = (X, Q, ∆) has counters X = {q 0 , . . . , q n−1 }, and the control structure is a simple cycle of length n, i.e., the transitions in ∆ are precisely (q i , Φ i , q i+1 (mod n) ) for some Presburger formula Φ i (x,x ′ ), for each i ∈ [n − 1]. We say that C is 1-variable-reducing with unary Presburger guards if there exists a counter y ∈ X such that each Φ i is of the form θ i ∧ ψ i , where θ i is either SUB y,z (with z a variable distinct from y) or DEC y , and ψ i is a conjunction of formulas of the form x ∈ A i ∪ (A ′ i + bN), where both A i and A ′ i are finite sets of natural numbers and x ∈ X. For each counter x ∈ X, we use ψ i,x to denote the set of conjuncts in ψ i that refers to the counter x.

Lemma 4.
There exists a polynomial-time algorithm which given a 1-variablereducing cycle with unary Presburger guards C = (X, Q, ∆) and two states p, q ∈ Q computes an formula λ p,q (x,x ′ ) in existential Presburger with divisibility such that Unlike Lemma 3, this lemma does not immediately follow from the results of [7] on flat parametric counter automata. To prove this, let us first take the formula ϕ p,q (x,x ′ ) from Lemma 3 applied to C ′ , which is obtained from C by first removing the unary Presburger guards. We can insert these unary Presburger guards to ϕ p,q , but this is not enough because we need to make sure that all "intermediate" values of y have to also satisfy the Presburger guards corresponding to y on that control state. More precisely, let the counter decrement in θ i be α i (which can either be a variable x distinct from y or 1). For j ∈ [n − 1], we use f j (x) to denote j i=0 α i . Write f (x) for f n−1 (x). Then, we can write This is a correct expression that captures the reachability relation (q 0 , w) → * C (q 0 , w ′ ), but the problem is that it has a universal quantifier and therefore is not a formula of PAD. To fix this problem, we will need to exploit the semilinear structure of unary Presburger guards. To this end, we first notice that, by taking the big conjunction over i and the big conjuncton over α i out, the formula η q0,q0 is equivalent to: Therefore, it suffices to rewrite each conjunct C(x) := ∀k : y ′ + (k + 1)f (x) ≤ y −→ (y ′ + kf (x) + α i ∈ A ∪ (A ′ + bN) as an existential Presburger formula, for each i and constraint (α i ∈ A ∪ A ′ + bN). To this end, let a := max A and let N denote |A ′ |. We claim that Simply put, we distinguish the cases when y ′ + if (x) + α i is "small" (i.e., less than the maximum threshold that can keep this number in an arithmetic progression with 0 period), and when this number is "big" (i.e. must be in an arithmetic progression with a nonzero period). To prove this equivalence, it suffices to show that if y ′ + kf (x) + α i / ∈ A ∪ (A ′ + bN) with k > a + N + 1 and y ′ + kf (x) + α i ≤ y, then we can find bN). Suppose to the contrary that such k ′ does not exist. Then, since there are N + 1 numbers in between a + 1 and a + N + 1, by pigeonhole principle there is an arithmetic progression a ′ + bN and two different numbers a + 1 ≤ j 1 < j 2 ≤ a + N + 1 such that y ′ + j h f (x) + α i ∈ a ′ + bN, for h = 1, 2. Let d := (j 2 − j 1 ). Note that df (x) denotes the difference between y ′ + j 1 f (x) + α i and y ′ + j 2 f (x) + α i , and this difference is of the form mb, for some positive integer m. We now find a number j ∈ [a + 1, a + N ] with j + qd = k for some positive integer q. Since y ′ + jf (x) + α i ∈ a ′′ + bN for some a ′′ ∈ A ′ , it must be the case that y ′ + (j + qd)f (x) + α i ∈ a ′′ + bN for q ∈ N, contradicting that y ′ + kf (x) + α i / ∈ A ∪ (A ′ + bN). We have proven correctness, and what remains is to analyse the size of the formula λ q0,q0 . To this end, it suffices to show that each formula C(x) is of polynomial size. This is in fact the case since there are at most polynomially many numbers in A and A ′ and that the size of all numbers in A ∪ A ′ ∪ {b} are of polynomial size even when they are written in unary.

An extension to flat control structures and an acceleration scheme
The following generalisation to flat control structures is an easy corollary of Lemma 3 and 4.

Theorem 4.
There exists a polynomial-time algorithm which, given a flat Presburger counter system C = (X, Q, ∆), each of whose simple cycle is 1-variable-reducing with unary Presburger guards and two states p, q ∈ Q, computes an formula λ p,q (x,x ′ ) in existential Presburger with divisibility such that (p, v) → * C (q, w) iff λ p,q (v, w) is satisfiable.
Indeed, to prove this theorem, we can simply use Lemma 4 to accelerate all cycles and the fact that transition relations expressed in existential Presburger with divisibility is closed under composition.

Application to word equations with length constraints
Theorem 4 gives rise to a simple and sound (but not complete) technique for solving quadratic word equations with length constraints: given a quadratic word equation (E, S) with regular constraints, if the counter system C(E, S) is flat, each of whose simple cycle is 1-variable-reducing with unary Presburger guards, then apply the decision procedure from Theorem 4. In this section, we show completeness of this method for the class of regular-oriented word equations recently defined in [11], which can be extended with regular constraints given as 1-weak NFA [2]. A word equation is regular if each variable x ∈ V occurs at most once on each side of the equation. Observe that xy = yx is regular, but xxyy = zz is not. It is easy to see that a regular word equation is quadratic. A word equation L = R is said to be oriented if there is a total ordering < on V such that the occurrences of variables on each side of the equation preserve <, i.e., if w = L or w = R and w = w 1 αw 2 βw 3 for some w 1 , w 2 , w 3 ∈ (A ∪ V ) * and α, β ∈ V , then α < β. Observe that xy = yz (i.e. that x and z are conjugates) is oriented, but xy = yx is not oriented. It was shown in [11] that the satisfiability for regular-oriented word equations is NP-hard. We show satisfiability for this class with length constraints is decidable.

Theorem 5. The satisfiability problem of regular-oriented word equations with length constraints is decidable in nondeterministic exponential time.
This decidability (in fact, an NP upper bound) for the strictly regular-ordered subcase (i.e. each variable occurs precisely once on each side) was proven in [10]. For solving this subcase, it was shown that Presburger Arithmetic is sufficient, but the decidability for the general class of regular-oriented word equations with length constraints remains open, which we prove in this paper. Before proving this lemma, we show a simple lemma that ⇒ preserves regularorientedness. Its proof can be found in the appendix. Lemma 6. If E ⇒ E ′ and E is regular-oriented, then E ′ is also regular-oriented.
We now prove Lemma 5. Let E := L = R. We first show that the length of a simple cycle in the control structure of C(E) is of length at most N = max{|L|, |R|} − 1. Given a simple cycle E 0 ⇒ E 1 ⇒ · · · ⇒ E n with n > 0 (i.e. E 0 = E n and E i = E j for all 0 ≤ i < j < n), it has to be the case that each rewriting in this cycle applies one of the (P2)-(P4) rules since the other rules reduce the size of the equation. We have will be easily seen to be symmetric. This assumption implies that β 0 is a variable y, and that L 0 = uyv for some words u, v ∈ (A ∪ V ) * (for, otherwise, |E 1 | < |E 0 |). Furthermore, it follows that, for each i ∈ [n − 1], E i+1 = E i [β i α i /β i ], i.e., the counter system C(E) applies either SUB y,x (in the case when x = α i ) or DEC y (in the case when α i ∈ A). For, otherwise, taking a minimal i ∈ [1, implying that the length of the cycle is at most |R 0 | − 1 ≤ |R| − 1.
Consider the control structure C(E) as a dag of SCCs. In this dag, each edge from one SCC to the next is size-reducing. Therefore, the maximal length of a path in this dag is |E|. Therefore, since the maximal path of each SCC is N (from the above analysis), the maximal length of a simple path in the control structure is at most N 2 . Handling regular constraints: It is difficult to extend Theorem 5 to the case with regular constraints because they may introduce nestings of cycles (which breaks the flat control structure) even for regular-oriented word equation. However, we can show that restricting to regular constraints given by 1-weak NFA [2] (i.e. a dag of SCCs, each with at most one state) preserves the flat control structure. The class of 1-weak automata is in fact quite powerful, e.g., when considered as recognisers of languages of ω-words, they capture the subclass of LTL with operators F and G [2]. They have also been used to obtain a decidable extension of infinite-state concurrent systems in term rewriting systems, e.g., see [20,31]. In the context of quadratic word equations, we can use 1weak NFA to capture the regular constraint x, y ∈ #(a + b) * , which in conjunction with xy = yz gives rise a non-Presburger length abstraction. Such an NFA will have two states q 0 and q 1 , and transitions q 0 # −→ q 1 and q 1 a,b −→ q 1 , where q 0 is an initial state and q 1 a final state. By virtue of Theorem 4, this lemma implies decidability of Theorem 6, but it does NOT imply the nondeterministic exponential time upper bound since each unary Presburger guard in C(E) will be of the form x ∈ LEN( (x∈L)∈S L). Even though we know that |S| is always of a polynomial size, their intersection requires performing a product automata construction, which will result in an NFA of an exponential size. Therefore, we obtain a nondeterministic double exponential time complexity upper bound (2NEXP), instead of NEXP as for the case without regular constraints. The proof of Lemma 7 can be found in the appendix. Remark 1. Our proof of Theorem 6 does not extend to the case when we allow generalised flat NFA (i.e. after mapping all the letters in A to a new symbol '?', the control structure of the NFA is flat) in the regular constraints. This is because a simple cycle involving two or more states will result in a counter system that is no longer flat.
Finally, we mention that the length abstraction of regular-oriented word equations with regular constraints is in general not Presburger-definable (see appendix for proof).

Future Work
One obvious research direction is to study extensions of our techniques to deal with the class of regular (but not necessarily oriented) word equations with length constraints. We believe that this is a key subproblem of the general class of quadratic word equations with length constraints. Finally, we conjecture that the length abstractions of general quadratic word equations with regular constraints can be effectively captured by existential Presburger with divisibility.

APPENDIX A Proof of Lemma 6
It is easy to see each rewriting rule preserves regularity. Now, because E ′ is regular, to show that E ′ := L = R is also oriented it is sufficient and necessary to show that there are no two variables x, y such that x occurs before y in L, but y occurs before x in R. All rewriting rules except for (P2)-(P4) are easily seen to preserve orientedness. Let us write E := αw 1 = βw 2 with α = β, and assume E ′ = E[αβ/β]; the case of E ′ = E[βα/α] is symmetric. So, β is some variable y. If β does not occur in w 1 , then L = w 1 and R = βw 2 and that E is oriented implies that E ′ is oriented. So assume that β appears in w 1 , say, w 1 = uβv. Then, R = βw 2 and L = uαβv. Thus, if α ∈ A, E ′ is oriented because we can use the same variable ordering that witnesses that E is oriented. So, assume α ∈ V . It suffices to show that α occurs at most once in E. For, if α also occurs on the other side of the equation E (i.e. in w 2 ), α precedes β on l.h.s. of E, while β precedes α on r.h.s. of E, which would show that E is not oriented.

B Proof of Lemma 7
Since C(E, S) is simply C(E) but annotated by the regular constraints and how they are modified, it suffices to prove that each SCC G in the control structure of C(E) with nodes E 1 , . . . , E m the restrictions C ′ of C(E, S) to control states of the form (E i , S), for some set S of regular constraints, is a flat each of whose cycle is of length O(|E|), and each of whose simple path is of length at most O(|E||V||S|M 3 ). By Theorem 5, we know that C(E) is flat, each of whose simple cycle is 1-variable-reducing and is of length |E| and each of whose simple path over the dag of SCCs corresponding to the control structure of C(E) is of length at most O(|E|). Therefore, we may assume that the SCC G is a cycle E 1 → · · · → E m and there exists a counter y which is reduced by each transition in G using SUB y,z or DEC y .
We will next define a partial order on sets of regular constraints of the form x ∈ L(A p,q ), where x ∈ V and A is an NFA in S. We will write S 1 ⊲ S 2 if S 1 S 2 but S 1 = S 2 . Before defining , the intuition behind this partial order is that each transition ((E ′ , S ′ ), Φ, (E ′′ , S ′′ )) in the control structure G ′ of C ′ will -decrease S ′ , i.e., S ′ S ′′ . In particular, this implies that every simple cycle in G ′ will consist of the nodes (E 1 , H), . . . , (E m , H) for some set H of regular constraints. Therefore, the length of each simple cycle in G ′ is at most O(|E|). Since we will see that the length a simple path in is at most O(|V||S|M 3 ), it will follow that the length of a simple path in G ′ is at most O(|E||V||S|M 3 ).
We write S 1 S 2 if: -The number of regular constraints in S 2 on y (i.e. of the form y ∈ L for some L) is at most the number of regular constraints in S 1 on y. -For each variable x distinct from y, if x ∈ L is in S 1 , then it is also in S 2 .
-For each constraint y ∈ L(A r,q ) in S 2 , there exists a state p that can reach r in A such that the constraint y ∈ L(A p,q ) is in S 1 .
The fact that is a partial order then follows from the fact that each A is 1-weak, i.e., whose transition relation gives rise to a partial order on the states of A. To show that the length of a simple path in is at most O(|V||S|M 3 ), observe that we can add at most |S|M 2 regular constraints on x (distinct from y). In addition, if the constraints on y in some S 1 are y ∈ L(A 1 p1,q1 ), . . . , y ∈ L(A K pK ,qK ), and if k i is the length of maximal simple path in A i pi,qi , then one can change the set of constraints on y in S 1 to any S 2 S 1 by at most . This proves that any simple path in is of length O(|V||S|M 3 ).

C Proof of Proposition 4
We claim that its length abstraction is precisely the set of triples (n x , n y , n z ) ∈ N 3 satisfying the formula ϕ(l x , l y , l z ) := l x = l y ∧ l x > 0 ∧ l x | l z .
Since divisibility is not Presburger-definable, the theorem immediately follows. To show that for each triplen = (n x , n y , n z ) satisfying ϕ there exists a solution σ to E and the constraint x, y ∈ #(a + b) * , simply consider σ with σ(x) = σ(y) = #a lx−1 , and σ(z) = σ(z) = σ(x) nz /nx . Conversely, consider a solution σ satisfying xz = zy and x, y ∈ #(a + b) * . We must have x = y since two conjugates x, y ∈ #(a + b) * must apply a full cyclical permutation, i.e., the same words. We then have |σ(x)| = |σ(y)| > 0. To show that |σ(x)| | |σ(z)|, let |σ(z)| = q|σ(x)| + r for some q ∈ N and r ∈ [|σ(x)| − 1]. It suffices to show that r = 0. To this end, matching both sides of E, we obtain z = x q w, where w is a prefix of σ(x) of length r. If r > 0, then matching both sides of the equation from the right reveals that the last |σ(y)| − 1 letters on l.h.s. of σ(E) contains #, which is not the case on r.h.s. of σ(E), contradicting that σ is a solution to E. Therefore, r = 0, proving the claim.