Linear Temporal Logic for Regular Cost Functions

Regular cost functions have been introduced recently as an extension to the notion of regular languages with counting capabilities, which retains strong closure, equivalence, and decidability properties. The specificity of cost functions is that exact values are not considered, but only estimated. In this paper, we define an extension of Linear Temporal Logic (LTL) over finite words to describe cost functions. We give an explicit translation from this new logic to two dual form of cost automata, and we show that the natural decision problems for this logic are PSPACE-complete, as it is the case in the classical setting. We then algebraically characterize the expressive power of this logic, using a new syntactic congruence for cost functions introduced in this paper.


Introduction
Since the seminal works of Kleene and Rabin and Scott, the theory of regular languages is one of the cornerstones in computer science.Regular languages have many good properties, of closure, of equivalent characterizations, and of decidability, which makes them central in many situations.
Recently, the notion of regular cost function for words has been presented as a candidate for being a quantitative extension to the notion of regular languages, while retaining most of the fundamental properties of the original theory such as the closure properties, the various equivalent characterizations, and the decidability [2].A cost function is an equivalence class of the functions from the domain (words in our case) to N ∪ {∞}, modulo an equivalence relation ≈ which allows some distortion, but preserves the boundedness property over each subset of the domain.The model is an extension to the notion of languages in the following sense: one can identify a language with the function mapping each word inside the language to 0, and each word outside the language to ∞.It is a strict extension since regular cost functions have counting capabilities, e.g., counting the number of occurrences of letters, measuring the length of intervals, etc... Linear Temporal Logic (LTL), which is a natural way to describe logical constraints over a linear structure, have also been a fertile subject of study, particularly in the context of regular languages and automata [10].Moreover quantitative extensions of LTL have recently been successfully introduced.For instance the model Prompt-LTL introduced in [8] is interested in bounding the waiting time of all requests of a formula, and in this sense is quite close to the aim of cost functions.
In this paper, we extend LTL (over finite words) into a new logic with quantitative features (LTL ≤ ), in order to describe cost functions over finite words with logical formulae.We do this by adding a new operator U ≤N : a formula φU ≤N ψ means that ψ holds somewhere in the future, and φ has to hold until that point, except at most N times (we allow at most N "mistakes" of the until formula).

Related works and motivating examples
Regular cost functions are the continuation of a sequence of works that intend to solve difficult questions in language theory.Among several other decision problems, the most prominent example is the star-height problem: given a regular language L and an integer k, decide whether L can be expressed using a regular expression using at most k-nesting of Kleene stars.The problem was resolved by Hashigushi [5] using a very intricate proof, and later by Kirsten [7] using an automaton that has counting features.
Finally, also using ideas inspired from [1], the theory of those automata over words has been unified in [2], in which cost functions are introduced, and suitable models of automata, algebra, and logic for defining them are presented and shown equivalent.Corresponding decidability results are provided.The resulting theory is a neat extension of the standard theory of regular languages to a quantitative setting.
On the logic side, Prompt-LTL, introduced in [8], is an interesting way to extend LTL in order to look at boundedness issues, and already gave interesting decidability and complexity results.Prompt-LTL would correspond in the framework of regular cost functions to a subclass of temporal cost functions introduced in [3]; in particular it is weaker than LTL ≤ introduced here.

Contributions
It is known from [2] that regular cost functions are the ones recognizable by stabilization semigroups (or in an equivalent way, stabilization monoids), and from [3] than there is an effective quotient-wise minimal stabilization semigroup for each regular cost function.This model of semigroups extends the standard approach for languages.
We introduce a quantitative version of LTL in order to describe cost functions by means of logical formulas.The idea of this new logic is to bound the number of "mistakes" of Until operators, by adding a new operator U ≤N .The first contribution of this paper is to give a direct translation from LTL ≤ -formulas to B-automata, which is an extension of the classic translation from LTL to Büchi automaton for languages.This translation preserves exact values (i.e.not only cost functions equivalence), which could be interesting in terms of future applications.We then show that regular cost functions described by LTL formulae are the same as the ones computed by aperiodic stabilization semigroups, and this characterization is effective.The proof uses a syntactic congruence for cost functions, introduced in this paper.
This work validates the algebraic approach for studying cost functions, since the analogy extends to syntactic congruence.It also allows a more user-friendly way to describe cost functions, since LTL can be more intuitive than automata or stabilization semigroups to describe a given cost function.
As it was done in [3] for temporal cost functions, the characterization result obtained here for LTL ≤ -definable cost functions follows the spirit of Schützenberger's theorem which links star-free languages with aperiodic monoids [9].

Organisation of the paper
After some notations, and reminder on cost functions, we introduce in Section 3 LTL ≤ as a quantitative extension of LTL, and give an explicit translation from LTL ≤ -formulae to B-automata.We then present in Section 4 a syntactic congruence for cost functions, and show that it indeed computes the minimal stabilization semigroup of any regular cost function.We finally use this new tool to show that LTL ≤ has the same expressive power as aperiodic stabilization semigroups.

Notations
We will note N the set of non-negative integers and N ∞ the set N ∪ {∞}, ordered by 0 is the set of infinite sequences of elements of E (we will not use here the notion of infinite words).Such sequences will be denoted by bold letters ( a, b,...).We will work with a fixed finite alphabet A. The set of words over A is A * and the empty word will be noted ε.The concatenation of words u and v is uv.The length of u is |u|.The number of occurrences of letter a in u is |u| a .Functions N → N will be denoted by letters α, β, . . ., and will be extended to N ∪ {∞} by α(∞) = ∞.

Cost functions and equivalence
If L ⊆ A * , we will note χ L the function defined by χ for some α.This equivalence relation doesn't pay attention to exact values, but preserves the existence of bounds.
A cost function is an equivalence class of F /≈. Cost functions are noted f , g, . . ., and in practice they will be always be represented by one of their elements in F .

B-automata
A B-automaton is a tuple Q, A, In, Fin, Γ, ∆ where Q is the set of states, A the alphabet, In and Fin the sets of initial and final states, Γ the set of counters, and Counters have integers values starting at 0, and an action σ ∈ ({i, r, c} * ) Γ performs a sequence of atomic actions on each counter, where atomic actions are either i (increment by 1), r (reset to 0) or c (check the value).In particular we will note ε the action corresponding to the empty word : doing nothing on every counter.If e is a run, let C(e) be the set of values checked during e on all counters of Γ.

A B-automaton A computes a regular cost function [[A ]] via the following semantic : [[A ]](u) = inf {supC(e), e run of A over u}.
With the usual conventions that sup / 0 = 0 and inf / 0 = ∞.There exists also a dual model of Bautomata, namely S-automata, that has the same expressive power, but we won't develop this further in this paper.See [2] for more details.Moreover, as in the case of languages, cost functions can be recognized by an algebraic structure that extends the classic notion of semigroups, called stabilization semigroups.A stabilization semigroup S = S, •, ≤, ♯ is a partially ordered set S together with an internal binary operation • and an internal unary operation a → a ♯ defined only on idempotent elements (elements a such that a • a = a).The formalism is quite heavy, see appendix for all details on axioms of stabilization semigroups and recognition of regular cost functions.

Quantitative LTL
We will now use an extension of LTL to describe some regular cost functions.This has been done successfully with regular languages, so we aim to obtain the same kind of results.Can we still go efficiently from an LTL-formula to an automaton?

Definition
The first thing to do is to extend LTL so that it can decribe cost functions instead of languages.We must add quantitative features, and this will be done by a new operator U ≤N .Unlike in most uses of LTL, we work here over finite words.
Formulas of LTL ≤ (on finite words on an alphabet A) are defined by the following grammar : Note the absence of negation in the definition of LTL ≤ .The negations have been pushed to the leaves.
a means that the current letter is a, ∧ and ∨ are the classic conjunction and disjunction; Xφ means that φ is true at the next letter; φUψ means that ψ is true somewhere in the future, and φ holds until that point; φU ≤N ψ means that ψ is true somewhere in the future, and φ can be false at most N times before ψ.The variable N is unique, and is shared by all occurrences of U ≤N operator; Ω means that we are at the end of the word.
We can define ⊤ = ( a∈A a) ∨ Ω and ⊥ = ¬⊤, meaning respectively true and false, and ¬a = ( b =a b) ∨ Ω to signify that the current letter is not a.

Semantics
We want to associate a cost function [[φ]] on words to any LTL ≤ -formula φ.
We will say that u, n |= φ (u, n is a model of φ) if φ is true on u with n as valuation for N, i.e. as number of errors for all the U ≤N 's in the formula φ.We finally define We can remark that if u, n |= φ, then for all k ≥ n, u, k |= φ, since the U ≤N operators appear always positively in the formula (that is why we don't allow the negation of an LTL ≤ -formula in general).In particular, We use LTL ≤ -formulae in order to describe cost functions, so we will always work modulo cost function equivalence ≈.
◮ Remark 4. If φ does not contain any operator U ≤N , φ is a classic LTL-formula computing a language L, and [[φ]] = χ L .

From LTL ≤ to B-Automata
We will now give a direct translation from LTL ≤ -formula to B-automata, i.e. given an LTL ≤ -formula φ on a finite alphabet A, we want to build a B-automaton recognizing [[φ]].This construction is adapted from the classic translation from LTL-formula to Büchi automata [4].
Let φ be an LTL ≤ -formula.We define sub(φ) to be the set of subformulae of φ, and Q = 2 sub(φ) to be the set of subsets of sub(φ).
We want to define a B-automaton We set the initial states to be In = {{φ}} and the final ones to be Fin = { / 0, {Ω}} We choose as set of counters Γ = {γ 1 , . . ., γ k } where k is the number of occurences of the U ≤N operators in φ, A state is basically the set of constraints we have to verify before the end of the word, so the only two accepting states are the one with no constraint, or with only constraint to be at the end of the word.
The following definitions are the same as for the classical case (LTL to Büchi automata) : An atomic formula is either a letter a ∈ A or Ω A set Z of formulae is consistent if there is at most one atomic formula in it.A reduced formula is either an atomic formula or a Next formula (of the form Xϕ). A set Z is reduced if all its elements are reduced formulae.If Z is consistent and reduced, we define next(Z) = {ϕ/Xϕ ∈ Z}.
We would like to define A φ with Z −→ next(Z) as transitions.
The problem is that next(Z) is not consistent and reduced in general.If next(Z) is inconsistent we remove it from the automaton.If it is consistent, we need to apply some reduction rules to get a reduced set of formulae.This consists in adding ε-transitions (but with possible actions on the counter) towards intermediate sets which are not actual states of the automaton (we will call them "pseudo-states"), until we reach a reduced set.
Let ψ be maximal (in size) not reduced in Y , we add the following transitions where action r j (resp.ic j ) perform r (resp.ic) on counter γ j and ε on the other counters.The pseudo-states don't (a priori) belong to Q = 2 sub(φ) because we add formulae Xψ for ψ ∈ sub(φ), so if Z is a reduced pseudo-state, next(Z) will be in Q again since we remove the new next operators.
The transitions of automaton A φ will be defined as follows: where Y ε:σ −→ * Z means that there is a sequence of ε-transitions from Y to Z with σ as combined action on counters.◮ Definition 7. If σ is a sequence of actions on counters, we will call val(σ) the maximal value checked on a counter during σ with 0 as starting value of the counters, and val(σ) = 0 if there is no check in σ.It corresponds to the value of a run of a B-automaton with σ as combined action of the counter.
◮ Lemma 8. Let u = a 1 . . .a m be a word on A and Y 0 Lemma 8 implies the correctness of the automaton A φ : Conversely, let N = [[φ]](u), then u, N |= φ so by definition of A φ , it is straightforward to verify that there exists an accepting run of A φ over u of value ≤ N (each counter γ i doing at most N mistakes relative to operator We finally get (and so we have obviously

Algebraic characterization
We remind that as in the case of languages, stabilization semigroups recognize exactly regular cost functions, and there exists a quotient-wise minimal stabilization semigroup for each regular cost function [3].
In standard theory, it is equivalent for a regular language to be described by an LTL-formula, or to be recognized by an aperiodic semigroup.Is it still the case in the framework of regular cost functions?To answer this question we first need to develop a little further the algebraic theory of regular cost functions.

Syntactic congruence
In standard theory of languages, we can go from a description of a regular language L to a description of its syntactic monoid via the syntactic congruence.Moreover, when the language is not regular, we get an infinite monoid, so this equivalence can be used to "test" regularity of a language.
The main idea behind this equivalence is to identify words u and v if they "behave the same" relatively to the language L, i.e.L cannot separate u from v in any context : The aim here is to define an analog to the syntactic congruence, but for regular cost functions instead of regular languages.Since cost functions look at quantitative aspects of words, the notions of "element" and "context" have to contain quantitative information : we want to be able to say things like "words with a lot of a's behave the same as words with a few a's".
That is why we won't define our equivalence over words, but over ♯-expressions, which are a way to describe words with quantitative information.

♯-expressions
We first define general ♯-expressions as in [6] and [3] by just adding an operator ♯ to words in order to repeat a subexpression "a lot of times".This differs from the stabilization monoid definition, in which the ♯-operator can only be applied to specific elements (idempotents).
The set Expr of ♯-expressions on an alphabet A is defined as follows: If we choose a stabilization semigroup S = S, •, ≤, ♯ together with a function h : A → S, the eval function (from Expr to S) is defined inductively by eval(a) = h(a), eval(ee ′ ) = eval(e) • eval(e ′ ), and eval(e ♯ ) = eval(e) ♯ (eval(e) has to be idempotent).We say that e is well-formed for S if eval(e) exists.Intuitively, it means that ♯ was applied to subexpressions that corresponds to idempotent elements in S.
If f is a regular cost function, e is well-formed for f iff e is well-formed for the minimal stabilization semigroup of f .◮ Example 9. Let f be the cost function defined over {a} * by The minimal stabilization semigroup of f is : a, aa, (aa) ♯ , (aa) ♯ a , with aa • a = a and (aa) ♯ a • a = (aa) ♯ .Hence the ♯-expression aaa(aa) ♯ is well-formed for f but the ♯-expression a ♯ is not.
The ♯-expressions that are not well-formed have to be removed from the set we want to quotient, in order to get only real elements of the syntactic semigroup.

ω♯-expressions
We have defined the set of ♯-expressions that we want to quotient to get the syntactic equivalence of cost functions.However, we saw that some of these ♯-expressions may not be well-typed for the cost function f we want to study, and therefore does not correspond to an element in the syntactic stabilization semigroup of f .Thus we need to be careful about the stabilization operator, and apply it only to "idempotent ♯-expressions".To reach this goal, we will add an "idempotent operator" ω on ♯-expressions, which will always associate an idempotent element (relative to f ) to a ♯-expression, so that we can later apply ♯ and be sure of creating well-formed expressions for f .We define the set Oexpr of ω♯-expressions on an alphabet A : The intuition behind operator ω is that x ω is the idempotent obtained by iterating x (which always exists in finite semigroups).
A context C[x] is a ω♯-expression with possible occurrences of a free variable x.Let E be a ω♯-expression, C[E] is the ω♯-expression obtained by replacing all occurrences of x by E in C[x], i.e.

C[E] = C[x][x ← E].
Let C OE be the set of contexts on ω♯-expressions.
We will now formally define the semantic of operator ω, and use ω♯-expressions to get a syntactic equivalence on cost functions, without mistyped ♯-expressions.◮ Definition 10.If E ∈ Oexpr and k, n ∈ N, we define E(k, n) to be the word E[ω ← k, ♯ ← n], where the exponential is relative to concatenation of words.◮ Lemma 11.Let f be a regular cost function, there exists K f ∈ N such that for any E ∈ Oexpr, the ♯-expression E[ω ← K f !] is well-formed for f , and we are in one of these two cases Proof.The proof is a little technical, since we have to reuse the definition of recognization by stabilization semigroup.K f can simply be taken to be the size of the minimal stabilization semigroup of f .◭ Here, f B and f ∞ are the analogs for regular cost functions of "being in L" and "not being in L" in language theory.But this notion is now asymptotic, since we look at boundedness properties of quantitative information on words.Moreover, f ∞ and f B are only defined here for regular cost functions, since K f might not exist if f is not regular.◮ Definition 12. Let f be a regular cost function, we write and L is a regular language, then u ∼ L v iff u ≡ χ L v ( ∼ L being the syntactic congruence of L).In this sense, ≡ is an extension of the classic syntactic congruence on languages.
Now that we have properly defined the equivalence ≡ f over Oexpr, it remains to verify that it is indeed a good syntactic congruence, i.e.Oexpr/≡ f is the syntactic stabilization semigroup of f .Indeed if f is a regular cost function, let S f = Oexpr/≡ f .We can provide S f with a structure of stabilization semigroup S f , •, ≤, ♯ .
◮ Theorem 14. S f is the minimal stabilization semigroup recognizing f .The proof consists basically in a bijection between classes of Oexpr for ≡ f , and elements of the minimal stabilization semigroup as defined in appendix A.7 of [3].

Expressive power of LTL ≤
If f is a regular cost function, we will call S f the syntactic stabilization semigroup of f .A finite semigroup S = S, • is called aperiodic if ∃k ∈ N, ∀s ∈ S, s k+1 = s k .The definition is the same if S is a finite stabilization semigroup.◮ Remark 15.For a regular cost function f , the statements " f is recognized by an aperiodic stabilization semigroup" and "S f is aperiodic" are equivalent, since S f is a quotient of all stabilization semigroups recognizing f .◮ Theorem 16.Let f be a cost function described by a LTL ≤ -formula, then f is regular and the syntactic stabilization semigroup of f is aperiodic.
The proof of this theorem will be the first framework to use the syntactic congruence on cost functions.
If φ is a LTL ≤ -formula, we will say that φ verifies property AP if there exists k ∈ N such that for any ω♯-expression E, E k ≡ [[φ]] E k+1 , which is equivalent to "[[φ]] has an aperiodic syntactic stabilization semigroup".
With this in mind, we can do an induction on LTL ≤ -formulaes : we first show that S Ω and all S a for a ∈ A are aperiodic.
We then proceed to the induction on φ : assuming that ϕ and ψ verify property AP, we show that Xψ, ϕ ∨ ψ, ϕ ∧ ψ, ϕUψ and ϕU ≤N ψ verify property AP.◮ Theorem 17.Let f be a cost function recognized by an aperiodic stabilization semigroup, then f can be described by an LTL ≤ -formula.
The proof of this theorem is a generalization of the proof of Wilke for aperiodic languages in [11].However difficulties inherent to quantitative notions appear here.
The main issue comes from the fact that in the classical setting, computing the value of a word in a monoid returns a single element.This fact is used to do an induction on the size of the monoid, by considering the set of possible results as a smaller monoid.The problem is that with cost functions, there is some additional quantitative information, and we need to associate a sequence of elements of a stabilization monoid to a single word.Therefore, it requires some technical work to come back to a smaller stabilization monoid from these sequences.
◮ Corollary 18.The class of LTL ≤ -definable cost functions is decidable.
Proof.Theorems 16 and 17 imply that it is equivalent for a regular cost function to be LTL ≤definable or to have an aperiodic syntactic stabilization semigroup.If f is given by an automaton or a stabilization semigroup, we can compute its syntactic stabilization semigroup S f (see [3]) and decide if f is LTL ≤ -definable by testing aperiodicity of S f .This can be done simply by iterating at most |S f | times all elements of S f and see if each element a reaches an element a k such that a k+1 = a k .◭

Conclusion
We first defined LTL ≤ as a quantitative extension of LTL.We started the study of LTL ≤ by giving an explicit translation from LTL ≤ -formulae to B-automata, which preserves exact values (and not only boundedness properties as it is usually the case in the framework of cost functions).We then showed that the expressive power of LTL ≤ in terms of cost functions is the same as aperiodic stabilization semigroups.The proof uses a new syntactic congruence, which has a general interest in the study of regular cost functions.This result implies the decidability of the LTL ≤ -definable class of cost functions.
As a further work, we can try to put ω♯-expressions in a larger framework, by doing an axiomatization of ω♯-semigroups.We can also extend this work to infinite words, and define an analog to Büchi automata for cost functions.To continue the analogy with classic languages results, we can define a quantitative extension of FO describing the same class as LTL ≤ , and search for analog definitions of counter-free B-automata and star-free B-regular expressions.The translation from LTL ≤ -formulae to B-automata can be further studied in terms of optimality of number of counters of the resulting B-automaton.
it is computed by the following one-counter B-automaton on the left-hand side.The cost function u → min {n ∈ N, a n factor of u} is computed by the nondeterministic one-counter B-automaton on the right-hand side.