Regular Cost Functions, Part I: Logic and Algebra over Words

The theory of regular cost functions is a quantitative extension to the classical notion of regularity. A cost function associates to each input a non-negative integer value (or infinity), as opposed to languages which only associate to each input the two values"inside"and"outside". This theory is a continuation of the works on distance automata and similar models. These models of automata have been successfully used for solving the star-height problem, the finite power property, the finite substitution problem, the relative inclusion star-height problem and the boundedness problem for monadic-second order logic over words. Our notion of regularity can be -- as in the classical theory of regular languages -- equivalently defined in terms of automata, expressions, algebraic recognisability, and by a variant of the monadic second-order logic. These equivalences are strict extensions of the corresponding classical results. The present paper introduces the cost monadic logic, the quantitative extension to the notion of monadic second-order logic we use, and show that some problems of existence of bounds are decidable for this logic. This is achieved by introducing the corresponding algebraic formalism: stabilisation monoids.


Introduction
This paper introduces and studies a quantitative extension to the standard theory of regular languages of words.It is the only quantitative extension (in which quantitative means that the function described can take infinitely many values) known to the author in which the milestone equivalence for regular languages: accepted by automata = recognisable by monoids = definable in monadic second-order logic = definable by regular expressions The principle is to use a reduction to the limitedness problem for a form of automata more general than distance automata, called nested distance desert automata.To understand this extension, let us first look again at distance automata: we can see a distance automaton as an automaton that has a counter which is incremented each time a "special" state is encountered.The value attached to a word by such an automaton is the minimum over all accepting runs of the maximal value assumed by the counter.Presented like this, a nested distance desert automaton is nothing but a distance automaton in which multiple counters and reset of the counters are allowed (with a certain constraint of nesting of counters).Kirsten performed a reduction of the star-height problem to the limitedness of nested distance desert automata which is much easier than the reduction of Hashiguchi.He also proves that the limitedness problem of nested distance desert automata is decidable.For this, he generalises the proof methods developed previously by Hashiguchi, Simon and Leung for distance automata.This work closes the story of the star-height problem itself.
The star-height problem is the king among the problems solved using this method.But there are many other (difficult) questions that can be reduced to the limitedness of distance automata and variants.Some of the solutions to these problems paved the way to the solution of the star-height problem.
The finite power property takes as input a regular language L and asks whether there exists some positive integer n such that (L + ε) n = L * .It was raised by Brzozowski in 1966, and it took twelve years before being independently solved by Simon and Hashiguchi [40,16].This problem is easily reduced to the limitedness problem for distance automata.
The finite substitution problem takes as input two regular languages L, K, and asks whether it is possible to find a finite substitution σ (i.e., a morphism mapping each letter of the alphabet of L to a finite language over the alphabet of K) such that σ(L) = K.This problem was shown decidable independently by Bala and Kirsten by a reduction to the limitedness of desert automata (a form of automata weaker than nested distance desert automata, but incomparable to distance automata), and a proof of decidability of this latter problem [2,25].
The relative inclusion star-height problem is an extension of the star height problem introduced and shown decidable by Hashiguchi using his techniques [22].Still using nested distance desert automata, Kirsten gave another, more elegant proof of this result [29].
The boundedness problem is a problem of model theory.It consists of deciding if there exists a bound on the number of iterations that are necessary for the fixpoint of a logical formula to be reached.The existence of a bound means that the fixpoint can be eliminated by unfolding its definition sufficiently many times.The boundedness problem is usually parameterised by the logic chosen and by the class of models over which the formula is studied.The boundedness problem for monadic second-order formulae over the class of finite words was solved by a reduction to the limitedness problem of distance automata by Blumensath, Otto and Weyer [3].
One can also cite applications of distance automata in speech recognition [37,38], databases [14], and image compression [24].In the context of verification, Abdulla, Krcàl and Yi have introduced R-automata, which correspond to nested distance desert automata in which the nesting of counters is not required anymore [1].They prove the decidability of the limitedness problem for this model of automata.
Finally, Löding and the author have also pursued this branch of researches in the direction of extended models.In [9], the star-height problem over trees has been solved, by a reduction to the limitedness problem of nested distance desert automata over trees.The latter problem was shown decidable in the more general case of alternating automata.In [10] a similar attempt has been tried for deciding the Mostowski hierarchy of non-deterministic automata over infinite trees (the hierarchy induced by the alternation of fixpoints).The authors show that it is possible to reduce this problem to the limitedness problem for a form of automata that unifies nested distance desert automata and parity tree automata.The latter problem is an important open question.
Bojańczyk and the author have introduced the notion of B-automata in [5], a model which resembles much (and is prior to) R-automata.The context was to show the decidability of some fragments of the logic MSO+U over infinite words, in which MSO+U is the extension of the monadic second order logic extended with the quantifier UX.ϕ meaning "for all integers n, there exists a set X of cardinality at least n such that ϕ holds".From the decidability results in this work, it is possible to derive every other limitedness results over finite words.However, the constructions are complicated and of non-elementary complexity.Nevertheless, the new notion of S-automata was introduced, a model dual to B-automata.Recall that the semantics of distance automata and their variants can be expressed as a minimum over all runs of the maximum of the value taken by counters.The semantics of S-automata is dual: it is defined as the maximum over all runs of the minimum of the value taken by the counters at the moment of their reset.Unfortunately, it is quite hard to compare in detail this work with all others.Indeed, since it was oriented toward the study of a logic over infinite words, the central automata are in fact ωB and ωS-automata: automata accepting languages of infinite words that have an infinitary accepting condition constraining the asymptotic behaviour of the counters along the run.This makes these automata very different 2 .Indeed, the automata in [5] accept languages while the automata in study here define functions.For achieving this, the automata use an extra mechanism involving the asymptotic behaviors of counters for deciding whether an infinite word should be accepted or not.This extra mechanism has no equivalent in distance automata, and is in some sense "orthogonal" to the machinery involved in cost functions.For this reason, B-automata and S-automata in [5] are just intermediate objects that do not have all the properties we would like.In particular B-and S-automata in [5] are not equivalent.However, the principle of using two dual forms of automata is an important concept in the theory of regular cost functions.The study of MSO+U has been pursued in several directions.Indeed, the general problem of the satisfaction of MSO+U is a challenging open problem.One partial result concerns the decision of WMSO+U (the weak fragment in which only quantifiers over finite sets are allowed) which is decidable [4].However, the techniques involved in this work are not directly related to cost functions.
The proof methods for showing the decidability of the limitedness problem of distance automata and their variants, are also of much interest by themselves.While the original proof of Hashiguchi is quite complex, a major advance has been achieved by Leung who introduced the notion of stabilisation [32,33] (see also [41] for an early overview).The principle is to abstract the behaviour of the distance automaton in a monoid, and further describe the semantics of the counter using an operator of stabilisation, i.e., an operator which describes, given an element of the monoid, what would be the effect of iterating it a "lot of times".This key idea was further used and refined by Simon, Leung, Kirsten, 2 One must be careful: these automata are not related to B and S-automata as, say, Büchi automata are related to automata over finite words.We warn the reader that these models cannot be thought as the extension of cost functions to infinite words.
Abdulla, Krcàl and Yi.This idea was not present in [5], and this is one explanation for the bad complexity of the constructions.
Another theory related to cost functions is the one developed by Szymon Toruńczyk in his thesis [46].The author proposes a notion of recognisable languages of profinite words which happen to be equivalent to cost functions.Indeed, profinite words are infinite sequences of finite words (which are convergent in a precise topology, the profinite topology).As such, a single profinite word can be used as a witness that a function is not bounded.Following the principle of this correspondence, one can see a cost function as a set of profinite words: the profinite words corresponding to infinite sequences of words over which the function is bounded.This correspondence makes Toruńcyk's approach equi-expressive with cost functions over finite words as far as decision questions are concerned.Seen like this, this approach can be seen as the theory of cost functions presented in a more abstract setting.Still, some differences have to be underlined.On one side, the profinite approach, being more abstract, loses some precision.For instance in the present work, we have a good understanding of the precision of the constructions: namely each operation can be performed doing an at most "polynomial approximation3 ".On the other side, the presentation in terms of profinite languages eliminates the corresponding annoying details in the development of cost functions: namely there is no more need to control the approximation at each step.Another interesting point is that the profinite presentation points naturally to extensions, which are orthogonal to cost functions, and are highly related to MSO+U.For the moment, the profinite approach has been developed for finite words only.It is not clear for now how easy this abstract presentation can be used for treating more complex models, as it has been done for cost functions, e.g., over finite trees [11].
1.2.Survey of the theory.The theory of regular cost functions gives a unified and general framework for explaining all objects, results and constructions presented above (apart from the results in [5] that are of a slightly different nature).It also allows to derive new results.
Let us describe the contributions in more details.Cost functions.The standard notion of language is replaced by the new notion of cost function.For this, we consider mappings from a set E to N ∪ {∞} (in practice E is the set of finite words over some finite alphabet) and the equivalence relation ≈ defined by f ≈ g if: for all X ⊆ E, f restricted to X is bounded iff g restricted to X is bounded.Hence two functions are equivalent if it is not possible to distinguish them using arguments of existence of bounds.A cost function is an equivalence class for ≈.The notion of cost functions is what we use as a quantitative extension to languages.Indeed, every language L can be identified with (the equivalence class of) the function mapping words in L to the value 0, and words outside L to ∞.All the theory is presented in terms of cost functions.This means that all equivalences are considered modulo the relation ≈.
Cost automata.A first way to define regular cost functions is to use cost automata, which come in two flavours, B-and S-automata.The B-automata correspond in their simple form to R-automata [1] and in their simple and hierarchical form to nested distance desert automata in [27,28].Those are also very close to B-automata in [5].Following the ideas in [5], we also use the dual variant of S-automata.The two forms of automata, B-automata and S-automata, are equi-expressive in all their variants, an equivalence that we call the duality theorem.Automata are not introduced in this paper.
Stabilisation monoids.The corresponding algebraic characterisation makes use of the new notion of stabilisation monoids.A stabilisation monoid is a finite ordered monoid together with a stabilisation operation.This stabilisation operation expresses what it means to iterate "a lot of times" some element.The operator of stabilisation was introduced by Leung [32,33] and used also by Simon, Kirsten, Abdulla, Krcàl and Yi as a tool for analysing the behaviour of distance automata and their variants.The novelty here lies in the fact that in our case, stabilisation is now part of the definition of a stabilisation monoid.We prove that it is possible to associate unique semantics to all stabilisation monoids.These semantics are represented by means of computations.A computation is an object describing how a word consisting of elements of the stabilisation monoid can be evaluated into a value in the stabilisation monoid.This key result shows that the notion of stabilisation monoid has a "meaning" independent from the existence of cost automata (in the same way a monoid can be used for recognising a language, independently from the fact that it comes from a finite state automaton).This notion of computations is easier to handle than the notion of compatible mappings used in the conference version of this work [6].
Recognisable cost functions.We use stabilisation monoids for defining the new notion of recognisable cost functions.We show the closure of recognisable cost functions under min, max, and new operations called inf-projection and sup-projection (which are counterparts to projection in the theory of regular languages).We also prove that the relation ≈ (in fact the correspoding preorder ) is decidable over recognisable cost functions.This decidability result subsumes many limitedness results from the literature.This notion of recognisability for cost functions is equivalent to being accepted by the cost automata introduced above.
Extension of regular expressions.It is possible to define two forms of expressions, B-and Sregular expressions, and show that these are equivalent to cost automata.These expressions were already introduced in [5] in which a similar result was established.
Cost monadic logic.The cost monadic (second-order) logic is a quantitative extension to monadic (second-order) logic.It is for instance possible to define the diameter of a graph in cost monadic logic.The cost functions over words definable in this logic coincide with the regular cost functions presented above.This equivalence is essentially the consequence of the closure properties of regular cost functions (as in the case of regular languages), and no new ideas are required here.The interest lies in the logic itself.Of course, the decision procedure for recognisable cost function entails decidability results for cost monadic logic.In this paper, cost monadic logic is the starting point of our presentation, and our central decidability result is Theorem 2.1 stating the decidability of this logic.1.3.Content of this paper.This paper does not cover the whole theory of regular cost functions over words.The line followed in this paper is to start from the logic "cost monadic logic", and to introduce the necessary material for "solving it over words".This requires the complete development of the algebraic formalism.
In Section 2, we introduce the new formalism of cost monadic logic, and show what is required to solve it.In particular, we introduce the notion of cost function, and advocate that it is useful to consider the logic under this view.We state there our main decision result, Theorem 2.1.
In Section 3 we present the underlying algebraic structure: stabilisation monoids.We then introduce computations, and establish the key results of existence (Theorem 3.3) and uniqueness (Theorem 3.4) of the value computed by computations.
In Section 4, we use stabilisation monoids for defining recognisable cost functions.We show various closure results for recognisable cost functions as well as decision procedures.Those results happen to fulfill the conditions required in Section 2 for showing the decidability of cost monadic logic over words.
In Section 5 some arguments are given on the relationship with the models of automata, which are not described in this document, and on how these different notions interact in the big picture.

Logic
2.1.Cost monadic logic.Let us recall that monadic second-order logic (monadic logic for short) is the extension of first-order logic with the ability to quantify over sets (i.e., monadic relations).Formally monadic formulae use first-order variables (x, y, . . .), and monadic variables (X, Y, . . .), and it is allowed in such formulae to quantify existentially and universally over both first-order and monadic variables, to use every boolean connective, to use the membership predicate (x ∈ X), and every predicate of the relational structure.We expect from the reader basic knowledge concerning monadic logic.
Example 2.0.1.The monadic formula reach(x, y, X) over the signature containing the single binary predicate edge (signature of a digraph): describes the existence of a path in a digraph from vertex x to vertex y such that all edges appearing in the path end in X.Indeed, it expresses that either the path is empty, or every sets Z containing x and closed under taking edges ending in X, also contains y.
In cost monadic logic, one uses a single extra variable N of a new kind, called the bound variable.It ranges over non-negative integers.Cost monadic logic is obtained from monadic logic by allowing the extra predicate |X| ≤ N -in which X is some monadic variable and N the bound variable -if and only if it appears positively in the formula (i.e., under the scope of an even number of negations).The semantic of |X| ≤ N is, as one may expect, to be satisfied if (the valuation of) X has cardinality at most (the valuation of) N .Given a formula ϕ, we denote by FV(ϕ) its free variables, the bound variable excluded.A formula that has no free-variables-it may still use the bound variable-is called a sentence.
We now have to provide a meaning to the formulae of cost monadic logic.We assume some familiarity of the reader with logic terminology.A signature consists of a set of symbols R, S, . . . .To each symbol is attached a non-negative integer called its arity.A (relational) structure (over the above signature) S = U S , R S , . . ., R S consists of a set U S called the universe, and for each symbol R of arity n of a relation R S ⊆ U n S .Given a set of variables F , a valuation of F (over S) is a mapping v which to each monadic variable X ∈ F associates a set v(X) ⊆ U S , and to each first-order variable x ∈ F associates an element v(x) ∈ U S .We denote by v, X = E the valuation v in which X is further mapped to E. Given a cost monadic formula ϕ, a valuation v of its free variable over a structure S and a non-negative integer n, we express by S, v, n |= ϕ the fact that the formula ϕ is satisfied over the structure S with valuation v when the variable N takes the value n.Of course, if ϕ is simply a sentence, we just write S, n |= ϕ.We also omit the parameter n when ϕ is a monadic formula.
The positivity assumption required when using the predicate |X| ≤ N has straightforward consequences.Namely, for all cost monadic sentences ϕ, all relational structures S, and all valuations v, S, v, n |= ϕ implies S, v, m |= ϕ for all m ≥ n.
Instead of evaluating as true or false as done above, we see a formula of cost monadic logic ϕ of free variables F as associating to each relational structure S and each valuation v of the free variables a value in N ∪ {∞} defined by: This value can be either a non-negative integer, or ∞ if no valuation of N makes the sentence true.In case of a sentence ϕ, we omit the valuation and simply write [[ϕ]](S).Let us stress the link with standard monadic logic in the following fact: Fact 2.0.2.For all monadic formula ϕ, and all relational structures S, diameter ::= ∀x, y ∃X |X| ≤ N ∧ reach(x, y, X). defines the diameter of the di-graph: indeed, the diameter of a graph is the least n such that for all pairs of states x, y, there exists a set of size at most n allowing to reach y from x (recall that in the definition of reach(x, y, X), x does not necessarily belong to X, hence this is the diameter in the standard sense).
From now on, for avoiding some irrelevant considerations, we will consider the variant of cost monadic logic in which a) only monadic variables are allowed, b) the inclusion relation X ⊆ Y is allowed, and c) each relation over elements is raised to a relation over singleton sets.Keeping in mind that each element can be identified with the unique singleton set containing it, it is easy to translate cost monadic logic into this variant.In this presentation, it is also natural to see the inclusion relation as any other relation.We will also assume that the negations are pushed to the leaves of formulae as is usual.Overall a formula can be of one of the following forms: in which ϕ and ψ are formulas, R is some symbol of arity n which can possibly be ⊆ (of arity 2), and X, X 1 , . . ., X n are monadic variables.
So far, we have described the semantic of cost monadic logic from the standard notion of model.There is another equivalent way to describe the meaning of formulae, by induction on the structure.The equations are disclosed in the following fact.Fact 2.0.4.Over a structure S and a valuation v, the following equalities hold: As it is the case for monadic logic, no property (if not trivial) is decidable for monadic logic in general.Since cost monadic logic is an extension of monadic logic, one cannot expect anything to be better in this framework.However we are interested, as in the standard setting, to decide properties over a restricted class C of structures.The class C can typically be the class of finite words, of finite trees, of infinite words (of length ω, or beyond) or of infinite trees.The subject of this paper is to consider the case of finite words over a fixed finite alphabet.
We are interested in deciding properties concerning the function described by cost monadic formulae over C.But what kind of properties?It is quite easy to see that, given a cost monadic sentence ϕ and n ∈ N, one can effectively produce a monadic formula ϕ n such that for all structures S, S |= ϕ n iff [[ϕ]](S) = n (such a translation would be possible even without assuming the positivity requirement in the use of the predicates |X| ≤ N ).Hence, deciding questions of the form "[[ϕ]] = n" can be reduced to the standard theory.
Properties that cannot be reduced to the standard theory, and that we are interested in, involve the existence of bounds.One says below that a function f is bounded over some set X if there is some integer n such that f (x) ≤ n for all x ∈ X.We are interested in the following generic problems: Boundedness: Is the function ] is also bounded over E? All these questions cannot be reduced (at least simply) to questions in the standard theory.Furthermore, all these questions become undecidable for very standard reasons as soon as the requirement of positivity in the use of the new predicate |X| ≤ N is removed.In this paper, we introduce suitable material for proving their decidability over the class C of words.
One easily sees that the domination question is in fact a joint extension of the boundedness question (if one sets ϕ to be always true, i.e., to compute the constant function 0), and the divergence question (if one sets ψ to be measuring the size of the structure, i.e., ∀X |X| ≤ N ).Let us remark finally that if ϕ is a formula of monadic logic, then the boundedness question corresponds to deciding if ϕ is a tautology.If furthermore ψ is also monadic, then the domination consists of deciding whether ϕ implies ψ.
In the following section, we introduce the notion of cost functions, i.e., equivalence classes over functions allowing to omit discrepancies of the function described, while preserving sufficient information for working with the above questions.
2.2.Cost functions.In this section, we introduce the equivalence relation ≈ over functions, and the central notion of cost function.
A correction function α is a non-decreasing mapping from N to N such that α(n) ≥ n for all n.From now on, the symbols α, α . . .implicitly designate correction functions.Given x, y in N ∪ {∞}, x α y holds if x ≤ α(y) in which α is the extension of α with α(∞) = ∞.For every set E, α is extended to (N ∪ {∞}) E in a natural way by f α g if f (x) α g(x) for all x ∈ E, or equivalently f ≤ α • g.Intuitively, f is dominated by g after it has been "stretched" by α.One also writes f ≈ α g if f α g and g α f .Finally, one writes f g (resp.f ≈ g) if f α g (resp.f ≈ α g) for some α.A cost function (over a set E) is an equivalence class of ≈ (i.e., a set of mappings from E to N ∪ {∞}).
Some elementary properties of α are: The above fact allows to work with a single correction function at a time.Indeed, as soon as two correction functions α and α are involved in the same proof, we can consider the correction function α = max(α, α ).By the above fact, it satisfies that f α g implies f α g, and f α g implies f α g.Example 2.0.6.Over N × N, maximum and sum are equivalent for the doubling correction function (for short, (max) ≈ ×2 (+)).Indeed, for all x, y ∈ ω, max(x, y) ≤ x + y ≤ 2 × max(x, y) .
Our next examples concern mappings from sequences of words to N. We have and for the other direction we use: The relation has other characterisations: Proposition 1.For all f, g from E to N ∪ {∞}, the following items are equivalent: (1) f g, and; (3) for all X ⊆ E, g| X is bounded implies f | X is bounded.
From (2) to (3).Let X ⊆ E be such that g| X is bounded.Let n be a bound of g over X. Item (2) states the existence of m such that ∀x ∈ E g(x) ≤ n → f (x) ≤ m.In particular, for all x ∈ E, we have g(x) ≤ n by choice of n, and hence f (x) ≤ m.Hence f | X is bounded by m.
The last characterisation shows that the relation ≈ is an equivalence relation that preserves the existence of bounds.Indeed, all this theory can be seen as a method for proving the existence/non-existence of bounds.One can also remark that the questions of boundedness, divergence, and domination presented in the previous section, are preserved under replacing the semantic of a formula by an ≈-equivalent function.Furthermore, the domination question can be simply reformulated as We conclude this section by some remarks on the structure of the relation.Cost functions over some set E ordered by form a lattice.Let us show how this lattice refines the lattice of subsets of E ordered by inclusion.The following elementary fact shows that we can identify a subset of E with the cost function of its characteristic function (given a subset X ⊆ E, one denotes by χ X its characteristic mapping defined by χ X (x) = 0 if x ∈ X, and ∞ otherwise): In this respect, the lattice of cost functions is a refinement of the lattice of subsets of E equipped with the superset ordering.Let us show that this refinement is strict.Indeed, there is only one language L such that χ L does not have ∞ in its range, namely L = E, however, we will show in Proposition 2 that, as soon as E is infinite, there are uncountably many cost functions which have this property of not using the value ∞.Proposition 2. If E is infinite, then there exist at least continuum many different cost functions from E to N.
Proof.Without loss of generality, we can assume E countable, and even, up to bijection, that E = N \ {0}.Let p 0 , p 1 , . . .be the sequence of all prime numbers.Every n ∈ E is decomposed in a unique way as p n 1 1 p n 2 2 . . . in which all n i 's are null but finitely many (with an obvious meaning of the infinite product).For all I ⊆ N, one defines the function f I from N \ {0} to N for all n ∈ N \ {0} by: . .} .Consider now two different sets I, J ⊆ N.This means-up to a possible exchange of the roles of I and J-that there exists i ∈ I \J.Consider now the set X = {p k i : k ∈ N}.Then, by construction, f I (p k i ) = k and hence f I is not bounded over X.However, f J (p k i ) = 0 and hence f J is bounded over X.It follows by Proposition 1 that f I and f J are not equivalent for ≈.We can finally conclude that-since there exist continuum many subsets of N-there is at least continuum many cost functions over E which do not use value ∞.

2.3.
Solving cost monadic logic over words using cost functions.As usual, we see a word as a structure, the universe of which is the set of positions in the word (numbered from 1), equipped with the ordering relation ≤, and with a unary relation for each letter of the alphabet that we interpret as the set of positions at which the letter occur.Given a set of monadic variables F , and a valuation v of F over a word u = a 1 . . .a k ∈ A * , we denote by u, v the word c 1 . . .c k over the alphabet , and to 0 otherwise.
It is classical that given a monadic formula ϕ with free variables F , the language The proof is done by induction on the formula.It amounts to remark that to the constructions of the logic, namely disjunction, conjunction, negation and existential quantification, correspond naturally some language theoretic operations, namely union, intersection, complementation and projection.The base cases are obtained by remarking that the relations of ordering, inclusion, and letter, also correspond to regular languages.We use a similar approach.To each cost monadic formula ϕ with free variables F over the signature of words over A, we associate the cost function f ϕ over A F defined by We aim at solving cost monadic logic by providing an explicit representation to the cost functions f ϕ .For reaching this goal, we need to define a family of cost functions F that contains suitable constants, has effective closure properties and decision procedures.
The first assumption we make is the closure under composition with a morphism.I.e., let f be a cost function in F over A * and h be a morphism from B * (B being another alphabet) to A * , we require f • h to also belong to F. In particular, this operation allows us to change the alphabet, and hence to add new variables when required.It corresponds to the closure under inverse morphism for regular languages.
Fact 2.0.4 gives us a very precise idea of the constants we need.The constants correspond to the formulae of the form R(X 1 , . . ., X n ) as well as their negation.As mentioned above, for such a formula ϕ, L ϕ is regular.Hence, it is sufficient for us to require that the characteristic function χ L belongs to F for each regular language L. The remaining constants correspond to the formula |X| ≤ N .We have that f |X|≤N ( u, X = E ) = |E|.This corresponds to counting the number of occurrences of letters from A × {1} in a word over A × {0, 1}.Up to a change of alphabet (thanks to the closure under composition with a morphism) it will be sufficient for us that F contains the function "size" which maps each word u ∈ {a, b} * to |u| a .Fact 2.0.4 also gives us a very precise idea of the closure properties we need.We need the closure under min and max for disjunctions and conjunctions.For dealing with existential and universal quantification, we need the new operations of inf-projection and sup-projection.Given a mapping f from A * to N ∪ {∞} and a mapping h from A to B that we extend into a morphism from A * to B * (B being another alphabet) the inf-projection of f with respect to h is the mapping f inf,h from B * to N ∪ {∞} defined for all v ∈ B * by: Similarly, the sup-projection of f with respect to h is the mapping f sup,h from B * to N∪{∞} defined for all v ∈ B * by: We summarise all the requirements in the following fact.
Fact 2.0.8.Let F be a class of cost functions over words such that: (1) for all regular languages L, χ L belongs to F, (2) F contains the cost function "size", (3) F is effectively closed under composition with a morphism, min, max, inf-projection and sup-projection, ( 4) is decidable over F, then the boundedness, divergence and domination problems are decidable for cost monadic logic over words.
The remainder of the paper is devoted to the introduction of the class of recognisable cost functions, and showing that this class satisfies all the assumptions of Fact 2.0.Thus we deduce our main result.
Theorem 2.1.The domination relation is decidable for cost-monadic logic over finite words.
All these results are established in Section 4.However, we need first to introduce the notion of stabilisation monoids, as well as some of its key properties.This is the subject of Section 3.

The algebraic model: stabilisation monoids
The purpose of this section is to describe the algebraic model of stabilisation monoids.This model has, a priori, no relation with the previous section.However, in Section 4, in which we define the notion of a recognisable cost function, we will use this model of describing cost functions.
The key idea-an idea directly inspired from the work of Leung, Simon and Kirsten-is to develop an algebraic notion (the stabilisation monoid) in which a special operator (called the stabilisation, ) allows to express what happens when we iterate "a lot of times" some element.In particular, it says whether we should count or not the number of iterations of this element.The terminology "a lot of times" is very vague, and for this reason such a formalism cannot describe precisely functions.However, it is perfectly suitable for describing cost functions.
The remaining part of the section is organised as follows.We first introduce the notion of stabilisation monoids in Section 3.1, paying a special attention to give it an intuitive meaning.In Section 3.2, we introduce the key notions of computations, under-computations and over-computations, as well as the two central results of existence of computations (Theorem 3.3) and "unicity" of their values (Theorem 3.4).These notions and results form the main technical core of this work.Then Section 3.4 is devoted to the proof of Theorem 3.3, and Section 3.5 to the proof of Theorem 3.4.

Stabilisation monoids.
A semigroup S = S, • is a set S equipped with an associative operation '•'.A monoid is a semigroup such that the product has a neutral element 1, i.e., such that 1 • x = x • 1 = x for all x ∈ S. Given a semigroup S = S, • , we extend the product to products of arbitrary length by defining π from S + to S by π(a) = a and π(ua) = π(u) • a.If the semigroup is a monoid of neutral element 1, we further set π(ε) = 1.All semigroups are monoids, and conversely it is sometimes convenient to transform a semigroup S into a monoid S 1 simply by the adjunction of a new neutral element 1.
An idempotent in S is an element e ∈ S such that e • e = e.We denote by E(S) the set of idempotents in S.An ordered semigroup S, •, ≤ is a semigroup S, • together with an order ≤ over S such that the product • is compatible with ≤; i.e., a ≤ a and b ≤ b implies a • b ≤ a • b .An ordered monoid is an ordered semigroup, the underlying semigroup of which is a monoid.
We are now ready to introduce the new notions of stabilisation semigroups and stabilisation monoids.• for all e ∈ E(S), e ≤ e; • for all e ∈ E(S), (e ) = e .It is called a stabilisation monoid if furthermore S, • is a monoid and 1 = 1 in which 1 is the neutral element of the monoid.
The intuition is that e represents what is the value of e n when n becomes "very large".Some consequences of the definitions, namely 5for all e ∈ E(S), e = e • e = e • e = e • e = (e ) , make perfect sense in this respect: repeating "a lot of e's" is equivalent to seeing one e followed by "a lot of e's", etc. . .This meaning of e is in some sense a limit behaviour.This is an intuitive reason why is not used for non-idempotent elements.Consider for instance the element 1 in Z/2Z.Then iterating it yields 0 at even iterations, and 1 at odd ones.This alternation prevents to giving a clear meaning to what is the result of "iterating a lot of times" 1.However, this view is incompatible with the classical view on monoids, in which by induction, if e • e = e, then e n = e for all n ≥ 1.The idea in stabilisation monoids is that the product is something that cannot be iterated "a lot of times".For this reason, considering that for all n ≥ 1, e n = e is correct for "small values of n", but becomes "incorrect" for "large values of n".The value of e n is e if n is "small", and it is e if n is "big".Most of the remainder of the section is devoted to the formalisation of this intuition, via the use of the notion of computations.
Even if the material necessary for working with stabilisation monoids has not been yet provided, it is already possible to give some examples of stabilisation monoids that are constructed from an informal idea of their intended meaning.
Example 3.1.1.In this example we start from an informal idea of what we would like to compute, and construct a stabilisation monoid from it.The explanations have to remain informal at this point in the exposition of the theory.However, this example should illustrate how we can already reason easily at this level of understanding.
Imagine you want, among words over a and b, to separate the ones that possess "a lot of occurrences of a's" from the ones that have only "a few occurrences of a's", i.e., imagine you want to describe a stabilisation monoid that "counts" the number of occurrences of a's.
For doing this, we should separate three "kinds" of words: • the kind of words with no occurrence of a; let b be the corresponding element in the stabilisation monoid (since the word b is of this kind), • the kind of words with at least one occurrence of a, but only "a few" such occurrences; let a be the corresponding element in the stabilisation monoid (since the word a is of this kind), • the kind of words with "a lot of occurrences of a's"; let 0 be the corresponding element in the stabilisation monoid.The words that we intend to separate-the ones with a lot of a's-are the ones of kind 0. With these three elements known, let us complete the definition of the stabilisation monoid.
Of course, iterating twice, or "many times", words which contain no occurrences of a yields words with no occurrences of the letter a.We capture this with the equalities b = b • b = b .Now words that have at least one a, but only a few number of occurrences of a, should not be affected by appending b letters to their left or to their right.I.e., we set a = b•a = a•b.Even more, appending a word with "few a's" to another word with "few a's" does also give a word with "few a's".I.e., we set a • a = a.
However, if we iterate "a lot of times" a word with at least one occurrence of a, it yields a word with "a lot of a's".Hence we set a = 0. We also easily get the equations b by inspecting all situations.We reach the following description of the stabilisation monoid: The left part describes the product operation '•', while the rightmost column gives the value of stabilisation (in general, this column may be partially defined since stabilisation is defined only for idempotents).
To complete the definition, we need to define the ordering over {b, a, 0}.The least we can do is setting 0 ≤ a and x ≤ x for all x ∈ {b, a, 0}.This is mandatory, since by definition of a stabilisation monoid a ≤ a, and we have a = 0.The reader can check that all the properties that we expect from a stabilisation monoid are now satisfied.
The intuition behind the ordering is that depending on what we mean by "a lot of a's", the same word can be of kind a or of kind 0. For instance the word a 100 is of kind a if we consider that 100 is "a few", while it is of kind 0 if we consider that 100 is "a lot".For this reason, there is a form of continuum that allows to go from a to 0. The order ≤ captures this relationship between elements.
It is sometimes convenient to present a stabilisation monoid by a form of Cayley graph: As in a standard Cayley graph, there is an edge labeled by y going from every vertex x to vertex x • y.Furthermore, there is a double arrow linking every idempotent x to its stabilised version x .
Example 3.1.2.Imagine we want to compute the size of the longest sequence of consecutive a's in words over the alphabet {a, b}.Then we would separate four "kinds" of words: • the kind consisting only of the empty word; let it be 1, • the kind of words, containing only occurrences of a, at least one occurrence of it, but only "a few" of them; let the corresponding element be a, • the kind of words containing at least one b, but no long sequence of consecutive a's; let the corresponding element be b, • the kind of words that contain a long sequence of consecutive a's; let the corresponding element be 0. Computing the size of the longest sequence of consecutive a's means identifying the words containing a "long" sequence of this type, i.e., it means to separate words of kind 0 from words of kind a or b.
The table of product and stabilisation is then naturally the following: We complete the definition of this stabilisation monoid by defining the ordering.For this, we let x ≤ x hold for all x ∈ {1, a, b, 0}, and we further set 0 ≤ a since a = 0. Since 0 = 0 • b, 0 ≤ a and a • b = b, we need also to set 0 ≤ b for ensuring the compatibility of the product with the order.Once more the ordering corresponds to the intuition that there exist words that can be of kind a (e.g., the word a 100 ) or b (e.g., the word ba 100 ), and that have kind 0 if we change what we mean by "a lot of".Remark 3.1.3.The notion of stabilisation monoids (or stabilisation semigroups) extends the notion of standard monoids (or semigroups).Many standard results concerning monoids have natural counterparts in the world of stabilisation monoids.For making this relationship more precise, let us describe the canonical way to translate a monoid into a stabilisation monoid.Let M = M, • be a monoid.The corresponding stabilisation monoid is: In other words, the monoid is extended with a trivial ordering (the equality), and the stabilisation is simply the identity over idempotents.The reader can easily check that this object indeed respects the definition of a stabilisation monoid.
If we refer to the intuition we gave above, we extend the monoid by an identity stabilisation.This means that for all idempotents, we do not make the distinction between iterating it "a few times" or "a lot of times".Said differently, we never have to count the number of occurrences of the idempotents.This is consistent with the principle that a standard monoid has no counting capabilities.Remark 3.1.4.The order plays an important role, even if it is sometimes hidden.Let us first remark that given a stabilisation monoid, it may happen that changing the order yields again another valid stabilisation monoid (as for ordered monoids).In general, there is a least order such that the structure is a valid stabilisation monoid.It is the intersection of all the "valid orders", and can be computed by a least fix-point.However, there is no maximal "valid order" in general.
More interestingly, there exist structures M, •, which have no order, which satisfy the definition of a stabilisation monoid, excepting for the rules involving the order, and such that it is not possible to construct an order for making them a valid stabilisation monoid.An example is the 10 elements structure which would be obtained for describing the property "there is an even number of small maximal segments of a's".But this is what we want.Indeed, a closer inspection would reveal that this property does not have the monotonic behaviour that we could use for defining a function.Consider for instance a word of the form a 1 ba 2 ba 3 b . . .ba n , and assume a small maximal segment of consecutive a's means a segment of length at most m, then, if m takes an even values at most equal to n, the word should be considered as in the language, while if it takes an odd value at most equal to m, the word should be thought outside the language.Thus, when m ranges in the interval {0, . . ., n}, the word is alternatively thought as in the language or outside the language.This is typically a non-monotonic behaviour.Keeping in mind cost monadic logic from the previous section, we see that no formula would be able to express such a property.Requiring an order in the definition of stabilisation monoids rules out such situations.
We have seen through the above examples how easy it is to work with stabilisation monoids at an informal level.An important part of the rest of the section is dedicated to providing formal definitions for this informal reasoning.In the above explanations, we worked with the imprecise terminology "a few" and "a lot of".Of course, the value (what we referred to as "the kind" in the examples) of a word depends on what is the frontier we fix for separating "a few" from "a lot".
We continue the description of stabilisation monoids by introducing the key notion of computations.These objects describe how to evaluate a long "product" in a stabilisation monoid.

3.2.
Computations, under-computations and over-computations.Our goal is now to provide a formal meaning for the notion of stabilisation semigroups and stabilisation monoids, allowing to avoid terms such as "a lot" or "a few".More precisely, we develop in this section the notion of computations.A computation is a tree which is used as a witness that a word evaluates to a given value.
We fix ourselves for the rest of the section a stabilisation semigroup S = S, •, , ≤ .We develop first the notion for semigroups, and then see how to use it for monoids in Section 3.3 (we will see that the notions are in close correspondence).
Let us consider a word u ∈ S + (it is a word over S, seen as an alphabet).Our objective is to define a "value" for this word.In standard semigroups, the "value" of u is simply π(u), the product of the elements appearing in the word.But, what should the "value" be for a stabilisation semigroup?All the informal semantics we have seen so far were based on the distinction between "a few" and "a lot".This means that the value the word has depends on what is considered as "a few", and what is considered as "a lot".This is captured by the fact that the value is parameterised by a positive integer n which can be understood as a threshold separating what is considered as "a few" from what is considered as "a lot".For each choice of n, the word u is subject to have a different value in the stabilisation semigroup.
Let us assume a threshold value n is fixed.We still lack a general mechanism for associating to each word u over S a value in S.This is the purpose of computations.Computations are proofs (taking the form of a tree) that a word should evaluate to a given value.Indeed, in the case of usual semigroups, the fact that a word u evaluates to π(u) can be witnessed by a binary tree, the leaves of which, read from left to right, yield the word u, and such that each inner node is labelled by the product of the label of its children.Clearly, the root of such a tree is labelled by π(u), and the tree can be seen as a proof of correctness for this value.
The notion of n-computation that we define now is a variation around this principle.For more ease in its use, it comes in three variants: under-computations, over-computations and computations.Definition 3.2.An n-under-computation T for the word u = a 1 . . .a l ∈ S + is an ordered unranked tree with l leaves, each node x of which is labelled by an element v(x) ∈ S called the value of x, and such that for all nodes x of children y 1 , . . ., y k (read from left to right), one of the following cases holds: Leaf: k = 0, and v(x) ≤ a m where x is the mth leave of T (read from left to right), Binary node: k = 2, and v(x ).An n-over computation is obtained by replacing everywhere "v(x) ≤" by "v(x) ≥".An n-computation is obtained by replacing everywhere "v(x) ≤" by "v(x) =", i.e., a ncomputation is a tree which is at the same time an n-under computation and an n-over computation.
The value of a [under-/over-]computation is the value of its root.We also use the following notations for easily denoting constructions of [under/over]computations.Given a non-leaf S-labelled tree T , denote by T i the subtree of T rooted at the i th children of the root.For a in S, we note as a the tree restricted to a single leaf of value a.If furthermore T 1 , . . ., T k are also S-labelled trees, then a[T 1 , . . ., T k ] denotes the tree of root labelled a, of degree k, and such that T i = T i for all i = 1 . . .k.
It should be immediately clear that these notions have to be manipulated with care, as shown by the following example.1, the aim of which is to count the number of occurrences of the letter a in a word.Both correspond to the evaluation of the same word.Both correspond to the same threshold value n.However, these two computations do not have the same value.We will see below how to compare computations and overcome this problem.
There is another problem.Indeed, it is straightforward to construct an n-computation for some word, simply by constructing a computation which is a binary tree, and would use no idempotent nodes nor stabilisation nodes.However, such a computation would of course not be satisfactory since every word u would be evaluated in this way as π(u).We do not want that.This would mean that the quantitative aspect contained in the stabilisation has been lost.We need to determine what is a relevant computation in order to rule out such computations.
Thus we need to answer the following questions: (1) What are the relevant computations?
(2) Can we construct a relevant n-computation for all words and all n?
(3) How do we relate the different values that n-computations may have on the same word?
The answer to the first question is that we are only interested in computations of small height, meaning of height bounded by some function of the semigroup.With such a restriction, it is not possible to use binary trees as computations.However, this choice makes the answer to the second question less obvious: does there always exist a computation?Theorem 3.3.For all words u ∈ S + and all non-negative integers n, there exists an n-computation of height at most6 3|S|.This result is an extension of the forest factorisation theorem of Simon [42] (which corresponds to the case of a semigroup).Its proof, which is independent from the rest of this work, is presented in Section 3.4.The third question remains: how to compare the values of different computations over the same word?An answer to this question in its full generality makes use of under-and over-computations.Theorem 3.4.For all non-negative integers p, there exists a polynomial α : N → N such that for all n-under-computations over some word of value a and height at most p, and all α(n)-over computations of value b over the same word u, a ≤ b .
Remark first that since computations are special instances of under-and over-computations, Theorem 3.4 holds in particular for comparing the values of computations.The proof of Theorem 3.4 is the subject of Section 3.5.
We have illustrated the above results in Figure 2. It depicts the relationship between computations in some idealised stabilisation monoid S. In this drawing, assume some word over some stabilisation semigroup is fixed, as well as some integer p ≥ 3|S|.We aim at representing for each n the possible values of an n-computation, an n-under computation or an n-over computation for this word of height at most p.In all the explanations below, all computations are supposed to not exceed height p.
The horizontal axis represents the n-coordinate.The values in the stabilisation semigroup being ordered, the vertical axis represents the values in the stabilisation semigroup (for the picture, we assume the values in the stabilisation semigroup totally ordered).Thus an n-computation (or n-under or n-over-computation) is placed at a point of horizontal coordinate n and vertical coordinate the value of the computation.
We can now interpret the properties of the computations in terms of this figure.First of all, under-computations as well as over-computations, and as opposed to computations, enjoy certain forms of monotonicity as shown by the fact below.Fact 3.4.1.For m ≤ n, all m-under-computations are also n-under-computations, and all n-over-computations are also m-over-computations (using the fact that e ≤ e for all idempotents e).
Any n-under-computation of value a can be turned into an n-under-computation of value b for all b ≤ a (by changing the root label from a to b).Similarly any n-overcomputation of value a can be turned into an n-over-computation of value b for all b ≥ a. Fact 3.4.1 is illustrated by Figure 2. It means that over-computations define a left and upward-closed area, while the under-computations define a right and downward-closed area.Hence, in particular, the delimiting lines are non-decreasing.Furthermore, since computations are at the same-time over-computations and under-computations, the area of computations lie inside the intersection of under-computations and over-computations.Since the height p is chosen to be at least 3|S|, Theorem 3.3 provides for us even more information.Namely, for each value of n, there exists an n-computation.This means in the picture that the area of computations crosses every column.However, since computations do not enjoy monotonicity properties, the shape of the area of computations can be quite complicated.Finally Theorem 3.4 states that the frontier of under-computations and the frontier of over-computations are not far one from each other.More precisely, if we choose an element a of the stabilisation semigroup, and we draw an horizontal line at altitude a, if the frontier of under-computations is above or at a for threshold n, then the frontier of over-computations is also above or at a at threshold α(n).Hence the frontier of overcomputations is always below the one of under-computations, but it essentially grows at the same speed, with a delay of at most α.Remark 3.4.2.In the case of standard semigroups or monoids (which can be seen as stabilisation monoids or semigroups according to Remark 3.1.3),the notions of computations, under-computations and over-computations coincide (since the order is trivial), and the value of the threshold n becomes also irrelevant.This means that the value of all n-[under/over-]computations over a word u coincide with π(u).(Such computations coincide with the "Ramsey factorisations" of the factorisation forest theorem.) Let us finally remark that Theorem 3.4, which is a consequence of the axioms of stabilisation semigroups, is also sufficient for deducing them.This is formalised by the following proposition.
Proposition 3. Let S = S, •, ≤, be a a structure consisting of a finite set S, a binary operation • from S 2 to S, ≤ be a partial order, and from S to S be a mapping defined over the idempotents of S. Assume furthermore that there exists α such that for all n-undercomputations for some word u of value a of height at most 3 and all α(n)-over-computation over u of value b of height at most 3, a ≤ b .Then S is a stabilisation semigroup.f, . . ., f ] are respectively a 0-computation for the word e α(0)+1 and an α(0)-over-computation for the same word.It follows that e ≤ f .Let us prove that stabilisation is idempotent.Let e be an idempotent.We already know that (e ) ≤ e (this makes sense since we have seen that e is idempotent).Let us prove the opposite inequality.Consider the 0-computation e [ Let us finally prove the consistency of stabilisation.Assume that both a • b and b • a are idempotents.Let t ab be (a • b)[a, b] (and similarly for t ba ), i.e., computations for ab and ba respectively.Define now: and This result is particularly useful.Indeed, when constructing a new stabilisation semigroup, we usually aim at proving that it "recognises" some function (to be defined in the next chapter).It involves proving the hypothesis of Proposition 3. Thanks to Proposition 3, the syntactic correctness is then for free.This situation occurs in particular in Section 4.5 and 4.6 when the closure of recognisable cost-functions under inf-projection and sup-projection is established.

Specificities of stabilisation monoids.
We have presented so far the notion of computations in the case of stabilisation semigroups.We are in fact interested in the study of stabilisation monoids.Monoids differ from semigroups by the presence of a unit element 1.This element is used for modelling the empty word.We present in this section the natural variant of the notions of computations for the case of stabilisation monoids.As is often the case, results from stabilisation semigroups transfer naturally to stabilisation monoids.The definition is highly related to the one for stabilisation semigroups, and we see through this section that it is easy to go from the notion for stabilisation monoid to the one of stabilisation semigroup case, and backward.The result is that we use the same name "computation" for the two notions elsewhere in the paper.Definition 3.5.Let M be a stabilisation monoid.Given a word u ∈ M * , a stabilisation monoid n-[under/over]-computation (sm-[under/over]-computation for short) for u is an n-[under/over]-computation for some v ∈ M + , such that it is possible to obtain u from v by deleting some occurrences of the letter 1.All have value 1.
Thus, the definition deals with the implicit presence of arbitrary many copies of the empty word (the unit) interleaved with a given word.This definition allows us to work in a transparent way with the empty word (this saves us case distinctions in proofs).In particular the empty word has an sm-n-computation which is simply 1, of value 1.There are many others, like 1[1, 1, 1, 1 [1,1]] for instance.
Since each n-computation is also an sm-n-computation over the same word, it is clear that Theorem 3.3 can be extended to this situation (just the obvious case of the empty word needs to be treated separately): Fact 3.5.1.There exists an sm-n-computation for all words in M * of size at most 3|M |.
The following lemma shows that sm-[under/over]-computations are not more expressive than [under/over]-computations. It is also elementary to prove.Lemma 3.6.Given an sm-n-computation (resp., sm-n-under-computation, sm-n-over-computation) of value a for the empty word, then a = 1 (resp.a ≤ 1, a ≥ 1).
For all non-empty words u and all sm-n-computations T (resp., sm-n-under-computations, sm-n-over-computations) for u of value a, there exists an n-computation (resp., nunder-computations, n-over-computations) for u of value a.Furthermore, its height is at most the height of T .
Proof.It is simple to eliminate each occurrence of an extra 1 by local modifications of the structure of the sm-computation: replace subtrees of the form 1[1, . . ., 1] by 1, subtrees of the form a[T, 1] by T , and subtrees of the form a[1, T ] by T , up to elimination of all occurrences of 1.For the empty word, this results in the first part of the lemma.For nonempty words, the resulting simplified sm-computation is a computation.The argument works identically for the under/over variants.
A corollary is that Theorem 3.4 extends to sm-computations.
Corollary 3.7.For all non-negative integers p, there exists a polynomial α : N → N such that for all sm-n-under-computations over some word u ∈ M * of value a and height at most p, and all sm-α(n)-over computations of value b over the same word u, a ≤ b .
Proof.Indeed, the sm-under-computations and sm-over-computations can be turned into under-computations and over-computations of the same respective values by Lemma 3.6.The inequality holds for these under and over-computations by Theorem 3.4.
There is a last lemma which is related and will prove useful.Lemma 3.8.Let u be a word in M * and v be obtained from u by eliminating some of its 1 letters, then all n-[under/over]-computations for v can be turned into an n-[under/over]computation for u of same value.Furthermore, the height increase is at most 3.
Proof.Let v = a 1 . . .a n , then u = u 1 . . .u n for u i ∈ 1 * a i 1 * .Let T be the n-[under/over]computation for v of value a.It is easy to construct an n-[under/over]-computation T i for u i of height at most 3 of value a i .It is then sufficient to plug in T each T i for the ith leave of T .
The consequence of these results is that we can work with sm-[under/over]-computations as with [under-over]-computations. For this reason we shall not distinguish further between the two notions unless necessary.

3.4.
Existence of computations: the proof of Theorem 3.3.In this section, we establish Theorem 3.3 which states that for all words u over a stabilisation semigroup S and all non-negative integers n, there exists an n-computation for u of height at most 3|S|.Remark that the convention in this context is to measure the height of a tree without counting the leaves.This result is a form of extension of the factorisation forest theorem due to Simon [42]: Theorem 3.9 (Simon [42,43]).Define a Ramsey factorisation to be an n-computation in the pathological case n = ∞ (i.e., there are no stabilisation nodes, and idempotent nodes are allowed to have arbitrary degree).For all non-empty words u over a finite semigroup S, there exists a Ramsey factorisation for u of height7 at most 3|S| − 1.Some proofs of the factorisation forest theorem can be found in [30,7,8].Our proof could follow similar lines as the above one.Instead of that, we try to reuse as much lemmas as possible from the above constructions.
For proving Theorem 3.3, we will need one of Green's relations, namely the J -relation (while there are five relations in general).Let us fix ourselves a semigroup S. We denote by S 1 the semigroup extended (if necessary) with a neutral element 1 (this transforms S into a monoid).Given two elements a, b ∈ S, a ≤ J b if a = x • b • y for some x, y ∈ S 1 .If a ≤ J b and b ≤ J b, then aJ b.We write a < J b to denote a ≤ J b and b ≤ J a.The interested reader can see, e.g., [8] for an introduction to the relations of Green (with a proof of the factorisation forest theorem), or monographs such as [31], [15] or [39] for deep presentations of this theory.Finally, let us call a regular element in a semigroup an element a such that a • x • a = a for some x ∈ S 1 .
The next lemma gathers some classical results concerning finite semigroups.
• J contains a regular element, • there exist a, b ∈ J such that a • b ∈ J, • all elements in J are regular, • all elements in J can be written as e • c for some idempotent e ∈ J, • all elements in J can be written as c • e for some idempotent e ∈ J.Such J -classes are called regular.
We will use the following technical lemma.Proof.We use some standard results concerning finite semigroups.The interested reader can find the necessary material for instance in [39].Let us just recall that the relations ≤ L , ≤ R and L and R are the one-sided variants of ≤ J and J (L stands for "left" and R for "right").Namely, a The proof is very short.By definition f ≤ L e since e • x • e = f .Since by assumption f J e, we obtain f Le (a classical result in finite semigroups).In a symmetric way f Re.Thus f He.Since an H-class contains at most one idempotent, f = e (it is classical than any H-class, when containing an idempotent, has a group structure; since groups contain exactly one idempotent element, this is the only one).
The next lemma shows that the stabilisation operation behaves in a very uniform way inside J -classes (similar arguments can be found in the works of Leung, Simon and Kirsten).Lemma 3.12.If eJ f are idempotents, then e J f .Furthermore, if e = x • f • y for some x, y, then e = x • f • y.
Proof.For the second part, assume e = x • f • y and eJ f .Let This proves that eJ f implies e ≤ J f .Using symmetry, we obtain e J f .Hence, if J is a regular J -class, there exists a unique J -class J which contains e for one/all idempotents e ∈ J.If J = J , then J is called stable, otherwise, it is called unstable.The following lemma shows that stabilisation is trivial over stable J -classes.The situation is different for unstable J -classes.In this case, the stabilisation always goes down in the J -order.Lemma 3.14.If J is an unstable J -class, then e < J e for all idempotents e ∈ J.
Proof.Since e = e • e , it is always the case that e ≤ J e. Assuming J is unstable means that eJ e does not hold, which in turn implies e < J e.
We say that a word u = a 1 . . .a n in S + is J-smooth, for J a J -class, if u ∈ J + , and π(u) ∈ J.It is equivalent to say that π(a i a i+1 • • • a j ) ∈ J for all 1 ≤ i < j ≤ n.Indeed for all 1 ≤ i < j ≤ n, a i J π(a 1 . . .a n ) ≤ J π(a i a i+1 • • • a j ) ≤ J a i ∈ J. Remark that, according to Lemma 3.10, if J is irregular, J-smooth words have length at most 1.We will use the following lemma from [8] as a black-box.This is an instance of the factorisation forest theorem, but restricted to a single J -class.Lemma 3.15 (Lemma 14 in [8]).Given a finite semigroup S, one of its J -classes J, and a J-smooth word u, there exists a Ramsey factorisation for u of height at most 3|J| − 1.
Remark that Ramsey factorisations and n-computations do only differ on what is allowed for a node of large degree, i.e., above n.That is why our construction makes use of Lemma 3.15 to produce Ramsey factorisations, and then based on the presence of nodes of large degree, constructs a computation by gluing pieces of Ramsey factorisations together.Lemma 3.16.Let J be a J -class, u be a J-smooth word, and n be some non-negative integer.Then one of the two following items holds: (1) there exists an n-computation for u of value π(u) and height at most 3|J| − 1, or; (2) there exists an n-computation for some non-empty prefix w of u of value8 a < J J and height at most 3|J|.
Proof.Remark that if J is irregular, then u has length 1 by Lemma 3.10, and the result is straightforward.Remark also that if J is stable, and since the stabilisation is trivial in stable J -classes (Lemma 3.13), every Ramsey factorisations for u of height at most 3|J| − 1 (which exist by Lemma 3.15) is in fact n-computations for u.
The case of J unstable remains.Let us say that a node in a factorisation is big if its degree is more than n.Our goal is to "correct" the value of big nodes.If there is a Ramsey factorisation for u which has no big node, then it can be seen as an n-computation, and once more the first conclusion of the lemma holds.
Otherwise, consider the least non-empty prefix u of u for which there is a Ramsey factorisation of height at most 3|J|−1 which contains a big node.Let F be such a factorisation and x be a big node in F which is maximal for the descendant relation (there are no other big nodes below).Let F be the subtree of F rooted in x.This decomposes u into vv v where v is the factor of u for which F is a Ramsey factorisation.For this v , it is easy to transform F into an n-computation T for v : just replace the label e of the root of F by e .Indeed, since there are no other big nodes in F than the root, the root is the only place which prevents F from being an n-computation.Remark that from Lemma 3.14, the value of F is < J J.
If v is empty, then v is a prefix of u, and F an n-computation for it.The second conclusion of the lemma holds.
Otherwise, by the minimality assumption and Lemma 3.15, there exists a Ramsey factorisation T for v of height at most 3|J| − 1 which contains no big node.Both T and T being n-computations of height at most 3|J| − 1, it is easy to combine them into an ncomputation of height at most 3|J| for vv .This is an n-computation for vv , which inherits from F the property that its value is < J J. It proves that the second conclusion of the lemma holds.
We are now ready to establish Theorem 3.3.
Proof.The proof is by induction on the size of a left-right-ideal Z ⊆ S, i.e., S 1 • Z • S 1 ⊆ Z (remark that a left-right-ideal is a union of J -classes).We establish by induction on the size of Z the following induction hypothesis: IH: for all words u ∈ Z + + Z * S there exists an n-computation of height at most 3|Z| for u.
Of course, for Z = S, this proves Theorem 3.3.
The base case is when Z is empty, then u has length 1, and a single node tree establish the first conclusion of the induction hypothesis (recall that the convention is that the leaves do not count in the height, and as a consequence a single node tree has height 0).
Otherwise, assume Z non-empty.There exists a maximal J -class J (maximal for ≤ J ) included in Z. From the maximality assumption, we can check that Z = Z \ J is again a left-right-ideal.Remark also that since Z is a left-right-ideal, it is downward closed for ≤ J .This means in particular that every element a such that a < J J belongs to Z .
Claim: We claim ( ) that for all words u ∈ Z + + Z * S, (1) either there exists an n-computation of height 3|J| for u, or; (2) there exists an n-computation of height at most 3|J| for some non-empty prefix of u of value in Z .
Let w be the longest J-smooth prefix of u.If there exists no such non-empty prefix, this means that the first letter a of u does not belong to J. Two subcases can happen.If u has length 1, this means that u = a, and thus a is an n-computation witnessing the first conclusion of ( ).Otherwise u has length at least 2, and thus a belongs to Z. Since furthermore it does not belong to J, it belongs to Z .In this case, a is an n-computation witnessing the second conclusion of ( ).
Otherwise, according to Lemma 3.16 applied to w, two situations can occur.The first case is when there is an n-computation T for w of value π(w) and height at most 3|J| − 1.There are several sub-cases.If u = w, of course, the n-computation T is a witness that the first conclusion of ( ) holds.Otherwise, there is a letter a such that wa is a prefix of u.If wa = u, then π(wa)[T, a] is an n-computation for wa of height at most 3|J|, witnessing that the first conclusion of ( ) holds.Otherwise, a has to belong to Z (because all letters of u have to belong to Z except possibly the last one).But, by maximality of w as a J-smooth prefix, either a ∈ Z , or π(wa) ∈ Z .Since Z is a left-right-ideal, a ∈ Z implies π(wa) ∈ Z .Then, π(wa)[T, a] is an n-computation for wa of height at most 3|J| and value π(wa) ∈ Z .This time, the second conclusion of ( ) holds.
The second case according to Lemma 3.16 is when there exists a prefix v of w for which there is an n-computation of height at most 3|J| of value < J J. In this case, v is also a prefix of u, and the value of this computation is in Z .Once more the second conclusion of ( ) holds.This concludes the proof of Claim ( ).
As long as the second conclusion of the claim ( ) applied on the word u holds, this decomposes u into v 1 u , and we can proceed with u .In the end, we obtain that all words u ∈ Z + + Z * S can be decomposed into u 1 . . .u k such that there exist n-computations T 1 , . . ., T k of height at most 3|J| for u 1 , . . ., u k respectively, and such that the values of T 1 , . . ., T k−1 all belong to Z (but not necessarily the value of T k ).Let a 1 , . . ., a k be the values of T 1 , . . ., T k respectively.The word a 1 . . .a k belongs to Z + + Z * S. Let us apply the induction hypothesis to the word a 1 . . .a k .We obtain an n-computation T for a 1 . . .a k of height at most 3|Z |.By simply substituting T 1 , . . ., T k to the leaves of T , we obtain an n-computation for u of height at most 3|J| + 3|Z | = 3|Z|.(Remark once more here that the convention is to not count the leaves in the height.Hence the height after a substitution is bounded by the sum of the heights.)3.5.Comparing computations: the proof of Theorem 3.4.We now establish the second key theorem for computations, namely Theorem 3.4 which states that the result of computations is, in some sense, unique.The proof works by a case analysis on the possible ways the over-computations and under-computations may overlap.We perform this proof for stabilisation monoids, thus using sm-computations.More precisely, all statements take as input computations, and output sm-computations, which can be then normalised into non-sm computations.The result for stabilisation semigroup can be derived from it.We fix ourselves from now on a stabilisation monoid M. Lemma 3.17.For all n-over-computations of value a over a word u ∈ M * of length at most n, π(u) ≤ a.
Proof.By induction on the height of the over-computation, using the fact that an n-overcomputation for a word of length at most n cannot contain a stabilisation node.

Proof. By induction on the height of the over-computation.
A sequence of words u 1 , . . ., u k is called a decomposition of u if u = u 1 . . .u k .We say that a non-leaf [under/over]-computation T for a word u decomposes u into u 1 ,. . .,u k if the subtree rooted at the ith child of the root is an [under/over]-computation for u i , for all i = 1 . . .k.Our proof will mainly make use of over-computations.For this reason, we introduce the following terminology.
We say that a word u ∈ M * n-evaluates to a ∈ M if there exists an n-over-computation for u of value a.We will also say that u This notion is subject to elementary reasoning such as (a) u n-evaluates to π(u) The core of the proof is contained in the following property: Lemma 3.19.There exists a polynomial α such that for all u 1 , . . ., From this result, we can deduce Theorem 3.4 as follows.
Proof of Theorem 3.4.Let α be as in Lemma 3.19.Let α p be the pth composition of α with itself.Let U be an n-under-computation of height at most p for some word u of value a, and T be an α p (n)-over-computation for u of value b.We want to establish that a ≤ b.The proof is by induction on p.If p = 0, this means that u has length 1, then T and U are also restricted to a single leaf, and the result obviously holds.Otherwise, U decomposes u into u 1 , . . ., u k .Let For all i = 1 . . .k, we can apply the induction hypothesis on U i (let us recall that U i is the sub-under-computation rooted at the ith child of the root of U ) and B i , and obtain that a i ≤ b i .Depending on k, three cases have to be separated.If k = 2 (binary node), then which is an idempotent.We have e = a i ≤ b i for all i = 1 . . .k. Hence by Lemma 3.17, e ≤ b, which means a ≤ b.If k > n (stabilisation node), we have once more which is an idempotent, and such that a ≤ e .This time, by Lemma 3.18, we have e ≤ b.We obtain once more a ≤ b.
The remainder of this section is dedicated to the proof of Lemma 3.19.Lemma 3.20.There exists a positive integer K such that for all idempotents e, f , whenever Proof.To each ordered pair i < j, let us associates the color c i,j = (a i , π(b i a i+1 . . .b j−1 )).We now apply the theorem of Ramsey to this coloring, for K sufficiently large, and get that there exist 1 The following lemma will be used for treating the case of idempotent and stabilisation nodes in the proof of Lemma 3.19.Lemma 3.21.There exists a polynomial β such that, if x 1 , y 1 , x 2 , y 2 . . ., x m , y m (m ≥ 1) are elements of M and e is an idempotent such that x h • y h ≤ e for all h = 1 . . .m, then ( We first claim ( ) that there exists i < j such that v = d i . . .d j−1 n-evaluates to y i • e • x j .For this, consider the word u = d 1 . . .d m−1 , and apply Theorem 3.3 for producing an (n + K)-computation U for u of height at most 3|M |.The word u has length m − 1 > β(n) − 1 = (n + K) 3|M | .Thus there is a stabilisation node in T , say of degree k > n + K. Let S be a subtree of T rooted at some stabilisation node.Let f be the (idempotent) value of the children of this node, the value of S being f .This subtree corresponds to the factor v = d i . . .d j−1 of u.We have to show that v n-evaluates to Hence we can apply Lemma 3.20, and get that f ≤ a 1 • e • b k .Since furthermore a 1 = y i , and b k is either ≤ x j or ≤ e • x j , it follows that v n-evaluates to f ≤ y i • e • x j .This concludes the proof of the claim ( ).
Set now This proves the second conclusion of the statement.
We are now ready to conclude.
Proof of Lemma 3.19.Let us set α(n) to be (n + 1)β(n) − 1, where β is the polynomial taken from Lemma 3.21.Lemma 3.19 follows from the following induction hypothesis: Induction hypothesis: For all words u which α(n)-evaluate to b, and all decompositions of Induction parameter: The height of the α(n)-over-computation T witnessing that α(n)evaluates to b.
It should be clear that this implies Lemma 3.19 since this means that b The essential idea in the proof of the induction hypothesis is that T decomposes the word into v 1 , . . ., v , and we have to study all the possible ways the v i 's and the u j 's may overlap.In practice, we will not refer much to T , but simply about how it decomposes the word into v 1 , . . ., v .Thus, from now on, let v 1 , . . ., v and u 1 , . . ., u k be decompositions of a word u such that each of the v i 's α(n)-evaluates to a i and is subject to the application of the induction hypothesis.
Leaves.This means that = 1.All the u h 's should be empty, but one, say u h = a where a is the letter labelling the leaf.Three cases can occur depending on h.If h = 1, then u 1 , . . ., u k obviously n-evaluate to a, 1, . . ., 1, 1, and 1 . . . 1 n-evaluate to 1, and we indeed have a , a, 1, . . ., 1, and 1 . . .1a1 . . . 1 n-evaluate to a, and we indeed have Binary nodes.If = 2, then there exist s in 1, . . ., k and words w, w such that We can apply the induction hypothesis to both v 1 and v 2 .We obtain that u 1 , . . ., u s−1 , w, w , u s+1 , . . ., Idempotent and stabilisation nodes.Assume now that v 1 , . . ., v α(n)-evaluate to e, . . ., e, where e is idempotent.We aim at proving that u 1 , . . ., u k n-evaluates to b 1 , . . ., b k , and b We rely on a suitable decomposition of the words: there exist 0 The best is to present it through a drawing.It is annotated with all the variables that will be used during the proof.The two main rows represent the two possible decompositions of the word into v i 's and u j 's.
Such a decomposition is not unique.It is sufficient to guarantee that each separation between some u s and some u s+1 fall in some v i h , and that v i h contains such a separation.We can apply the induction hypothesis on each equation ( ).Hence, it follows that Since each v h α(n)-evauates to e, each v h also n-evaluates to e. Now e h has been chosen such that v i h +1 . . .v i h+1 −1 n-evaluates to e h .Thus from ( ), u j h n-evaluates for all h = 0 . . .m to b j h that we define as b j h = b h • e h • b h .At this point, we have that C1 u 1 , . . ., u k n-evaluate to b 1 , . . ., b k .
To head toward the conclusion, we will use Lemma 3.21.Thus, let us set x h to be b h−1 and y h to be c h • b h • e h for all h = 1 . . .m.We have According to ( †), x h • y h ≤ e, and we can apply Lemma 3.21 to x 1 , y 1 , x 2 , . . ., x m , y m and obtain that (y 1 • x 2 ) . . .(y m−1 • x m ) n-evaluates to some z subject to the conclusions of the lemma (we will recall these conclusions upon need).

Recognisable cost functions
We have seen in the previous sections the notion of stabilisation monoids, as well as the key technical tools for dealing with them, namely computations, over-computations and under-computations.In particular, we have seen Theorem 3.3 and Theorem 3.4 that state the existence of computations and the "unicity" of their values.In this section, we put these notions in action, and introduce the definition of recognisable cost functions.We will see in particular that the hypothesis of Fact 2.0.8 is fulfilled by recognisable cost functions, and as a consequence the domination problem for cost-monadic logic is decidable over finite words.In particular, the informal reasoning developed in the example such as "a few+a few=a few" now has a formal meaning: the imprecision in such arguments is absorbed in the equivalence up to α of computation trees, and results in the fact that the monoid does not define a unique function, but instead directly a cost function.
Another example is the case of standard regular languages.
Example 4.2.2.Let us recall that a monoid M together with h from A to M and a subset F ⊆ M is said to recognise a language L over A if for all words u, u ∈ L if and only if π( h(u)) ∈ F .The same monoid can be seen, thanks to Remark 3.1.3as a stabilisation monoid.In this case, thanks to Remark 3.4.2, the same M, h, F recognises the characteristic mapping of L.
An elementary result is also the closure under composition with a morphism.We continue this section by developing other tools for analysing the recognisable cost functions.
4.2.The -expressions.We now present the notion of -expressions.This provides a convenient notation in several situations.This object was introduced by Hashiguchi for studying distance automata [17].The -expressions can be seen in two different ways.On one side, a -expression allows to denote an element in a stabilisation monoid.On the other side, a -expression denotes an infinite sequence of words.Such sequences are used as witnesses, e.g., of the non-existence of a bound for a function (if the function tends toward infinity over this sequence), or of the non-divergence of a function f (if the function is bounded over the sequence).More generally, -expressions will be used as witnesses of non-domination.
In a finite monoid M, given an element a ∈ M , one denotes by a ω the only idempotent which is a power of a.This element does not exist in general for infinite monoids, while it always does for finite monoids (our case).Furthermore, when it exists, it is unique.In particular in a finite monoid M, a ω = a Ω , where Ω is some multiple of |M |!.This is a useful notion since the operator of stabilisation is only defined for idempotents.In a stabilisation monoid, let us denote by a ω the element (a Ω ) .As opposed to a which is not defined if a is not idempotent, a ω is always defined.We consider Ω as fixed from now.
A -expression over a set A is an expression composed of letters from A, products, and exponents with ω .A -expression E over a stabilisation monoid denotes a computation in this stabilisation monoid.It naturally evaluates to an element of E, denoted value(E), and called the value of E. A -expression is called strict if it contains at least one occurrence of ω .
Given a set A ⊆ M , call A the set of values of expressions over A. Equivalently, it is the least set which contains A and is closed under product and stabilisation of idempotents.One also denotes A + the set of values of strict -expressions over A.
• µ(e) = µ(e ) for all e ∈ E(M) (in this case, µ(e) ∈ E(M )).Remark 4.2.4.The n-computations (resp., n-under-computations, n-over-computations) over M are transformed by morphism (applied to each node of the tree) into n-computations (resp., n-under-computations, n-over-computations) over M .In a similar way, the image under morphism of a -expression over M is a -expression over M .
We immediately obtain: Lemma 4.3.For µ a morphism of stabilisation monoids from M to M , h a mapping from an alphabet A to M and I an ideal of M , we have: Proof.Let us remark first that I = µ −1 (I ) is an ideal of M.
Let u be a word in A * .Let us consider an n-computation over M for h(u) of value a ∈ µ −1 (I ).This computation can be transformed by morphism into an n-computation over M for (µ • h)(u) of value µ(a) ∈ I .In a similar way, each n-computation over h(u) of value a ∈ M \ µ −1 (I ) can be transformed into an n-computation of value µ(a) ∈ M \ I .
The notion of morphism is intimately related to the notion of product.Given two stabilisation monoids M = M, •, , ≤ and M = M , • , , ≤ , one defines their product by: where (x, x ) • (y, y ) = (x • y, x • y ), (e, e ) = (e , e ), and (x, x ) ≤ (y, y ) if and only if x ≤ y and x ≤ y .
As expected, the projection over the first component (resp., second component) is a morphism of stabilisation monoids from M × M onto M (resp., onto M ).It follows by Lemma 4.3 that if f is recognised by M, h, I and g by M , h , I , then f is also recognised by M × M , h × h , I × M and g by M × M , h × h , M × I , in which one sets (h × h )(a) = (h(a), h(a )) for all letters a.Thus one obtains: Lemma 4.4.If f and g are recognisable cost functions over A * , there exist a stabilisation monoid M, an application h from A to M and two ideals I, J such that M, h, I recognises f and M, h, J recognises g.Proof.Let f, g be recognisable cost functions.According to Lemma 4.4 there exist a stabilisation monoid M, a mapping h from the alphabet A to M and two ideals I, J such that M, h, I recognises f and M, h, J recognises g.
We show that f dominates g if and only if the following (decidable) property holds: Thus we have f g.Second direction.Let us suppose the existence of a ∈ h(A) ∩I \J.By definition of h(A) , there is a -expression E over h(A) of value a.Let F be the -expression over A obtained by substituting to each element x ∈ h(A) some letter from c ∈ A such that h(c) = x.According to Proposition 4, f is unbounded over {unfold(F, Ωn) : n ≥ 1} (for some suitable k).However, still applying Proposition 4, g is bounded over {unfold(F, Ωn) : n ≥ 3}.This witnesses that g does not dominate f .4.5.Closure under inf-projection.We establish the following theorem.
The projection in the classical case (of recognisable languages of finite words) requires a powerset construction (for monoids as for deterministic automata).In our case, the approach is similar.Let z be a mapping from alphabet A to B. The goal of a stabilisation monoid which would recognise the inf-projection by z of a recognisable cost function is to keep track of all the values a computation could have taken for some inverse image by z of the input word.Hence an element of the stabilisation monoid for the inf-projection of the cost function consists naturally of a set of elements in the original monoid.
A closer inspection reveals that it is possible to close these subsets downward, i.e., to consider only ideals.In fact, it is not only possible, but it is even necessary for the construction to go through.Let us describe more formally this construction.
We have to consider a construction of ideals.Let M ↓ be the set of ideals of M. One equips M ↓ of an order simply by inclusion: and of a product as follows: Finally, the stabilisation is defined for an idempotent by: The resulting structure M ↓ , •, ≤, is denoted M ↓ .It may seem a priori that our first goal would be to prove that the structure defined in the above way is indeed a stabilisation monoid.In fact, thanks to Proposition 3, this will be for free (see Lemma 4.12 below).
We now prove that M ↓ can be used for recognising the inf-projection of a cost function recognised by M. Thus our goal is to relate the (under)-computations in M ↓ to the (under)computations in M.This will provide a semantic link between the two stabilisation monoids.This relationship takes the form of Lemmas 4. 10 A similar characterisation holds for the stabilisation of idempotents.Lemma 4.9.If E is a stable idempotent in M ↓ (i.e.such that E = E ), then for all a ∈ E there exist b, c, e ∈ E with e idempotent such that a ≤ b • e • c.
Proof.By definition of E , there is a strict -expression F over E such that a ≤ value(F ).Thus it is sufficient to prove, by induction, that all strict -expressions F is such that value(F ) ≤ b • e ω • c for some b, e, c in E. The base case is F = G ω where G is a non strict -expression.In this case value(G) = g ∈ E. It follows that value(G ω ) = value(G) ω ≤ g ω • g ω • g ω .Thus the induction hypothesis holds.The other case is the product Once more the induction hypothesis hold.Lemma 4.10.Let A 1 . . .A k be a word over M ↓ and let T be an n-under-computation over A 1 . . .A k of height at most p and of value A. For all a ∈ A, there exists an n-undercomputation of height 3p and value a for some word a 1 . . .a k such that a 1 ∈ A 1 ,. . .,a k ∈ A k .
Proof.The proof is by induction on p.
Leaf case, i.e., T = A 1 .Let a ∈ A ⊆ A 1 , then a is an n-computation of value a ∈ A. Idempotent node.T = F [T 1 , . . ., T k ] for some k ≤ n where F ⊆ E for an idempotent E such that the value of T i is E for all i.Let a ∈ F ⊆ E. We have a ≤ b • e • c for some b, c, e ∈ E (Lemma 4.8).We then apply the induction hypothesis for b, e, . . ., e and c on the n-under-computations T 1 , . . ., T k−1 and T k respectively, yielding the n-undercomputations t 1 , . . ., t k−1 and t k respectively.The tree a[t 1 , (e • c)[e[t 2 , . . ., t k−1 ], t k ]] is an n-under-computation witnessing that the induction hypothesis holds.
Stabilisation node.T = F [T 1 , . . ., T k ] for some k > n and F ⊆ E for some idempotent E such that the value of T i is E for all i.Let a ∈ F ⊆ E .We have a ≤ b • e • c for some b, e, c ∈ E (Lemma 4.9).We then apply the induction hypothesis for b, e, . . ., e and c respectively and the computations T 1 , . . ., T k respectively, yielding the n-under-computations t 1 , . . ., t k respectively.We conclude by constructing the n-undercomputation a[t 1 , (e • c)[e [t 2 , . . ., t k−1 ], t k ]] (remark that e [t 2 , . . ., t k−1 ] is a valid undercomputations since e ≤ e).
Idempotent node, i.e., T = F [T 1 , . . ., T k ] for k ≤ n 3|M | where T 1 , . . ., T k share the same idempotent value E ⊆ F .Let t 1 , . . ., t k be the n-computations of respective values b 1 , . . ., b k obtained by applying the induction hypothesis on T 1 , . . ., T k respectively.Furthermore, according to Theorem 3.3, there exists an n-computation t for the word b 1 . . .b k of height at most 3|M |.Let a be the value of t.Since E is an idempotent, it is closed under product and stabilisation and contains b 1 , . . ., b k .It follows that a ∈ E (by induction on the height of t).The induction hypothesis holds using the witness n-computation t{t 1 , . . ., t k } where t{t 1 , . . ., t k } is obtained from t by substituting the ith leaf for t i for all i = 1 . . .k.
Stabilisation node, i.e., T = F [T 1 , . . ., T k ] for k > n 3|M | where T 1 , . . ., T k all share the same idempotent value E such that E ⊆ F .Let t 1 , . . ., t k be the n-computations of respective values b 1 , . . ., b k obtained by applying the induction hypothesis on T 1 , . . ., T k respectively.Furthermore, according to Theorem 3.3, there exists an n-computation t for b 1 . . .b k of height at most 3|M |.Since t has height at most 3|M | and has more than n 3|M | leaves, it contains at least one node of degree more than n, namely, a stabilisation node.It is then easy to prove by induction on the height of t that the value of t belongs to E + .Thus the n-computation t{t 1 , . . ., t k } (as defined in the above case) is a witness for the induction hypothesis.
Proof.Consider an n-under-computation T of value A in M ↓ for some word A 1 . . .A k of height at most p, and some α(n)-over-computation T for the same word of value B (with α(n) = α (n) 3|M | where α is obtained from Theorem 3.4 applied to M for height at most 3p).We aim at A ≤ B. Indeed, this implies that one can use Proposition 3, and get that M ↓ is a stabilisation semigroup (it is then straightforward to prove it a stabilisation monoid.
Let a ∈ A, we aim at a ∈ B, thus proving A ⊆ B, i.e., A ≤ B. By Lemma 4.10, there exists an n-under-computation for some word a 1 . . .a k with a 1 ∈ A 1 , . . ., a k ∈ A k , of of a product with: and of a stabilisation operation by: Let us call M ↑ the resulting structure.You can remark that E + ↑ = E ↑.This was not the case for ideals.The proof is extremely close to the case of inf-projection.However, a careful inspection would show that all computations of bounds, and even some local arguments need to be modified.
Let us state a simple remark on the structure of idempotents.
Our second preparatory lemma is used for the treatment of stabilisation nodes.
Lemma 4.15.There exists a polynomial α such that for all idempotents E of M ↑ , all a ∈ E and all α(m) ≤ n, there exists an m-over-computation of height at most 2|M | + 3 of value a over some word over E of length n.
Proof.It is sufficient to prove the result for a single pair E, a, and construct for each such case a polynomial α E,a .Then, since there are finitely many such pairs (E, a), one can choose a polynomial α that is above all the α E,a .This α will witness the lemma for all choices of E and a.
We first claim ( ) that if b ∈ E then for all n ≥ 1, there exists a word w n over E of length n such that for all m ≥ 1 there exist an m-over-computation for w of height at most 3 of value b.Indeed, by Lemma 4. Consider now a -expression e of value a ∈ E for some idempotent E. Without loss of generality, we can choose it of height at most 2|M |. and consider the word u m = unfold(e, |M |!(m + 1)) for all m ≥ 1.The length of this word is a polynomial α(m), and there is an m-over-computation of height at most 2|M | of value a for this word.Consider now some n ≥ α(m).The word u m can be written vb for some b ∈ E. We can apply the above claim to b and n+1−α(m), yielding the word w n+1−α(m) .Combining the two m-overcomputations, we then naturally obtain an m-over-computation for the word vw n+1−α(m) of height at most 2|M | + 3 and of value a, and this word has length n.Lemma 4.16.There exists a polynomial α such that for all words A 1 . . .A k over M ↑ and all α(n)-over-computations T for A 1 . . .A k of height at most p of value A and all a ∈ A, there exists an n-over computation of height at most (2|M | + 3)p and value a for some word a 1 . . .a k with a 1 ∈ A 1 ,. . .,a k ∈ A k .
Proof.The proof is by induction on p.We take the polynomial α of Lemma 4.15.
Leaf case, i.e., T = A 1 .Let a ∈ A ⊆ A 1 , then a is an n-computation of value a ∈ A. Stabilisation node.T = F [T 1 , . . ., T k ] for some k > α(n) and F ⊆ E for some idempotent E such that the value of T i is E for all i.Let a ∈ F ⊆ E .According to Lemma 4.15, there exists a word a 1 . . .a k over E and an n-over-computation t for a 1 . . .a k of value a and height at most 2|M | + 3. We then apply the induction hypothesis for each of a 1 , . . ., a k with the computations T 1 , . . ., T k respectively.This yields n-over-computations t 1 , . . ., t k respectively.We conclude by constructing the n-over-computation obtained by substituting in t the ith leaf with t i .Lemma 4.17.Let A 1 . . .A k be a word over M ↑ and T be an n-under-computation T for A 1 . . .A k of height p and value A. For all words u = a 1 . . .a k with a 1 ∈ A 1 , . . ., a k ∈ A k , there exists an n-computation over a 1 . . .a k of value a ∈ A and of height at most 3|M |p.
Proof.The proof is by induction on p.
Leaf case, i.e., T = A and u = a 1 ∈ A 1 ⊆ A. Hence a 1 is a computation satisfying the induction hypothesis.Idempotent node, i.e., T = F [T 1 , . . ., T k ] for k ≤ n where T 1 , . . ., T k share the same idempotent value E ⊆ F .Let t 1 , . . ., t k be the n-computations of respective values b 1 , . . ., b k obtained by applying the induction hypothesis on T 1 , . . ., T k respectively.Furthermore, according to Theorem 3.3, there exists an n-computation t for the word b 1 . . .b k of height at most 3|M |.Let a be the value of t.Since E is an idempotent, it is closed under product and contains b 1 , . . ., b k .Since furthermore t does not contain any node of stabilisation, we obtain that a ∈ E (by induction on the height of t).We conclude using the n-computation t{t 1 , . . ., t k } (obtained from t by substituting the ith leaf of t for t i ) which satisfies the induction hypothesis.
Stabilisation node, i.e., T = F [T 1 , . . ., T k ] for k > n where T 1 , . . ., T k all share the same idempotent value E ⊆ F .Let t 1 , . . ., t k be the n-computations of respective values b 1 , . . ., b k obtained by applying the induction hypothesis on T 1 , . . ., T k respectively.Furthermore, according to Theorem 3.3, there exists an n-computation t for b 1 . . .b k of height at most 3|M | and value a.Since E is a sub-stabilisation monoid of M which contains b 1 , . . ., b k , a

On the role of automata
In this paper we have developed the algebraic and logical aspects of regular cost functions over finite words.More precisely, we have introduced a notion of logic, cost monadic logic, that is suitable for describing functions, and an algebraic notion of stabilisation monoid that is suitable for recognising functions up to an equivalence relation ≈.We have shown that the logically defined functions could be translated into equivalent ones recognisable by stabilisation monoids.Decision procedures for several problems involving the existence of upper bounds for functions are derived from this translation.
There could have been several other facets for approaching this theory, a very natural one being through automata.The automata theoretic presentation happens to be closer to the historical developments.Indeed, the study of distance automata [18], and then of nested distance desert automata [26] was the original motivation.Following ideas from [5], it is convenient to consider two dual forms of automata using counters, called B and S-automata.The first model computes a minimum over all runs of the maximal values taken by counters, and the second form computes a maximum over all runs of the minimum value taken by some counters at some identified places in the run.As for regular languages, these automata happen to have the same expressiveness for describing cost functions as stabilisation monoids.
Technically, all the necessary material for proving the equivalence between automata and regular cost functions is already present in this paper.Indeed, in one direction, as it is classical for regular languages, automata can be seen as a special fragment of cost monadic logic.Thus, automata can only define regular cost functions.For the converse implication, it is easy to construct a B-automaton guessing under-computations, or an S-automaton guessing over-computations, and use it for describing a recognisable cost function.We have seen all the necessary material for establishing the correction of these approaches.
Despite this strong connection, there are several reasons for not presenting automata in this document.
A first reason is to emphasize the difference with the theory of regular languages.In the case of languages, the simplest way to show the decidability of monadic logic over words is to use automata.This is not the case anymore here.Proving the important results concerning B and S-automata (the central one being the equivalence between the two models, called the duality theorem), is more complicated than developing the theory of stabilisation monoids.In fact, the simplest way to prove the duality theorem is to translate the (say) B-automaton into a stabilisation monoid, and only then into an S-automaton (though, some other techniques are possible).One explanation for this difference between the theory of regular languages and regular cost functions is that B-automata and S-automata cannot be determinised.For these reasons stabilisation monoids form a much simpler model.
A second reason is that we could concentrate even more deeply on the model of stabilisation monoid.In particular, we did not only develop stabilisation monoids for obtaining decision procedures (as all works using stabilisations were doing so far), but we proved that a suitably axiomatised notion of stabilisation monoid can be used to recognise a cost function independently of the presence of any cost monadic formula, or any automaton.This is reminiscent of the proof in the theory or regular languages of infinite words that finite Wilke algebras can be translated in a unique way into ω-semigroups.If we were only interested in decidability questions, the paper could be simplified, and the important Theorem 3.4 omitted.
A third reason is that B-automata and S-automata, which may seem a bit useless under the light of the previous explanations, are in fact so important that they require a deep study on their own.The importance of automata does not stem from the question of decidability of cost monadic logic over words, but over trees (even finite) [11].Indeed, the situation is reminiscent from the case of regular languages of infinite trees.In this case, proving the decidability of monadic logic over infinite trees can only be achieved using infinite tree automata, and the proof makes also use of the ability to determinise automata over infinite words (see, e.g, the survey [45]).The situation is similar here, and what is important in the study of automata is to disclose a suitable variant of the notion of determinism, called history-determinism [6], and to prove that B and S-automata can be made history-deterministic.These considerations are completely diverging from the content of this paper.

Example 2 .0. 3 .
The sentence ∀X |X| ≤ N calculates the size of a structure.More formally [[∀X |X| ≤ N ]](S) equals |U S |.A more interesting example makes use of Example 2.0.1.Again over the signature of digraphs, the cost monadic sentence:

Definition 3 . 1 .
A stabilisation semigroup S, •, ≤, is a finite ordered semigroup S, •, ≤ together with an operator : E(S) → E(S) (called the stabilisation) such that: • for all e ≤ f in E(S), e ≤ f ; • for all a, b ∈ S with a • b ∈ E(S) and b • a ∈ E(S), (a • b) = a • (b • a) • b; 4

Proof.
Let us first prove that • is compatible with ≤.Assume a ≤ a and b ≤ b , then (a • b)[a, b] is a 0-under-computation over ab and (a • b )[a , b ] is an α(0)-over-computation over the same word ab.It follows that a • b ≤ a • b .Let us now prove that • is associative.Let a, b, c in n.Then ((a • b) • c)[(a • b)[a, b], c] is a 0-computation for the word abc, and (a • (b • c))[a, (b • c)[b, c]] is an α(0)-computation for the same word.It follows that (a • b) • c ≤ a • (b • c).The other inequality is symmetric.Let e be an idempotent.The tree e [ 2α(0)+2 e, . . ., e] is both a 0 and α(0)-computation over the word e 2α(0)+1 .Furthermore, the tree (e • e )[e [ α(0)+1 e, . . ., e], e [ α(0)+1 e, . . ., e]] is also both a 0 and an α(0)-computation for the same word.It follows that e • e = e , i.e., that maps idempotents to idempotents.Let us show that e ≤ e for all idempotents e.The tree e [e, e, e] is a 2-computation over the word eee, and e[e, e, e] is a max(3, α(2))-computation over the same word.It follows that e ≤ e.Let us show that stabilisation is compatible with the order.Let e ≤ f be idempotents.Then e [ α(0)+1 e, . . ., e] and f [ α(0)+1
0)+1 times t ba , . . ., t ba ], b]] .Then both t (a•b) and t a•(b•a) •b are at the same time 0 and α(0)-computations over the same word (ab) α(0)+2 of height at most 3. Since their respective values are (a • b) and a • (b • a) • b, it follows by our assumption that (a • b) = a • (b • a) • b.

Lemma 3 . 11 .
If f = e • x • e for eJ f two idempotents, then e = f .

Lemma 3 . 13 .
If J is a stable J -class, then e = e for all idempotents e ∈ J.Proof.Indeed, we have e = e • e • e and thus by Lemma 3.11, e = e.

Lemma 3 . 18 .
For all n-over-computations of value b over a word b 1 . . .b k (k ≥ 1) such that e ≤ b i for all i, and e is an idempotent, then e ≤ b.

a 1 ,
. . .,a k be the values of the children of the root of i, read from left to right.By applying Lemma 3.19 on T and the decomposition u 1 , . . ., u k .We construct the α p−1 (n)-overcomputations B 1 , . . ., B k for u 1 , . . ., u k respectively, and of respective values b 1 , . . ., b k , as well as an α p−1 (n)-over-computation B of value b for b 1 , . . ., b k .
The last inequality can have three origins.Either m > β(n) or e h = e (recall ( †) stating that x h • y h ≤ e • e h ) for some h = 1 . . .m, or e 0 = e .In the two first cases, by Lemma 3.21,x 1 • z • y m ≤ e ,and thus e 0 • x 1 • z • y m ≤ e (since e 0 is either 1, or e, or e ).In the third case, e 0 • x 1 • z • y m ≤ e since x 1 • z • y m ≤ e. Gathering the claims C1, C2, C3, we get that u 1 , . . ., u k n-evaluate to b 1 , . . ., b k , that b 2 . . .b k−1 n-evaluates to c = z • c m and that b 1 • c • b k ≤ e.This is exactly the induction hypothesis for the idempotent node case.If we further gather C4, we get that if the root node of T is a stabilisation node, b 1 • c • b k ≤ e.Once more the induction hypothesis is satisfied.

Fact 4 . 2 . 3 .
Let M, h, I recognise a cost function f over A * , and let z be a mapping from another alphabet B to A, then M, h • z, I recognises f • z.Proof.One easily checks that the computations involved in the definition of [[M, h, I]] + (z(u)) are exactly the same as the one involved in the definition of [[M, h • z, I]] + (u).

Corollary 4 . 5 .
If f and g are recognisable, then so are max(f, g) and min(f, g).Proof.According to Lemma 4.4, one assumes f recognised by M, h, I and g by M, h, J. Then (for a height fixed to at most p = 3|M |) one has:max([[M, h, I]] + p , [[M, h, J]] + p )(u)= max sup{n + 1 : there is an n-computation for h(u) of value in I} sup{n + 1 : there is an n-computation for h(u) of value in J} = sup{n + 1 : there exists an n-computation over h(u) of value in I ∪ J}= [[M, h, I ∪ J]] + p .Thus max(f, g) is recognised by M, h, I ∪ J.In a similar way, min(f, g) is recognised by M, h, I ∩ J.

4. 4 .Theorem 4 . 6 .
Decidability of the domination relation.We are now ready to establish the decidability of the domination relation.The domination relation ( ) is decidable over recognisable cost functions.

First direction.
Let us suppose h(A) ∩ I ⊆ J.Of course, every n-computation over h(u) for a word u over h(A) has its value in h(A) .It follows that (for heights at most 3|M |):[[M, h, I]] + (u) = sup{n + 1 : there is an n-computation for h(u) of value in I} ≤ sup{n + 1 : there is an n-computation for h(u) of value in J} = [[M, h, J]] + (u) .
Binary node, i.e., T = A[T 1 , T 2 ].Let B 1 and B 2 be the respective values of T 1 and T 2 .Let a ∈ A ⊆ B 1 • B 2 .By definition of the product, there exists b 1 ∈ B 1 and b 2 ∈ B 2 such that a ≤ b 1 • b 2 .By induction hypothesis, there exist n-under-computations t 1 and t 2 of respective values b 1 and b 2 .The n-under-computation a[t 1 , t 2 ] satisfies the induction hypothesis.

Lemma 4 . 11 .
There exists a polynomial α such that for all words A 1 . . .A k over M ↓ and all α(n)-over-computation T for A 1 . . .A k of height p and value A, and all a 1 ∈ A 1 , . . ., a k ∈ A k , there exists an n-computation over a 1 . . .a k of value a ∈ A and of height at most 3|M |p.Proof.The proof is by induction on p. Set α(n) = n 3|M | for all n.Leaf case, i.e., T = A and u = a 1 ∈ A 1 ⊆ A. Hence a 1 is a computation satisfying the induction hypothesis.Binary node, i.e., T = A[T 1 , T 2 ] where T 1 and T 2 have respective values B 1 and B 2 such that B 1 • B 2 ⊆ A. One applies the induction hypothesis on T 1 and T 2 , and gets computations t 1 and t 2 , of respective values b 1 ∈ B 1 and b 2 ∈ B 2 .The induction hypothesis is then fulfilled with the n-computation (b 1 • b 14, b ≥ c • f • d where c, f, d belong to E, and f is idempotent.So if n = 1, we take w = b.If n = 2, we take w = c(f • d).Finally, for n ≥ 3, there is a natural m-over-computation of value c • f • d of height 2 or 3 over the word c f . . .f n−2 times d, which is of length n.
Binary node, i.e., T = A[T 1 , T 2 ].Let B 1 and B 2 be the respective values of T 1 and T 2 .Let a ∈ A ⊆ B 1 • B 2 .By definition of the product, there exists b 1 ∈ B 1 and b 2 ∈ B 2 such that a ≥ b 1 • b 2 .By induction hypothesis, there exist n-computations t 1 and t 2 of respective values b 1 and b 2 .The n-over-computation a[t 1 , t 2 ] satisfies the induction hypothesis.Idempotent node.T = F [T 1 , . . ., T k ] for some k ≤ α(n) where F ⊆ E for an idempotent E such that the value of T i is E for all i.Let a ∈ F ⊆ E. We have a ≥ b • e • c for some b, c, e ∈ E (Lemma 4.14).We then apply the induction hypothesis for b, e, . . ., e and c on the computations T 1 , . . ., T k−1 and T k respectively, yielding the n-under-computations t 1 , . . ., t k−1 and t k respectively.We conclude by constructing the n-under-computation a[t 1 , (e • c)[e[t 2 , . . ., t k−1 ], t k ]].
Binary node, i.e., T = A[T 1 , T 2 ] where T 1 and T 2 have respective values B 1 and B 2 such that B 1 • B 2 ⊆ A. One applies the induction hypothesis on T 1 and T 2 , and get computations t 1 and t 2 , of respective values b 1 ∈ B 1 and b 2 ∈ B 2 .The induction hypothesis is then fulfilled with the n-computation (b 1 • b 2 )[t 1 , t 2 ] of value b 1 • b 2 ∈ A.
and | | a | | b , where a and b are distinct letters, | | is the function mapping each word to its length and | | a the function mapping each word to the number of occurrences of the letter a it contains.Indeed we have | | a ≤ | | but the set of words a * is a witness that | | a α | | b cannot hold whatever is α.Given words u 1 , . . ., u k ∈ {a, b} * , we have |u 1 . . .u k | a ≈ α max(|K|, max 8. In particular, Item 1 is established as Example 4.2.2.Item 2 is achieved in Example 4.2.1.Item 3 is the subject of Fact 4.2.3,Corollary 4.5 and Theorems 4.7 and 4.13.Finally, Item 4 is established in Theorem 4.6.
and 4.11 below.Let us first state a simple remark on the structure of idempotents.Lemma 4.8.If E is an idempotent in M ↓ , then for all a ∈ E there exist b, c, e ∈ E with e idempotent such that a ≤ b • e • c.E, for all n ≥ 1, there exist a 1 , . . ., a n ∈ E such that a ≤ a 1 • • • a n .Hence using Ramsey's theorem, for n sufficiently large, there exist 1 Lemma 4.14.If E is an idempotent in M ↑ , then for all a ∈ E there exist b, c, e ∈ E with e idempotent such that a ≥ b • e • c.
Proof.As E = E • • • E, for all n, there exist a 1 , . . ., a n ∈ E such that a ≥ a 1 • • • a n .Using Ramsey's theorem, for n sufficiently large, there exist 1