Tameness and the power of programs over monoids in DA

The program-over-monoid model of computation originates with Barrington's proof that the model captures the complexity class $\mathsf{NC^1}$. Here we make progress in understanding the subtleties of the model. First, we identify a new tameness condition on a class of monoids that entails a natural characterization of the regular languages recognizable by programs over monoids from the class. Second, we prove that the class known as $\mathbf{DA}$ satisfies tameness and hence that the regular languages recognized by programs over monoids in $\mathbf{DA}$ are precisely those recognizable in the classical sense by morphisms from $\mathbf{QDA}$. Third, we show by contrast that the well studied class of monoids called $\mathbf{J}$ is not tame. Finally, we exhibit a program-length-based hierarchy within the class of languages recognized by programs over monoids from $\mathbf{DA}$.


Introduction
A program of range n on alphabet Σ over a finite monoid M is a sequence of pairs (i, f ) where 1 ≤ i ≤ n and f : Σ → M is a function.This program assigns to each word w 1 w 2 • • • w n the monoid element obtained by multiplying out in M the elements f (w i ), one per pair (i, f ), in the order of the sequence.When an accepting set F ⊆ M is specified, the program naturally defines the language L n of words of length n assigned an element in F .A program sequence (P n ) n∈N then defines the language formed by the union of the L n .
A program over M is a generalization of a morphism from Σ * to M , and recognition by a morphism equates with acceptance by a finite automaton.Moving from morphisms to programs has a significant impact on the expressive power as shown by the seminal result of Barrington [Bar89] 1 that polynomial-length program sequences over the group S 5 capture the complexity class NC 1 (of languages accepted by bounded fan-in Boolean circuits of logarithmic depth).
Barrington's result was followed by several results strengthening the correspondence between circuit complexity and programs over finite monoids.The classes AC 0 ⊂ ACC 0 ⊆ NC 1 were characterized by polynomial-length programs over the aperiodic, the solvable, and all monoids respectively [Bar89,BT88].More generally for any variety V of finite monoids one can define the class P(V) of languages recognized by polynomial-length programs over a monoid drawn from V. In particular, if A is the variety of finite aperiodic monoids, then P(A) characterizes the complexity class AC 0 [BT88].It was further observed that in a formal sense, understanding the regular languages of P(V) is sufficient to understand the expressive power of P(V) (see [MPT91], but also [Str94] for a logical point of view).
In view of the above results it is plausible that algebraic automata theory methods could help separating complexity classes within NC 1 .But although partial results in restricted settings were obtained, no breakthrough was achieved this way.
The reason of course is that programs are much more complicated than morphisms: programs can read the letter at an input position more than once, in non-left-to-right order, possibly assigning a different monoid element each time.This complication can be illustrated with the following example.Consider the variety of finite monoids known as J.This is the variety generated by the syntactic monoids of all languages defined by the presence or absence of certain subwords, where u is a subword of v if u can be obtained from v by deleting letters [Sim75].One deduces that monoids in J are unable to morphism-recognize the language defined by the regular expression (a + b) * ac + .Yet a sequence of programs over a monoid in J recognizes (a + b) * ac + by the following trick.Consider the language L of all words having ca as a subword but having as subwords neither cca, caa nor cb.Being defined by the occurrence of subwords, L is recognized by a morphism ϕ : {a, b, c} * → M where M ∈ J, i.e., for this ϕ there is an F ⊆ M such that L = ϕ −1 (F ).Here is the trick: the program of range n over M given by the sequence of instructions (2, ϕ), (1, ϕ), (3, ϕ), (2, ϕ), (4, ϕ), (3, ϕ), (5, ϕ), (4, ϕ), . . ., (n, ϕ), (n − 1, ϕ), using F as accepting set, defines the set of words of length n in (a + b) * ac + .For instance, on input abacc the program outputs ϕ(baabcacc) which is in F , while on inputs abbcc and abacca the program outputs respectively ϕ(babbcbcc) and ϕ(baabcaccac) which are not in F .(See [Gro20, Lemma 4.1] for a full proof of the fact that (a + b) * ac + ∈ P(J).) The first part of our paper addresses the question of what are the regular languages in P(V).As mentioned above, this is the key to understanding the expressive power of P(V).
Observe first that the class L(V) of all languages recognized by a morphism into a monoid in V is trivially included in P(V).It turns out that P(V) always contains more regular languages.For instance, because a program instruction (i, f ) operating on a word w is "aware" of i, the program can have a behavior depending on some arithmetic properties of i.Moreover, a program's behavior can in general depend on the length of the input words it treats.So in particular, as far as regular languages are concerned, a program for a given input length can take into account the length of the input modulo some fixed number k in its acceptance set and each program instruction (i, f ) can depend on the value of i modulo k.This can be formalized by assuming without loss of generality that membership can depend on the length of the word w at hand modulo a fixed number k and that each letter in w is tagged with its position modulo k.Regular languages recognized this way are exactly the languages recognized by a stamp (a surjective morphism from Σ * to M with Σ an alphabet and M a finite monoid) in the variety of stamps V * Mod, where * is the wreath product of stamps and Mod the variety of cyclic stamps into groups.In other words, L(V * Mod) is always included in P(V).
A program over a monoid can also recognize regular languages by changing its behavior depending on bounded-length prefixes and suffixes arbitrarily.To formalize this, we introduce the class EV of stamps that, modulo the beginning and the end of a word, behave essentially like stamps into monoids from V. It is then not too hard to show that L(EV * Mod) is always included in P(V) when V does contain non-trivial monoids.Many varieties V are such that P(V) cannot recognize more regular languages than those in L(EV * Mod).This is the case for example of the variety DA as we will see below.
Our first result characterizes those varieties V having the property that P(V) does not contain "many more" regular languages than does L(EV * Mod).To this end we introduce the notion of tameness for a variety of finite monoids V (Definition 3.9) and our first result shows that a variety of finite monoids V is tame if and only if P(V) ∩ Reg ⊆ L(QEV).
Here, L(QV) is the class of regular languages recognized by stamps in quasi-V.A stamp ϕ from Σ * to M is in quasi-V if, though M might not be in V, its stable monoid induced by ϕ is in V, i.e. there is a number k such that ϕ((Σ k ) * ) forms a submonoid of M which is in V.
For tame varieties V we do not know when the inclusion of L(EV * Mod) in L(QEV) is strict or not.In particular we do not know when the inclusion in our result is an equality.As usual in this context, we conjecture that equality holds at least for local varieties V.
Our notion of a tame variety differs subtly but fundamentally from the notion of p-variety (program-variety).This notion goes back to Péladeau [Pél90] and can be stated by saying that a variety of finite monoids V is a p-variety whenever any monoid that can be "simulated" by programs over a monoid in V belongs itself to V. Equivalently, V is a p-variety whenever any regular language in P(V) with a neutral letter (a letter which can be inserted and deleted arbitrarily in words without changing their membership in the language) is in fact morphism-recognized by a monoid in V. (The equivalence between the two definitions is claimed without a proof in [Tes03] and [Gro18], see [PST97] for a proof in one direction, the other direction requiring a standard argument.)While understanding the neutral letter regular languages in P(V) for V ranging over all possible varieties of finite monoids would suffice to solve most open questions about the internal structure of NC 1 , for V to be a p-variety does not imply a precise characterization of all the regular languages in P(V).It can be proved that if V is a p-variety then the regular languages in P(V) are all in L(QLV), where LV is the inclusion-wise largest variety of finite semigroups containing all monoids in V and only those monoids.For instance, DA is a p-variety [LTT06], and this implies that P(DA) ∩ Reg ⊆ L(QLDA) as explained above.The latter inclusion is strict and the correct characterization, namely L(QEDA), requires proving that DA is also tame in our sense.Furthermore, there exist p-varieties for which unexpected (and interesting) things happen when considering program-recognition of regular languages without neutral letter, and this precisely because they aren't tame.For example, P(J) ∩ Reg ⊆ L(QLJ) with strict inclusion while it will follow from our result that P(J) ∩ Reg L(QEJ) (knowing that it is easy to check that L(QEJ) ⊆ L(QLJ)).
The situation for programs over finite semigroups of the form V * D, where V is a variety of finite monoids and D is the variety of finite definite (or righty trivial) semigroups, turns out to be much simpler.Indeed, with the necessary adaptations to the notion of p-variety, Péladeau, Straubing and Thérien [PST97] could show that for any p-variety of the form V * D we have P(V * D) ∩ Reg = L(Q(V * D)).Once our notion of tameness is adapted for finite semigroups, it is possible to show that any p-variety of the form V * D is tame.Hence the result of [PST97] mentioned above follows from our result as for varieties of the form V * D we have, abusing notation, that It is to be noted that programs over semigroups in V * D correspond to Straubing's k-programs over monoids in V [Str00,Str01].It is also possible to prove that regular languages recognized by monoids from a k-program variety V, as for p-varieties, are all in L(QLV).Interestingly, in order to get a tight characterization for the regular languages recognized by k-programs over commutative monoids in [Str01], Straubing determines not only which monoids can be simulated by such k-programs, but, in our terms, which stable stamps those k-programs can "simulate".Our notion of tameness also builds upon stable stamps and, as we have advocated above, subsumes previous definitions of "good behavior" of programs with respect to recognition of regular languages.
Tameness as defined here is also a proper extension of the notion of sp-varieties of monoids (Definition 3.3), a concept introduced in [GMS17] as our initial attempt to capture the expected behavior of programs over small varieties.We will for instance see that the variety of finite commutative monoids is tame but not an sp-variety.
Showing that a variety is tame can be a difficult task.For instance showing that the variety A is tame amounts to showing that the regular languages in AC 0 are in L(QA), which, as shown by Barrington, Compton, Straubing and Thérien [BCST92], follows from the fact that modulo counting cannot be done in AC 0 (the famous result initially proven by Furst, Saxe and Sipser [FSS84] and independently by Ajtai [Ajt83]).Similarly much of the structure of NC 1 would in fact be resolved by showing the tameness of certain varieties (see [MPT91,Corollary 4.13], [Str94,Conjecture IX.3.4]).
The present work is motivated by the need to better understand the subtle behaviors of polynomial-length programs over monoids.We focus in this paper on the variety of monoids DA.The importance of DA in algebraic automata theory and its connections with other fields are well established (see [TT02b] for an eloquent testimony).In particular P(DA) corresponds to languages accepted by decision trees of bounded rank [GT03].It is also known that regular languages with a neutral letter that are in P(DA) are also in L(DA) [LTT06].
Our second result shows that the variety DA is tame.As it is easy to see that DA is powerful enough to describe prefixes and suffixes of words up to some bounded length, we get that L(EDA) = L(DA).Moreover, because DA is a local variety, QDA = DA * Mod [DP13].Altogether the tameness of DA implies that the regular languages in P(DA) are precisely the languages in L(QDA).
Our third result is that, on the other hand, the variety of finite monoids J is not tame as witnessed by the regular language (a + b) * ac + discussed above which is in P(J) but not in L(QEJ).Characterizing the regular languages in P(J) remains an open problem, partially solved in [Gro20].
Our final result concerns P(DA).With C k the class of languages recognized by programs of length O(n k ) over DA, we prove that forms a strict hierarchy.We also relate this hierarchy to another algebraic characterization of DA and exhibit conditions on M ∈ DA under which any program over M can be rewritten as an equivalent subprogram (made of a subsequence of the original sequence of instructions) of length O(n k ), refining a result by Tesson and Thérien [TT02a].
Organization of the paper.In Section 2 we define programs over monoids, precognition by such programs and the necessary algebraic background.The definition of tameness for a variety V is given in Section 3 with our first result showing that regular languages in P(V) are included in L(QEV) if and only if V is tame; we also briefly discuss the case of J, which isn't tame.We show that DA is tame in Section 4. Finally, Section 5 contains the hierarchy results about P(DA).

Preliminaries
This section is dedicated to the introduction of the mathematical material used throughout this paper.Concerning algebraic automata theory, we only quickly review the basics and refer the reader to the two classical references of the domain by Eilenberg [Eil74,Eil76] and Pin [Pin86].
General notations.Let i, j ∈ N be two natural numbers.We shall denote by [[i, j]] the set of all natural numbers n ∈ N verifying i ≤ n ≤ j.We shall also denote by Words and languages.Let Σ be a finite alphabet.We denote by Σ * the set of all finite words over Σ.We also denote by Σ + the set of all finite non empty words over Σ, the empty word being denoted by ε.Since all our alphabets and words in this article are always finite, we shall not mention it anymore from here on.Given some word w ∈ Σ, we denote its length by |w| and, for any a ∈ Σ, by |w| a the number of occurrences of the letter a in w.A language over Σ is a subset of Σ * .A language is regular if it can be defined using a regular expression.Given a language L, its syntactic congruence ∼ L is the relation on Σ * relating two words u and v whenever for all x, y ∈ Σ * , xuy ∈ L if and only if xvy ∈ L. It is easy to check that ∼ L is an equivalence relation and a congruence for concatenation.The syntactic morphism of L is the mapping sending any word u to its equivalence class in the syntactic congruence.
The quotient of a language L over Σ relative to the words u and v is the language, denoted by u −1 Lv −1 , of the words w such that uwv ∈ L.

Monoids, semigroups and varieties.
A semigroup is a non-empty set equipped with an associative law that we will write multiplicatively.A monoid is a semigroup with an identity.An example of a semigroup is Σ + , the free semigroup over Σ. Similarly Σ * is the free monoid over Σ.With the exception of free monoids and semigroups, all monoids and semigroups considered here are finite.A morphism ϕ from a semigroup S to a semigroup T is a function from S to T such that ϕ(xy) = ϕ(x)ϕ(y) for all x, y ∈ S. A morphism of monoids additionally requires that the identity is preserved; unless otherwise stated, when we say "morphism", we always mean "monoid morphism".Any morphism ϕ : Σ * → M for Σ an alphabet and M some monoid is uniquely determined by the images of the letters of Σ by ϕ.A semigroup T is a subsemigroup of a semigroup S if T is a subset of S and is equipped with the restricted law of S. Additionally the notion of submonoids requires the presence of the identity.A semigroup T divides a semigroup S if T is the image by a semigroup morphism of a subsemigroup of S. Division of monoids is defined in the same way by replacing any occurrence of "semigroup" by "monoid".The Cartesian (or direct) product of two semigroups is simply the semigroup given by the Cartesian product of the two underlying sets equipped with the Cartesian product of their laws.
A language L over Σ is recognized by a monoid M if there is a morphism h from Σ * to M and a subset F of M such that L = h −1 (F ).We also say that the morphism h recognizes L.
It is well known that a language is regular if and only if it is recognized by a finite monoid.Actually, as ∼ L is a congruence, the quotient Σ * /∼ L is a monoid, called the syntactic monoid of L, that recognizes L via the syntactic morphism of L. The syntactic monoid of L is finite if and only if L is regular.The quotient Σ + /∼ L is analogously called the syntactic semigroup of L.
A variety of finite monoids is a non-empty class of finite monoids closed under Cartesian product and monoid division.A variety of finite semigroups is defined similarly.When dealing with varieties, we consider only varieties of finite monoids or semigroups, so we will drop the adjective "finite" when talking about those.
An element s of a semigroup is idempotent if ss = s.For any finite semigroup S there is a positive number (the minimum such number), the idempotent power of S, often denoted ω, such that for any element s ∈ S, s ω is idempotent.
A general result of Reiterman [Rei82] states that each variety of monoids (or semigroups) can be defined as the class of all finite monoids satisfying some set of identities, for an appropriate notion of identity.In our case, we only use a restricted version of this notion of an identity, that we understand as a formal equality of terms built on the basis of variables by using products and ω-powers.A finite monoid is then said to satisfy such an identity whenever the equality is verified for any setting of the variables to elements of the monoid, interpreting the ω-power as the idempotent power of that monoid.For instance, the variety of finite aperiodic monoids A, known as the variety of "group-free" finite monoids (i.e.those verifying that they do not have any non-trivial group as a subsemigroup), is defined by the identity x ω = x ω+1 .The variety of monoids DA is defined by the identity (xy) ω = (xy) ω x(xy) ω .The variety of monoids J is defined by the identities (xy) ω = (xy) ω x = y(xy) ω .One easily deduces that J ⊆ DA ⊆ A.

Varieties of languages.
A variety of languages is a class of languages over arbitrary alphabets closed under Boolean operations, quotients and inverses of morphisms (i.e. if L is a language in the class over an alphabet Σ, if Γ is some other alphabet and ϕ : Γ * → Σ * is a morphism, then ϕ −1 (L) is also in the class).
Eilenberg showed [Eil76, Chapter VII, Section 3] that there is a bijective correspondence between varieties of monoids and varieties of languages: to each variety of monoids V we can bijectively associate L(V) the variety of languages whose syntactic monoids belong to V and, conversely, to each variety of languages V we can bijectively associate M(V) the variety of monoids generated by the syntactic monoids of the languages of V, and these correspondences are mutually inverse.
When V is a variety of semigroups, we will denote by L(V) the class of languages whose syntactic semigroup belongs to V.There is also an Eilenberg-type correspondence for an appropriate notion of language varieties, that is ne-varieties (non-erasing-varieties) of languages, but we won't present it here.(The interested reader may have a look at [Str02] as well as [PS05, Lemma 6.3].) Quasi and locally V languages, modular counting and predecessor.If S is a semigroup we denote by S 1 the monoid S if S is already a monoid and S ∪ {1} otherwise.
The following definitions are taken from [PS05,CPS06b].Let ϕ be a surjective morphism from Σ * , for Σ some alphabet, to a finite monoid M : such a morphism is called a stamp.For all k consider the subset ϕ(Σ k ) of M .As M is finite there is a k such that ϕ(Σ 2k ) = ϕ(Σ k ).This implies that ϕ(Σ k ) is a semigroup.The semigroup given by the smallest such k is called the stable semigroup of ϕ and this k is called the stability index of ϕ.If 1 is the identity of M , then ϕ(Σ k ) ∪ {1} is called the stable monoid of ϕ.If V is a variety of monoids, then we shall denote by QV the class of stamps whose stable monoid is in V and by L(QV) the class of languages whose syntactic morphism is in QV.
For V a variety of monoids, we say that a finite semigroup S is locally V if, for every idempotent e of S, the monoid eSe belongs to V; we denote by LV the class of locally-V finite semigroups, which happens to be a variety of semigroups.
We now define languages recognized by V * Mod and V * D. We do not use the standard algebraic definition using the wreath product as we won't need it, but instead give a characterization of the languages recognized by such algebraic objects [CPS06a,Til87].
Let V be a variety of monoids.We say that a language over Σ is in L(V * Mod) if it is obtained by a finite combination of unions and intersections of languages over Σ for which membership of each word over Σ only depends on its length modulo some integer k ∈ N >0 and languages L over Σ for which there is a number k ∈ N >0 and a language L over Σ × {0, . . ., k − 1} whose syntactic monoid is in V, such that L is the set of words w that belong to L after adding to each letter of w its position modulo k.Observe that neither V * Mod nor QV are varieties of monoids or semigroups, but classes of stamps that happen to be varieties of stamps of a certain kind2 , that we won't introduce.
Similarly we say that a language over Σ is in L(V * D) if it is obtained by a finite combination of unions and intersections of languages over Σ for which membership of each word over Σ only depends on its k ∈ N last letters and languages L over Σ for which there is a number k ∈ N and a language L over Σ × Σ ≤k (where Σ ≤k denotes all words over Σ of length at most k) whose syntactic monoid is in V, such that L is the set of words w that belong to L after adding to each letter of w the word composed of the k (or less when near the beginning of w) letters preceding that letter.The variety of semigroups V * D can then be defined as the one generated by the syntactic semigroups of the languages in L(V * D) as defined above.
A variety of monoids V is said to be local if L(V * D) = L(LV).This is not the usual definition of locality, defined using categories, but it is equivalent to it [Til87,Theorem 17.3].One of the consequences of locality that we will use is that L(V * Mod) = L(QV) when V is local [DP14, Corollary 18], while L(V * Mod) ⊆ L(QV) in general (see [Dar14,Pap14]).
Programs over varieties of monoids.Programs over monoids form a non-uniform model of computation, first defined by Barrington and Thérien [BT88], extending Barrington's permutation branching program model [Bar89].Let M be a finite monoid and Σ an alphabet.A program P over M is a finite sequence of instructions of the form (i, f ) where i is a positive integer and f a function from Σ to M .The length of P is the number of its instructions.A program has range n if all its instructions (i, f ) verify 1 ≤ i ≤ n.A program P of range n defines a function from Σ n , the words of length n, to M as follows.On input w ∈ Σ n , for w = w 1 • • • w n , each instruction (i, f ) outputs the monoid element f (w i ).A sequence of instructions then yields a sequence of elements of M and their product is the output P (w) of the program.The only program of range 0, the empty one, always outputs the identity of M .
A language L over Σ is p-recognized by a sequence of programs (P n ) n∈N if for each n, P n has range n and length polynomial in n and recognizes L ∩ Σ n , that is, there exists a subset F n of M such that L ∩ Σ n is precisely the set of words w of length n such that P n (w) ∈ F n .In that case, we also say that L is p-recognized by M .We denote by P(M ) the class of languages p-recognized by a sequence of programs (P n ) n∈N over M .If V is a variety of monoids we denote by P(V) the union of all P(M ) for M ∈ V.
The following is a simple fact about P(V).Let Σ and Γ be two alphabets and µ : Σ * → Γ * be a morphism.We say that µ is length multiplying, or that µ is an lm-morphism, if there is a constant k such that for all a ∈ Σ, the length of µ(a) is k.
Lemma 2.1.[MPT91, Corollary 3.5] For V any variety of monoids, P(V) is closed under Boolean operations, quotients and inverse images of lm-morphisms.
Given two range n programs P, P over some monoid M using the same input alphabet Σ, we shall say that P is a subprogram, a prefix or a suffix of P whenever P is, respectively, a subword, a prefix or a suffix of P , looking at P and P as words over [n] × M Σ .

General results about regular languages and programs
Let V be a variety of monoids.By definition any regular language recognized by a monoid in V is p-recognized by a sequence of programs over a monoid in V. Actually, since in a program over some monoid in V, the monoid element output for each instruction can depend on the position of the letter read, hence in particular on its position modulo some fixed number, it is easy to see that any regular language in L(V * Mod) is p-recognized by a sequence of programs over some monoid in V. We will see in Section 3.2 that programs over some monoid in V can also p-recognize the regular languages that are "essentially V" i.e. that differ from a language in L(V) only on the prefix and suffix of the words.
In this section we characterize those varieties V such that programs over monoids in V do not recognize more regular languages than those mentioned above.
We first recall the definitions and results around p-varieties developed by Péladeau, Tesson, Straubing and Thérien and then present the definition of sp-varieties that was inspired by their work and studied in the conference version of the present paper.In order to deal with the limitation of sp-varieties we then define the notion of essentially-V that will be the last ingredient for our definition of tameness.We then provide an upper bound on the regular languages that can be p-recognized by a sequence of programs over a monoid from a tame variety V.
3.1.pand sp-varieties of monoids.We first recall the definition of p-varieties.These seem to have been originally defined by Péladeau in his Ph.D. thesis [Pél90] and later used by Tesson in his own Ph.D. thesis [Tes03].The notion of a p-variety has also been defined for semigroups by Péladeau, Straubing and Thérien in [PST97].
Let µ be a morphism from Σ * to a finite monoid M .We denote by W(µ) the set of languages L over Σ such that L = µ −1 (F ) for some subset F of M .Given a semigroup S there is a unique morphism η S : S * → S 1 extending the identity on S, called the evaluation morphism of S. We write W(S) for W(η S ).We define W(M ) similarly for any monoid M .It is easy to see that if M ∈ V then W(M ) ⊆ P(V).The condition to be a p-variety requires a converse of this observation.Definition 3.1.An p-variety of monoids is a variety V of monoids such that for any finite monoid M , if W(M ) ⊆ P(V) then M ∈ V.
The following result illustrates an important property of p-varieties, when the notion is adapted to varieties of semigroups accordingly.Proposition 3.2.[PST97] Let V * D be a p-variety of semigroups, where V is a variety of monoids.
Then P(V * D) ∩ Reg = L(V * D * Mod) (where the latter class is defined in the same way as L(V * Mod)).
It is known that J is a p-variety of monoids [Tes03] but as we have seen in the introduction, P(J) contains languages that are more complicated than those in L(J * Mod) (see the end of this subsection for a proof).In order to capture those varieties for which programs are well behaved we need a restriction of p-varieties and this brings us to the following definition.
Definition 3.3.An sp-variety of monoids is a variety V of monoids such that for any finite semigroup S, if W(S) ⊆ P(V) then S 1 ∈ V.
Hence any sp-variety of monoids is also a p-variety of monoids, but the converse is not always true as we will see in Proposition 3.6 below that J is not an sp-variety.
An example of an sp-variety of monoids is the class of aperiodic monoids A. This is a consequence of the result that for any number k > 1, checking if |w| a is a multiple of k for w ∈ {a, b} * cannot be done in AC 0 = P(A) [FSS84, Ajt83] (we shall denote the corresponding language over the alphabet {0, 1} by MOD k ).Towards a contradiction, assume there would exist a semigroup S such that S 1 is not aperiodic but still W(S) ⊆ P(A).Then there is an x in S such that x ω = x ω+1 .Consider the morphism µ : {a, b} * → S 1 sending a to x ω+1 and b to x ω , and the language L = µ −1 (x ω ).It is easy to see that L is the language of all words with a number of a congruent to 0 modulo k, where k is the smallest number such that x ω+k = x ω .As x ω = x ω+1 , we have k > 1, so that L / ∈ P(A) by [FSS84,Ajt83].Let η S : S * → S 1 be the evaluation morphism of S. The morphism ϕ : Σ * → S * sending each letter a ∈ Σ to µ(a) verifies that µ = η S • ϕ, so that From W(S) ⊆ P(A) it follows that η −1 S (x ω ) ∈ P(A), hence since ϕ sends each letter of Σ to a letter of S, it is an lm-morphism and as P(A) is closed under inverses of lm-morphisms by Lemma 2.1, we have L = ϕ −1 (η −1 S (x ω )) ∈ P(A), a contradiction.The following is the desired consequence of being an sp-variety of monoids.
Proof.Let L be a regular language in P(M ) for some M ∈ V. Let M L be the syntactic monoid of L and η L its syntactic morphism.Let S be the stable semigroup of η L , in particular S = η L (Σ k ) for some k.We wish to show that S 1 is in V.
We show that W(S) ⊆ P(V) and conclude from the fact that V is an sp-variety that S 1 ∈ V as desired.Let η S : S * → S 1 be the evaluation morphism of S. Consider m ∈ S and consider L = η −1 S (m).We wish to show that L ∈ P(V).This implies that W(S) ⊆ P(V) by closure under union, Lemma 2.1.
Let L = η −1 L (m).Since m belongs to the syntactic monoid of L and η L is the syntactic morphism of L, a classical algebraic argument [Pin86, Chapter 2, proof of Lemma 2.6] shows that L is a Boolean combination of quotients of L. By Lemma 2.1, we conclude that L ∈ P(V).
By definition of S, for any element s of S there is a word u s of length k such that η L (u s ) = s.Notice that this is precisely where we need to work with S and not S 1 .
Let f : S * → Σ * be the lm-morphism sending s to u s and notice that L = f −1 (L ).The result follows by closure of P(V) under inverse images of lm-morphisms, Lemma 2.1.
We don't know whether it is always true that for sp-varieties of monoids V, L(QV) is included in P(V).But we can prove it for local varieties.
Proof.This follows from the fact that for local varieties L(QV) = L(V * Mod) (see [DP14]).The result can then be derived using Proposition 3.4, as we always have L(V * Mod) ⊆ P(V).
As A is local [Til87, Example 15.5] and an sp-variety, it follows from Proposition 3.5 that the regular languages in P(A), hence in AC 0 , are precisely those in L(QA), which is the characterization of the regular languages in AC 0 obtained by Barrington, Compton, Straubing and Thérien [BCST92].
We will see in the next section that DA is an sp-variety.As it is also local [Alm96], we get from Proposition 3.5 that the regular languages of P(DA) are precisely those in L(QDA).
As explained in the introduction, the language (a + b) * ac + can be p-recognized by a program over J.A simple algebraic argument shows that it is not in L(QJ): just compute the stable monoid of the syntactic morphism of the language, which is equal to the syntactic monoid of the language, that is not in J. Hence, by Proposition 3.4, we have the following result: Proposition 3.6.J is not an sp-variety of monoids.Despite Proposition 3.6 providing some explanation for the unexpected relative strength of programs over monoids in J, the notion of an sp-variety of monoids isn't entirely satisfactory.
We say that a monoid is trivial when its underlying set contains a sole element.The class of all trivial monoids, that we will denote by I, forms a variety: it is the sole variety containing only trivial monoids, so we may call it the trivial variety of monoids.
One observation to be made is that any non-trivial monoid M p-recognizes the language of words over {a, b} starting with an a: for the first position in any word, just send a to any element that is not the identity and b to the identity.This means that for any non-trivial variety of monoids V, we have that a(a + b) * ∈ P(V).But since the stable monoid of the syntactic morphism of a(a + b) * is equal to the syntactic monoid of this language, it follows that for any non-trivial variety of monoids V not containing the syntactic monoid of a(a + b) * , we have P(V) ∩ Reg L(QV), hence that V is not an sp-variety of monoids.
Therefore, many varieties of monoids actually aren't sp-varieties of monoids simply because of the built-in capacity of programs over any non-trivial monoid to test the first letter of input words.This is for example true for any non-trivial variety containing only groups and for any non-trivial variety containing only commutative monoids.This built-in capacity, additional to programs' ability to do positional modulo counting that underlies the definition of sp-varieties, should be taken into account in the notion we are looking for to capture "good behavior".In order to define our notion of tameness we first study this extra capacity that is built-in for programs over V and that we call "essentially-V".
3.2.Essentially-V stamps.It is easy to extend our reasoning above to show that given any non-trivial monoid M and given some k ∈ N >0 , the language of words over {a, b} having an a in position k, that is (a + b) k−1 a(a + b) * , is p-recognized by M , and the same goes for (a+b) * a(a+b) k−1 .By generalizing, we can quickly conclude that given any non-trivial variety of monoids V, for any alphabet Σ and any x, y ∈ Σ * , we have that xΣ * y ∈ P(V) by closure of P(V) under Boolean operations, Lemma 2.1.Put informally, p-recognition by monoids taken from any fixed non-trivial variety of monoids allows one to check some constant-length beginning or ending of the input words.Moreover, p-recognition by monoids taken from any fixed non-trivial variety of monoids V also easily allows to test for membership of words in L(V) after stripping out some constant-length beginning or ending: that is, languages of the form This motivates the definition of essentially-V stamps.
Definition 3.7.Let V be a variety of monoids.Let ϕ : Σ * → M be a stamp and let s be its stability index.We say that ϕ is essentially-V whenever there exists a stamp µ : Σ * → N with N ∈ V such that for all u, v ∈ Σ * , we have We will denote by EV the class of all essentially-V stamps3 and by L(EV) the class of languages recognized by morphisms in EV.
Informally stated, a stamp ϕ : Σ * → M is essentially-V when it behaves like a stamp into a monoid of V as soon as a sufficiently long beginning and ending of any input word has been fixed.The value for "sufficiently long" depends on ϕ and is adequately given by the stability index s of ϕ, as by definition of s, any word w of length at least 2s can always be made of length between s and 2s − 1 without changing the image by ϕ.
Let us start by giving some examples.Consider first the language a(a + b) * over the alphabet {a, b}.Let's take ϕ : {a, b} * → M to be its syntactic morphism: its stability index is equal to 1 and it has the property that for any w ∈ {a, b} * , we have ϕ(aw) = ϕ(a) and ϕ(bw) = ϕ(b).Hence, if we define µ : {a, b} * → {1} to be the obvious stamp into the trivial monoid {1}, we indeed have that for all u, v ∈ {a, b} * , it holds that In conclusion, the stamp ϕ is essentially-V for any variety of monoids, in particular a(a+b) * ∈ L(EI).
Let us now consider the language a(a + b) * b(a + b) * a over the alphabet {a, b} of words starting and ending with an a and containing some b in between.Let ϕ : {a, b} * → M be its syntactic morphism: its stability index is equal to 3 and it has the property that for all x, y ∈ {a, b} + , given any u, v ∈ {a, b} * verifying that the letter b appears in u if and only if it appears in v, it holds that ϕ (xuy) = ϕ (xvy).Hence, if we define µ : {a, b} * → N to be the syntactic morphism of the language (a + b) * b(a + b) * , it is direct to see that for all u, v ∈ {a, b} * , it holds that So we can conclude that the stamp ϕ is essentially-V for any variety of monoids containing the syntactic monoid of (a + b) * b(a + b) * , in particular a(a + b) * b(a + b) * a ∈ L(EJ).However, note that ϕ / ∈ EI because we have ϕ (aaa)a(aaa) = ϕ (aaa)b(aaa) .
It is now easy to prove that as long as V is non-trivial, polynomial-length programs over monoids from V do have the built-in capacity to recognize any language recognized by an essentially-V stamp.
Proof.Let ϕ : Σ * → M be a stamp in EV.By definition, given the stability index s of ϕ, there exists a stamp µ : Σ * → N with N ∈ V such that for all u, v ∈ Σ * , we have Let F ⊆ M .By definition of µ, given m ∈ N and x, y ∈ Σ s , we either have that We claim that {w} ∈ P(V) for any w ∈ Σ ≤2s−1 and also that xµ −1 (m)y ∈ P(V) for any x, y ∈ Σ s and m ∈ N .So, by closure of P(V) under Boolean operations, Lemma 2.1, it follows that ϕ −1 (F ) ∈ P(V).Since this is true for any F , we have that W(ϕ) ⊆ P(V) and as this is itself true for all ϕ, we can conclude that L(EV) ⊆ P(V).
The claim remains to be proven.Let k ∈ N >0 and a ∈ Σ.Since V is non-trivial, there exists a non-trivial N ∈ V: we shall denote its identity by 1 and by z one of its elements distinct from the identity, chosen arbitrarily.It is easy to see that the language Σ k−1 aΣ * is p-recognized by the sequence of programs (P n ) n∈N over N such that for all n ∈ N, we have We prove the same for It then follows by closure of P(V) under Boolean operations, Lemma 2.1, that {w} ∈ P(V) for any w ∈ Σ ≤2s−1 and that xΣ * y ∈ P(V) for any x, y ∈ Σ s .
Finally, let m ∈ N .It is direct to show that there exists where g : Σ → N is defined by g(b) = µ(b) for all b ∈ Σ.We can then conclude that xµ −1 (m)y ∈ P(V) for any x, y ∈ Σ s by closure of P(V) under Boolean operations, Lemma 2.1, and this holds for any m.
Let us first mention that tameness is a generalization of sp-varieties of monoids.
Proof.Let V be an sp-variety of monoids.
Let S = ϕ(Σ + ): as ϕ is stable, we have S = ϕ(Σ).Let ρ : S → Σ be an arbitrary mapping from S to Σ such that ϕ(ρ(s)) = s.Consider η S : S * → S 1 the evaluation morphism of S: the unique morphism f : S * → Σ * sending each letter s ∈ S to ρ(s) verifies that η S = ϕ • f .Now, given any F ⊆ S 1 , we have η −1 S (F ) = f −1 (ϕ −1 (F )), but since ϕ −1 (F ) ∈ P(V) and as f is an lm-morphism because it sends each letter of S to a letter of Σ, it follows that η −1 S (F ) ∈ P(V) by closure of P(V) under inverses of lm-morphisms, Lemma 2.1.Therefore, W(S) ⊆ P(V).Since V is an sp-variety of monoids, this entails that M = S 1 belongs to V, and therefore ϕ ∈ EV.As this is true for any stable stamp ϕ such that W(ϕ) ⊆ P(V), we can conclude that V is tame.
The notion of essentially-V stamps can be adapted to varieties of semigroups in a straightforward way.We can then define a notion of tameness for varieties of semigroups accordingly.The exact same proof as the one above then goes through to allow us to show that p-varieties of the form V * D are tame.
However there exist varieties of monoids that are tame but not sp-varieties.We give an example of such a variety in Subsection 3.4.
Programs over monoids taken from tame varieties of monoids have the expected behavior as we show next.
Let ϕ : Σ * → M be a stamp of stability index s.The stable stamp of ϕ is the unique stamp ϕ : (Σ s ) * → M such that ϕ (u) = ϕ(u) for all u ∈ Σ s and M is the stable monoid of ϕ.For any variety of monoids V we let QEV be the class of stamps whose stable stamp is essentially-V and, accordingly, we define L(QEV) as the class of languages whose syntactic morphism is in QEV.
Proposition 3.11.A variety of monoids V is tame if and only if P(V) ∩ Reg ⊆ L(QEV).
Proof.Let V be a variety of monoids.
Left-to-right implication.Assume first that V is tame.For this direction, the proof follows the same lines as those of Proposition 3.4.
Let L ∈ P(V)∩Reg over some alphabet Σ and let η : Σ * → M be the syntactic morphism of L. For any m ∈ M , a classical algebraic argument [Pin86, Chapter 2, proof of Lemma 2.6] shows that η −1 (m) is a Boolean combination of quotients of L, so η −1 (m) ∈ P(V) by Lemma 2.1.Now let s be the stability index of η, let M be its stable monoid and take η : (Σ s ) * → M to be the stable stamp of η.The unique morphism f : (Σ s ) * → Σ * such that f (u) = u for all u ∈ Σ s is an lm-morphism and verifies that η = η • f .Hence, for all m ∈ M , we have that , so that η −1 (m ) ∈ P(V) by closure of P(V) under inverses of lm-morphisms, Lemma 2.1.Thus, since inverses of monoid morphisms commute with union and P(V) is closed under unions (Lemma 2.1), we can conclude that η −1 (F ) ∈ P(V) for all F ⊆ M , i.e.W(η ) ⊆ P(V).
But as η is stable, by tameness of V, this entails that η ∈ EV, so that L ∈ L(QEV).
For any m ∈ M , we therefore have ϕ −1 (m) ∈ L(QEV).Let η m : Σ * → M m be the syntactic morphism of the language ϕ −1 (m), we thus have η m ∈ QEV.We first claim that η m is a stable stamp.To see this notice first that for all u, v ∈ Σ * we have Since η m is equal to its stable stamp and η m ∈ QEV, it follows that η m ∈ EV.Therefore there exists a stamp µ m : Σ * → N m with N m ∈ V such that for all u, v ∈ Σ * , we have In conclusion, µ witnesses the fact that ϕ is essentially-V.
As for the case of sp-varieties of monoids, we don't know whether it is always true that for a tame non-trivial variety of monoids V, L(QEV) is included in P(V).If this were the case then for tame non-trivial varieties of monoids V we would have P(V) ∩ Reg = L(QEV).We conjecture this to be at least true for varieties of monoids that are local.Conjecture 3.12.Let V be a local tame variety of monoids.Then P(V) ∩ Reg = L(QEV).
We conclude this subsection by showing that J, which is not an sp-variety of monoids (Proposition 3.6), isn't tame either.Proposition 3.13.J is not tame.Proof.To show this, we show that (a + b) * ac + , which belongs to P(J) by the construction of the introduction, does not belong to L(QEJ).
We first claim that any essentially-J stamp ϕ : Σ * → M of stability index s verifies that there exists some k ∈ N >0 such that ϕ(x(uv) k y) = ϕ(x(uv) k uy) for all u, v ∈ Σ * and x, y ∈ Σ s .Indeed, by definition there exists a stamp µ : Σ * → N with N ∈ J such that for all u, v ∈ Σ * , we have If we set ω to be the idempotent power of N , we have that for all u, v ∈ Σ * , by the identities for J. Hence, we have that ϕ(x(uv) ω y) = ϕ(x(uv) ω uy) for all u, v ∈ Σ * and x, y ∈ Σ s .
Let us now consider the syntactic morphism η : {a, b, c} * → M of the language (a+b) * ac + .As already mentioned for Proposition 3.6, the stable monoid of η is equal to the syntactic monoid M .Moreover, the stability index of η is 2. Therefore, the stable stamp of η is the unique stamp η : ({a, b, c} 2 ) * → M such that η (u) = η(u) for all u ∈ {a, b, c} 2 .By what we have shown just above, since the stability index of η is 1, if η were essentially-J, there should exist some k ∈ N >0 such that η (x(uv) k y) = η (x(uv) k uy) for all u, v ∈ ({a, b, c} 2 ) * and x, y ∈ {a, b, c} 2 .However, for all k ∈ N >0 , we do have that (aa) (bb)(aa) for all k ∈ N >0 .Therefore, it follows that the stable stamp η of η is not essentially-J, so we can conclude that (a + b) * ac + / ∈ L(QEJ).Since the syntactic monoid of the language a(a+b) * is not commutative, by the discussion at the end of Subsection 3.1, we know that Com is not an sp-variety of monoids.It is, however, tame, as we are going to prove now.
The following lemma then asserts that any stable stamp ϕ such that W(ϕ) ⊆ P(Com) actually verifies the equation of the previous lemma, which allows us to conclude that Com is tame by combining those two lemmas.
Lemma 3.15.Let ϕ : Σ * → M be a stable stamp such that W(ϕ) ⊆ P(Com).Then, for any x, y, e, f ∈ Σ such that ϕ(e) and ϕ(f ) are idempotents, we have Proof.Let us first observe that for any program P over some finite commutative monoid N using the input alphabet Σ and of range n ∈ N, there exist a program P over N using the same input alphabet and of same range such that P = n i=1 (i, h i ) verifying P (w) = P (w) for all w ∈ Σ n [Tes03, Example 3.4].We call P a single-scan program.
The assumption that W(ϕ) ⊆ P(Com) thus means that for all F ⊆ M , there exists a sequence (P F,n ) n∈N of single-scan programs over some N F ∈ Com that recognizes ϕ −1 (F ).
So for all e, f ∈ Σ such that ϕ(e) and ϕ(f ) are idempotents, we have that (3.2) holds.

The case of DA
In this section, we prove that DA is an sp-variety of monoids, which implies that it is tame.Combined with the fact that DA is local [Alm96], we obtain the following result by Proposition 3.5.Observe that for any k ∈ N, k ≥ 2, the fact that b * ((ab * ) k ) * / ∈ P(DA) is an immediate corollary of the classical result that MOD k / ∈ AC 0 = P(A) [FSS84,Ajt83].However, we propose a direct semigroup-theoretic proof of the first result without resorting to the involved proof techniques of the latter result.
Before proving the proposition we first show that it implies that DA is an sp-variety of monoids.This implication is a consequence of the following lemma, which is a result inspired by an observation in [TT02b] stating that non-membership of a given finite monoid M in DA implies non-aperiodicity of M or division of it by (at least) one of two specific finite monoids.
Proof.We distinguish two cases: the aperiodic and the non-aperiodic one.
Aperiodic case.Assume first that S 1 is aperiodic.Then, since S 1 / ∈ DA, by Lemma 3.2.4 in [Tes98], we have that S 1 is divided by the syntactic monoid of (c * ac * bc * ) * , denoted by B 2 , or by the syntactic monoid of (b + c) * a(b + c) * b(b + c) * * , denoted by U .We treat those two not necessarily distinct subcases separately.Subcase B 2 divides S 1 .It is easily proven that (c + ab) * is recognized by B 2 (actually, its syntactic monoid is isomorphic to B 2 ): just consider the syntactic morphism η : {a, b, c} * → B 2 of (c * ac * bc * ) * and build the morphism ϕ : {a, b, c} * → B 2 sending a to η(a), b to η(b) and c to η(ab).
Subcase U divides S 1 .The proof goes the same way as for the first subcase.
It is again easily proven that (b + ab) * is recognized by U : here we consider the syntactic morphism η : {a, b} Non-aperiodic case.Assume now that S 1 is not aperiodic.Then there is an x in S such that x ω = x ω+1 for ω ∈ N >0 the idempotent power of S 1 .Consider the morphism µ : {a, b} * → S 1 sending a to x ω+1 and b to x ω , and the language L = µ −1 (x ω ).Let k ∈ N, k ≥ 2 be the smallest positive integer such that x ω+k = x ω , that cannot be 1 because x ω = x ω+1 .Using this, for all w ∈ {a, b} * , we have where |w| indicates the length of w and |w| a the number of a's it contains, so that w belongs to L if and only if |w| a = 0 mod k.Hence, L is the language of all words with a number of a's divisible by k, b * ((ab * ) k ) * .In conclusion, b * ((ab * ) k ) * is recognized by µ verifying µ({a, b} + ) ⊆ S.
Let now S be any finite semigroup such that W(S) ⊆ P(DA).Let η S : S * → S 1 be the evaluation morphism of S. To show that S 1 is in DA, we assume for the sake of contradiction that it is not the case.Then Lemma 4.3 tells us that one of (c + ab) * , (b + ab) * or b * ((ab * ) k ) * for some k ∈ N, k ≥ 2 is recognized by a morphism µ : Σ * → S 1 , for Σ the appropriate alphabet, such that µ(Σ + ) ⊆ S.
In all cases, we thus have a language L ⊆ Σ * equal to µ −1 (Q) for some subset Q of S 1 with the morphism µ sending letters of Σ to elements of S. Consider then the morphism ϕ : Σ * → S * sending each letter a ∈ Σ to µ(a), a letter of S: As W(S) ⊆ P(DA), we have that η −1 S (Q) ∈ P(DA), hence since ϕ is an lm-morphism and P(DA) is closed under inverses of lm-morphisms by Lemma 2.1, we have L = ϕ −1 (η −1 S (Q)) ∈ P(DA): a contradiction to Proposition 4.2.In the remaining part of this section we prove Proposition 4.2.
Proof of Proposition 4.2.The idea of the proof is the following.We work by contradiction and assume that we have a sequence of programs over some monoid M of DA deciding one of the targeted language L. Let n be much larger than the size of M , and let P n be the program running on words of length n.Consider a set ∆ of words such that L ⊆ ∆ * (for instance take ∆ = {c, ab} for L = (c + ab) * ).We will show that we can fix a constant (depending on M and ∆ but not on n) number of entries to P n such that P n always outputs the same value and there are completions of the entries in ∆ * .Hence, if ∆ was chosen so that there is actually a completion of the fixed entries in L and one outside of L, P n cannot recognize the restriction of L to words of length n.We cannot prove this for all ∆, in particular it will not work for ∆ = {ab} and indeed (ab) * is in P(DA).The key property of our ∆ is that after fixing any letter at any position, except maybe for a constant number of positions, one can still complete the word into one within ∆ * .This is not true for ∆ = {ab} because after fixing a b in an odd position all completions fall outside of (ab) * .
We now spell out the technical details.
Let ∆ be a finite non-empty set of non-empty words over an alphabet Σ.Let ⊥ be a letter not in Σ.A mask is a word over Σ ∪ { ⊥ }.The positions of a mask carrying a ⊥ are called free while the positions carrying a letter in Σ are called fixed.A mask λ is a submask of a mask λ if it is formed from λ by replacing some occurrences (possibly zero) of ⊥ by a letter in Σ.
A completion of a mask λ is a word w over Σ that is built from λ by replacing all occurrences of ⊥ by a letter in Σ.Notice that all completions of a mask have the same length as the mask itself.A mask λ is ∆-compatible if it has a completion in ∆ * .
The dangerous positions of a mask λ are the positions within distance 2l − 2 of the fixed positions or within distance l − 1 of the beginning or the end of the mask, where l is the maximal length of a word in ∆.A position that is not dangerous is said to be safe and is necessarily free.
We say that ∆ is safe if the following holds.Let λ be a ∆-compatible mask.Let i be any free position of λ that is not dangerous.Let a be any letter in Σ.Then the submask of λ constructed by fixing a at position i is ∆-compatible.We have already seen that ∆ = {ab} is not safe.However our targeted ∆, ∆ = {c, ab}, ∆ = {b, ab}, ∆ = {a, b}, are safe.We always consider ∆ to be safe in the following.
Note that it is important in the definition of safe for ∆ that we fix only safe positions, i.e. positions far apart and far from the beginning and the end of the mask.Indeed, depending on the chosen ∆, there might be words that never appear as factors in any word of ∆ * , such as bb when ∆ = {c, ab} or aa when ∆ = {b, ab}, so that fixing a position near an already fixed position to an arbitrary letter in a ∆-compatible mask may result in a mask that has no completion in ∆ * .This is why we make sure that safe positions are far from those already fixed and from the beginning and the end of the mask, where far depends on the length of the words of ∆.
Finally, we say that a completion w of a mask λ is safe if w is a completion of λ belonging to ∆ * or is constructed from a completion of λ in ∆ * by modifying only letters at safe positions of λ, the dangerous positions remaining unchanged.
Let M be a monoid in DA whose identity we will denote by 1.We define a version of Green's relations for decomposing monoids that will be used, as often in this setting, to prove the main technical lemma in the current proof.Given two elements u, u of M we say that u ≤ J u if there are elements v, v of M such that u = vuv .We write u ∼ J u if u ≤ J u and u ≤ J u.We write u < J u if u ≤ J u and u ∼ J u.Given two elements u, u of M we say that u ≤ R u if there is an element v of M such that u = uv.We write u ∼ R u if u ≤ R u and u ≤ R u.We write u < R u if u ≤ R u and u ∼ R u.Given two elements u, u of M we say that u ≤ L u if there is an element v of M such that u = vu.We write u ∼ L u if u ≤ L u and u ≤ L u.We write u < L u if u ≤ L u and u ∼ L u. Finally, given two elements u, u of M , we write u ∼ H u if u ∼ R u and u ∼ L u .
We shall use the following well-known fact about these preorders and equivalence relations (see [Pin86, Chapter 3, Proposition 1.4]).
Lemma 4.4.For all elements u and From the definition it follows that for all elements u, v, r of M , we have u ≤ R ur and v ≤ L rv.When the inequality is strict in the first case, i.e. u < R ur, we say that r is R-bad for u.Similarly r is L-bad for v if v < L rv.It follows from M ∈ DA that being R-bad or L-bad only depends on the ∼ R or ∼ L class, respectively.This is formalized in the following lemma, that is folklore and used at least implicitly in many proofs involving DA (see for instance [TT02b, proof of Theorem 3]).Since we didn't manage to find the lemma stated and proven in the form below, we include a proof for completeness.
Proof.Let u, u , r ∈ M such that u ∼ R u and ur ∼ R u.This means that there exist v, v , s ∈ M such that u = u v , u = uv and urs = u.that for any quadruplet (λ , P , u , v ) strictly smaller than (λ, P, u, v), the lemma is verified.Consider the following conditions concerning the quadruplet (λ, P, u, v): (a) there does not exist any instruction (x, f ) of P such that for some letter a the submask λ of λ formed by setting position x to a is ∆-compatible and f (a) is R-bad for u; (b) v is not R-bad for u; (c) there does not exist any instruction (x, f ) of P such that for some letter a the submask λ of λ formed by setting position x to a is ∆-compatible and f (a) is L-bad for v; (d) u is not L-bad for v.
We will now do a case analysis based on which of these conditions are violated or not.
Case 1: condition (a) is violated.So there exists some instruction (x, f ) of P such that for some letter a the submask λ of λ formed by setting position x to a (if it wasn't already the case) is ∆-compatible and f (a) is R-bad for u.Let i be the smallest number of such an instruction.Let P be the subprogram of P until, and including, instruction i − 1.Let w be a safe completion of λ.For any instruction (y, g) of P , as y < i, g(w y ) cannot be R-bad for u, so u ∼ R ug(w y ).Hence, by Lemma 4.5, u ∼ R uP (w) for all safe completions w of λ.
So, because f (a) is R-bad for u, any safe completion w of λ , which is also a safe completion of λ, is such that u ∼ R uP (w) < R uP (w)f (a) ≤ R uP (w)v by Lemma 4.5, hence uP (w) < J uP (w)v by Lemma 4.4.So (λ , P , u, 1) ≺ (λ, P, u, v), therefore, by induction we get a ∆-compatible submask λ 1 of λ and a monoid element t 1 such that uP (w) = t 1 for all safe completions w of λ 1 .
Let P be the subprogram of P starting from instruction i+1.Notice that, since u ∼ R t 1 (by what we have proven just above), u < R t 1 f (a) (by Lemma 4.5) and t 1 f (a)P (w)v = uP (w)f (a)P (w)v = uP (w)v for all safe completions w of λ 1 .Hence, (λ 1 , P , t 1 f (a), v) is strictly smaller than (λ, P, u, v) and by induction we get a ∆-compatible submask λ 2 of λ 1 and a monoid element t such that t 1 f (a)P (w)v = t for all safe completions w of λ 2 .
Thus, any safe completion w of λ 2 is such that Therefore λ 2 and t form the desired couple of a ∆-compatible submask of λ and an element of M .We still have to show that |λ 2 | Σ satisfies the desired upper bound.By induction, since (λ , P , u, 1) is of height h ≤ h − 1, we have Consequently, by induction again, as Case 2: condition (a) is verified but condition (b) is violated, so v is R-bad for u and Case 1 does not apply.
Let w be a safe completion of λ: for any instruction (x, f ) of P , as the submask λ of λ formed by setting position x to w x is ∆-compatible (by the fact that ∆ is safe and w is a safe completion of λ), f (w x ) cannot be R-bad for u, otherwise condition (a) would be violated, so u ∼ R uf (w x ).Hence, by Lemma 4.5, u ∼ R uP (w) for all safe completions w of λ.Notice then that u ∼ R uP (w) < R uP (w)v (by Lemma 4.5), hence uP (w) < J uP (w)v (by Lemma 4.4) for all safe completions w of λ.So (λ, P, u, 1) ≺ (λ, P, u, v), therefore we obtain by induction a monoid element t 1 and a ∆-compatible submask λ of λ such that uP (w) = t 1 for all completions w of λ .If we set t = t 1 v, we get that any safe completion w of λ is such that uP (w)v = t 1 v = t.Therefore λ and t form the desired couple of a ∆-compatible submask of λ and an element of M .
Moreover, by induction, since (λ, P, u, 1) is of height h ≤ h − 1, we have the desired upper bound.Case 3: condition (c) is violated.So there exists some instruction (x, f ) of P such that for some letter a the submask λ of λ formed by setting position x to a (if it wasn't already the case) is ∆-compatible and f (a) is L-bad for v.
We proceed as for Case 1 by symmetry.Case 4: condition (c) is verified but condition (d) is violated, so u is L-bad for v and Case 3 does not apply.We proceed as for Case 2 by symmetry.Case 5: conditions (a), (b), (c) and (d) are verified.
As it was in Case 2 and Case 4, using Lemma 4.5, the fact that condition (a) and condition (c) are verified implies that u ∼ R uP (w) and v ∼ L P (w)v for any prefix P of P , any suffix P of P and all safe completions w of λ.Moreover, since condition (b) and condition (d) are verified, by Lemma 4.5, we get that uP (w)v ∼ R u and uP (w)v ∼ L v for all safe completions w of λ.This implies that (λ, P, u, v) is minimal for ≺ and that h = 0.
Let w 0 be a completion of λ that is in ∆ * .Let λ be the submask of λ fixing all free dangerous positions of λ using w 0 and let t = uP (w 0 )v.Then, for any completion w of λ , which is a safe completion of λ by construction, we have that uP (w)v ∼ R u ∼ R t and uP (w)v ∼ L v ∼ L t. Thus, uP (w)v ∼ H t for any completion w of λ .As M is aperiodic, this implies that uP (w)v = t for all completions w of λ (see [Pin86, Chapter 3, Proposition 4.2]).Therefore λ and t form the desired couple of a ∆-compatible submask of λ and an element of M .Now, since the number of free positions of λ fixed in λ , i.e. |λ | Σ − |λ| Σ , is exactly the number of free dangerous positions in λ, and as a position in λ is dangerous if it is within distance 2l − 2 of a fixed position or within distance l − 1 of the beginning or the end of λ, we have This concludes the proof of the lemma.
Setting ∆ = {c, ab} or ∆ = {b, ab} with Σ the associated alphabet, when applying Lemma 4.6 with the trivial ∆-compatible mask λ of length n containing only free positions, with P some program over M of range n and with u and v equal to 1, the resulting mask λ has the property that we have an element t of M such that P (w) = t for any safe completion w of λ .Since the mask λ is ∆-compatible and has a number of fixed positions upper-bounded by (2 h 6l) 2 h where h is the height of (λ, P, u, v), itself upper-bounded by 2 • |M | 2 , as long as n is big enough, we have a safe completion w 0 ∈ ∆ * and a safe completion w 1 / ∈ ∆ * .Hence, P cannot be part of any sequence of programs p-recognizing ∆ * .This implies that (c + ab) * / ∈ P(M ) and (b + ab) * / ∈ P(M ).Finally, for any k ∈ N, k ≥ 2, we can prove that b * ((ab * ) k ) * / ∈ P(M ) by setting ∆ = {a, b} and completing the mask given by the lemma by setting the letters in such a way that we have the right number of a modulo k in one case and not in the other case.
This concludes the proof of Proposition 4.2 because the argument above holds for any monoid in DA.

A fine hierarchy in P(DA)
The definition of p-recognition by a sequence of programs over a monoid given in Section 2 requires that for each n, the program reading the entries of length n has a length polynomial in n.In the case of P(DA), the polynomial-length restriction is superfluous: any program over a monoid in DA is equivalent to one of polynomial length over the same monoid [TT02a] (in the sense that they recognize the same languages).In this section, we show that this does not collapse further: in the case of DA, programs of length O(n k+1 ) express strictly more than those of length O(n k ).
Following [GT03], we use an alternative definition of the languages recognized by a monoid in DA.We define by induction a hierarchy of classes of languages SUM k , where SUM stands for strongly unambiguous monomial.A language L is in SUM 0 if it is of the form A * for some alphabet A. A language L is in SUM k for k ∈ N >0 if it is in SUM k−1 or L = L 1 aL 2 for some languages L 1 ∈ SUM i and L 2 ∈ SUM j and some letter a with i + j = k − 1 such that no word of L 1 contains the letter a or no word of L 2 contains the letter a.
Gavaldà and Thérien stated without proof that a language L is recognized by a monoid in DA iff there is a k ∈ N such that L is a Boolean combination of languages in SUM k [GT03] (see [Gro18, Theorem 4.1.9]for a proof).For each k ∈ N, we denote by DA k the variety of monoids generated by the syntactic monoids of the Boolean combinations of languages in SUM k .It can be checked that, for each k, DA k forms a variety of monoids recognizing precisely Boolean combinations of languages in SUM k : this is what we do in the first subsection.
In the two following subsections, we then give a fine program-length-based hierarchy within P(DA) for this parametrization of DA.

5.1.
A parametrization of DA.For each k ∈ N, we denote by SUL k the class of regular languages that are Boolean combinations of languages in SUM k ; it is a variety of languages as shown just below.But as DA k is the variety of monoids generated by the syntactic monoids of the languages in SUL k , by Eilenberg's theorem, we know that, conversely, all the regular languages whose syntactic monoids lie in DA k are in SUL k .
Back to the fact that SUL k is a variety of languages for any k ∈ N. Closure under Boolean operations is obvious by construction.Closure under quotients and inverses of morphisms is respectively given by the following two lemmas and by the fact that both quotients and inverses of morphisms commute with Boolean operations.
Given a word u over a given alphabet Σ, we will denote by alph(u) the set of letters of Σ that appear in u.
Lemma 5.1.For all k ∈ N, for all L ∈ SUM k over an alphabet Σ and u ∈ Σ * , u −1 L and Lu −1 both are unions of languages in SUM k over Σ.
Proof.We prove it by induction on k.
Base case: k = 0. Let L ∈ SUM 0 over an alphabet Σ and u ∈ Σ * .This means that L = A * for some A ⊆ Σ.We have two cases: either alph(u) A and then u −1 L = Lu −1 = ∅; or alph(u) ⊆ A and then u −1 L = Lu −1 = A * = L.So u −1 L and Lu −1 both are unions of languages in SUM 0 over Σ.The base case is hence proved.
Inductive step.Let k ∈ N >0 and assume that the lemma is true for all k ∈ N, k < k.
Let L ∈ SUM k over an alphabet Σ and u ∈ Σ * .This means that either L is in SUM k−1 and the lemma is proved by applying the inductive hypothesis directly for L and u, or L = L 1 aL 2 for some languages L 1 ∈ SUM i and L 2 ∈ SUM j and some letter a ∈ Σ with i + j = k − 1 and, either no word of L 1 contains the letter a or no word of L 2 contains the letter a.We shall only treat the case in which a does not appear in any of the words of L 1 ; the other case is treated symmetrically.
There are again two cases to consider, depending on whether a does appear in u or not.If a / ∈ alph(u), then it is straightforward to check that u −1 L = (u −1 L 1 )aL 2 and Lu −1 = L 1 a(L 2 u −1 ).By the inductive hypothesis, we get that u −1 L 1 is a union of languages in SUM i over Σ and that L 2 u −1 is a union of languages in SUM j over Σ.Moreover, it is direct to see that no word of u −1 L 1 contains the letter a.By distributivity of concatenation over union, we finally get that u −1 L and Lu −1 both are unions of languages in SUM k over Σ.
If a ∈ alph(u), then let u = u 1 au 2 with u 1 , u 2 ∈ Σ * and a / ∈ alph(u 1 ).It is again straightforward to see that As before, by the inductive hypothesis, we get that L 1 u 1 −1 is a union of languages in SUM i over Σ and that both u 2 −1 L 2 and L 2 u −1 are unions of languages in SUM j over Σ.And, again, by distributivity of concatenation over union, we get that u −1 L and Lu −1 both are a union of languages in SUM k over Σ.
This concludes the inductive step and therefore the proof of the lemma.
Lemma 5.2.For all k ∈ N, for all L ∈ SU M k over an alphabet Σ and ϕ : Γ * → Σ * a morphism where Γ is another alphabet, ϕ −1 (L) is a union of languages in SUM k over Γ.
Proof.We prove it by induction on k.
Base case: k = 0. Let L ∈ SUM 0 over an alphabet Σ and ϕ : Γ * → Σ * a morphism where Γ is another alphabet.This means that L = A * for some A ⊆ Σ.It is straightforward to check that ϕ −1 (L) = B * where B = {b ∈ Γ | ϕ(b) ∈ A * }.B * is certainly a union of languages in SUM 0 over Σ.The base case is hence proved.
Inductive step.Let k ∈ N >0 and assume that the lemma is true for all k ∈ N, k < k.
Let L ∈ SUM k over an alphabet Σ and ϕ : Γ * → Σ * a morphism where Γ is another alphabet.This means that either L is in SUM k−1 and the lemma is proved by applying the inductive hypothesis directly for L and ϕ, or L = L 1 aL 2 for some languages L 1 ∈ SUM i and L 2 ∈ SUM j and some letter a ∈ Σ with i + j = k − 1 and, either no word of L 1 contains the letter a or no word of L 2 contains the letter a.We shall only treat the case in which a does not appear in any of the words of L 1 ; the other case is treated symmetrically.
Let us define B = {b ∈ Γ | a ∈ alph(ϕ(b))} as the set of letters of Γ whose image word by ϕ contains the letter a.For each b ∈ B, we shall also let ϕ(b) = u b,1 au b,2 with u b,1 , u b,2 ∈ Σ * and a / ∈ alph(u b,1 ).It is not too difficult to see that we then have By the inductive hypothesis, by Lemma 5.1 and by the fact that inverses of morphisms commute with unions, we get that ϕ −1 (L 1 u b,1 −1 ) is a union of languages in SUM i over Γ and that ϕ −1 (u b,2 −1 L 2 ) is a union of languages in SUM j over Γ.Moreover, it is direct to see that no word of ϕ −1 (L 1 u b,1 −1 ) contains the letter b for all b ∈ B. By distributivity of concatenation over union, we finally get that ϕ −1 (L) is a union of languages in SUM k over Γ.
This concludes the inductive step and therefore the proof of the lemma.

Strict hierarchy.
For each k we show there exists a language L k ⊆ {0, 1} * that can be recognized by a sequence of programs of length O(n k ) over a monoid M k in DA k but cannot be recognized by any sequence of programs of length O(n k−1 ) over any monoid in DA.
For a given k ∈ N >0 , the language L k expresses a property of the first k occurrences of 1 in the input word.To define L k we say that S is a k-set over n for some n ∈ N if S is a set where each element is an ordered tuple of k distinct elements of [n].For any sequence ∆ = (S n ) n∈N of k-sets over n, we set L ∆ = n∈N K n,Sn , where for each n ∈ N, K n,Sn is the set of words over {0, 1} of length n such that for each of them, it contains at least k occurrences of 1 and the ordered k-tuple of the positions of the first k occurrences of 1 belongs to S n .
On the one hand, we show that for all k there is a monoid M k in DA k such that for all ∆ the language L ∆ is recognized by a sequence of programs over M k of length O(n k ).The proof is done by an inductive argument on k.
On the other hand, we show that for all k there is a ∆ such that for any finite monoid M and any sequence of programs (P n ) n∈N over M of length O(n k−1 ), L ∆ is not recognized by (P n ) n∈N .This is done using a counting argument: for some monoid size i, for n big enough, 14:27 the number of languages in {0, 1} n recognized by a program over some monoid of size i of length at most α • n k−1 for α some constant is upper-bounded by a number that turns out to be asymptotically smaller than the number of different possible K n,Sn .
Upper bound.We start with the upper bound.Notice that for some k ∈ N >0 and ∆ = (S n ) n∈N , the language of words of length n of L ∆ is exactly K n,Sn .Hence the fact that L ∆ can be recognized by a sequence of programs over a monoid in DA k of length O(n k ) is a consequence of the following proposition.
Proposition 5.3.For all k ∈ N >0 there is a monoid M k ∈ DA k such that for all n ∈ N and all k-sets S n over n, the language K n,Sn is recognized by a program over M k of length at most 4n k .
Proof.We first define by induction on k a family of languages Z k over the alphabet Y k = { ⊥l , l | 1 ≤ l ≤ k}.For k = 0, Z 0 is {ε}.For k > 0, Z k is the set of words containing k and such that the first occurrence of k has no ⊥k to its left, and the sequence between the first occurrence of k and the first occurrence of ⊥k or k to its right, or the end of the word if there is no such letter, belongs to Z k−1 .A simple induction on k shows that Z k is defined by the following expression Fix n.If n = 0, the proposition follows trivially, otherwise, we define by induction on k a program P k (i, S) for every k-set S over n and every 1 ≤ i ≤ n + 1 that will for the moment output elements of Y k ∪ {ε} instead of outputting elements of M k .
For any k > 0, 1 ≤ j ≤ n and S a k-set over n, let f j,S be the function with f j,S (0) = ε and f j,S (1) = k if j is the first element of some ordered k-tuple of S, f j,S (1) = ⊥k otherwise.We also let g k be the function with g k (0) = ε and g k (1) = ⊥k .If S is a k-set over n and 1 ≤ j ≤ n then S|j denotes the (k − 1)-set over n containing the ordered (k − 1)-tuples t such that (j, t) ∈ S.
For k > 0, 1 ≤ i ≤ n + 1 and S a k-set over n, the program P k (i, S) is the following sequence of instructions: In other words, the program guesses the first occurrence j ≥ i of 1, returns ⊥k or k depending on whether it is the first element of an ordered k-tuple in S, and then proceeds for the next occurrences of 1 by induction.
For k = 0, 1 ≤ i ≤ n + 1 and S a 0-set over n (that is empty or contains ε, the only ordered 0-tuple of elements of [n]), the program P 0 (i, S) is the empty program ε.
A simple computation shows that for any k ∈ N >0 , 1 ≤ i ≤ n + 1 and S a k-set over n, the number of instructions in P k (i, S) is at most 4n k .
A simple induction on k shows that when running on a word w ∈ {0, 1} n , for any k ∈ N >0 , 1 ≤ i ≤ n + 1 and S a k-set over n, P k (i, S) returns a word in Z k iff the ordered k-tuple of the positions of the first k occurrences of 1 starting at position i in w exists and is an element of S.
For any k > 0 and S n a k-set over n, it remains to apply the syntactic morphism of Z k to the output of the functions in the instructions of P k (1, S n ) to get a program over M k of length at most 4n k recognizing K n,Sn .14:29 For each possible acceptance set, an input word to the program is accepted if and only if the word over the alphabet M produced by the program belongs to some fixed Boolean combination of languages in SUM k .The idea is then just to keep enough instructions so that membership of the produced word over M in each of these languages does not change.
Recall that if P is a program over some monoid M of range n, then P (w) denotes the element of M resulting from the execution of the program P on w.It will be convenient here to also work with the word over M resulting from the sequence of executions of each instruction of P on w.We denote this word by EP (w).
The result is a consequence of the following lemma and the fact that for any acceptance set F ⊆ M , a word w ∈ Σ n (where Σ is the input alphabet) is accepted iff EP (w) ∈ L where L is a language in SUL k , a Boolean combination of languages in SUM k .
Lemma 5.7.Let Σ be an alphabet, M a finite monoid, and n, k natural numbers.
For any program P over M of range n and any language K over M in SUM k , there exists a subprogram Q of P of length O(n max{k,1} ) such that for any subprogram Q of P that has Q as a subprogram, we have for all words w over Σ of length n: EP (w) ∈ K ⇔ EQ (w) ∈ K .
Proof.A program P over M of range n is a finite sequence (p i , f i ) of instructions where each p i is a positive natural number which is at most n and each f i is a function from Σ to M .We denote by l the number of instructions of P .For each set I ⊆ [l] we denote by P [I] the subprogram of P consisting of the subsequence of instructions of P obtained after removing all instructions whose index is not in I.In particular, P [1, m] denotes the initial sequence of instructions of P , until instruction number m.
We prove the lemma by induction on k.
The intuition behind the proof for a program P on inputs of length n and some K 1 γK 2 ∈ SU M k when k ≥ 2 is as follows.We assume that K 1 does not contain any word with the letter γ, the other case is done symmetrically.Consider the subset of all indices I γ ⊆ [l] that correspond, for a fixed letter a and a fixed position p in the input, to the first instruction of P that would output the element γ when reading a at position p.We then have that, given some w as input, EP (w) ∈ K 1 γK 2 if and only if there exists i ∈ I γ verifying that the element at position The idea is then that if we set I to contain I γ as well as all indices obtained by induction for P [1, i − 1] and K 1 and for P [i + 1, l] and K 2 , we would have that for all w, EP (w) ∈ K 1 γK 2 if and only if EP [I](w) ∈ K 1 γK 2 , that is EP (w) where only the elements at indices in I have been kept.
The intuition behind the proof when k < 2 is essentially the same, but without induction.We now spell out the details of the proof, starting with the inductive step.
Inductive step.Let k ≥ 2 and assume the lemma proved for all k < k.Let n be a natural number, P a program over M of range n and length l and any language K over M in SUM k .If K ∈ SUM k−1 , by the inductive hypothesis, we are done.Otherwise, by definition, K = K 1 γK 2 for γ ∈ M and some languages K 1 ∈ SUM k 1 and K 2 ∈ SUM k 2 over M with k 1 + k 2 = k − 1. Moreover either γ does not occur in any of the words of K 1 or it does not occur in any of the words of K 2 .We only treat the case where γ does not appear in any of the words in K 1 .The other case is treated similarly by symmetry.
Observe that when n = 0, we necessarily have P = ε, so that the lemma is trivially proven in that case.So we now assume n > 0.
For each 1 ≤ p ≤ n and each a ∈ Σ consider within the sequence of instructions of P the first instruction of the form (p, f ) with f (a) = γ, if it exists.We let I γ be the set of indices of these instructions for all a and p.Notice that the size of I γ is in O(n).
For all i ∈ I γ , we let J i,1 be the set of indices of the instructions within P [1, i − 1] appearing in its subprogram obtained by induction for P [1, i − 1] and K 1 , and J i,2 be the same for P [i + 1, l] and K 2 .
We now let I be the union of I γ and J i,1 and J i,2 = {j + i | j ∈ J i,2 } for all i ∈ I γ .We claim that Q = P [I] has the desired properties.
First notice that by induction the sizes of J i,1 and J i,2 for all i ∈ I γ are in O(n max{k−1,1} ) = O(n k−1 ) and because the size of I γ is linear in n, the size of I is in O(n k ) = O(n max{k,1} ) as required.
Let Q be a subprogram of P that has Q as a subprogram: it means that there exists some set Now take w ∈ Σ n .Assume now that EP (w) ∈ K. Let i be the position in EP (w) of label γ witnessing the membership in K. Let (p i , f i ) be the corresponding instruction of P .In particular we have that f i (w p i ) = γ.Because γ does not occur in any word of K 1 , for all j < i such that p j = p i we cannot have f j (w p j ) = γ.Hence i ∈ I γ .By induction we have that EP [1, i − 1][J](w) ∈ K 1 for any set J ⊆ [i − 1] containing J i,1 and EP [i + 1, l][J](w) ∈ K 2 for any set J ⊆ [l − i] containing J i,2 .Hence, if we set I 1 = {j ∈ I | j < i} as the subset of I of elements less than i and I 2 = {j − i ∈ I | j > i} as the subset of I of elements greater than i translated by −i, we have If not this shows that there is an instruction (p j , f j ) with j < i, j ∈ I , p j = p i and f j (w p j ) = γ.But that would contradict the fact that γ cannot occur in K 1 .So we have EP (w) ∈ K as desired.
Base case.There are two subcases to consider.Subcase k = 1.Let n be a natural number, P a program over M of range n and length l and any language K over M in SUM 1 .
If K ∈ SUM 0 , we can conclude by referring to the subcase k = 0. Otherwise K = A * 1 γA * 2 for γ ∈ M and some alphabets A 1 ⊆ M and A 2 ⊆ M .Moreover either γ / ∈ A 1 or γ / ∈ A 2 .We only treat the case where γ does not belong to A 1 , the other case is treated similarly by symmetry.
We use the same idea as in the inductive step.
Observe that when n = 0, we necessarily have P = ε, so that the lemma is trivially proven in that case.So we now assume n > 0.
For each 1 ≤ p ≤ n, each α ∈ M and a ∈ Σ consider within the sequence of instructions of P the first and last instruction of the form (p, f ) with f (a) = α, if they exist.We let I be the set of indices of these instructions for all a, α and p.Notice that the size of I is in O(n) = O(n max{k,1} ).
We claim that Q = P [I] has the desired properties.We just showed that it has the required length.
Let Q be a subprogram of P that has Q as a subprogram: it means that there exists some set I ⊆ [l] containing I such that Q = P [I ].
Take w ∈ Σ n .Assume now that EP (w) ∈ K. Let i be the position in EP (w) of label γ witnessing the membership in K. Let (p i , f i ) be the corresponding instruction of P .In particular we have that f i (w p i ) = γ and this is the γ witnessing the membership in K.Because γ / ∈ A 1 , for all j < i such that p j = p i we cannot have f j (w p j ) = γ. 1 .Hence for all j < i, f j (w p j ) ∈ A 1 .By symmetry we have that for all j > i, f j (w p j ) ∈ A 2 , showing that EP (w) ∈ A * 1 γA * 2 = K as desired.
Subcase k = 0. Let n be a natural number, P a program over M of range n and length l and any language K over M in SUM 0 .Then K = A * for some alphabet A ⊆ M .We again use the same idea as before.
Observe that when n = 0, we necessarily have P = ε, so that the lemma is trivially proven in that case.So we now assume n > 0.
For each 1 ≤ p ≤ n, each α ∈ M and a ∈ Σ consider within the sequence of instructions of P the first instruction of the form (p, f ) with f (a) = α, if it exists.We let I be the set of indices of these instructions for all a, α and p.Notice that the size of I is in O(n) = O(n max{k,1} ).
We claim that Q = P [I] has the desired properties.We just showed that it has the required length.
Let Q be a subprogram of P that has Q as a subprogram: it means that there exists some set I ⊆ [l] containing I such that Q = P [I ].
Take w ∈ Σ n .Assume now that EP (w) ∈ K.As EP [I ](w) is a subword of EP (w), it follows directly that EP [I ](w) ∈ A * = K as desired.
Assume finally that EP [I ](w) ∈ K.If there is an instruction (p j , f j ), with j ∈ [l] and f j (w p j ) / ∈ A then either j ∈ I and we get a direct contradiction with the fact that EP [I](w) ∈ A * = K, or j / ∈ I and we get a smaller j ∈ I ⊆ I with the same property, contradicting again the fact that EP [I ](w) ∈ A * = K.Hence for all j ∈ [l], f j (w p j ) ∈ A, showing that EP (w) ∈ A * = K as desired.

Conclusion
We introduced a notion of tameness, particularly relevant to the analysis of programs over monoids from "small" varieties.The main source of interest in tameness is Proposition 3.11, stating that a variety of monoids V is tame if and only if the class of regular languages p-recognized by programs over monoids from V is included in the class L(QEV).A first question that arises is for which V those two classes of regular languages are equal.We could not rule out the possibility that for some tame non-trivial V, L(QEV) \ P(V) = ∅.We conjecture that if V is local, abusing notation, QEV = EV * Mod, by analogy with QV equating V * Mod in that case; as L(EV * Mod) ⊆ P(V) holds unconditionally (because V cannot be trivial if it is local [Til87, p. 134]), under our conjecture P(V) ∩ Reg = L(QEV) would hold for local tame varieties V (this is Conjecture 3.12).Concretely, we have obtained the technical result that DA is a tame variety using semigroup-theoretic arguments.We have given A and Com as further examples of tame varieties.Our proof that A is tame needed the fact that MOD m / ∈ AC 0 for all m ≥ 2, so it would be interesting to prove A tame "purely algebraically", independently from the known combinatorial arguments [Ajt83,FSS84,Hås86] and those based on approximating circuits by polynomials over some finite field [Raz87,Smo87].But tameness of A is actually equivalent to MOD m / ∈ AC 0 for all m ≥ 2 by Proposition 3.11, thus confronting us with the challenging task to reprove significant circuit complexity results by relying mainly on new semigroup-theoretic arguments.Such a breakthrough is still to be made.
By contrast, we have shown that J is not tame.So programs over monoids from J p-recognize "more regular languages than expected".A natural question to ask is what these regular languages in P(J) are.Partial results in that direction were obtained in [Gro20].
To conclude we should add, in fairness, that the progress reported here does not in any obvious way bring us closer to major NC 1 complexity subclasses separations.Our concrete contributions here largely concern P(DA) and P(J), classes that are well within AC 0 .But this work does uncover new ways in which a program can or cannot circumvent the limitations imposed by the underlying monoid algebraic structure available to it.

3. 4 .
The example of finite commutative monoids.The variety Com of finite commutative monoids is defined by the identity xy = yx and L(Com) is the class of languages that are Boolean combinations of languages of the form {w ∈ Σ * | |w| a ≡ k mod p} for k ∈ [[0, p − 1]] and p prime or {w ∈ Σ * | |w| a = k} for k ∈ N with Σ any alphabet and a ∈ Σ (see [Eil76, Chapter VIII, Example 3.5]).