Asynchronous wreath product and cascade decompositions for concurrent behaviours

We develop new algebraic tools to reason about concurrent behaviours modelled as languages of Mazurkiewicz traces and asynchronous automata. These tools reflect the distributed nature of traces and the underlying causality and concurrency between events, and can be said to support true concurrency. They generalize the tools that have been so efficient in understanding, classifying and reasoning about word languages. In particular, we introduce an asynchronous version of the wreath product operation and we describe the trace languages recognized by such products (the so-called asynchronous wreath product principle). We then propose a decomposition result for recognizable trace languages, analogous to the Krohn-Rhodes theorem, and we prove this decomposition result in the special case of acyclic architectures. Finally, we introduce and analyze two distributed automata-theoretic operations. One, the local cascade product, is a direct implementation of the asynchronous wreath product operation. The other, global cascade sequences, although conceptually and operationally similar to the local cascade product, translates to a more complex asynchronous implementation which uses the gossip automaton of Mukund and Sohoni. This leads to interesting applications to the characterization of trace languages definable in first-order logic: they are accepted by a restricted local cascade product of the gossip automaton and 2-state asynchronous reset automata, and also by a global cascade sequence of 2-state asynchronous reset automata. Over distributed alphabets for which the asynchronous Krohn-Rhodes theorem holds, a local cascade product of such automata is sufficient and this, in turn, leads to the identification of a simple temporal logic which is expressively complete for such alphabets.


Introduction
Algebraic automata theory, that is, the use of algebraic tools and notions such as monoids, morphisms and varieties, has been very successful in classifying recognizable word languages, from both the theoretical and the algorithmic points of view, offering structural descriptions of, say, logically defined classes of languages and decision algorithms for membership in these classes [Eil76,Str94,Pin19]. The purpose of this paper is to extend some of this approach to the concurrent setting. We are particularly interested in decomposition results, in the spirit of the Krohn-Rhodes theorem, and their applications to the study of first-order definable trace languages.
Let us first specify our model of concurrency. Words represent sequential behaviours: a sequence of letters models a sequence of events, occurring on a single process. In a concurrent setting involving multiple processes, we work with the well established (Mazurkiewicz) traces [Maz77,DR95]: a trace represents a concurrent behaviour as a labelled partial order which captures the distribution of events across processes, as well as causality and concurrency between them.
The notion of a recognizable trace language is also very well established: a set of traces is recognizable if the set of all the words representing these traces is a regular language. A key contribution, due to Zielonka, is the description of an automata-theoretic model for the acceptance of recognizable trace languages, namely asynchronous automata [Zie87]. These automata, with their local state sets (one for each process), are natural distributed devices, which run on input traces in a distributed fashion, respecting the underlying causality and concurrency between events. More precisely, when working on an event during a run on an input trace, an asynchronous automaton updates only the local states of the processes participating in that event; the other processes remain oblivious to the occurrence of this event. Zielonka's theorem states that every recognizable trace language is accepted by an asynchronous automaton.
Early results seemed to indicate that the algebraic approach could be neatly transferred to recognizable trace languages, with the monoid-theoretic definition of recognizability matching the operational model of asynchronous automata (this is Zielonka's theorem mentioned above [Zie87]) and the characterization of star-free and first-order definable trace languages (Guaiana et al. [GRS92], Ebinger and Muscholl [EM96]) in terms of aperiodic monoids. Very few significant results in this direction emerged since, and especially no strong Krohn-Rhodes like decomposition results 1 .
There are indeed deep, technical obstacles to this approach, which are discussed in some detail in [Sar22,AGSW21]. Our first contribution is the introduction of better suited notions of asynchronous transformation monoids, asynchronous morphisms and asynchronous wreath products. Because these notions closely adhere to the distributed nature of traces, we obtain important results, such as a so-called asynchronous wreath product principle, describing the languages recognized by an asynchronous wreath product of asynchronous transformation monoids. Moreover, just as (ordinary) transformation monoids model DFAs and their wreath product models the cascade product of DFAs, our asynchronous wreath product of asynchronous transformation monoids can be implemented as a local cascade product of asynchronous automata. The local cascade product, in its purely automata-theoretic form, Both the asynchronous transducer and the global-state labelling function are asynchronous computing devices but the latter carries more information. This is reflected in the different ways these functions can be implemented. The composition of the asynchronous transducers of (appropriately defined) asynchronous automata, as we explained, translates directly to the local cascade product of these automata. In contrast, an additional ingredient is required to implement a global cascade sequence by means of an asynchronous automaton, that is, to make the global information carried by ζ A known to local states. This ingredient is the restricted cascade product with the gossip automaton.
Organization. We now briefly describe the organization of the paper. Section 2 summarizes the necessary notions on distributed alphabets, traces, transformation monoids, recognizable trace languages and asynchronous automata -including the statement of Zielonka's theorem.
We begin Section 3 with a brief account of the wreath product operation on transformation monoids, the notion of simulation and the Krohn-Rhodes theorem. We introduce our notions of asynchronous transformation monoids (Section 3.2) -directly in the spirit of Zielonka's asynchronous automata -, asynchronous morphisms (Section 3.3) and asynchronous wreath products (Section 3.4). Asynchronous transducers are used in Section 3.5 to state and prove the asynchronous wreath product principle (Theorem 3.24). The question of the implementation of the asynchronous wreath product, in the form of the local cascade product of asynchronous automata, is discussed in Section 3.6. Finally, in Section 3.7, we formulate and briefly discuss the asynchronous Krohn-Rhodes property of distributed alphabets.
Section 4 is entirely dedicated to the proof that the asynchronous Krohn-Rhodes property holds for distributed alphabets over an acyclic architecture (Theorem 4.2). The next section explores the applications of the notions of asynchronous wreath product and local cascade product to the special case of first-order definable languages: the characterization of the trace languages recognized by asynchronous wreath products of localized resets by LocTL[S i ] (Theorem 5.6), which therefore is expressively complete over aKR distributed alphabets (Theorem 5.7); and the characterization of the first-order definable trace languages by means of the restricted local cascade product of the gossip automaton with local cascade product of localized asynchronous reset automata (Theorem 5.14). This characterization is obtained using a new local temporal logic LocTL[Y i ≤ Y j , S i ] which is proved to be expressively complete in Theorem 5.5.
The notions of global-state labelling function associated with an asynchronous automaton, and of global cascade sequences as computing and accepting devices are introduced in Section 6, where we prove the characterization of first-order definable languages in terms of global cascade sequences of asynchronous localized reset automata (Theorem 6.12).
Section 7 completes the operational point of view on the latter results by presenting an asynchronous implementation of global cascade sequences (Theorems 7.3 and 7.7). Finally, Section 8 outlines an intriguing question left open by this work.
The results in this paper are an elaboration and an extension of those presented at CONCUR 2020 [AGSW20]. Proofs of several key results e.g. Theorem 4.2, Theorem 5.6 and Theorem 6.12 are unavailable in the conference proceedings and are included here. The complete Section 7 discussing implementation of a global cascade sequence as an asynchronous automaton is a technical addition that is briefly alluded to in our conference paper without details. Finally, Section 5 includes significant new technical contributions that 2. Preliminaries 2.1. Basic notions in trace theory. Let P be a finite set of agents/processes. If P is clear from the context, we use the simpler notation {X i } to denote a P-indexed family {X i } i∈P .
A distributed alphabet over P is a family Σ = {Σ i }, where the Σ i are non-empty finite sets that may overlap one another. Let Σ = i∈P Σ i . The location function loc : Σ → 2 P is defined by setting loc(a) = {i ∈ P | a ∈ Σ i }. Note that from Σ and loc, one can reconstruct the distributed alphabet, and hence we also use the notation (Σ, loc) for distributed alphabet. The corresponding trace alphabet is the pair (Σ, I), where I is the independence relation I = {(a, b) ∈ Σ 2 | loc(a) ∩ loc(b) = ∅} induced by (Σ, loc). The corresponding dependence relation is D = Σ 2 \ I.
A Σ-labelled poset is a structure t = (E, ≤, λ) where E is a set, ≤ is a partial order on E and λ : E → Σ is a labelling function. For e, e ∈ E, define e e if and only if e < e and for each e with e ≤ e ≤ e either e = e or e = e . For X ⊆ E, let ↓X = {y ∈ E | y ≤ x for some x ∈ X}. For e ∈ E, we abbreviate ↓{e} by simply ↓e. We also write ⇓e = ↓e \ {e} for the strict past of e.
This operation, henceforth referred to as trace concatenation, gives Tr(Σ) a monoid structure. A trace t is said to be a prefix of a trace t if there exists t such that t = t t .
Observe that, with a (resp. b) denoting the singleton trace whose only event is labelled a (resp. b), if (a, b) ∈ I then ab = ba in Tr(Σ). A basic result in trace theory gives a presentation of the trace monoid as a quotient of the free word monoid Σ * by the congruence ∼ I ⊆ Σ * × Σ * generated by ab ∼ I ba for (a, b) ∈ I. See [DR95] for more details.
Proposition 2.1. The canonical morphism from Σ * → Tr(Σ), sending a letter a ∈ Σ to the trace a, factors through the quotient monoid Σ * /∼ I and induces an isomorphism from Σ * /∼ I to the trace monoid Tr(Σ). Let t = (E, ≤, λ) ∈ Tr(Σ). The elements of E are referred to as events in t and for an event e in t, loc(e) abbreviates loc(λ(e)). Further, let i ∈ P. The set of i-events in t is E i = {e ∈ E | i ∈ loc(e)}. This is the set of events in which process i participates. It is clear that E i is totally ordered by ≤.
Note that, if we restrict the trace t to a downward-closed subset of events c = ↓c, we get another trace (c, ≤, λ) which is a prefix of t. In fact, every prefix of t arises this way, and we often identify prefixes with downward-closed sets of events. Examples of prefixes are defined by the empty set, E itself and, more importantly, ↓e or ⇓e, for every event e ∈ E.
2.2. Recognizable trace languages. A map from a set X to itself is called a transformation of X. The set F(X) of all such transformations forms a monoid under function composition: Example 2.2. Let X = {1, 2} and let r 1 , r 2 be the constant maps on X, mapping each element to 1 and 2, respectively. Then M = {id X , r 1 , r 2 } is a monoid. We denote the tm (X, M ) by U 2 . Let N be a monoid and let T = (X, M ) be a tm. By a morphism ϕ from N to T , we mean a (monoid) morphism ϕ : N → M . We abuse the notation and also write this as ϕ : N → T .
Remark 2.4. Morphisms from free word monoids to transformation monoids almost correspond to deterministic and complete automata, the only difference being that an automaton also includes an initial state and a set of final states. More precisely, let Σ be a finite alphabet, T = (X, M ) be a tm and ϕ : Σ * → T be a morphism. In the corresponding automaton, X is the set of states and for each a ∈ Σ, ϕ(a) ∈ M ⊆ F(X) defines the deterministic and complete transition function for letter a. Conversely, a deterministic and complete automaton A over Σ, defines a tm T = (X, M ) where X is the set of states of A, M is the transition monoid of A and a surjective morphism ϕ : Σ * → M as follows. For each a ∈ Σ, let ϕ(a) ∈ F(X) be the transformation on X induced by the transition function for letter a in A. We obtain a morphism ϕ : Σ * → F(X) and we let M = ϕ(Σ * ) be the submonoid of F(X) generated by {ϕ(a)} a∈Σ . For instance, the tm U 2 defined in Example 2.2 corresponds to the DFA in Figure 1 below, and the induced morphism is given by ϕ(a) = r 1 , ϕ(b) = r 2 and ϕ(c) = id X . We now fix a distributed alphabet (Σ, loc). Let ϕ : Tr(Σ) → M be a morphism to a monoid M . We note that, if (a, b) ∈ I, then ab = ba in Tr(Σ) and hence ϕ(a) and ϕ(b) commute in M . In fact, in view of Proposition 2.1, any function ϕ : Σ → M which has the property that ϕ(a) and ϕ(b) commute for every (a, b) ∈ I, can be uniquely extended to a morphism from Tr(Σ) to M .
Transformation monoids can be naturally used to recognize trace languages. We say that a trace language L ⊆ Tr(Σ) is recognized by a tm T = (X, M ) if there exists a morphism ϕ : Tr(Σ) → T , an initial element x in ∈ X and a final subset X fin ⊆ X such that L = {t ∈ Tr(Σ) | ϕ(t)(x in ) ∈ X fin }. A trace language is said to be recognizable if it is recognized by a finite tm (see [DR95,Die90]).
Remark 2.5. A morphism ϕ : Tr(Σ) → (X, M ) from the trace monoid into a tm can also be viewed, by Proposition 2.1, as a morphism from the free monoid Σ * with the additional property that ϕ(a) and ϕ(b) commute for every (a, b) ∈ I. The automaton corresponding to ϕ : Σ * → (X, M ) (cf. Remark 2.4) thus has the property that ab and ba have the same state transitions for all (a, b) ∈ I. An automaton with this property is called a diamondautomaton. It is a classical sequential automata model for recognizing trace languages since all linearizations of a trace reach the same destination state starting from any given state. For instance, the DFA in Figure 1 is a diamond-automaton for the distributed alphabet Here (b, c) ∈ I and indeed, the words bc and cb have identical state transitions in the DFA. Note however that the runs of the DFA on the two linearizations are different, namely, 1 → 1 → 2 for cb and 1 → 2 → 2 for bc.
2.3. Asynchronous automata. Recognizability of trace languages can also be seen as an automata-theoretic notion that is concurrent in nature. In the upcoming definition of an asynchronous automaton, the set of states is structured as a P-indexed family of finite non-empty sets {S i } i∈P . The elements of S i are called the local i-states, or the local states of process i. If P is a non-empty subset of P, a P -state is a map s : P → i∈P S i such that s(j) ∈ S j for every j ∈ P . We denote by S P the set of all P -states and we call S = S P the set of global states. 2 If P ⊆ P and s ∈ S P then s P denotes the restriction of s to P . We use the shorthand −P to indicate the complement of P in P. We sometimes split a global state s ∈ S as (s P , s −P ) ∈ S P × S −P . If a ∈ Σ, we talk about a-states to mean loc(a)-states and we write S a for S loc(a) . If a ∈ Σ, loc(a) ⊆ P and s is a P -state, we write s a for s loc(a) .
Finally, we use the following notion of extension. If P ⊆ P and f is a transformation of S P , the extension of f to S is the transformation g ∈ F(S) such that (g(s)) P = f (s P ) and (g(s)) −P = s −P . In other words, if s = (s P , s −P ) ∈ S, then g((s P , s −P )) = (f (s P ), s −P ).
We observe that f is entirely determined by g and P . Extensions of transformations of S P are called P -maps (see Section 3.2 for more details on P -maps).
Asynchronous automata were introduced by Zielonka for concurrent computation on traces [Zie87]. An asynchronous automaton A over (Σ, loc) is a tuple ({S i } i∈P , {δ a } a∈Σ , s in ) where • S i is a finite non-empty set of local i-states for each process i; • For a ∈ Σ, δ a : S a → S a is a (complete) transition function on a-states; • s in ∈ S is an initial global state.
2 Note that we can naturally identify SP with i∈P Si and S with i∈P Si. Similar to P-indexed families, we will follow the convention of writing {Y a } to denote the Σ-indexed family {Y a } a∈Σ .
Observe that an a-transition of A reads and updates only the local states of the agents which participate in a. As a result, actions which involve disjoint sets of agents are processed concurrently by A. For a ∈ Σ, let ∆ a : S → S be the extension of δ a : S a → S a . Clearly, if (a, b) ∈ I then ∆ a and ∆ b commute. Hence, the (global) transition functions {∆ a } induce a trace morphism t → ∆ t from Tr(Σ) to F(S). We denote by A(t) the global state reached when running A on t, A(t) = ∆ t (s in ).
Let L ⊆ Tr(Σ) be a trace language. We say that L is accepted by A if there exists a subset S fin ⊆ S of final global states such that L = {t ∈ Tr(Σ) | A(t) ∈ S fin }.
The fundamental theorem of Zielonka [Zie87] states that a trace language is recognizable if and only if it is accepted by some asynchronous automaton (see [Muk12] for another proof of the theorem).
Asynchronous labelling functions. We will also use asynchronous automata to decorate events of a trace with extra information computed by the automaton. This is similar to the notion of locally computable functions defined in [MS97]. It extends to traces the notion of sequential letter-to-letter word transducers. When dealing with a trace t = (E, ≤, λ), we wish to preserve the underlying poset (E, ≤) and enrich the labels with extra information.
Formally, let (Σ, loc) be a distributed alphabet and Γ be a finite set. The alphabet Σ × Γ can be equipped with a distributed structure over P by letting (Σ × Γ) i = Σ i × Γ. As a result, the location of (a, γ) ∈ Σ × Γ is simply loc(a); thus we unambiguously reuse the notation loc for the location function of Σ × Γ, and use the notation Tr(Σ × Γ) to denote the set of traces over this distributed alphabet. A Γ-labelling function is a map θ : Tr(Σ) → Tr(Σ × Γ) such that, for t = (E, ≤, λ) ∈ Tr(Σ), we have θ(t) = (E, ≤, (λ, µ)), i.e., the map θ adds a new label µ(e) ∈ Γ to each event e in t.
For instance, given i ∈ P, we may consider the {0, 1}-labelling function θ i which decorates each event e of a trace t with µ i (e) = 1 if the strict causal past of e contains some event on process i, i.e., if E i ∩ ⇓e = ∅, and µ i (e) = 0 otherwise.
Given a Γ-labelling function θ, we say thatÂ computes (or implements) θ if for every t ∈ Tr(Σ),Â(t) = θ(t). We also say that an asynchronous automaton Notice that a Γ-labelling function θ is defined on all input traces, hence an asynchronous automaton which computes θ must be complete in the sense that it admits a run on all traces from Tr(Σ) and it does not use an acceptance condition: when considering an asynchronous automaton A which computes a Γ-labelling function, we always assume that all global states of A are accepting.
For instance, the above {0, 1}-labelling function θ i can be computed by an asynchronous {0, 1}-transducer. We need two states {0, 1} for each process. Initially, all processes start in state 0. When the first event e occurs on process i, all processes in loc(e) switch to state 1: for all a ∈ Σ i , δ a is the constant map sending all states in S a to (1, . . . , 1). Then, the information is propagated via synchronizing events: for all b ∈ Σ \ Σ i , the map δ b sends (0, . . . , 0) to itself and all other states to (1, . . . , 1). It is easy to add output functions {µ a } in order to compute θ i .
There exists a canonical (most general) function computed by an asynchronous automaton A = ({S i }, {δ a }, s in ), called the asynchronous transducer of A and denoted χ A . This function, which was already defined in [AS04], simply adds to an event e the local state information of A before executing e. Formally, letting Γ A = a S a , it is implemented by taking each output function µ a to be the identity function from S a to Γ A . Notice that, traces in χ A (Tr(Σ)) have labels from {(a, s a ) | a ∈ Σ, s a ∈ S a } ⊆ Σ × Γ A . We denote by Σ × S the set {(a, s a ) | a ∈ Σ, s a ∈ S a }, and consider it a distributed alphabet as it naturally inherits the location function of Σ × Γ A , that is, loc((a, s a )) = loc(a). Clearly, all Γ-labelling functions computed by A are abstractions of χ A .

Asynchronous structures and decomposition problems
This section is devoted to the development of new algebraic asynchronous structures. Our main interest is in transferring from the algebraic theory of word languages to trace languages the results and methods which rely on the wreath product operation: the wreath product principle (see [Str94]) and the Krohn-Rhodes theorem (see [Eil76]). We present a new algebraic framework for the setting of traces, and introduce an appropriate asynchronous wreath product operation; an asynchronous wreath product principle is also established that works in the realm of traces. This allows posing the question of a meaningful distributed analogue of the Krohn-Rhodes theorem which we then partially resolve in the remainder of this article.
3.1. Wreath product in sequential setting. We first recall the definitions of division, simulation and wreath products in the context of transformation monoids, see [Eil76].  N )) if there exists a surjective map f : Y → X and a surjective morphism π : N → M defined on a submonoid N of N , such that π(n)(f (y)) = f (n(y)) for all n ∈ N and all y ∈ Y , see Figure 2 (left).
Finally, given morphisms ϕ : Σ * → T = (X, M ) and ψ : Σ * → T = (Y, N ), we say that ψ simulates ϕ if there exists a surjective map f : Y → X such that ϕ(a)(f (y)) = f (ψ(a)(y)) for all a ∈ Σ and all y ∈ Y , see Figure 2 (right). Note that to keep the pictures simple, the automata are not complete, all missing transitions go to an implicit sink state (state s for automaton A and s for automaton B). In particular, X = {q 1 , q a , q b , q ab , s} and M is generated by the transition functions ϕ(a) and ϕ(b) defined by A. We can check that ψ simulates ϕ using the surjective map f : Y → X defined by The same map f , along with π : N → M as π(ψ(w)) = ϕ(w), shows that T divides T . For sets U and V , we denote the set of all functions from U to V by F(U, V ).
Definition 3.4 (Wreath Product). Let T 1 = (X, M ) and T 2 = (Y, N ) be two tm's. We define the wreath product of T 1 and T 2 to be the tm T = T 1 T 2 = (X × Y, M × F(X, N )) where, for m ∈ M and f ∈ F(X, N ), (m, f ) represents the following transformation on X × Y : One verifies that the product of (m 1 , It is well known [Eil76] that the wreath product operation is associative on transformation monoids. The celebrated Krohn-Rhodes theorem [KR65] (see [Str94,DKS12] for different proofs), in its division and its simulation formulations, is as follows.
Theorem 3.5 (Krohn-Rhodes Theorem). Let ϕ : Σ * → T = (X, M ) be a morphism into a finite tm. Then ϕ is simulated by a morphism ψ : Σ * → T , where the tm T is the wreath product of finitely many transformation monoids which are either copies of U 2 or of the form (G, G) for some non-trivial simple group G dividing M . In particular, every finite transformation monoid (X, M ) divides a wreath product of the form above.
Directly interpreting these results in terms of morphisms from trace monoids to ordinary tm's leads to major technical difficulties (see [Sar22,AGSW21]). For instance, division of transformation monoids does not imply simulation of morphisms from trace monoid to the tm's due to the trace monoid not being a free monoid. Also the crucial wreath product principle, that describes word languages recognized by a wreath product of two transformation monoids in terms of word languages recognized by the individual tm's, breaks down when working with trace languages and morphisms from trace monoid; this is primarily due to the fact that the principle uses a sequential transducer that takes into account the runs of an automaton on words. But in a diamond automaton different linearizations of a trace may produce different runs, albeit ending in the same final state (see Remark 2.5).
3.2. Asynchronous transformation monoids. We now introduce a new algebraic framework to discuss recognizability for trace languages, which is more consistent with the distributed nature of the alphabet (Σ, loc) and with the notion of asynchronous automata. This point of view resolves the issues mentioned in the last subsection.
Asynchronous transformation monoids are defined as follows. Note that an atm T = ({S i }, M ) naturally induces the tm (S, M ), and that one can view T as the tm (S, M ), equipped with an additional structure which depends on P. We abuse notation and write T also for this tm.
More precisely, a crucial feature of the definition of an atm is that it makes a clear distinction between local and global states. While the underlying transformations operate on global states, we will be interested in global transformations that are essentially "induced" by a particular subset P of processes, that is, P -maps in the sense of Section 2.2.
It is worth pointing out at this stage, that a transformation g ∈ F(S) such that g(s) is of the form (s P , s −P ) for every s ∈ S, is not necessarily a P -map. This condition merely says that the (−P )-component of a global state is not updated by g. The update of the P -component might still depend on the (−P )-component.
The following lemma provides a characterization of P -maps. We skip the easy proof.  A simple but crucial observation regarding P -maps is recorded in the following lemma.
Lemma 3.9. Let f, g ∈ F(S) and let P, P ⊆ P. If f is a P -map, g is a P -map and P ∩ P = ∅, then f g = gf .
Proof. Suppose that f and g are the extensions of some f ∈ F(S P ) and g ∈ F(S P ), respectively. Let Q = P \ (P ∪ P ). We can denote, unambiguously, a global state s ∈ S as s = (s P , s P , s Q ). We then have This shows that f and g commute.
3.3. Asynchronous morphisms. We now introduce particular morphisms from the trace monoid Tr(Σ) to asynchronous transformation monoids.
It is important to observe that not every morphism ϕ : Tr(Σ) → M defines an asynchronous morphism: indeed ϕ(a) and An elementary yet fundamental result about asynchronous morphisms is stated in Lemma 3.11 below.
Lemma 3.11. Let T = ({S i }, M ) be an atm. Further, let ϕ : Σ → M be such that ϕ(a) is an a-map for every a ∈ Σ. Then ϕ can be uniquely extended to an asynchronous morphism from Tr(Σ) to T .
Proof. The map ϕ uniquely extends to a morphism from the free monoid Σ * to M . By Proposition 2.1, Tr(Σ) is the quotient of Σ * by the relations of the form ab = ba where (a, b) ∈ I, so we only need to show that ϕ(a) and ϕ(b) commute. Indeed, (a, b) ∈ I means that loc(a) ∩ loc(b) = ∅. As ϕ(a) and ϕ(b) are an a-map and a b-map, respectively, the result follows from Lemma 3.9.
A very important example of an asynchronous morphism is given by the transition morphism of an asynchronous automaton. Let A = ({S i }, {δ a }, s in ) be an asynchronous automaton over (Σ, loc). For each a ∈ Σ, let ∆ a ∈ F(S) be the extension of the local transition function δ a , an a-map by definition. Let also M A be the submonoid of F(S) generated by the ∆ a (a ∈ Σ). By Lemma 3.11 the map a → ∆ a extends to an asynchronous morphism ϕ A , from Tr(Σ) to the atm T A = ({S i }, M A ). We say that ϕ A is the transition (asynchronous) morphism of A and We record the following lemma, whose proof is immediate. Lemma 3.13. Given an asynchronous automaton A = ({S i }, {δ a }, s in ) over (Σ, loc), its transition atm T A and its transition asynchronous morphism ϕ A : A trace language is accepted by A if and only if it is recognized by T A via ϕ A , with s in as the initial state.
We also record the converse construction. Let T = ({S i }, M ) be an atm, let s in ∈ S be a global state and let ϕ : Tr(Σ) → T be an asynchronous morphism. For each a ∈ Σ, ϕ(a) is an a-map: let δ a be the (uniquely determined) transformation of S a of which ϕ(a) is an extension. Finally let A ϕ = ({S i }, {δ a }, s in ). Then A ϕ is an asynchronous automaton (over (Σ, loc)) and the following lemma is easily verified.
Lemma 3.14. Given an atm T = ({S i }, M ), a global state s in ∈ S and an asynchronous morphism ϕ : Tr(Σ) → T , the asynchronous automaton A ϕ is effectively constructible. Moreover, ϕ is the asynchronous transition morphism of A ϕ .
A trace language L ⊆ Tr(Σ) is recognized by T via ϕ (with initial state s in ) if and only if it is accepted by A ϕ .
Thus Zielonka's theorem can be rephrased to state that a trace language is recognizable if and only if it is recognized by an asynchronous morphism to an atm. We will see a more precise rephrasing in Section 3.7 below (Theorem 3.33).
3.4. Asynchronous wreath product. We adapt the definition of wreath product (see Section 3.1) to the setting of asynchronous transformation monoids.
An important observation is that the tm associated with the atm T 1 as T 2 is the wreath product of the transformation monoids (S, M ) and (Q, N ) associated with T 1 and T 2 respectively. In particular, the composition law on M × F(S, N ) is the same as in Definition 3.4. The associativity of the asynchronous wreath product operation follows immediately.
In addition, if s ∈ S and q ∈ Q are such that s P = s P and q P = q P , we have that is, f (s) = f (s ), which concludes the proof.
3.5. Asynchronous wreath product principle. The classical wreath product principle [Str94] is a critical result that defines the importance and utility of wreath product structures in formal language theory. We give here analogous results which exploit the distributed structure of asynchronous automata and asynchronous transformation monoids. In fact we recover the classical principle as a special case when there is only one process.
Let T = ({S i }, M ) be an atm and ϕ : Tr(Σ) → T be an asynchronous morphism. We associate with T the alphabet Σ × S = {(a, s a ) | a ∈ Σ, s ∈ S} where each letter a is decorated with local a-state information of T . Recall that this alphabet can be viewed as a distributed alphabet by letting (Σ × S) i (i ∈ P) be the set of letters (a, s a ) ∈ Σ × S such that a ∈ Σ i . In other words, loc((a, s a )) = loc(a). The choice of an initial global state s in ∈ S induces the following transducer over traces.
It is immediately verified that, if A = ({S i }, {δ a }, s in ) is an asynchronous automaton and ϕ is its transition morphism, then χ s in ϕ coincides with χ A , the asynchronous transducer of A defined in Section 2.3.
Trace χ(t) Figure 4. Asynchronous transducer output on a trace.
Example 3.18. Let ϕ be the first asynchronous morphism in Example 3.12, with be the global initial state and let χ be the corresponding asynchronous transducer. Figure 4 shows (automata-style) the computation of the asynchronous morphism ϕ on the trace t = abc ∈ Tr(Σ) (by showing local process states before and after each event), and the resulting trace χ(t) ∈ Tr(Σ × S).
The following lemma is a straightforward consequence of the definition of the asynchronous transducer. Lemma 3.19. Let ϕ : Tr(Σ) → T be an asynchronous morphism to an atm T = ({S i }, M ), let s in ∈ S and let χ be the associated asynchronous transducer. Let t ∈ Tr(Σ), a ∈ Σ and s = ϕ(t)(s in ). Then the trace χ(ta) ∈ Tr(Σ × S) factors as χ(ta) = χ(t)(a, s a ).
We now define a notion of asynchronous wreath product of asynchronous morphisms defined on trace monoids.
Lemma 3.21. If ϕ and ψ are as in the above definition, then ϕ as ψ induces an asynchronous morphism to the atm T as T .
The following technical result establishes an important connection between asynchronous transducers and asynchronous wreath product of asynchronous morphisms. This is crucially utilized in the proof of the asynchronous wreath product principle. For each t ∈ Tr(Σ), let π 1 (t) and π 2 (t) be the first and second component projections of (ϕ as ψ)(t) ∈ M × F(S, N ). Then π 1 (t) = ϕ(t) and π 2 (t)(s in ) = ψ(χ(t)).
We can now state and prove both directions of what we term the asynchronous wreath product principle. Proof. Let ψ : Tr(Σ × S) → T = ({Q i }, N ) be an asynchronous morphism recognizing L, with q in ∈ Q as the initial global state, and Q fin ⊆ Q as the set of final global states. Then ). Therefore t ∈ χ −1 (L) if and only if (ϕ as ψ)(t)(s in , q in ) ∈ S × Q fin , which concludes the proof that ϕ as ψ recognizes χ −1 (L).
Theorem 3.24. Let T 1 = ({S i }, M ) and T 2 = ({Q i }, N ) be atms and let L ⊆ Tr(Σ) be a trace language recognized by an asynchronous morphism η : Tr(Σ) → T 1 as T 2 , with initial global state (s in , q in ). For each a ∈ Σ, let η(a) = (m a , f a ). Mapping each a ∈ Σ to m a defines an asynchronous morphism ϕ : Tr(Σ) → T 1 . Let χ be the local asynchronous transducer associated to ϕ and s in . Then L is a finite union of languages of the form is an a-map and, by Lemma 3.16, m a ∈ M is an a-map (of T 1 ) and f a : S → N is such that, for each s ∈ S, f a (s) ∈ N is an a-map (of T 2 ) which depends only on s a . In particular, f a : S → N may be viewed as a map f a : S a → N . Below we will use f a in this sense.
It follows, by Lemma 3.11, that the map a → m a extends to an asynchronous morphism ϕ : Tr(Σ) → T 1 . Similarly, mapping (a, s a ) ∈ Σ × S to f a (s a ) defines an asynchronous morphism ψ : Tr(Σ × S) → T 2 and we have η = ϕ as ψ.
By definition of recognizability, L is the union of a finite family of languages recognized by η with initial global state (s in , q in ) and a single final global state and we can, without loss of generality, assume that L is recognized with a single final global state, say, (s fin , q fin ).
Let π 1 (t) and π 2 (t) be the first and second component projections of η(t), for each t ∈ Tr(Σ). Thus t ∈ L if and only if η(t)((s in , q in )) = (s fin , q fin ), that is, Proposition 3.22, applied to η = ϕ as ψ, shows that this is equivalent to ϕ(t)(s in ) = s fin and ψ(χ(t))(q in ) = q fin .
Let now U ⊆ Tr(Σ) be recognized by the asynchronous morphism ϕ with initial and final states s in and s fin , and let V ⊆ Tr(Σ × S) be recognized by ψ with initial and final states q in and q fin . Then t ∈ L if and only if t ∈ U and χ(t) ∈ V , that is, L = U ∩ χ −1 (V ), which completes the proof.
Remark 3.25. Note that the asynchronous wreath product principle, when restricted to a single process, corresponds exactly to the sequential wreath product principle.
Example 3.26. Consider the distributed alphabet Σ = ({a, b}, {b, c}, {c}) over P = {p 1 , p 2 , p 3 } from Example 3.12. We define an asynchronous morphism η from Tr(Σ) into the asynchronous wreath product ( Denoting the (isomorphic) monoids of U 2 [p 1 ] and U 2 [p 3 ] by U 2 , by Definition 3.15, we know that η(a) = (m a , f a ) ∈ U 2 × F(S, U 2 ). Further, by Lemma 3.16, f a can be described as a function in F(S a , U 2 ). Let the local states of the first tm be It is clear from the description of η below that η(a) (resp. η(b) and η(c)) is an a-map (resp. b-map and c-map). Therefore η extends to an asynchronous morphism.
The naturally derived asynchronous morphisms ϕ : ] is localized, we only need to describe ϕ on Σ p 1 letters (the other letters being mapped to id), similarly for ψ. Note that since there is no letter on which processes p 1 and p 3 synchronize, and the information passed by U 2 [p 1 ] via re-labelling of events is 'local', it cannot be utilised by U 2 [p 3 ], and hence the asynchronous morphism η to the asynchronous wreath product structure U 2 [p 1 ] as U 2 [p 3 ] essentially reduces to an asynchronous morphism into the direct product 3.6. Local cascade product. Now we present an automata-theoretic view of the asynchronous wreath product.
This operation on asynchronous automata corresponds exactly to the asynchronous wreath product defined in Section 3.4, in the sense of the following statement. Proof. Letδ a ,δ (a,sa) and∆ a denote the extensions to global states (S, Q and R, respectively) of the maps δ a ∈ F(S a ), δ (a,sa) ∈ F(Q a ) and ∆ a ∈ F(R a ). By definition of transition morphisms (Section 3.3), for each a ∈ Σ and s ∈ S, we have ϕ(a) =δ a and ψ(a, s a ) =δ (a,sa) , while the transition morphism of A 1 • A 2 maps a to∆ a .
The correspondence between the asynchronous wreath product of asynchronous transformation monoids and the local cascade product of asynchronous automata established in Proposition 3.28, induces an automata-theoretic version of the asynchronous wreath product principle, using the asynchronous transducers of the asynchronous automata involved.
More precisely, a run of the local cascade product A 1 • A 2 on a trace t can be understood as follows. One first views A 1 as an asychronous computing device, namely the asynchronous transducerÂ 1 (see Section 2.3) which computes χ A 1 . Its run on t outputs the trace χ A 1 (t) ∈ Tr(Σ × S) (see Figure 4) and one then runs A 2 on that trace, leading to state q ∈ Q. Note that, due to the asynchronous nature of the computing device, as soon as A 1 has finished working on some event (of t), A 2 can start working on the 'same' event (of χ A 1 (t)). Further, A 1 and A 2 work asynchronously and can 'simultaneously' process concurrent events. Finally, the run of A 1 • A 2 on t takes the initial state (s in , q in ) to the pair (s, q), as in Finally, note that the local cascade product is associative. The following local cascade product principle, which relies on associativity, is the announced rephrasing of the asynchronous wreath product principle in automata-theoretic terms.
is also a local cascade product. A 1 , A 2 , A 3 and B are all localized asynchronous automata, having two local states in processes 1, 2, 3, and 3 respectively, and a single local state in each of the remaining processes. The automata are described in Figure 6; note that due to its localized structure, A 1 is completely described by transitions induced by the letters of Σ 1 on the local states of process 1. A similar statement holds for A 2 , A 3 , and B; also for simplicity, in the extended alphabet letters for A 2 , A 3 and B, we have only displayed non-trivial state information.
It is not difficult to see that the asynchronous transducerÂ labels process 1 events (resp. process 2 events, and process 3 events) by p 2 (resp. q 2 and s 2 ) if and only if that event has an a-labelled event in its past. Since B detects the existence of letter (d, s 2 ) by changing its state, A • B can recognize for instance the language "there exists a d that has an a in its past".
To conclude this section, we relate local cascade products with the Γ-labelling functions introduced in Section 2.3.   s 1 ), (c, q 1 , s 1 ), (c, q 1 , s 2 ) (c, q 2 , s 1 ), (c, q 2 , s 2 ) (d, s 1 ), (d, s 2 ) (c, q 1 , s 1 ) (c, q 1 , s 2 ) (c, q 2 , s 1 ) (c, q 2 , s 2 ) Figure 6. Cascade product of localized reset asynchronous automata Proof. Let γ : Σ × S → Σ × Γ be the map given by γ(a, s a ) = (a, µ a (s a )). We also denote by γ the extension of this map to a morphism from Tr(Σ × S) to Tr(Σ × Γ). By definition, θ = γ • χ A . In particular θ −1 (L) = χ −1 A (γ −1 (L)). Let B be the automaton on alphabet Σ × S obtained from B by keeping the same state sets and global initial and accepting states, and letting the transition δ (a,sa) be equal to the transition δ (a,µa(sa)) of B. It is clear that B accepts γ −1 (L) and, by Theorem 3.23, 3.7. Questions of decomposition. Most of the known characterizations of interesting classes of recognizable trace languages (e.g., star-free languages, languages definable in first-order logic or in a global or local temporal logic, see [GRS92,EM96,DG02,DG06]) are in terms of morphisms into ordinary transformation monoids or of syntactic monoids (equivalently, of the transition monoids of certain canonical minimal diamond-automata). In contrast, Zielonka's theorem simulates a morphism into a tm by an asynchronous morphism into an atm. We seek an asynchronous version of the Krohn-Rhodes theorem that would be similar in spirit: starting with a morphism into a tm, we ask whether it can be simulated by an asynchronous morphism into an asynchronous wreath product of 'simpler' asynchronous transformation monoids. A positive resolution in this form would help us lift the sequential or diamond automata-theoretic chacterizations of first-order definable trace languages mentioned above to asynchronous automata-theoretic characterizations (cf. Theorem 5.14 below). We would also like the statement of the proposed asynchronous version of the Krohn-Rhodes theorem to coincide with the classical sequential Krohn-Rhodes Theorem (Theorem 3.5) when the underlying distributed alphabet involves only one process.
We first extend the notion of simulation (Definition 3.1) to trace morphisms as follows.
If ψ happens to be an asynchronous morphism to an atm, we say that ψ simulates ϕ if it is the case when ψ is viewed as a morphism to the tm underlying T .
We note that Zielonka's fundamental theorem [Zie87], already mentioned in Sections 2.2 and 3.3, can actually be rephrased as follows (see [Muk12]). Theorem 3.33 (Zielonka Theorem). Every morphism ϕ : Tr(Σ) → T to a finite tm is simulated by an asynchronous morphism ψ : Tr(Σ) → T to an atm. 5 Recall the atm U 2 [p] defined in Example 3.8, an asynchronous analogue at process p ∈ P of the tm U 2 . If G is a group, recall that (G, G) is a tm and let G[p] be its asynchronous analogue at process p (again, see Example 3.8).
Definition 3.34. We say that a distributed alphabet (Σ, loc) has the asynchronous Krohn-Rhodes property ((Σ, loc) is aKR, for short) if every morphism ϕ : Tr(Σ) → T to a tm T = (X, M ) is simulated by an asynchronous morphism to an atm T which is the asynchronous wreath product of asynchronous transformation monoids of the form U 2 [p] or G[p], where p ∈ P and G is a simple group dividing M .
It is an interesting question to characterize which distributed alphabets are aKR, or even whether all are. While we are not able to answer this question, we show in Section 4 that acyclic architectures are aKR. In Section 5, we show that a weaker property holds when we restrict our attention to morphisms from Tr(Σ) to aperiodic transformation monoids. See also the discussion in Section 8.
In view of our discussion so far, it is clear that establishing that a distributed alphabet is aKR amounts to a simultaneous generalization, for this particular distributed alphabet, of the Krohn-Rhodes theorem (Theorem 3.5) and of Zielonka's theorem (Theorem 3.33).

The case of acyclic architectures
Definition 4.1. The communication graph of a distributed alphabet {Σ i } i∈P is the undirected graph G = (P, E) where E = {(i, j) ∈ P × P | i = j and Σ i ∩ Σ j = ∅}. An acyclic architecture is a distributed alphabet whose communication graph is acyclic.
Observe that if (Σ, loc) is an acyclic architecture, then no action is shared by more than two processes. We note that Zielonka's theorem admits a simpler proof in this case [KM13]. Our objective in this section is to establish the following result.
To prove Theorem 4.2, we need several technical lemmas. We first define a notion of wreath product of trace morphisms into tm's. In general, the above map ϕ ψ may not extend to a morphism from the trace monoid (see [Sar22,AGSW21] for an example). However, it does so under a technical condition.
We now show how to simulate a morphism from a trace monoid to a tm, by the wreath product of morphisms to transformation monoids that are simpler.
Proof. Observe that Σ 0 ⊆ Σ p . The letters in Σ 0 are mutually dependent, that is, Σ * 0 is in fact a submonoid of Tr(Σ). Let N be the submonoid of M generated by ϕ(Σ 0 ); we can view N as N = ϕ(Σ * 0 ). For each n ∈ N , letn be the constant transformation of N which maps every element to n. The setN = N ∪ {n | n ∈ N } is easily verified to be a submonoid of F(N ).
Let ϕ 1 : Σ → (N,N ) be defined by letting If a and b are independent letters, then they cannot be both in Σ p , one of ϕ 1 (a) and ϕ 1 (b) at least is the identity, and hence ϕ 1 (a) and ϕ 1 (b) commute. It follows that ϕ 1 naturally extends to a morphism ϕ 1 : Tr(Σ) → (N,N ), which is the identity on the submonoid generated by Σ \ Σ p . We note that if a group G dividesN , then it also divides N . Indeed, suppose that τ is a morphism defined on a submonoid N ⊆N onto G. Each element of the formn (n ∈ N ) in N is idempotent, and hence τ (n) is the identity 1 G of G. In particular, the restriction of τ to the submonoid N = N ∩ N has the same range as τ , that is, G is a quotient of N and hence G ≺ N .
Observe that, if w = a 1 . . . a r , i is maximal such that a i ∈ Σ p \ Σ 0 (i = 0 if w has no letter in Σ p \ Σ 0 ) and w is the projection of a i+1 · · · a r onto Σ * 0 , then ϕ 1 (w) = ϕ(w ). In other words, ϕ 1 (w) is the evaluation under ϕ of the word read by process p since its last joint action with a neighbour. Now let us make Σ × N a distributed alphabet (over P) by letting loc((a, n)) = loc(a) for each (a, n) ∈ Σ × N . Let also If (a, n) and (b, n ) are independent letters, that is, if a and b are independent in Σ, then ϕ(a) and ϕ(b) commute because ϕ is defined on Tr(Σ). Moreover, either a, b ∈ Σ p , or a ∈ Σ p and b ∈ Σ p (or vice versa). In the first case, ϕ 2 (a, n) = ϕ(a) and ϕ 2 (b, n ) = ϕ(b) commute.
The penultimate equality follows from the fact that the generators of N commute with ϕ(a) since p / ∈ loc(a). In all cases, f (η(a)(n, x)) = ϕ(a) (f (n, x)). This completes the proof.
Our next lemma shows how to combine simulations by asynchronous morphisms, under certain technical assumptions.
Proof (of Theorem 4.2). The proof proceeds via induction on the number of processes. The base case, with only one process (and hence no distributed structure on the alphabet), is the Krohn-Rhodes theorem (Theorem 3.5).
Let now P = {1, 2, . . . , k}, with k ≥ 2, and assume that the theorem holds for acyclic architectures with less than k processes. Since the communication graph is acyclic, there exists a 'leaf' process which communicates with at most one other process. Without loss of generality, let this leaf process be 1, and its only neighbouring process be 2 (if process 1 has no neighbour, then process 2 can be any other process). We let Σ 0 be the set of letters a such that loc(a) = {1}. By our assumptions, loc(a) = {1, 2} for every a ∈ Σ 1 \ Σ 0 .
Note that no two letters of Σ 1 are independent, so that the submonoid of Tr(Σ) generated by Σ 1 is freely generated by Σ 1 : we write it Σ * 1 . Let ϕ 1 : Σ * 1 → (X 1 , M 1 ) be the restriction of ϕ 1 to Σ * 1 . The Krohn-Rhodes theorem shows that ϕ 1 is simulated by a morphism ψ 1 : Σ * 1 → T , where T is a wreath product of transformation monoids of the form U 2 and (G, G), where G is a nontrivial simple group G dividing M 1 -and hence, by Lemma 4.5, dividing M .
Next, we extend ψ 1 to a morphism ψ 1 : Tr(Σ) → T by letting ψ 1 (a) = id for each letter a ∈ Σ 1 . Consider the atm T [1] defined in Example 3.8. It is easily verified that ψ 1 : Tr(Σ) → T [1] induces an asynchronous morphism (Lemma 3.11), which simulates ϕ 1 , and that T [1] is an asynchronous wreath product of asynchronous transformation monoids of the required form (that is, of the form U 2 [1] or G[1] for a group G dividing M ).
Suppose that letters (a, s a ) and (b, s b ) are independent in Σ . We first verify that they are independent in Σ as well. If this is not the case, then loc(a) ∩ loc(b) = {1}. However, as we noticed before, letters whose location contains 1 and is not reduced to {1} (as is the case for all letters a such that some (a, s a ) ∈ Σ ) have location {1, 2}, and this means that loc(a) ∩ loc(b) = {1, 2}, a contradiction. It follows that Tr(Σ ) is a submonoid of Tr(Σ × S), and we now consider the restriction ϕ 2 of ϕ 2 to Tr(Σ ).
By induction hypothesis, ϕ 2 is simulated by an asynchronous morphism ψ 2 : Tr(Σ ) → T , where T is an asynchronous wreath product of asynchronous transformation monoids of the form U 2 [p] or G[p] for some simple group G dividing M , and some p ∈ P \ {1}. These asynchronous transformation monoids can again be trivially extended over P (by adding a singleton state set for process 1). The corresponding extension of T , denoted T [↑P] is an asynchronous wreath product of the desired form. Moreover ψ 2 can also be extended over Σ × S by mapping any letter in (Σ × S) \ Σ to the identity. It is easily verified that ψ 2 is an asynchronous morphism, which simulates ϕ 2 .
We can now apply Lemma 4.7 to conclude the proof.

Temporal logics, first-order trace languages & local cascade products
Recall that star-free trace languages, aperiodic trace languages (those recognized by an aperiodic monoid) and first-order definable trace languages coincide, by the combined results of Guaiana, Restivo and Salemi [GRS92] and Ebinger and Muscholl [EM96]. A consequence of Theorem 4.2 is that any aperiodic trace language over an acyclic architecture, and indeed over any aKR distributed alphabet, is recognized by an asynchronous wreath product of 2-state asynchronous transformation monoids of the form U 2 [p] (see Example 3.8). Proposition 3.28 then shows that it is accepted by a local cascade product of localized two-state reset asynchronous automata. For convenience, we denote these automata by U 2 [p] as well: U 2 [p] has state set {S i }, where S p = {1, 2} and each S i (i = p) is a singleton, and transitions as follows. The alphabet Σ p contains two disjoint subsets R 1 and R 2 that reset the states in S p to 1 and 2, respectively. All remaining letters act as the identity, in particular letters in Σ p \ (R 1 ∪ R 2 ). In this section, we aim at generalizing this result to any distributed alphabet. Our route towards this utilizes yet another formalism used to classify trace languages, that of temporal logics.

Local temporal logics. We first introduce a process-based past-oriented local temporal logic over traces, called
This logic, as well as some of its fragments, turns out be as expressive as first-order logic over traces, see Theorem 5.5 below. Furthermore, we have chosen the logic in a way that facilitates showing correspondence with local cascade products (see the proofs of Theorem 5.6 and Corollary 5.12). The syntax of Turning to the semantics of LocTL[Y i ≤ Y j , Y i , S i ], each event formula is evaluated at an event of a trace t ∈ Tr(Σ), say, t = (E, ≤, λ). For any event e ∈ E and process i ∈ P, we and, for all g ∈ E i , f < g < e implies t, g |= α.
Note that, Y i is a modality expecting an argument α, whereas a and Y i ≤ Y j are basic constant formulas. Also, the since operators formulas are evaluated on traces by interpreting the Boolean connectives in the natural way and by letting t |= ∃ i α if there exists a maximal i-event e in t such that t, e |= α .
We will also consider various fragments of in the corresponding fragment) such that L = {t ∈ Tr(Σ) | t |= β}.
In Theorem 5.5, the three local temporal logics To this end, we will use the following derived constants is equivalent to Y i and simply means that there are i events in the strict past of the current event.
For i ∈ P and an event formula α, we define Remark 5.1. Notice that an event satisfying must be a j-event. Hence, we may restrict the use of (Y i = Y j ) and (Y i < Y j ) to be at j-events only and still get an expressively complete logic.
Lemma 5.2. Consider a trace t ∈ Tr(Σ). For any event e in the trace, any process i, and any natural number m, if t, e |= Y m i α then t, e |= Y i α. Proof. The proof is by induction on m. The base case is when m = 1. Suppose that t, e |= (Y i = Y j ) ∧ (⊥ S j α) for some j ∈ P. Due to (Y i = Y j ), we know that e i , e j exist and e i = e j . Now, from ⊥ S j α, we get that e ∈ E j and t, e j |= α. We deduce that t, e i |= α and t, e |= Y i α. For the inductive step, let m ≥ 1 and suppose that t, e |= for some j ∈ P. Due to (Y i < Y j ), we know that e i , e j exist and e i < e j . Now, from t, e |= (Y i < Y j ) S j (Y m i α), we deduce that e ∈ E j and there exists an event f ∈ E j in the strict past of e such that t, f |= Y m i α and at all j-events between e and f , the formula (Y i < Y j ) is true. By induction, we get t, f |= Y i α. Let f = f k < f k−1 < · · · < f 1 < f 0 = e be the sequence of j-events between f and e. For all 0 ≤ < k, we have f j = f +1 and t, f |= (Y i < Y j ). We deduce by induction on that e i = f i for all 0 ≤ ≤ k. This is clear when = 0 since e = f 0 and for the inductive step it follows from e i = f i < f j = f +1 . Finally, e i = f i and we get t, e i |= α as desired. Proof. Assume that t, e |= Y i α, i.e., e i exists and t, e i |= α. Since e i is in the strict past of e, there exists f such that e i f ≤ e. It is well-known that there is m ≥ 1 and a sequence of events f = f 1 < f 2 < · · · < f m = e and processes j such that j ∈ loc(f −1 ) ∩ loc(f ) for all 1 < ≤ m. Moreover, we may assume that the processes j 2 , . . . , j m are pairwise distinct, and also different from i since e i f . We get m ≤ |P|.
We show by induction on that e i = f i and t, f |= Y i α. For the base case = 1, since e i f 1 we have e i = f 1 i and we find j 1 ∈ loc(e i )∩loc(f 1 ). We get t, f 1 |= (Y i = Y j 1 )∧(⊥S j 1 α). Therefore, t, f 1 |= Y 1 i α and we are done. Consider now 1 ≤ < m and assume that We obtain t, f +1 |= Y +1 i α which concludes this proof.
Proof. From Lemmas 5.2 and 5.3 we see that This concludes the proof.
We now establish the expressive completeness of our past-oriented local temporal logics This crucially depends on the expressive completeness of a process-based pure future local temporal logic proved by Diekert and Gastin in [DG06]. It is unknown whether the fragment LocTL[S i ] is as expressive as LocTL[Y i , S i ] in general, see Theorem 5.7 for a partial result.
Theorem 5.5. Let (Σ, loc) be a distributed alphabet over P. Over Tr(Σ), first-order logic, Proof. First, from the semantics of LocTL Conversely, we first show that LocTL[Y i , S i ] is expressively complete. In [DG06], Diekert and Gastin give a process-based pure future local temporal logic which they show is expressively equivalent to first order logic over traces with a unique minimal event. The Semantics: t, e |= α S i α if there exists f ∈ E i such that f ≤ e and t, f |= α and, for all g ∈ E i , f < g ≤ e implies t, g |= α. We . This shows LocTL[Y i , S i ] is expressively complete over prime traces (traces with a unique maximal event). More precisely, for each first order sentence ϕ, there is an event formula ϕ in LocTL[Y i , S i ] such that for all prime traces t we have t |= ϕ iff t, max(t) |= ϕ. Adsul and Sohoni [AS02,Ads04] showed that any first order sentence over traces can be equivalently expressed in a first normal form which is a boolean combination of first order sentences evaluated only in the process views of traces. More precisely, for a trace t = (E, ≤, λ), let t i denote the trace induced by the restriction to ↓E i . Given any first order sentence ϕ, by [AS02] there exists a natural number n and sentences ϕ i,m (i ∈ P, 1 ≤ m ≤ n) such that t |= ϕ iff for some m, for each i ∈ P, we have t i |= ϕ i,m . Consider the equivalent event formulas With global accepting state 1: To conclude that the languages accepted by A are LocTL[S i ]-definable, we only need to show that if L 2 is LocTL[S i ]-definable over alphabet Σ × S, then χ −1 (L 2 ) is LocTL[S i ]definable over (Σ, loc). This is done by structural induction on LocTL[S i ]-formulas over Σ × S. For an event formula α of LocTL[S i ] over Σ × S, we provide an event formulaα over (Σ, loc) such that for any trace t ∈ Tr(Σ), and any event e in t, we have t, e |=α if and only if χ(t), e |= α. The non-trivial case here is the base case of letter formula α = (a, s a ). If p / ∈ loc(a), then we letα = a. If instead p ∈ loc(a), we let We now establish the converse implication. For any LocTL[S i ] event formula α, We construct an asynchronous automaton A α , which is a local cascade product of copies of U 2 [p], and which is such that for any trace t and event e of t, each local state [A α (↓e)] i (i ∈ loc(e)) completely determines whether t, e |= α. A α is constructed by structural induction on the LocTL[S i ] event formula α.
Base case: Suppose that α = a ∈ Σ. We let ∈ loc(a), and S i = { , ⊥} for all i ∈ loc(a). For any P -state s, if for all i ∈ P we have s i = ⊥ (resp. ), then we write s = ⊥ (resp. s = ). We let s in = ⊥. The local transition δ a is a reset to and the transitions δ b (b = a) are resets to ⊥. This construction ensures that, for all i ∈ loc(a) we have [A α (↓e)] i = if and only if t, e |= α. It is also easy to see that A α is a local cascade product of U 2 [p] for p ∈ loc(a).
Inductive case: The non-trivial case is α = β S j γ. By inductive hypothesis, we have constructed automata A β and A γ as local cascade products of copies of with initial state ⊥ is such that all letters from Σ j reset the state to ; andÂ β (resp.Â γ ) simply lifts A β (resp. A γ ) to appropriate input alphabet by ignoring the local state information provided in the local cascade product. Hence, A is a local cascade product of copies of U 2 [p] which simultaneously provides the truth values of β and γ at any event and remembers whether some j-event already occured. Let χ be the associated local asynchronous transducer.
We construct B = ({Q i }, {δ (a,sa) }, q in ) over Σ × S such that A • B is the required asynchronous automaton. Let Q i = { , ⊥} for all i ∈ P. Again, we denote a P -state q as ⊥ (resp. ) if q i = ⊥ (resp. q i = ) for all i ∈ P . We let the initial state be q in = ⊥. For any a / ∈ Σ j , we let the local transition δ (a,sa) be the reset to ⊥. By assumption, if a j-event e of the trace χ(t) is labelled (a, s a ), then [s a ] j determines whether some j-event occured in the past of e, and if this is the case, the truth values of β and γ at the previous j-event e j . When ξ is a boolean combination of β, γ, then we write [s a ] j Y j ξ if according to the j state of s a , there is a previous j-event e j and ξ is true at e j . Then the transition for a ∈ Σ j is given by The transitions make sense if we recall the identity β S j γ ≡ ⊥S j (γ ∨(β ∧(β S j γ))). Note that in the last two cases above, δ (a,sa) is the identity transformation on process j states. Hence, process j update is realised by some U 2 [j]. The other processes of loc(a) can update their states mimicking process j state update, once they also have the truth value of α = β S j γ at the previous j-event e j , which is being made available at event e by the above U 2 [j]'s state. In view of this, it is easy to verify that B is a local cascade product of U 2 [j] followed by U 2 [p] for p = j.
To conclude the proof, we have to handle trace formulas of LocTL[S i ], which are the sentences defining trace languages. So consider a trace formula ∃ j α where α is an event formula in LocTL[S i ]. Let A α be the local cascade product of copies of U 2 [p] constructed above. As in the inductive case above, we also use a copy of U 2 [j] which remembers whether some j-event already occurred in the past. Let t = (E, ≤, λ) ∈ Tr(Σ) and let s = (U 2 [j] • A α )(t). The local state s j allows to determine whether E j = ∅ thanks to U 2 [j], and in this case whether the maximal event of E j satisfies α thanks to A α .
We are now ready to give new characterizations of first-order definable trace languages over aKR distributed alphabets.
(1) L is definable in first-order logic.
(2) L is accepted by a counter-free diamond-automaton (or, recognized by an aperiodic monoid).
(3) L is accepted by a local cascade product of copies of U 2 [p] (or, recognized by an asynchronous wreath product of atms of the form U 2 [p]). (4) L is accepted by a counter-free asynchronous automaton (or, recognized by an aperiodic asynchronous transformation monoid).
Proof. By [EM96], first-order definability coincides with recognizability by an aperiodic monoid. By Theorem 5.6, LocTL[S i ]-definability coincides with acceptability by local cascade product of asynchronous reset automata of the form U 2 [p], or equivalently asynchronous wreath product of atm's of the form U 2 [p].
If (Σ, loc) is aKR, recognizability by an aperiodic monoid implies recognizability by an asynchronous wreath product of asynchronous transformation monoids of the form U 2 [p]. The converse implication follows from the easy verification that a wreath product of asynchronous transformation monoids of the form U 2 [p] is an aperiodic atm (that is, the associated global tm is aperiodic). Putting these equivalences together and keeping in mind the correspondences between (asynchronous) automata and (asynchronous) morphisms into (asynchronous) transformation monoids, we get the desired result.

Cascade decomposition for
We turn now to the logic LocTL[Y i ≤ Y j , S i ] and its relation with local cascade products. Here, we seek a decomposition result which is valid for all distributed alphabets, and not only for those that are known to be aKR. Unfortunately, we do not know whether all aperiodic trace languages can be accepted by local cascade products of copies of U 2 [p]. It is interesting to notice first that the argument in the proof of Theorem 5.6 cannot be lifted to the logic LocTL[Y i ≤ Y j , S i ]. More precisely, when α = Y i ≤ Y j , the automaton A α specified in this proof cannot, in general, be obtained as a local cascade product of copies of U 2 [p]. To explain this formally, we use the notion of Γ-labelling functions computed by asynchronous automata, defined in Section 2.3.
Given an event formula α, i.e., θ α (t) = (E, ≤, (λ, µ)) where for all e ∈ E, we have µ α (e) = 1 if t, e |= α and µ α (e) = 0 otherwise. In the proof of Theorem 5.6, we have constructed for each formula α ∈ LocTL[S i ] an asynchronous automaton A α which computes θ α , and which is a local cascade product of copies of U 2 [p]. Lemma 5.8 below shows that this cannot be extended to This is a slight modification of an example from Adsul and Sohoni [AS04].
Nevertheless, there is no aperiodic asynchronous automaton over (Σ, loc) which computes θ α . In particular, θ α cannot be computed by a local cascade product of copies of U 2 [p].
As A is aperiodic, there exists n such that, starting at the initial global state s in , traces (ab) n and (ab) n+1 reach the same global state, say s = (s 1 , s 2 , s 3 ) = A((ab) n ) = A((ab) n+1 ). It follows that the trace ab fixes s. As the transition function of a (resp. b) does not change the local state of process 3 (resp. process 1), it must be that the a-transition at s leads to a global state of the form s = (s 1 , s 2 , s 3 ). In particular, s is the global state reached on input (ab) n a. Now, consider the traces t = (ab) n c and t = (ab) n ac. The c-event in t satisfies α = Y 1 ≤ Y 3 whereas the c-event in t does not. Since s c = (s 1 , s 3 ) = s c , this contradicts the fact that µ c computes the truth value of α at c-events.
On the other hand, the gossip automaton, one of the most important tools in the theory of asynchronous automata, due to Mukund and Sohoni [MS97], computes all the constants of LocTL[Y i ≤ Y j , S i ]. Let Γ = {0, 1} P×P and θ Y be the Γ-labelling function which decorates each event e of a trace t with the truth values of all constants Y i ≤ Y j , i.e., for all i, j ∈ P, µ Y i,j (e) = 1 if t, e |= Y i ≤ Y j and µ Y i,j (e) = 0 otherwise. Since the events referred to by {Y i } are called primary events, we call θ Y the primary order labelling function.
Note that, as a consequence of Lemma 5.8, the gossip automaton is not aperiodic in general, a fact that was already established in [AS04].
In view of Theorems 5.6 and 5.9, the following lemma will help relate LocTL[Y i ≤ Y j , S i ] languages and local cascade products on arbitrary distributed alphabets.
Proof. We simply write θ for the primary order labelling function θ Y .
(1) For each event formula α ∈ LocTL[Y i ≤ Y j , S i ] over Σ, we construct a formula α ∈ LocTL[S i ] over Σ × Γ such that, for all traces t ∈ Tr(Σ) and all events e in t, we have t, e |= α Now, for i, j ∈ P, we let Γ i,j = {γ ∈ Γ | γ i,j = 1} and we define The inductive cases are trivial, for instance (2) Given an event formula α ∈ LocTL[Y i ≤ Y j , S i ], we construct an event formula α ∈ LocTL[Y i ≤ Y j , S i ] such that, for all traces t ∈ Tr(Σ) and all events e in t, we have θ(t), e |= α if and only if t, e |= α. Again, the construction is by structural induction and the interesting cases are the constants. For (a, γ) ∈ Σ × Γ, we define The other cases are trivial, e.g., Since the gossip automaton G computes θ Y and all LocTL[S i ]-definable languages over (Σ × Γ, loc) can be accepted by a local cascade product of copies of asynchronous reset automata of the form U 2 [p] (Theorem 5.6), we deduce from Lemma 5.10 (1) and Proposition 3.31 that all LocTL[Y i ≤ Y j , S i ]-definable languages over (Σ, loc) can be accepted by a local cascade product of the gossip automaton G followed by copies of asynchronous reset automata. Now, as we saw, the gossip automaton exhibits a non-aperiodic behaviour in general. In order to get a converse of the above statement, we introduce a restricted version of the local cascade product.
Let A be an asynchronous automaton over (Σ × Γ, loc). The θ Y -restricted local cascade product G • r A is an asynchronous automaton which runs G on an input trace t ∈ Tr(Σ), and runs A over over (Σ, loc), where R i = Υ i × S i for i ∈ P and, for a ∈ Σ and (υ a , s a ) ∈ R a , ∆ a ((υ a , s a )) = (∇ a (υ a ), δ (a,µa(υa)) (s a )). Notice that, in the definition of the transition relation of this restricted cascade product, the a-state υ a of G has been abstracted to µ a (υ a ) ∈ Γ. Notice that languages accepted by G • r A are also accepted by G • A where A is the extension of A to the suitable alphabet Σ × S. Proof. The first component of G • r A runs G on the input trace t ∈ Tr(Σ) and accepts since all global states of G are accepting. Now, the second components runs A on θ Y (t). Therefore, t is accepted by G • r A if and only if θ Y (t) is accepted by A. Proof.
Lemma 5.10 (1). By Theorem 5.6, L is accepted by an asynchronous automaton A which is a local cascade product of asynchronous reset automata of the form U 2 [p]. Since L = θ Y −1 (L ), the left-to-right implication follows from Lemma 5.11. Conversely, assume that L is accepted by G • r A where A is a local cascade product of asynchronous reset automata of the form U 2 [p]. By Lemma 5.11, we have L = θ Y −1 (L ) where L is a trace language over (Σ×Γ, loc) accepted by A. By Theorem 5.6, L is LocTL[S i ]definable over (Σ × Γ, loc). Finally, Lemma 5.10 (2) implies that Remark 5.13. In Lemma 5.11 and Corollary 5.12, as well as in Theorem 5.14 below, we may replace the gossip automaton G by any asynchronous automaton computing the primary order labelling function θ Y .
We close this section with the following theorem, summarizing our characterization of first-order definable trace languages using local cascade products.
Theorem 5.14. Let (Σ, loc) be a distributed alphabet and let L ⊆ Tr(Σ) be a trace language. The following are equivalent: (1) L is recognized by an aperiodic monoid.
(3) L is definable in first-order logic.
(5) L is accepted by a θ Y -restricted local cascade product G • r A where G is the gossip automaton and A is a local cascade product of copies of asynchronous reset automata of the form U 2 [p]. (6) L is accepted by a θ Y -restricted local cascade product G • r A where G is the gossip automaton and A is an aperiodic asynchronous automaton.
Proof. As mentioned before, Guaiana, Restivo and Salemi [GRS92] established the equivalence of (1) and (2), and Ebinger and Muscholl [EM96] proved that these are equivalent to (3). The equivalence between (3) and (4) is Theorem 5.5 and Corollary 5.12 gives the equivalence with (5). The equivalence with (6) was first proved by Adsul and Sohoni [AS04]. We can obtain it as follows. First, (5) implies (6) since a local cascade product of aperiodic asynchronous automata is again aperiodic. Next, we show that (6) implies (4) as in the proof of Corollary 5.12. Assume that L is accepted by G • r A where A is an aperiodic asynchronous automaton. By Lemma 5.11, we have L = θ Y −1 (L ) where L is a trace language over (Σ × Γ, loc) accepted by A. Since (1) implies (4), the language L accepted by A is LocTL[Y i ≤ Y j , S i ]-definable over (Σ × Γ, loc). Finally, Lemma 5.10 (2) implies that L is LocTL[Y i ≤ Y j , S i ]-definable over (Σ, loc). 6. Global cascade sequences, the related principle and temporal logics We have established a direct correspondence between asynchronous wreath products of asynchronous transformation monoids and local cascade products of asynchronous automata (Proposition 3.28 and Theorem 3.29). In particular, over aKR distributed alphabets, any asynchronous automaton is simulated by a local cascade product of our proposed localised two-state reset and localised permutation asynchronous automata. This yields the same benefits as in the theory of word languages (see Section 5, in particular Theorem 5.7, for a concrete example). However, we do not know which distributed alphabets with non-acyclic architecture are aKR. Despite this, we have been able to get decomposition results (see Theorem 5.14) for first order definable trace languages over general architectures. This has been possible thanks to the expressive completeness of LocTL[Y i ≤ Y j , S i ] (Theorem 5.5) and the primary order labelling function θ Y which allows to reason about LocTL[Y i ≤ Y j , S i ]definability over (Σ, loc) in terms of LocTL[S i ]-definability over (Σ × Γ, loc) (see Lemma 5.10).
We have also ruled out, in Lemma 5.8, the possibility of an aperiodic asynchronous automaton that computes θ Y . This finally leads to a restricted version of the local cascade product characterization in Theorem 5.14, which circumvents the non-aperiodic behaviour of the gossip automaton which computes θ Y .
In this section, we propose global cascade sequences as a new model for accepting trace languages. This model is built using asynchronous automata and lets us pose automatatheoretic and language-theoretic decomposition questions in the same spirit as with the local cascade product. Its definition and acceptance condition are inspired by the operational point of view of the local cascade product, and are natural from an automata-theoretic viewpoint. Further, it supports a global cascade principle in the same vein as the local cascade principle supported by local cascade products.
Later in the section, we show that global cascade sequences of localised two-state reset asynchronous automata accept exactly the first order definable trace languages. In fact, we establish that LocTL[Y i , S i ]-definability matches acceptability by a global cascade sequence of copies of U 2 [p] and use the expressive completeness of LocTL[Y i , S i ] from Theorem 5.5. This allows a characterization of first order logic purely in terms of an asynchronous cascade of copies of U 2 [p] albeit using global cascade sequences. The new characterization remains in the realm of aperiodic asynchronous devices and can be considered intrinsic in the spirit of the 'first-order = aperiodic' slogan. Of course, all this comes at a cost! What we lose, bringing in global cascade sequences instead of local cascade products in decomposition results, is the nice correspondence to an algebraic operation. In Section 7, we show how to construct an asynchronous automaton that realizes a global cascade sequence, making use, again, of the gossip automaton.
Let A = ({S i }, {δ a }, s in ) be an asynchronous automaton over (Σ, loc) and χ A be the asynchronous transducer computed by A. Recall that χ A (t) preserves the underlying poset of events of t and, at each event, records the previous local states of the processes participating in that event. We now introduce a natural variant of χ A called the global-state labelling function, where we record at each event e the best global state that causally precedes e. This is the best global state that the processes participating in the current event are (collectively) aware of.
To be more precise, we first set additional notation. Recall that the alphabet Σ × S can be equipped with a distributed structure (over P) by letting loc((a, s)) = loc(a), that is, We refer to (Σ, loc) as the input alphabet of A and (Σ × S, loc) as its output alphabet.
Example 6.4. Figure 7 shows the image by the global-state labelling function ζ, for the same asynchronous automaton A (or asynchronous morphism ϕ) and trace t as in Example 3.18. Note the difference from Figure 4. For example, here the p 3 -event has process p 1 state 2 in its label (which is the best process p 1 state in its causal past) even though process p 1 and process p 3 never interact directly.
Trace ζ(t) Figure 7. Global-state labelling function output on a trace As ζ A (t) carries more information than χ A (t) (Remark 6.3), one can view ζ A as an information-theoretic generalization of χ A . However, unlike χ A , it is not clear a priori whether it can be computed by an asychronous automaton. We will return to this important issue in Section 7. At the moment, we simply extend the operational point of view of local cascade products (see Figure 5) using global-state labelling functions instead of asynchronous transducers.
Definition 6.5. A global cascade sequence (in short, gcs) A seq is a sequence (A 1 , A 2 , . . . , A n ) of asynchronous automata such that, for 1 ≤ i < n, the input alphabet of A i+1 is the output alphabet of A i . The input alphabet of A 1 is called the input alphabet of A seq and the output alphabet of A n is called the output alphabet of A seq .
We associate a global-state labelling function ζ Aseq from traces over the input alphabet of A seq to traces over the output alphabet of A seq , namely the composition ζ Aseq = ζ A 1 ζ A 2 · · · ζ An of the global-state labelling functions of the A i . For instance, if A seq = (A 1 , A 2 ) then ζ Aseq (t) = ζ A 2 (ζ A 1 (t)).
It is important to observe that a gcs A seq is not an asychronous automaton. A gcs is simply a cascade of a sequence of compatible automata which are connected via global-state labelling mechanisms. The following lemmas are immediate.  The concatenation of global cascade sequences, as in Lemma 6.6, is denoted by C seq = A seq · B seq . This is an associative operation, as verified in the following lemma.
Lemma 6.7. Let A seq , B seq , C seq be global cascade sequences such that the input alphabet of B seq is the output alphabet of A seq , and the input alphabet of C seq is the output alphabet of B seq . Then (A seq · B seq ) · C seq = A seq · (B seq · C seq ) and ζ (Aseq·Bseq) ζ Cseq = ζ Aseq ζ (Bseq·Cseq) .
We now show that, one can view a gcs as an acceptor of trace languages in a natural way. We begin with some notation.
For an automaton A, the set of global states of A is denoted by gs(A). Given a global cascade sequence A seq = (A 1 , . . . , A n ), we refer to the set gs(A 1 ) × . . . × gs(A n ) as the global states of A seq and denote it as gs(A seq ). Similarly the set of P -states of A seq is the cartesian product of the sets of P -states of its constituent asynchronous automata. Given a P -state s = (s 1 , . . . , s n ) of A seq , and P ⊆ P , we let s P = ((s 1 ) P , . . . , (s n ) P ) be the natural restriction of s to P .
Given F ⊆ gs(A seq ), we define the language L(A seq , F ), a language accepted by A seq , by A language L ⊆ Tr(Σ) is said to be accepted by A seq if there exists a subset F ⊆ gs(A seq ) such that L = L(A seq , F ). See the Figure 8.
The following global cascade sequence principle is an easy consequence of the definitions. Theorem 6.9. Let A seq and B seq be global cascade sequences, let (Σ, loc) be the input alphabet of A seq , and suppose that the output alphabet of A seq is the input alphabet of B seq , say (Π, loc). Let C seq = A seq · B seq . Then any language L ⊆ Tr(Σ) accepted by C seq is a finite union of languages of the form U ∩ ζ −1 Aseq (V ) where U ⊆ Tr(Σ) is accepted by A seq , and V ⊆ Tr(Π) is accepted by B seq .
Building on the simple observation in Remark 6.3 that global-state labelling functions are information-theoretic generalizations of local asynchronous transducers, we now show that a local cascade product can be realized by an appropriate global cascade sequence.
An asynchronous automaton B over (Σ × S, loc) naturally gives rise to another asynchronous automaton B (with the same state sets, etc.) operating over (Σ × S, loc) by defining the transition of B on letter (a, s) ∈ Σ × S to be the transition of B on letter (a, s a ). We abuse the notation slightly in the following lemma and denote B also by B.
Lemma 6.10. The action of a local cascade product A = A 1 • . . . • A n can be simulated by the gcs A seq = (A 1 , . . . , A n ) in the following sense: for any trace t, and any process i, Proof. To be completely rigorous, the gcs in question is A seq = (A 1 ,Â 2 , . . . ,Â n ). We prove the lemma by induction on n. If n = 1, the statement is trivially true. For the inductive step, we assume that the lemma holds for A = A 1 • . . . • A n−1 , and A seq = (A 1 ,Â 2 , . . . ,Â n−1 ). Fix a trace t. For any event e in the trace and any process i, by induction hypothesis, [A (⇓e)] i = [A seq (⇓e)] i . Hence if the label of the event e in ζ A seq (t) is (a, s) (for some s ∈ gs(A seq )), then the label of the same event in χ A (t) corresponds to (a, s a ) (see Remark 6.3). By construction, transition on (a, s a ) in the local cascade product component A n is same as that by (a, s) in the corresponding global cascade sequence componentÂ n . Hence there is a natural correspondence between the run of A n (in A) over χ A (t) and the run ofÂ n (in A seq ) over ζ A seq (t), and we can conclude that An important language-theoretic consequence of Lemma 6.10 is that every language accepted by A = A 1 • . . . • A n is also accepted by the gcs A seq = (A 1 , . . . , A n ).
We now come to the main result of this section, that relates the logic LocTL[Y i , S i ] with global cascade sequences of localized reset automata U 2 [p]. Recall that Theorem 5.7 gives an exact correspondence between first order definable trace languages and local cascade products of U 2 [p] asynchronous automata for aKR distributed alphabets. We generalize this language-theoretic decomposition result to any distributed alphabet, using a global cascade sequence of the same distributed resets instead of local cascade product.
Theorem 6.11. A trace language is defined by a LocTL[Y i , S i ] formula if and only if it is accepted by a global cascade sequence of asynchronous reset automata of the form U 2 [p].
The proof of Theorem 6.11 follows the same structure and re-uses elements of the proof of Theorem 5.6. Let ζ : Tr(Σ) → Tr(Σ × S) be the global-state labelling function associated with U 2 [p] and its initial state, say, 1. By the global cascade principle (Theorem 6.9), any language recognized by A is a union of languages of the form L 1 ∩ ζ −1 (L 2 ) where L 1 ⊆ Tr(Σ) is recognized by U 2 [p], and L 2 ⊆ Tr(Σ × S) is recognized by B. We have seen in the proof of Theorem 5.6 that L 1 is LocTL[S i ] definable over alphabet (Σ, loc).

Proof. First consider a global cascade sequence
By assumption, we know that L 2 is LocTL[Y i , S i ] definable over alphabet (Σ × S, loc) and we need to prove that ζ −1 (L 2 ) is LocTL[Y i , S i ] definable over (Σ, loc). This is done by structural induction on the LocTL[Y i , S i ]-formula over (Σ × S, loc) defining L 2 . For a LocTL[Y i , S i ] event formula α over (Σ × S, loc), we construct a LocTL[Y i , S i ] event formulâ α over (Σ, loc) such that for any trace t ∈ Tr(Σ) and any event e in t, we have t, e |=α if and only if ζ(t), e |= α. The non-trivial case here is the base case of a letter formula from (Σ × S, loc). We let The inductive cases are trivial, for instance Y i α = Y iα .
Towards establishing the converse implication, we construct, for any LocTL[Y i , S i ] event formula α, a gcs A α from reset asynchronous automata of the form U 2 [p], which is such that for any trace t, event e of t and process i ∈ loc(e), the local state [A α (↓e)] i determines whether t, e |= α. Again, this is done by structural induction on the LocTL[Y i , S i ] event formula α. The case of LocTL[S i ]-formulas, in view of Lemma 6.10, is already handled in (the proof of) Theorem 5.6 and we only need to deal with the inductive case where α is of the form α = Y j β.
By induction, a gcs of reset automata A β has been constructed, which provides the truth value of β at any event. We construct an asynchronous automaton B = ({Q i }, {δ (a,sa) }, q in ) over (Σ × S, loc) such that A α = A β · U 2 [j] · B as follows. The middle U 2 [j] remembers whether some j-event already occured. Concerning B, we let Q i = { , ⊥} for all i ∈ P and, again, we denote a P -state q as ⊥ (resp. ) if q i = ⊥ (resp. q i = ) for all i ∈ P . We let the initial state be q in = ⊥. Let ζ be the global-state labelling function associated with A β · U 2 [j]. Let t ∈ Tr(Σ) be a trace, e be an event in t, and (a, s) be the label of e in ζ(t). Write e j the last j-event in ⇓e if it exists, i.e., if ⇓e ∩ E j = ∅. The local state s j determines whether e j exists, written s j Y j , and in this case whether it satisfies β, written s j Y j β. The transition functions of B are: δ (a,s) = reset to if s j Y j β δ (a,s) = reset to ⊥ if s j Y j β .
It is easy to see that B is a gcs of copies of U 2 [p], one for each process p ∈ P. These reset automata work independently of each other, each U 2 [p] depends only on the global state information from A β · U 2 [j] provided by ζ. Hence, B is also a local cascade product of these U 2 [p]. This completes the proof.
Our subsequent result crucially uses the expressive completeness of LocTL[Y i , S i ] and its proof is immediate from Theorems 5.5 and 6.11. It is best seen as an addition to the several characterizations of first-order definable trace languages presented in Theorem 5.14.
Theorem 6.12. Let (Σ, loc) be a distributed alphabet and let L ⊆ Tr(Σ) be a trace language. Then L is definable in first-order logic if and only if L is accepted by a global cascade sequence of asynchronous reset automata of the form U 2 [p].

Asynchronous implementation of a global cascade sequence
We have already noted in Section 6 that the global-state labelling function ζ A associated with an asynchronous automaton A is not an abstraction of the asynchronous transducer of A, that is, it cannot be directly computed by A. However, we will show how to construct an asynchronous automaton A G which computes ζ A using the gossip automaton.
Recall that the gossip automaton G = ({Υ i }, {∇ a }, v in ) keeps track of the truth values all the constants of LocTL[Y i ≤ Y j , S i ]: it computes the primary order labelling function θ Y : Tr(Σ) → Tr(Σ × Γ) which decorates each event e of a trace t with the truth values γ ∈ Γ = {0, 1} P×P of all constants Y i ≤ Y j , i.e., for all i, j ∈ P, γ i,j = 1 if t, e |= Y i ≤ Y j and γ i,j = 0 otherwise.
Construction of A G . Roughly speaking, in the automaton A G , each process keeps track of its local gossip state in the gossip automaton and the best global state of A that it is aware of. In fact, A G is realised as a θ Y -restricted local cascade product of the gossip automaton and an asynchronous automaton A g -the global-state detector -derived from A where each process in A g keeps track of the best global state of A that it is aware of. When processes synchronize in A g , they use the Γ-labelling information to correctly update the best global state that they are aware of at the synchronizing event.
So A G = G • r A g where A g is an asynchronous automaton over (Σ×Γ, loc) derived from A. Therefore, for the construction, we only need to describe A g . Recall that A = ({S i }, {δ a }, s in ). Then A g = ({Q i }, {δ (a,γ) }, q in ) where Q i = S for all i ∈ P, and q in = (s in , . . . , s in ). Before defining the transitions, we define for (a, γ) ∈ Σ × Γ the function globalstate (a,γ) : Q a → S as follows: globalstate (a,γ) (q a ) = s ∈ S where, for each i ∈ P, s(i) = q a (j)(i) if there exists j ∈ loc(a) such that γ i,j = 1 s in (i) otherwise.
Note that, when γ i,j = 1 at some event e of a trace θ Y (t), then t, e |= Y i ≤ Y j and process j has the latest information about process i. Hence, the function globalstate (a,γ) determines the best global-state that processes in loc(a) are collectively aware of. We define the local transition functions of A g by δ (a,γ) (q a ) = q a where for all i ∈ loc(a) we set q a (i) = ∆ a (globalstate (a,γ) (q a )) .
Recall that ∆ a is the extension of the local transition function δ a of A to global states in S. In order to prove the correctness of the construction, we first introduce some notation. Let t = (E, ≤, λ) ∈ Tr(Σ), and i ∈ P. Then ↓ i (t) is the i-view of t and it is defined by ↓ i (t) = ↓E i . It is easy to see that if ↓ i (t) = ∅, then there exists e ∈ E i such that ↓ i (t) = ↓e. We note that ↓ i (t) is a trace prefix of t and it represents knowledge of the agent i about t.
The next lemma shows that in A G , each process keeps track of the best global state of A that it is aware of.
Proof. Note that as A G = G • r A g , A G (t) = (υ, q) implies that G(t) = v and A g (θ Y (t)) = q. We prove the lemma by induction on the size of t = (E, ≤, λ), that is, on |E|. The base case of the empty trace is easy and skipped.
Consider t = ta, θ Y (t ) = θ Y (t)(a, γ) and let (υ, q) = A G (t) and (υ , q ) = A G (t ). Clearly A g (θ Y (t)) = q, A g (θ Y (t )) = q and δ (a,γ) (q a ) = q a . By definition of the (a, γ)-transition function of A g , we have q i = ∆ a (globalstate (a,γ) (q a )) for each i ∈ loc(a). Observe that, by the local nature of a-transition functions of asynchronous automata, we have q i = q i for i ∈ loc(a).
By induction, for every i ∈ P, q i = A(↓ i (t)). If i ∈ loc(a), ↓ i (t ) = ↓ i (t). As, in this case, we also have q i = q i , we are done by the induction hypothesis. Now we let i ∈ loc(a). Let e correspond to the last occurrence of a in t . Then process i participates in e. As a result, ↓ i (t ) = ↓e. We first study the global state s = A(⇓e) ∈ S. For i ∈ P, let e i be the maximal i-event in ⇓e if it exists, i.e., if ⇓e ∩ E i = ∅. Fix a process i ∈ P. If e i does not exist, then s(i) = s in (i). Otherwise there exists j ∈ loc(a) such that e i ≤ e j . Therefore s(i) = [A(⇓e)] i = [A(↓ j (t))] i = q j (i) where the last equality is obtained using the induction hypothesis, q j = A(↓ j (t)). From Theorem 5.9, we have γ i,j = 1 (as θ Y (t ) = θ Y (t)(a, γ)) and, using the definition of the function globalstate (a,γ) , it follows that s = globalstate (a,γ) (q a ). Now, with s = A(↓e), we see that s = ∆ a (s) = ∆ a (globalstate (a,γ) (q a )). As already observed, q i = ∆ a (globalstate (a,γ) (q a )), for i ∈ loc(a). The proof of the inductive step is now complete, as for i ∈ loc(a), q i = s = A(↓e) = A(↓ i (t )).
We now show that the asynchronous automaton A G simulates A. More precisely, we define a simulation map f : Υ × Q → S by f (υ, q) = s where, for i ∈ P, s(i) = q i (i).
Lemma 7.2. The action of A can be simulated by A G as, for t ∈ Tr(Σ), A(t) = f (A G (t)).
An important consequence of Lemma 7.2 is that any language accepted by A is also accepted by A G : if L is accepted by A via F ⊆ S then it is accepted by A G via f −1 (F ). Note that, as f −1 (F ) = Υ × F where F ⊂ gs(A g ), the accepting set f −1 (F ) is solely determined by F and does not depend on the state reached by the gossip automaton.
Finally, we establish that A G computes the global-state labelling function ζ A . For this we fix the Γ-transducer G = ({Υ i }, {∇ a }, v in , {ν Y a : Υ a → Γ}) which computes θ Y . Now we are ready to extend A G to a S-transducerÂ G = (A G , {ξ a }). We define ξ a : Υ a × Q a → S as follows: ξ a (υ a , q a ) = globalstate (a,ν Y a (υa)) (q a ). Theorem 7.3. The transducerÂ G computes the global-state labelling function ζ A .
Proof. Let t = (E, ≤, λ) ∈ Tr(Σ) and t = θ Y (t) = (E, ≤, (λ, µ)) ∈ T R((Σ × Γ, loc)). Fix an event e ∈ E with λ(e) = a and µ(e) = γ. Note that, ζ A decorates the event e with the additional information s = A(⇓e). On the other hand, with A G (t) = (υ, q), the transducer A G decorates the event e with the additional information ξ a (υ a , q a ) = globalstate (a,ν Y a (υa)) (q a ). By Theorem 5.9,Ĝ computes θ Y and we have ν Y a (υ a ) = γ. Therefore it remains to show that s = globalstate (a,γ) (q a ) to complete the proof. But we have already seen in the proof of Lemma 7.1 (inductive step applied with t = ⇓e and t = ↓e) that s = globalstate (a,γ) (q a ). Now we are ready to realize a global cascade sequence as an asychronous automaton. We associate with a gcs A seq = (A 1 , . . . , A n ) an asychronous automaton A G seq and an asynchronous transducerÂ G seq , whose constructions extend the constructions of A G andÂ G from A in a natural fashion. In particular, A G seq is a local cascade product G • r (A g 1 • . . .• A g n ). The first component in this product is the gossip automaton which computes θ Y and the first cascade product is θ Y -restricted. In the subsequent components, each process keeps track of the best global state it is aware of for the corresponding automaton in A seq . In other words, we use the earlier construction for each component automaton but keep a single copy of the gossip automaton.
Let (Σ, loc) be the input (distributed) alphabet of A 1 (or that of A seq ) with total alphabet Σ. Note that the input alphabet for A j is the output alphabet of A j−1 . We abuse the notation and use the total alphabet instead of the distributed alphabet. The