A Theory of Formal Choreographic Languages

We introduce a meta-model based on formal languages, dubbed formal choreographic languages, to study message-passing systems. Our framework allows us to generalise standard constructions from the literature and to compare them. In particular, we consider notions such as global view, local view, and projections from the former to the latter. The correctness of local views projected from global views is characterised in terms of a closure property. We consider a number of communication properties -- such as (dead)lock-freedom -- and give conditions on formal choreographic languages to guarantee them. Finally, we show how formal choreographic languages can capture existing formalisms; specifically we consider communicating finite-state machines, choreography automata, and multiparty session types. Notably, formal choreographic languages, differently from most approaches in the literature, can naturally model systems exhibiting non-regular behaviour.


Introduction
Choreographic models of message-passing systems are becoming increasingly popular both in academia [BB11,BZ07,CHY12] and industry [KBR + 05, OMG11,Bon18]. These models advocate two complementary views of communicating systems. The global view can be thought of as an holistic description of the interactions carried out by a number of participants. Dually, the local views describe the expected communication behaviour of each participant in isolation.
As discussed in Section 9, the literature offers various choreographic models. Here, we introduce formal choreographic languages (FCL) as a meta-model to formalise messagepassing systems; existing choreographic models can be conceived as specifications of FCLs. Essentially, this results in the definition of global and local languages. The words of a global language (g-language for short) consist of (possibly infinite) sequences of interactions; an interaction has the form A− →B:m and represents the fact that participant A sends message m to participant B, and participant B receives it. The words of a local language (l-language for short) consist of (possibly infinite) sequences of actions; actions can take two forms, A B?m and A B!m, respectively representing that participant B receives message m from A and that participant A sends message m to B.
Such languages provide an abstract description of the possible runs of a system in terms of sequences of interactions at the global level, executed through synchronous messagepassing at the local level. A word w in a global language represents then a possible run expected of a communicating system. For instance, we can model a continuation of w as another word w ′ such that w·w ′ is also in the language. Also, w induces an expected "local" behaviour on each participant A; in fact, the behaviour of A is obtained by projecting w, that is by ignoring interactions on the run w not involving A while retaining only the output and input actions performed by A.
Our language-theoretic treatment is motivated mainly by the need for a general setting immune to syntactic restrictions. This naturally leads us to consider e.g., choreographies represented by context-free languages (cf. Example 3.12). In fact, we strive for generality; basically prefix-closure is the only requirement we impose on FCLs. (We discuss some implications of relaxing prefix-closure in Section 10.) The gist is that, if a sequence of interactions or of communications is an observable behaviour of a system, any prefix of the sequence should be observable as well. This allows us to consider partial executions as well as "complete" ones, which can be formalised in terms of maximal words (namely words without continuations). We admit infinite words to account for diverging computations, ubiquitous in communication protocols.
Some g-languages cannot be faithfully executed by distributed components. This can be illustrated with a simple example; consider a g-language containing only the word A− →B:m·C− →D:n and its prefixes. Such a language -in our interpretation of words concatenation as a "strict sequencing" operator (see e.g. [CM13,SD19] for non-strict interpretations of sequencing) -does specify that the interactions A− →B:m and C− →D:n must occur only in the given order. Clearly, this is not possible if the participants are distributed or act concurrently because C and D cannot be aware of when the interaction between A and B takes place.
Contributions & structure. We summarise below our main contributions.
Section 2 introduces FCL (g-languages in Definition 2.2, l-languages in Definition 2.4) and adapts standard constructions from the literature. In particular, we render communicating systems and choreographies as, respectively, sets of l-languages (Definition 2.6) Vol. 19:3 A THEORY OF FORMAL CHOREOGRAPHIC LANGUAGES 9:3 and g-languages, while we take inspiration for projections from choreographies and multiparty session types. We consider synchronous interactions; as discussed in Section 10, the asynchronous case is scope for future work. Section 3 considers correctness of communicating systems with respect to choreographies (the communicating system "executes" an interaction only if it is specified) and completeness (the communicating system "executes" at least the specified interactions). An immediate consequence of our constructions is the completeness of systems projected from g-languages (Corollary 3.2). Correctness is more tricky and requires to introduce closure under unknown information (CUI, cf. Definition 3.3). Intuitively, a g-language is CUI if it contains extensions of words with a single interaction whose participants cannot distinguish the extended word from other words of the language. Theorem 3.7 characterises correctness of projected systems in terms of CUI.
Section 4 shows how FCLs allow us to capture many relevant communication properties in a fairly uniform way.
Section 5 proposes branch-awareness (Definition 5.3) to ensure the communication properties defined in Section 4 (Theorem 5.7). Intuitively, branch-awareness requires each participant to "distinguish" words where its behaviour differs. Notably, we separate the condition for correctness (CUI) from the one ensuring the communication properties (branch awareness). Most approaches in the literature instead combine them into a single condition, which takes names such as well-branchedness or projectability [HLV + 16]. Thus, these single conditions are stronger than each of CUI and branch-awareness.
We illustrate the generality of FCLs on three case studies (cf. Sections 6 to 8), respectively taken from communicating finite-state machines [BZ83], choreography automata [BLT20] and multiparty session types [SD19]. We remark that FCLs can capture protocols that cannot be represented by regular g-languages such as the "task dispatching" protocol in Example 3.12. To the best of our knowledge this kind of protocols cannot be formalised in other approaches.
Section 9 compares with related work while Section 10 draws some conclusions and discusses future work. This paper is a revised and extended version of [BLT22]; the main differences are: (i) the results on decidability of CUI and branch-awareness in Section 7.2 are new; (ii) Section 6 yields a new case study; (iii) Section 7 has been significantly extended; (iv) full proofs have been added; (v) the text has been largely revised in order to improve readability.

Formal Choreographic Languages
We briefly recall a few basic notions used throughout the paper and fix some notation. Let Σ be an alphabet (i.e., a set of symbols).
We define the sets of finite and infinite words on Σ as where Σ ⋆ is the usual Kleene-star closure on Σ and Σ ω is the set of infinite words on Σ, that is maps from natural numbers to Σ (aka ω-words [Sta97]). The concatenation operator is denoted as · and ε is its neutral element. If w is infinite then w·w ′ = w. We write a 0 ·a 1 ·a 2 ·. . . for the word mapping i to a i ∈ Σ for all natural numbers i.
We shall deal with languages on particular alphabets 1 , namely the alphabets of interactions and actions whose definition (see below) we borrowed from [BLT20].
Definition 2.1 (Interactions and actions alphabets). Let P be a set of participants (ranged over by A, B, X, . . .) and M (ranged over by m, x, . . .) a set of messages, such that P and M are disjoint. We define ranged over by α, β, . . .
We call Σ int and Σ act , respectively, alphabet of interactions and alphabet of actions (over P and M).
Our results are independent of the sets P and M we consider and hence we do not further specify them. Words on Σ ∞ int (ranged over by w, w ′ , ...) are called interaction words while those on Σ ∞ act (ranged over by v, v ′ , ...) are called words of actions. Hereafter z, z ′ , ... range over Σ ∞ int ∪ Σ ∞ act and we use L and L to range over subsets of, respectively, Σ ∞ int and Σ ∞ act . Function ptp(·) yields the set of participants involved in an interaction or in an action and it is defined on Σ int ∪ Σ act as follows: This function extends naturally to (sets of) words. The subject of A B!m is the sender A and the subject of A B?m is the receiver B.
We summarise in Table 1 all the name conventions for variables ranging over the sets defined above.
1 These alphabets may be infinite; formal languages over infinite alphabets have been studied, e.g., in [ABB80]. A global language specifies the expected interactions of a system while a local language specifies the communication behaviour of a participant.
Definition 2.2 (Global language). A global language (g-language for short) is a prefix-closed language L on Σ int such that ptp(L) is finite.
Example 2.3. For any finite subset Σ of Σ int , the set Σ ⋆ is a g-language; notice that, if P is infinite, then Σ ⋆ int is not a g-language since it encompasses infinitely many participants. The set L = pref({ C− →A:w·A− →B:g, C− →B:w·A− →B:g, C− →A:w·C− →B:w }) is a g-language (which will be used later on in other examples). It formally describes the following interaction protocol involving Alice, Bob and Carol: Carol can decide to ask either Bob or Alice to work and after that Alice gossips with Bob, unless also Bob is asked to work after Alice is. ⋄ Definition 2.4 (Local language). A local language (l-language for short) is a prefix-closed language L on Σ act such that ptp(L) is finite. An l-language is A-local if its words have all actions with subject A.
Example 2.5. The set L = { ε, C A!w, C B!w, C A!w·C B!w } is a C-local language; in particular, it specifies the local behaviour of Carol with respect to the g-language L of Example 2.3. ⋄ As discussed in the Introduction, l-languages give rise to communicating systems.
Definition 2.6 (Communicating system). Let P ⊆ P be a finite set of participants. A (communicating) system over P is a map S = (L A ) A∈P assigning, to each participant A ∈ P, an A-local language L A ̸ = { ε } such that ptp(L A ) ⊆ P.
By projecting a g-language L on a participant A we obtain the A-local language describing the sequence of actions performed by A in the interactions involving A in the words of L. Definition 2.7 below recasts in our setting the notion of projection used in several choreographic formalisms, e.g., in [CHY12,HYC16].
Definition 2.7 (Projection). The projection on A of an interaction B− →C:m is computed by the function ↓ : Σ int × P → Σ act ∪ { ε } defined by: and extended homomorphically to interaction words and g-languages. The projection of a g-language L, written L↓, is the communicating system (L↓ A ) A∈ptp(L) .
Example 2.8. Let L be the g-language in Example 2.3. By Definition 2.7, we have The latter is the l-language in Example 2.5. ⋄ We consider a synchronous semantics of communicating systems, similarly to other choreographic approaches such as [BZ07, CHY12, DGJ + 15, SD19]. Intuitively, a choreographic word is in the semantics of a system S iff its projection on each participant A yields a word in the local language assigned by S to A.
Definition 2.9 (Semantics). Given a system S over P, the set int ptp(w) ⊆ P and for all A ∈ P, w↓ A ∈ S(A) } is the (synchronous) semantics of S. Notice that the above definition coincides with the join operation in [FBS04], used in realisability conditions for an asynchronous setting.
By the finiteness condition on the number of participants in a g-language (Definition 2.2) we immediately get the following.
Fact 2.10. Let L be a g-language. Then L ⊆ Σ ⋆ int implies L↓ ⊆ Σ ⋆ int . Example 2.11. For the system L↓ in Example 2.8, we have L↓ = pref({ C− →A:w·A− →B:g, C− →B:w·A− →B:g, A− →B:g, C− →A:w·C− →B:w·A− →B:g }) ⋄ Two interactions α and β are independent (in symbols α ∥ β) when ptp(α) ∩ ptp(β) = ∅. Informally, a language L describing the behaviour of a system, and containing a word w, does contain also all the words where independent interactions in w are swapped; we say that L is concurrency closed. The notion of concurrency closure in our setting is a delicate one because of the possible presence of infinite words. One in fact has to allow infinitely many swaps of independent interactions while avoiding that the interactions do disappear by pushing them infinitely far away. Technically, we consider Mazurkiewicz's traces [Maz86] on Σ int with independence relation α ∥ β: Definition 2.12 (Concurrency closure). Let ∼ be the reflexive and transitive closure of the relation ≡ on finite interaction words defined by w α β w where w ≪ w ′ iff for each finite prefix w 1 of w there are a finite prefix w ′ 1 of w ′ and a g-word w ∈ Σ ⋆ int such that w 1 ·ŵ ∼ w ′ 1 . A g-language L is concurrency closed if it coincides with its concurrency closure, namely L = { w ∈ Σ ∞ int there is w ′ ∈ L such that w ∼ w ′ }. As expected from the discussion above, the semantics of systems is naturally concurrency closed since in a distributed setting independent events can occur in any order. Indeed, relation ∼ can be characterised as follows.
Lemma 2.13. Given a g-language L and two words w 1 , w 2 ∈ L, w 1 ∼ w 2 iff w 1 ↓ A = w 2 ↓ A for each A ∈ ptp(L).

Therefore we have
Proposition 2.14. Let S be a system. Then S is concurrency closed.
Proof. Trivial, since closure under swap does not change the projection by Lemma 2.13.
The intuition that g-languages, equipped with the projection and semantic functions of Definition 2.7 and Definition 2.9, do correspond to a natural syntax and semantics for the abstract notion of choreography, can be strengthened by showing that these functions form a Galois connection.
Let us define G = { L L is a g-language } and S = { S S is a system }. Moreover, given S, S ′ ∈ S, we define S ⊆ S ′ if S(A) ⊆ S ′ (A) for each A. Proof. Note that ( )↓ and are trivially monotone by their definitions. ( =⇒ ) We first observe that where the equalities above hold by Definition 2.9 and the inclusion holds by hypothesis (L↓ ⊆ S). Hence, given a word w ∈ L, we get w ∈ L↓ by construction and w ∈ S by the above.
(⇐=) We have to show that L↓ A ⊆ S(A) for each A. Let hence w A ∈ L↓ A . By definition there is w ∈ L with w↓ A = w A . Then w ∈ S by hypothesis and hence, by Definition 2.9, Notice that, by Proposition 2.15, L↓ ⊆ S can be understood as "L can be realised by S" according to the notion of realisability frequently used in the literature, namely that all behaviours of the choreography are possible for the system.
A closure operator is a function cl whose domain and codomain are ordered sets and such that cl is monotone (x ≤ y =⇒ cl(x) ≤ cl(y)), extensive (x ≤ cl(x)), and idempotent (cl(x) = cl(cl(x))). It is well-known that, given a Galois connection (f ⋆ , f ⋆ ), cl = f ⋆ • f ⋆ is a closure operator. In our setting ↓ is a closure operator, hence the above boils down to the following corollary: Corollary 2.16. For all g-languages L, L ′ ∈ G, monotonicity: L ⊆ L ′ =⇒ L↓ ⊆ L ′ ↓ , extensiveness: L ⊆ L↓ , idempotency: L↓ = L↓ ↓ .

Correctness and Completeness
A g-language specifies the expected communication behaviour of a system made of several components. Definition 2.6 formalises such systems in terms of l-languages. In the present section we deal with properties relating a communicating system with a specification (i.e., a glanguage). In particular, we first introduce correctness and completeness of a communicating system with respect a g-language. The latter property follows by the Galois connection discussed in the previous section. Instead in order to prove correctness conditions are introduced in terms of a closure property, CUI; this requires to handle continuity. We highlight the expressiveness of CUI g-languages by showing that there exist non-regular CUI g-languages.
Definition 3.1 (Correctness and completeness). Let L be a g-language. A system S is correct with respect to L if S ⊆ L and it is complete with respect to L if S ⊇ L.
Correctness and completeness are related to existing notions. For instance, in the literature on multiparty session types (see, e.g., the survey [HLV + 16]) correctness is analogous to subject reduction and completeness to session fidelity.
Notice that, by Proposition 2.15, we can interpret L↓ ⊆ S as a characterisation for completeness of S with respect to L. Hence, an immediate result of the Galois connection 9:8 F. Barbanera defined in Section 2 is that any system projected from a g-language is complete. In fact, completeness coincides with the extensiveness property of the closure operator associated to our Galois connection.
Corollary 3.2. For any g-language L, L↓ is complete with respect to L.
It is easy to check that a similar result does not hold for correctness. If we consider the g-language L of Example 2.3, we have that, as shown in Example 2.11, C− →A:w·C− →B:w·A− →B:g ∈ L↓ but C− →A:w·C− →B:w·A− →B:g ̸ ∈ L. That is L↓ ̸ ⊆ L.
Characterising correctness for projected systems. Can we identify conditions on g-languages to ensure correctness of their projections? The answer to this question is positive since correctness can be characterised as a closure property.
Definition 3.3 (CUI). A g-language L is closed under unknown information, in symbols cui(L), if for all finite words w 1 ·α, w 2 ·α ∈ L with the same final interaction α = A− →B:m ∈ Σ int we have w·α ∈ L for all w ∈ L such that w ↓ A = w 1 ↓ A and w ↓ B = w 2 ↓ B .
Intuitively, participants cannot distinguish words with the same projection on their role. Hence, if two participants A and B find words w 1 and w 2 compatible with another word w, and interaction A− →B:m can occur after both w 1 and w 2 , then it should be enabled also after w. Indeed, A cannot know whether the current word is w or w 1 and, likewise, B cannot tell apart w and w 2 . Hence, L should encompass w because, after the execution of w, A and B are willing to take A− →B:m, which can thus happen at the system level. Closure under unknown information (CUI for short) lifts this requirement at the level of g-language.
Example 3.4. The language L in Example 2.3 is not CUI because it contains the words w 1 ·α = C− →A:w·A− →B:g w 2 ·α = C− →B:w·A− →B:g and w = C− →A:w·C− →B:w and A cannot distinguish between w 1 and w while B cannot distinguish between w 2 and w; nonetheless w·A− →B:g = C− →A:w·C− →B:w·A− →B:g ̸ ∈ L. Notice that, as shown after Corollary 3.2 L ̸ ⊇ L↓ . ⋄ The language in Example 3.4 is not the semantics of any system, in fact languages obtained as semantics of a communicating system are always CUI.

Proposition 3.5 (Semantics is CUI). For all systems S, S is CUI.
Proof. Let S be a system over P. In order to show closure of S under unknown information, let us take words w A ·A− →B:m, w B ·A− →B:m, w ′ ∈ S , such that By Definition 3.3, we have to show that w ′ ·A− →B:m ∈ S . This in turn, by definition of synchronous semantics, amounts to show that (w ′ ·A− →B:m)↓ X ∈ S(X) for each X ∈ P. If hence, by hypothesis and definition of projection we have where the last two equalities hold by hypothesis.
The next property connects finite and infinite words in a language; it corresponds to the closure under the limit operation used in ω-languages [Eil76,Sta97]. Definition 3.6 (Continuity). A language L on an alphabet Σ is continuous if z ∈ L for all z ∈ Σ ω such that pref(z) ∩ L is infinite.
This notion of continuity, besides being quite natural, is the most suitable for our purposes among the possible ones [Red86]. Intuitively, a language L is continuous if an ω-word is in L when infinitely many of its approximants (i.e., finite prefixes) are in L. A g-language L is standard or continuous (sc-language, for short) if either L ⊆ Σ ⋆ int or L is continuous. Notice that, for prefix-closed languages, for all z ∈ Σ ω we have that Closure under unknown information characterises correct projected systems.
Theorem 3.7 (Characterisation of correctness). If L↓ is correct with respect to L then cui(L) holds. If L is an sc-language and cui(L) then L↓ is correct with respect to L.
Proof. We prove the first implication. In order to show closure of L under unknown information, let us take words By Definition 3.3, we have to show that w ′ ·A− →B:m ∈ L. Thanks to correctness, it is enough to show that w ′ ·A− →B:m ∈ L↓ . This in turn, by definition of synchronous semantics, We now prove the second implication. By Definition 3.1, we have to show that L↓ ⊆ L; we proceed by contradiction. Fix a word w ∈ L↓ \ L. The only possible cases are: L ⊆ Σ ⋆ int By Fact 2.10, L↓ does not contain infinite g-words. Hence, w is finite and we can take its longest prefix w ′ that belongs to L. Let α = A− →B:m be the interaction immediately following w ′ in w. We can choose w A , w B ∈ L such that w A ↓ A = w↓ A and w B ↓ B = w↓ B . (Recall that by Definition 2.9 for each X ∈ P there is a word w X ∈ L such that w↓ X = w X ↓ X .) Take the shortest prefixes w ′ A and w ′ B of w A and w B respectively such that Now, by w ′ ∈ L and the definition of CUI, we infer that w ′ ·A− →B:m ∈ L against the hypothesis that w ′ was the longest such prefix of w.
int w ′ ⪯ w and w ′ ∈ L } is finite then we take the longest g-word in L and the proof is as in the previous case. Otherwise we have a contradiction because, by continuity, w ∈ L. 9:10 F. Barbanera Notice that CUI is defined in terms of g-languages only, hence checking CUI does not require to build the corresponding system. This allows us to study CUI on specific classes of g-languages. For instance, we can show that CUI is decidable on a class of languages accepted by Büchi automata (cf. Section 7). An interesting observation is that strengthening the precondition of Definition 3.3 with the additional requirement w 1 = w 2 would invalidate Theorem 3.7. Indeed, L ∪ {A− →B:g} with L the language in Example 2.3 would become CUI but not correct. The next example shows that the continuity condition in Theorem 3.7 is necessary for languages containing infinite g-words.
Example 3.8 (Continuity matters). The CUI language does contain an infinite word but it is not continuous. The projection of L is not correct because its semantics contains the g-word A− →B:l·B− →C:n·(C− →D:n) ω . This word, in fact, does not belong to L since the projections of C and D can exchange infinitely many messages n due to the infinite g-word of L regardless whether A and B exchange l or r. ⋄ Notice that, since L ⊆ L↓ always holds, Theorem 3.7 implies that cui(L) characterises the languages L such that L = L↓ . Besides, the following corollary descends from Theorem 3.7.
Corollary 3.9. If L is an sc-language, L↓ is the smallest CUI sc-language containing L.
Proof. Let cl(L) = L↓ . Given an sc-language L, cui(cl(L)) holds by Proposition 3.5. Moreover, it is not difficult to check that if L is an sc-language, so is cl(L). L ⊆ cl(L) holds by extensiveness of cl. Now, in order to show that cl(L) is smaller or equal than any CUI sc-language containing L, let us consider any sc-languageL such that cui(L) and L ⊆L. By monotonicity of cl, we have that cl(L) ⊆ cl(L) and, by Theorem 3.7, cl(L) = L. So cl(L) ⊆L.
CUI ensures that continuous g-languages are concurrency closed.
Proposition 3.10. If L is an sc-language and cui(L), then L is concurrency closed.
Proof. Thanks to Theorem 3.7 L = L↓ , hence it is concurrency closed thanks to Proposition 2.14.
Hence, an sc-language cannot be CUI unless it is concurrency closed.
As recalled before, in many choreographic formalisms (such as [BDCLT21, HLV + 16, CDYP16, BBO12, FBS04]) the correctness and completeness of a projected system, namely L = L↓ (together with some forms of liveness and deadlock-freedom properties), is guaranteed by well-branchedness conditions. Most of such conditions guarantee, informally speaking, that participants reach consensus on which branch to take when choices arise. For instance, a well-branchedness condition could be that, at each choice, there is a unique participant deciding the branch to follow during a computation and that such participant informs each other participant. Such a condition is actually not needed to prove L = L↓ , as shown by the example below.
Example 3.11. The g-language L of Example 2.11 is CUI, without being well-branched in the above sense. Indeed, after the interaction C− →A:w, there is a branching in the projected system, since both the interactions C− →B:w and A− →B:g can be performed. However, these interactions do not have the same sender.
A key merit of our model is its generality and expressiveness. We show this with an example of a non-regular CUI g-language whose projected system is correct and complete by Theorem 3.7 and Corollary 3.2, respectively.
Example 3.12 (A non-regular CUI g-language). We consider a task dispatching service where, as soon as a Server communicates its availability, a Dispatcher sends a task to S. The server either processes the task directly and sends back the resulting data to D or sends the task to participant H for some pre-processing, aiming at resuming it later on. Indeed, after communicating a result to D, the server can resume a previous task (if any) from H, process it, and send the result to D. The server eventually stops by sending s to both D and H; this can happen only when all dispatched tasks have been processed. The task dispatching scenario above is faithfully captured by the g-language L obtained by prefix-closing the (non-regular) language generated by the following context-free grammar.
The language in Example 3.12 is non-regular since it has the same structure of a language of well-balanced parenthesis. Remarkably, this implies that the g-language cannot be expressed in most of the other choreographic models in the literature. There are models going beyond regular languages for both the binary (e.g., [Dar14,TV16]) and the multi-party case (e.g., [JY20]). These approaches are based on process algebraic methods. An interesting future research direction would be to study whether the multi-party models give rise to CUI and branch-aware languages. The argument used to show cui(L) in Example 3.12 proves the following.
Proposition 3.13. If there exists a participant involved in all the interactions of a g-language L then cui(L) holds.

Communication Properties
Besides correctness and completeness, other properties could be of interest for messagepassing systems. For instance, one would like to ensure that participants eventually interact, if they are willing to. More generally, we are interested in some relations between the interactions in a system and the communication actions of its participants. We consider a number of properties, defined as follows.
Harmonicity (HA) requires that each sequence of communications that a participant is able to perform can be executed in some computation of the system.  The remaining communication properties rely on the notion of maximal computation (cf. beginning of Section 2 on page 4). Lock-freedom (LF) requires that if a participant has pending communications to make on an ongoing computation, then there is a continuation of the computation involving that participant.
Strong lock-freedom (SLF) requires that if a participant has pending communications to make on an ongoing computation, then each maximal continuation of the computation involves that participant.
Starvation-freedom (SF) requires that if a participant has pending communications to make on an ongoing computation, then each infinite continuation of the computation involves that participant.
Deadlock-freedom (DF) requires that in all completed computations each participant has no pending actions.
Definition 4.5 (Deadlock-freedom). A system S on P is deadlock free if, for each finite and maximal word w ∈ S and participant A ∈ P, w↓ A is maximal in S(A).
Barred for harmonicity, these properties appear in the literature under different names in various contexts. For instance, the notion of lock-freedom in [BDCLT21] corresponds to ours, which in turn corresponds to the notion of liveness in [LNTY17,KS10] in a channel-based synchronous communication setting. Likewise, the notion of strong lock-freedom in [SD19] corresponds to ours and, under fair scheduling, to the notion of lock-freedom in [Kob02]. As a final example, the definition of deadlock-freedom in its (equivalent) contrapositive form, coincides with the notion of progress as defined for synchronous processes in [Pad13, GJP + 19]. Harmonicity, introduced in the present paper, assures that no behaviour of a participant can be taken out from a system without affecting the overall behaviour of the system itself. Notice that the inverse of harmonicity, S ↓ A ⊆ S(A), holds by construction.
The next proposition highlights the relations among our properties.  Proof. SLF =⇒ LF Let A ∈ P, w ∈ S be finite, and w↓ A not to be maximal in S(A). By SLF we can infer that also w is not maximal in S . Otherwise, for w ′ = ε, we would have w·w ′ to be maximal in S and, by SLF, we would get ε = w ′ ↓ A ̸ = ε. Contradiction. So, if w is not maximal in S , there exists w ′ ̸ = ε such that w·w ′ is maximal in S . We get hence immediately w ′ ↓ A ̸ = ε by SLF. LF =⇒ DF Let us assume S to be lock-free. By contradiction let us assume S not to be deadlock-free. Then there is a finite and maximal word w ∈ S and a participant A such that w↓ A is not maximal in S(A). Since S is lock-free, by definition, there is a word w ′ such that ww ′ ∈ S and w ′ ↓ A ̸ = ε. Hence w ′ ̸ = ε and therefore w is not maximal in S, contrary to our assumption. It is immediate to check that S is vacuously deadlock free, since there is no finite maximal word in S. However, S is not lock-free. It is enough to take w = ε, which is finite in S and such that w↓ C = ε is not maximal in L C . However there is no w ′ such that w·w ′ ∈ S and w ′ ↓ C ̸ = ε.
and consider the communicating system S = (L X ) X∈{ A,B,C } whose synchronous semantics is It is immediate to check that S is vacuously deadlock-free, since there is no finite maximal word in S. However, S is not starvation-free. It is enough to take w = A− →B:m, which is finite in S and also w↓ C = ε is not maximal in L C , but for w ′ = (A− →B:m) ω we have that w·w ′ ∈ S and w ′ ↓ C = ε. It is easy to check that S is lock-free. However, S is not harmonic, since B A?m · A B!y ∈ S(A) and there is no w ∈ S such that w↓ A = B A?m·A B!y. and that S is harmonic. However, S is not lock-free. In fact, by taking w = C− →A:m ∈ S , we have that w↓ A is not maximal in L A , since C A?m·A C!m = w↓ A ·A C!m ∈ L A . However, there is no word w ′ such that w·w ′ ∈ S and w ′ ↓ A ̸ = ε.
SF ̸ =⇒ LF Immediate, since otherwise, by LF =⇒ DF proved above, we would get SF to imply DF, which we showed not to hold. It is not difficult also to check that S is lock-free. However, it is not starvation-free. In fact, by taking the non maximal word w = ε, we have that for the infinite word w ′ = (A− →B:m) ω and for the participant C we have that w↓ C is non maximal and ww ′ ∈ S . However w ′ ↓ C = ε.
It is not difficult also to check that S is harmonic. However, it is not starvation-free. In fact, by taking the non maximal word w = ε, we have that for the infinite word w ′ = (A− →B:m) ω and for the participant C we have that w↓ C is non maximal and ww ′ ∈ S . However w ′ ↓ C = ε. S is deadlock-free, since the only finite maximal word is B− →A:m whose projection on A and B are both maximal. However, S is not harmonic, since A B!y ∈ S(A) and there is no w ∈ S such that w↓ A = A B!y. It is easy to check that S is harmonic. However, S is not deadlock-free, since for the maximal and finite word A− →B:m ∈ S we have that w↓ C (= ε) is not maximal in L C .
DF ∧ SF =⇒ SLF In order to show SLF, let w ∈ S be finite and let w↓ A be non maximal in S(A) for A ∈ P. Besides, let w ′ be such that ww ′ is maximal in S . We have to show that w ′ ↓ A ̸ = ε. We distinguish two cases: w ′ is finite: Since S is DF we can infer that w ′ ̸ = ε and w ′ ↓ A ̸ = ε. w ′ is infinite: We get w ′ ↓ A ̸ = ε immediately by SF. SLF =⇒ DF ∧ SF To show DF by contradiction, let us assume to have w ∈ S finite and maximal and such that w↓ A is non maximal in S(A) for A ∈ P. By SLF we have that w ′ ↓ A ̸ = ε for each w ′ such that ww ′ is maximal in S . Since we assumed w to be maximal, the only possible w ′ is ε, contradicting w ′ ↓ A ̸ = ε. To show SF by contradiction, let us assume to have w ∈ S finite and such that w↓ A is non maximal in S(A) for A ∈ P. Besides, let us assume that there exists an infinite w ′ such that ww ′ ∈ S and w ′ ↓ A = ε. Since w ′ is infinite, ww ′ is trivially maximal in S and we get a contradiction with SLF.

Communication Properties by Construction
Harmonicity (cf. Definition 4.1) is the only property guaranteed on any system obtained via projection. This is a simple consequence of Corollary 3.2.
Corollary 5.1. If L is a g-language then L↓ is harmonic.
Proof. By Corollary 3.2, L ⊆ L↓ . Now, by monotonicity of projection, we get L↓ ⊆ L↓ ↓, that is harmonicity of L↓.
To ensure the other properties on a system L↓ we need to require some conditions on the g-language L. Basically, we will strengthen CUI which is too weak. For instance, cui(L) does imply neither deadlock-freedom nor lock-freedom for L↓. Informally, L is CUI because C can ascertain which of its last actions to execute from the first input. So, Corollary 3.2 and Theorem 3.7 ensure that L = L↓ . However, L↓ is not deadlock-free. In particular, w ∈ L = L↓ is a deadlock since it is a finite maximal word whose projection on B, namely w↓ B = A B?m, is not maximal in L↓ B because w ′ ↓ B = A B?m·B C!m ∈ L↓ B . L↓ is non lock-free either by Proposition 4.6. ⋄ In many models (cf. [HLV + 16]) in order to ensure, besides other properties, also the correctness of L↓, a condition called well-branchedness is required. We identify a notion weaker than well-branchedness, which by analogy we dub branch-awareness (BA for short).

Definition 5.3 (Branch-awareness). A participant
w 1 ↓ X ̸ = w 2 ↓ X and w 1 ↓ X ̸ ≺ w 2 ↓ X and w 2 ↓ X ̸ ≺ w 1 ↓ X . A g-language L on P is branch-aware if each X ∈ P distinguishes all maximal words in L whose projections on X differ.
Condition w 1 ↓ X ̸ = w 2 ↓ X in Definition 5.3 is not strictly needed to define BA, but it makes the notion of 'distinguishes' more intuitive. Equivalently, as shown in Proposition 5.5 below, a participant X distinguishes two branches if, after a common prefix, X is actively involved in both branches, performing different interactions. Proposition 5.5. Participant X distinguishes two g-words w 1 , w 2 ∈ Σ ∞ int iff there are w ′ 1 ·α 1 ⪯ w 1 and w ′ 2 ·α 2 ⪯ w 2 such that w ′ 1 ↓ X = w ′ 2 ↓ X and α 1 ↓ X ̸ = α 2 ↓ X . Proof. Trivial.
The notions of well-branchedness in the literature [HLV + 16] additionally impose that α 1 ↓ X and α 2 ↓ X in the above proposition are input actions, but for a (unique) participant (a.k.a. the selector ) which is required to have different outputs. In our case the notion of selector corresponds to the "first" participant that distinguishes two words. Also in our case a selector must be involved in each branch but, due to the perfect symmetry of input and output actions in synchronous communications, its involvement can happen through input or output actions. This is illustrated by the following example. The participant B immediately distinguishes w and w ′ via its first actions, that is the input from A and the output to C. Hence, B is the selector of the branch made of w and w ′ . Notice that A and C also distinguish these two words, however this happens "later". Finally, observe that the same would hold if we replace B− →C:m with C− →B:m in w ′ . ⋄ In our theory, BA is not needed for correctness, but it is nevertheless useful to prove the communication properties presented in Section 4.
Theorem 5.7 (Consequences of BA). Let L be a branch-aware and CUI sc-language. Then L↓ satisfies all the properties in Definitions 4.1 to 4.5.
Proof. Let L be a branch-aware sc-language such that cui(L) holds. We prove the properties separately. Harmonicity Immediate by Corollary 5.1. Lock-freedom By contradiction, let us assume L↓ not to be lock-free. By Definition 4.2 and cui(L), it follows that there exist a participant A ∈ P and a finite g-word w ∈ L such that By Corollary 5.1, L↓ is harmonic. Hence, by the above and cui(L), there exists w ′′ ∈ L such that This means that, by taking a maximal extension of w and a maximal extension of w ′′ŵ , we would get two non branch-aware words in L, contradicting our hypothesis of L being branch-aware. Deadlock-freedom Immediate by Proposition 4.6. Starvation-freedom By contradiction, let us assume L↓ not to be starvation-free. By Definition 4.4 and cui(L), it follows that there exist a participant A ∈ P and a finite g-word w ∈ L such that • w↓ A is not maximal in L↓ A ; • there is an infinite word w ′ such that w·w ′ ∈ L and w ′ ↓ A = ε. Now, by harmonicity of L↓ (Corollary 5.1), non maximality of w↓ A and by cui(L), it follows that there exist w ′′ ∈ L and a finite wordŵ such that The above means that by taking ww ′ and any maximal extension of w ′′ŵ we would get two maximal words in L which A cannot distinguish, so contradicting our hypothesis of L being branch-aware. Strong lock-freedom Immediate since SLF = SF ∧ DF.
Example 5.8 (Task dispatching and branch-awareness). In order to show that the glanguage L in Example 3.12 is branch-aware, we first notice that each maximal word in   It is not difficult to show that branch-awareness actually characterises SLF for systems obtained by projecting CUI languages.
Proposition 5.10 (Branch-awareness characterises SLF). A CUI g-language L is branchaware iff L↓ is strongly lock-free.
Proof. Necessity follows from Theorem 5.7 while for sufficiency we reason by contraposition. Assume L not to be branch-aware. Then, by Definition 5.3 there exist a participant X and two maximal words in L such that their projections on X differ and either w 1 ↓ X ≺ w 2 ↓ X or w 2 ↓ X ≺ w 1 ↓ X . Let us consider the first case (the second is analogous). Let now w ′ 1 and w ′ 2 be the maximal prefixes (hence finite) of, respectively, w 1 and w 2 such that w ′ 1 ↓ X = w ′ 2 ↓ X . We distinguish the following two cases. w ′ 1 = w 1 In such a case we get that L↓ is not deadlock-free, since, for the finite maximal w we have that w 1 ↓ X is not maximal. So, by Proposition 5.9, L↓ is not strongly lock-free as well.
w ′ 1 ≺ w 1 In such a case we get that w ′ 1 is a non-maximal finite word such that w 1 = w ′ 1 ·w ′′ 1 is maximal, w 1 ↓ X is non maximal and w ′′ 1 ↓ X = ε, that is L↓ is not strongly lock-free.

CFSMs and Choreography Languages
Communicating finite-state machines (CFSMs) have been introduced in [BZ83] as a convenient model to analyse message-passing protocols. Basically, a CFSM is a finite-state automaton (FSA), defined below, whose transitions are actions.
Definition 6.1 (Finite state automaton (FSA)). A finite state automaton (FSA) is a tuple A = ⟨S, q 0 , Λ, →⟩ where • S is a finite set of states (ranged over by q, s, . . .) and q 0 ∈ S is the initial state; • Λ is a finite set of labels (ranged over by l, λ, . . .); • →⊆ S × Λ × S is a set of transitions.

The set of reachable states of
Remark 6.2. Our definition of FSA omits the set of accepting states since we consider only FSAs where each state is accepting. ⋄ Following the assumption in Remark 6.2, we define the language L(A) as the union of the finite words accepted by A in the classical sense and the infinite words accepted by A considered as Büchi automaton 2 where all states are accepting.  A CFSM is an FSA labelled in Σ act ∪ { ε }, where ε is not a symbol in Σ act and it overloads the notation for the empty string to represent internal transitions of CFSMs as usual in automata theory.
Remark 6.3. FSAs, and consequently CFSMs, can be deterministic or not. Deterministic FSAs have no label ε, and transitions from the same state are pairwise different. Given a non-deterministic FSA one can build a deterministic FSA generating the same language. We will assume that each CFSM is deterministic since we are interested in languages and non-deterministic CFSMs can be determinised while preserving their language.
Notice that it is necessary to adopt finer notions of equivalence such as bisimilarity [Par81] in order to tell apart deterministic FSAs from non-deterministic ones. It could be interesting to investigate the possibility of including the notion of nondeterminism in FCLs in the future. ⋄ A CFSM is local to a participant A (A-local for short) if all its transitions have subject A; we will consider communicating systems where the behaviour of each participant A is specified by an A-local CFSM. Formally, given a finite set P ⊆ P of participants, a system of CFSMs is a map (M A ) A∈P assigning an A-local CFSM M A to each participant A ∈ P such that any participant occurring in a label of a transition of M A is in P.
The synchronous behaviour of a system of CFSMs (M A ) A∈P has been defined in [BLT20] as any FSA where states are maps assigning a state in M A to each A ∈ P and transitions are labelled by interactions. Intuitively, given a configuration s, if M A and M B have respectively denotes the update of f on x with y. The next definition is a slightly different version of the one in [BLT20], where the semantics of a system of CFSMs S is the FSA (|S| ) defined below rather than its language as in our case.
Definition 6.4 (Synchronous semantics of systems of CFSMs). Let S = (M A ) A∈P be a system of CFSMs where M A = ⟨S A , q 0A , Σ act , → A ⟩ for each participant A ∈ P. A synchronous configuration for S is a map s = (q A ) A∈P assigning a local state q A ∈ S A to each A ∈ P.
The synchronous semantics of S is the g-language L((|S| )) where (|S| ) = ⟨S, s 0 , Σ int , →⟩ is defined as follows: • S is the set of configurations of S, as defined above, and s 0 : A → q 0A for each A ∈ P is the initial configuration of S As we will see in Section 7, the synchronous semantics in Definition 6.4 is a choreography automaton [BLT20]. Note that this is not the case for the asynchronous semantics of communicating systems which is in general a transition system with infinitely many configurations.
An immediate relation between systems of CFSMs and our communicating systems is that, given a system of CFSMs S = (M X ) X∈P , we can define the abstract system corresponding to S asŜ = (L(M X )) X∈P . Unsurprisingly, the semantics of S andŜ do coincide. Proof. The proof shows the commutativity of the following diagram where S is the set of systems of CFSMs, L is the set of communicating systems (cf. Definition 2.6), C is the set of c-automata, and G is the set of global languages.
Let M X = ⟨S X , q 0X , Σ act , → X ⟩ and let (|S| ) = ⟨S, s 0 , Σ int , →⟩ where ⟨S, s 0 , Σ int , →⟩ is as in Definition 6.4. We prove the equality by proving separately the following two inclusions, for all S. L((|S| )) ⊆ Ŝ By definition ofŜ, it is enough to show that, for any w ∈ Σ ∞ int , if w ∈ L((|S| )), then for any X ∈ P, we have that w↓ X ∈ L(M X ). Let w ∈ L((|S| )) and proceed by coinduction on the paths of machine (|S| ).
If w = ε the thesis follows immediately. Otherwise Ŝ ⊆ L((|S| )) For this case we have to prove that for all w ∈ Σ ∞ int , if w ∈ Ŝ then w ∈ L((|S| )). Let w ∈ Ŝ and proceed by coinduction. If w = ε the thesis follows immediately. Otherwise, let w = (A− →B:m)·w ′ . Since w↓ A = (A B!m)·w ′ ↓ A , w↓ B = (A B?m)·w ′ ↓ B and, for each X ∈ P \ { A, B }, w↓ X = w ′ ↓ X , and since, for each X ∈ P, we have w↓ X ∈ L(M X ), it follows, by definition of recognised word and by Definition 2.9, that w ′ ∈ Ŝ ′ , witĥ • S is deadlock-free if in none of its reachable configurations without outgoing transitions there exists A ∈ P willing to communicate.
The next definition formalises properties of systems of CFSMs.
Definition 6.6 (Communication properties of systems of CFSMs [BLT20]). Let S = (M X ) X∈P be a system of CFSMs.
• Liveness: S is live if for each configuration s ∈ R((|S| )) and each A ∈ P such that s(A) has some outgoing transition in M A , there exists a run of (|S| ) from s including a transition involving A. • Lock freedom: a configuration s ∈ R((|S| )) is a lock if there is A ∈ P with an outgoing transition from s(A) in M A and there exists a run of (|S| ) starting from s, maximal with respect to prefix order and containing no transition involving A. System S is lock-free if for each s ∈ R((|S| )), s is not a lock. • Deadlock freedom: a configuration s ∈ R((|S| )) is a deadlock if s has no outgoing transitions in (|S| ), yet there exists A ∈ P such that s(A) has an outgoing transition in M A . System S is deadlock-free if for each s ∈ R((|S| )), s is not a deadlock.
It is the case that lock-freedom, strong lock-freedom, and deadlock-freedom ofŜ (in the sense of Definitions 4.3 to 4.5) respectively imply liveness, lock-freedom, and deadlockfreedom of S as stated in Proposition 6.8 below, which relies on the following lemma.
Lemma 6.7. Let S = (M X ) X∈P be a system of CFSMs. If w ∈ L((|S| )) is recognised by a run of (|S| ) from s 0 to s then, for each A ∈ P, w↓ A ∈ L(M A ) and w↓ A is recognised by a run of M A from s 0 (A) to s(A).
Proof. By induction on the length of w using Definition 6.4. Proposition 6.8. For all systems of CFSMs S = (M X ) X∈P (i)Ŝ lock-free iff S live; (ii)Ŝ strong lock-free iff S lock-free; (iii)Ŝ deadlock-free iff S deadlock-free.
Proof. (i) ⇒ LetŜ be lock-free. Following Definition 6.6, in order to show the liveness of S, let us consider s ∈ R((|S| )), and let s(A) a − → be a transition in M A (for a certain A ∈ P). Let now w be the trace of a run of (|S| ) from s 0 to s. By Proposition 6.5 w ∈ Ŝ and, by Lemma 6.7, w↓ A ∈ L(M A ) and w↓ A is the trace of a run from s 0 (A) to s(A) in M A . Now, to prove that S is live, we have to show that there exists a run of (|S| ) from s such that one of its transitions has a component transition from s(A). We have that w↓ A ·a ∈ L(M A ). Hence, by lock-freedom ofŜ, there is w ′ such that w·w ′ ∈ Ŝ and w ′ ↓ A ̸ = ε. By Proposition 6.5 w ′ is also a run of (|S| ) from s and, by Lemma 6.7 (considering the CFSMs S ′ = (M ′ X ) X∈P where each M ′ X is like M ′ X but with s(X) as initial state), such a run has a transition from s(A) as component transition. ⇐ By contraposition, let us assumeŜ not to be lock-free. Then there exists a participant A and a word w ∈ Ŝ such that • w↓ A is not maximal inŜ(A); and • for all w ′ , w·w ′ ∈ Ŝ implies w ′ ↓ A = ε Let s be the configuration in (|S| ) reached by recognising w. Since w↓ A is not maximal, we get, by Lemma 6.7 and determinism of the automata in S, that s(A) has at least an outgoing transition. If S were live, then there would be a run of (|S| ) from s -and hence a word of the form w·w ′ ∈ L((|S| )) = Ŝ -including a transition involving A, so contradicting the liveness of S. So S is not live and we are done. The proof of (ii) and (iii) are similar to (i).
Remark 6.9. Proposition 6.8 does not hold in case we consider non-determistic CFSMs. For instance, let us consider the following system of CFSMs.
The corresponding communicating systems isŜ = (L X ) X∈{ A,B } where It is easy to check thatŜ is deadlock-free. However S is not so, since the system can reach the stuck configuration (q 1 , q 5 ) and there is an outgoing transition from q 1 in M A . Likewise, the communicating system made of M A and either of the following machines

Choreography Automata and Choreography Languages
We advocated choreography automata (c-automata) [BLT20] as an expressive and flexible model of global specifications. Essentially c-automata are finite-state automata (FSAs) whose transitions are labelled with interactions. This yields an immediate connection between g-languages and c-automata in terms of the languages the latter accept. This section explores such connection, as well as the connection between the projection operation on c-automata in [BLT20] and the projection on g-languages.
7.1. The g-languages of c-automata. Choreography automata (c-automata for short, ranged over by CA, CB, etc.) are defined in [BLT20] as deterministic FSAs with labels in the set Σ int of interactions. (Observe that the set of participants occurring in a c-automaton is necessarily finite.) We can see c-automata as a tool for specifying g-languages as shown by the next proposition.
Proof. To show that L(CA) is a g-language we need to show prefix closure. It follows from the fact that all the states are accepting: if a word w is in L(CA) then any prefix of w can be generated by taking the corresponding prefix of the computation generating w.
We now need to show that the language is continuous. We need to show that L(CA) contains an infinite word if it contains all its finite prefixes. Note that, thanks to determinism, words which are prefixes one of the other are generated by computations which are prefixes one of the other as well. Let w be an infinite word whose prefixes are in L(CA). The infinite run obtained as the limit of the runs generating the prefixes of w generates w. This concludes the proof.
Interestingly, c-automata have an immediate relation with CFSMs [BZ83], introduced in the previous section, which can be adopted to model the local behaviour of distributed components. Indeed, the local behaviour of a participant of a c-automaton CA can be algorithmically obtained directly from CA. Formally 3 Definition 7.2 (Projection of c-automata [BLT20]). Let CA = ⟨S, q 0 , Σ int , →⟩ be a cautomaton and A be a participant. The projection of CA on A is the CFSM CA↓ A obtained by determinising up-to-language equivalence the intermediate automaton The projection of CA, written CA↓, is the system of CFSMs (CA↓ A ) A∈P .
The l-language of a projection of a c-automaton CA coincides with the projection of the language of CA: Proposition 7.3. If CA is a c-automaton on P then for all X ∈ P, L(CA)↓ X = L(CA↓ X ).
Proof. By definition of projection and since determinisation preserves the language, we have L(CA↓ X ) = L(A X ), where A X is the intermediate automaton (cf. Definition 7.2). Since A X is obtained by taking the projection of every transition in CA, L(A X ) = L(CA)↓ X . The thesis follows by transitivity.
Observing that CA↓ is ε-free, the LTS of CA↓ is a c-automaton and its language coincides with the g-language of the system (L(CA↓ X )) X∈P thanks to Proposition 6.5.
The notion of well-formedness we provided in [BLT20] was aimed to guarantee correctness and completeness as well as the communication properties of projected systems (cf. Definition 6.6). We discovered later on that our notion of well-formedness in [BLT20] was flawed. In fact it does not correctly handle the interplay between concurrent transitions and choices. This is shown by the following example. is well-formed according to [BLT20,Def. 4.12]. However, the system CA↓ admits the run C− →B:r C− →D:n which is not a word accepted by CA. ⋄ 3 Overloading the projection operator of Definition 2.7 does not introduce confusion and avoids the introduction of further notation. In our setting, the problem of the above CA is that L(CA) is not CUI. In fact, by setting w 1 = A− →B:m and w 2 = C− →D:n, we have that w 1 ·C− →B:r, w 2 ·C− →B:r ∈ L(CA). Now, by taking w = ε, we have that w 1 ↓ C = w↓ C and w 2 ↓ B = w↓ B , but w·C− →B:r = C− →B:r ̸ ∈ L(CA).

7.2.
Deciding CUI and Branch-Awareness. In this section we show that CUI and BA are decidable when restricting to c-languages associated to c-automata. The approach of the present paper can hence be used to prove the communication properties considered in [BLT20] (namely those in Definition 6.6) by showing the corresponding ones in the FCL setting.
We begin by unveiling an interesting interplay between c-automata and FCLs. Languages accepted by c-automata are continuous closures of regular languages, so making conditions like CUI and branch-awareness decidable.
Theorem 7.5. CUI is decidable on languages accepted by c-automata.
Proof. Let L be a language accepted by a c-automaton CA. We show that ¬cui(L) is decidable by reducing the problem to a search in the FSAs CA, CA↓ A , and CA↓ B .
By definition of CUI we have ¬cui(L) iff there are an interaction α = A− →B:m and finite words w 1 , w 2 , w ∈ L such that This amounts to find a state q of CA and two states 4 Q A and Q B respectively in CA↓ A and CA↓ B such that In order to get a decision procedure for ba(L), we observe that the following equivalences do hold. ¬ba(L) iff there are X ∈ P and two maximal words there are two maximal words w ′ 1 ∈ L([CA] p ) and w ′ 2 ∈ L([CA] q ) such that w ′ 1 ↓ X = ε and w ′ 2 ↓ X ̸ = ε. Intuitively, the first equivalence above states that for a participant X it is not possible to distinguish when a computation halts with word w 1 or continues as w 2 . Transferring this condition on automata yields the last equivalence, which basically requires that L([A X ] p ) ∩ L([A X ] q ) ̸ = ∅. Then from its initial state CA has two paths from the initial state respectively reaching p and q with two words whose projection on X coincide. Therefore there are two runs from p and q respectively such that X does not occur in the run from p while it occurs in the run from q, as required by condition b).
We now notice that condition a) is decidable because the intersection of ω-regular languages is computatable. In order to decide condition b), instead, we can proceed as follows. We perform a breath-first search on [A X ] p stopping either when a state does not have outgoing transitions or when it has been already visited. Condition b) holds iff the resulting tree has a path from p to a leaf made of ε-transitions only. Finiteness of automata guarantees that this procedure terminates. Likewise we can check the existence of a word w ′ 2 ∈ L([CA] q ) such that w ′ 2 ↓ X ̸ = ε (as required by condition b)).

Global Types as Choreographic Languages
The global types of [SD19] are our last case study. Informally, a global type A → B : {m i .G i } 1≤i≤n specifies a protocol where participant A must send to B a message m i for some 1 ≤ i ≤ n and then, depending on which m i was chosen by A, the protocol continues as G i . Global types and multiparty sessions are defined in [SD19] in terms of the following grammars for, respectively, pre-global types, pre-processes, and pre-multiparty sessions (we adapted some of the notation to our setting): where we assume messages m i to be pairwise distinct in sets {m i .G i } 1≤i≤n and {m i .P i } 1≤i≤n , to whose elements we call branches. The first two grammars are interpreted coinductively, that is their solutions are both minimal and maximal fixpoints (the latter corresponding to infinite trees). A pre-global type G (resp. pre-process P ) is a global type (resp. process) if its tree representation is regular, namely it has finitely many distinct sub-trees. A multiparty session (MPS for short) is a pre-multiparty session such that (a) in A ▷ P , participant A does not occur in process P and (b) in A 1 ▷P 1 | . . . | A n ▷ P n , participants A i are pairwise different. The semantics of global types is the LTS induced by the following two rules: Rule (GT2) allows out-of-order execution, namely interaction A− →B:m between participants A and B can happen even if an interaction between other participants is syntactically occurring before, provided that A− →B:m occurs in all branches. As usual, we let G The semantics for MPSs is the LTS defined by the following rule (1) applies only if the messages in Λ ′ include those in Λ, which is the case for MPSs obtained by projection, defined below.
Definition 8.1 (Projection [SD19, Definition 3.4]). The projection of G on a participant X such that the depths of its occurrences in G are bounded is the partial function G ↾ X coinductively defined by end ↾ X = 0 and, for a global type G = A → B : {m i .G i } 1≤i≤n , by: The global type G is projectable 5 if G ↾ X is defined for all participants X of G, in which case G ↾ denotes the corresponding MPS.
The projection on X is partial because if we have a choice where X is not involved in the first interaction then only the last clause in Definition 8.1 could apply, but this clause requires X to start each branch with an input action from the same sender S.
Let L(G) be the language coinductively defined as follows: The g-language L(G) associated to a global type G is the concurrency and prefix closure of L(G), that is We define the l-language L(B ▷ P ) associated to a named process B ▷ P as the prefix closure of L(B ▷ P ) which, letting ⋆ ∈ { ?, ! }, is defined by The symmetry between senders and receivers in CUI and branch-awareness allows for an immediate generalisation of the projection in Definition 8.1 by extending the last case with the clause: if X ̸ ∈ {A, B}, n > 1, and for all 1 ≤ i ≤ n, G i ↾ X = S!Λ i Corollary 8.6 still holds for this generalised definition of projection.

Related Work
The use of formal language theories for the modelling of concurrent systems dates back to the theory of traces [Maz86]. A trace is an equivalence class of words that differ only for swaps of independent symbols. Closure under concurrency corresponds on finite words to form traces, as we noted after Definition 2.12. An extensive literature has explored a notion of realisability whereby a language of traces is realisable if it is accepted by some class of finitestate automata. Relevant results in this respect are the characterisations in [Zie87,Dub86] (and the optimisation in [GM06]) for finite words and the ones in [Gas90,Gas91,GPZ91] for infinite ones. A key difference of our framework with respect to this line of work is that we aim to stricter notions of realisability: in our context it is not enough that the runs of the language may be faithfully executed by a certain class of finite-state automata. Rather we are interested in identifying conditions on the g-languages that guarantee well-behaved executions in "natural" realisations.
Other abstract models of choreographies, such as Conversation protocols (CP) [FBS04] and c-automata [BLT20], have some relation with our FCL. We discussed in depth the relation with c-automata in Section 7.
CP, probably the first automata-based model of choreographies, are non-deterministic Büchi automata whose alphabet resembles a constrained variant of our Σ int . A comparison with the g-languages accepted by CPs is not immediate as CPs are based on asynchronous communications (although some connections are evident as noted below Definition 2.9).
Other proposals ascribable to choreographic settings (cf. [HLV + 16]) define global views that can be seen as g-languages. We focus on synchronous approaches because our current theory needs to be extended to cope with asynchrony.
In [BZ07, LGMZ08] the correctness of implementations of choreographies (called choreography conformance) is studied in a process algebraic setting. The other communication properties we consider here are not discussed there.
The notion of choreography implementation in [BZ07] corresponds to our correctness plus a form of existential termination. It is shown that one can decide whether a system is an implementation of a given choreography, since both languages are generated by finite-state automata, hence language inclusion and existential termination are decidable.
In [LGMZ08] three syntactic conditions (connectedness, unique points of choice and causality safety) ensure bisimilarity (hence trace equivalence) between a choreography and its projection. Connectedness rules out systems which are not concurrency closed, while we conjecture that unique points of choice and connectedness together imply our CUI and BA.  Causality safety, immaterial in our case, is needed in [LGMZ08] due to the existence of an explicit parallel composition operator in their process algebra. Many multiparty session type systems [HLV + 16] have two levels of types (global and local) and one implementation level (local processes). This is the case also for synchronous session type systems such as [KY14, DGJ + 15]. The approach in [SMD22] instead merges the two levels of types into a single one to allow for composition, while having a level of local processes for implementation. Our approach, like the session type systems in [SD19,BDCLT21], considers only (two) abstract descriptions, g-languages and l-languages. The literature offers several behavioural types featuring correctness-by-construction principles through conditions (known as projectability or well-branchedness) more demanding than ours. For instance, relations similar to those in Section 8 can be devised for close formalisms, such as [BDCLT21] whose notion of projection is more general than the one in [SD19], yet its notion of projectability still implies CUI and BA.
There is a connection between CUI and the closure property CC2 [AEY03] on messagesequence charts (MSCs) [ITU96]. On finite words CC2 and CUI coincide. Actually, CUI can be regarded as a step-by-step way to ensure CC2 on finite words. The relations between our properties and CC3, also used in MSCs, are still under scrutiny.

Concluding Remarks
We introduced formal choreography languages as a general and abstract theory of choreographies based on formal languages. In this theory we recasted known properties and constructions such as projections from global to local specifications.
One of our contributions is the characterisation of systems' correctness in terms of closure under unknown information. Other communication properties can be ensured by additionally requiring branch awareness. A synopsis of our main contributions is in Figure 10.1.
We demonstrated the versatility of FCL by considering three existing models. We showed some relations between FCL and two automata models, communicating finite-state machines [BZ83] and c-automata [BLT20]. These models are close to FCL, given that the relations retrace the well-known connection between automata and formal language theories. The last model captured with FCL is the variant of MPSTs presented in [SD19]. Being based on behavioural types, this model radically differs from FCL.
Future work. Our investigation proposes a new point of view for choreography formalisms and the related constructions. As such, a number of extensions and improvements need to be analysed, to check how they may fit in our setting. We list below the most relevant. Our framework considers only point-to-point synchronous communications. Possible generalisations could contemplate other interaction models such as those e.g., in [CMSY17,Wad14] and those based on asynchronous communications. Another possible generalisation is to consider nominal FCLs in the line of nominal languages [KST12,BBKL12,KMPS15]. While the general approach should apply, it is not immediate how to extend CUI in order to characterise correctness for an asynchronous semantics. This is somehow confirmed by the results in [Alu05, AEY01] on the realisability of MSCs showing that in the asynchronous setting this is a challenging problem.
A second direction is analysing how to drop prefix-closure, so allowing for specifications where the system (and single participants) may stop their execution at some points but not at others; a word would hence represent a complete computation, not only a partial one.
A further direction would unveil the correspondence between closure properties and subtyping relations used in many multiparty session types approaches.
Additionally, it would be good to understand whether the composition and decomposition operators or the partial multiparty sessions defined in, respectively, [BDCLT21] and [SMD22], for multiparty session types can be rephrased in the more general setting of FCLs.
Another intriguing research direction is the application of FCLs to capture properties that are not usually considered in behavioural type frameworks. Specifically, the notions of receptiveness and responsiveness have been defined in [tBCHK17,tBCHP21,tBHK20] to formalise the properties of communicating systems where some components can always succeed to send or receive messages. An interesting initial question to address is whether receptiveness and responsiveness can be characterised in terms of FCLs. A starting point to address it could be to give FCL models for the dynamic logic used in [tBCHP23] to characterise receptiveness.
Finally, our approach strives for generality neglecting efficiency and practical applicability. An interesting research direction is to identify classes of languages ensuring that our properties, such as CUI, could be checked efficiently.