A Model of Cooperative Threads

We develop a model of concurrent imperative programming with threads. We focus on a small imperative language with cooperative threads which execute without interruption until they terminate or explicitly yield control. We define and study a trace-based denotational semantics for this language; this semantics is fully abstract but mathematically elementary. We also give an equational theory for the computational effects that underlie the language, including thread spawning. We then analyze threads in terms of the free algebra monad for this theory.


Introduction
In the realm of sequential programming, semantics, whether operational or denotational, provides a rich understanding of programming constructs and languages, and serves a broad range of purposes.These include, for instance, the study of verification techniques and the reconciliation of effects with functional programming via monads.With notorious difficulties, these two styles of semantics have been explored for concurrent programming, and, by now, a substantial body of work provides various semantic accounts of concurrency.Typically, that work develops semantics for languages with parallel-composition constructs and various communication mechanisms.
Surprisingly, however, that work provides only a limited understanding of threads.It includes several operational semantics of languages with threads, sometimes with operational notions of equivalence, e.g., [BMT92,PR97,Jef97,JR05]; denotational semantics of those languages seem to be much rarer, and to address message passing rather than sharedmemory concurrency, e.g., [FH99,Jef95].Yet threads are in widespread use, often in the context of elaborate shared-memory systems and languages for which a clear semantics would be beneficial.
In this paper, we investigate a model of concurrent imperative programming with threads.We focus on cooperative threads which execute, without interruption, until they either terminate or else explicitly yield control.Non-cooperative threads, that is, threads with preemptive scheduling, can be seen as threads that yield control at every step.In this sense, they are a special case of the cooperative threads that we study.
Cooperative threads appear in several systems, programming models, and languages.Often without much linguistic support, they have a long history in operating systems and databases, e.g., [SQL07].Cooperative threads also arise in other contexts, such as Internet services and synchronous programming [AHT02, BCZ03,Bou06,Bou07,AZ06].Most recently, cooperative threads are central in two models for programming with transactions, Automatic Mutual Exclusion (AME) and Transactions with Isolation and Cooperation (TIC) [IB07,SKB07].AME is one of the main starting points for our research.The intended implementations of AME rely on software transactional memory [ST95] for executing multiple cooperative threads simultaneously.However, concurrent transactions do not appear in the high-level operational semantics of the AME constructs [ABH08].Thus, cooperative threads and their semantics are of interest independently of the details of possible transactional implementations.
We define and study three semantics for an imperative language with primitives for spawning threads, yielding control, and blocking execution.
• We obtain an operational semantics by a straightforward adaptation of previous work.
In this semantics, we describe the meaning of a whole program in terms of small-step transitions between states in which spawned threads are kept in a thread pool.This semantics serves as a reference point.• We also define a more challenging compositional denotational semantics.The meaning of a command is a prefix-closed set of traces.Prefix-closure arises because we are primarily interested in safety properties, that is, in "may" semantics.Each trace is roughly a sequence of transitions, where each transition is a pair of stores, and a store is a mapping from variables to values.We establish adequacy and full-abstraction theorems with respect to the operational semantics.These results require several non-trivial choices in the definition of the denotational semantics.• Finally, we define a semantics based on the algebraic theory of effects.More precisely, we give an equational theory for the computational effects that underlie the language, and analyze threads in terms of the free algebra monad for this theory.This definition is more principled and systematic; it explains threads with standard semantic structures, in the context of functional programming.As we show, furthermore, we obtain our denotational semantics as a special case.Section 2 introduces our language and Section 3 defines its operational semantics.Section 4 develops its denotational semantics.Section 5 presents our adequacy and full-abstraction theorems (Theorems 5.10 and 5.15).Section 6 concerns the algebraic theory of effects and the analysis of the denotational semantics in this monadic setting (Theorem 6.4).Section 7 concludes.

The Language
Our language is an extension of a basic imperative language with assignments, sequencing, conditionals, and while loops (IMP [Win93]).Programs are written in terms of a finite set of variables Vars, whose values are natural numbers.In addition to those standard constructs, our language includes: • A construct for executing a command in an asynchronous thread.Informally, async C forks off the execution of C.This execution is asynchronous, and will not happen if the present thread keeps running without ever yielding control, or if the present thread blocks without first yielding control.• A construct for yielding control.Informally, yield indicates that any pending thread may execute next, as may the current thread.• A construct for blocking.Informally, block halts the execution of the entire program, even if there are pending threads that could otherwise make progress.We define the syntax of the language in Figure 1.We do not detail the constructs on numerical and boolean expressions, which are as usual.
Figure 2 gives an illustrative example.It shows a piece of code that spawns the asynchronous execution of x := 0, then executes x := 1 and yields, then resumes but blocks unless the predicate x = 0 holds, then executes x := 2. The execution of x := 0 may hapasync x := 0; x := 1; yield; if x = 0 then skip else block; x := 2 pen once the yield statement is reached.With respect to safety properties, the conditional blocking amounts to waiting for x = 0 to hold.More generally, AME's blockUntil b can be written if b then skip else block.
More elaborate uses of blocking are possible too, and supported by lower-level semantics and actual transactional implementations [IB07,ABH08].In those implementations, blocking may cause a roll-back and a later retry at an appropriate time.We regard roll-back as an interesting aspect of some possible implementations, but not as part of the high-level semantics of our language, which is the subject of this work.Thus, our language is basically a fragment of the AME calculus [ABH08].It omits higher-order functions and references.It also omits "unprotected sections" for non-cooperative code, particularly legacy code.Non-cooperative code can however be modeled as code with pervasive calls to yield (at least with respect to the simple, strong memory models that we use throughout this paper; cf.[GMP06]).See Section 7 for further discussion of possible extensions to our language.

Operational Semantics
We give an operational semantics for our language.Despite some subtleties, this semantics is not meant to be challenging.It is given in terms of small-step transitions between states.Accordingly, we define states, evaluation contexts, and the transition relation.
3.1.States.As described in Figure 3, a state Γ = σ, T, C consists of the following components: • a store σ which is a mapping of the given finite set Vars of variables to a set Value of values, which we take to be the set of natural numbers; • a finite sequence of commands T which we call the thread pool; • a distinguished active command C. We write σ[x → n] for the store that agrees with σ except at x, which is mapped to n.We write σ(b) for the boolean denoted by b in σ, and σ(e) for the natural number denoted by e in σ, similarly.We write T.T ′ for the concatenation of two thread pools T and T ′ .

Evaluation Contexts.
As usual, a context is an expression with a hole [ ], and an evaluation context is a context of a particular kind.Given a context C and an expression C, we write C[C] for the result of placing C in the hole in C. We use the evaluation contexts defined by the grammar: Figure 4: Transition rules of the abstract machine.

3.3.
Steps.A transition Γ −→ Γ ′ takes an execution from one state to the next.Figure 4 gives rules that specify the transition relation.According to these rules, when the active command is skip, a command from the pool becomes the active command.It is then evaluated as such until it produces skip, yields, or blocks.No other computation is interleaved with this evaluation.Each evaluation step produces a new state, determined by decomposing the active command into an evaluation context and a subexpression that describes a computation step (for instance, a yield or a conditional).
In all cases at most one rule applies.In two cases, no rule applies.The first is when the active command is skip and the pool is empty; this situation corresponds to normal termination.The second is when the active command is blocked, in the sense that it has the form E[block]; this situation is an abnormal termination.

Denotational Semantics
Next we give a compositional denotational semantics for the same language.Here, the meaning of a command is a prefix-closed set of traces, where each trace is roughly a sequence of transitions, and each transition is a pair of stores.
The use of sequences of transitions goes back at least to Abrahamson's work [Abr79] and appears in various studies of parallel composition [AP93,HdeBR94,Bro96,Bro02].However, the treatment of threads requires some new non-trivial choices.For instance, transition sequences, as we define them, include markers to indicate not only normal termination but also the return of the main thread of control.Moreover, although these markers are similar, they are attached to traces in different ways, one inside pairs of stores, the other not.Such details are crucial for adequacy and full abstraction.Also crucial to full abstraction is minimizing the information that the semantics records.More explicit semantics will typically be more transparent, for instance, in detailing that a particular step in a computation causes the spawning of a thread, but will consequently fail to be fully abstract.
Section 4.1 is an informal introduction to some of the details of the semantics.Section 4.2 defines transition sequences and establishes some notation.Sections 4.3 and 4.4 define the interpretations of commands and thread pools, respectively.Section 4.5 discusses semantic equivalences.4.1.Informal Introduction.As indicated above, the meaning of a command will be a prefix-closed set of traces, where each trace is roughly a sequence of transitions, and each transition is a pair of stores.Safety properties-which pertain to what "may" happen-are closed under prefixing, hence the prefix-closure condition.Intuitively, when the meaning of a command includes a trace (σ 1 , σ ′ 1 )(σ 2 , σ ′ 2 ) . .., we intend that the command may start executing with store σ 1 , transform it to σ ′ 1 , yield, then resume with store σ 2 , transform it to σ ′ 2 , yield again, and so on.
In particular, the meaning of block will consist of the empty sequence ε.The meaning of yield; block will consist of the empty sequence ε plus every sequence of the form (σ, σ), where σ is any store.Here, the pair (σ, σ) is a "stutter" that represents immediate yielding.
If the meaning of a command , which is obtained by concatenation plus a simple local composition between (σ n , σ ′ n ) and (σ ′ n , σ ′′ n ).Unfortunately, this naive expectation is incorrect.In a trace (σ 1 , σ ′ 1 )(σ 2 , σ ′ 2 ) . .., some of the pairs may represent steps taken by commands to be executed asynchronously.Those steps need not take place before any further command D starts to execute.
Accordingly, computing the meaning of C; D requires shuffling suffixes of traces in C with traces in D. The shuffling represents the interleaving of C's asynchronous work with D's work.We introduce a special return marker "Ret" in order to indicate how the traces in C should be parsed for this composition.In particular, when C is of the form C 1 ; async (C 2 ), any occurrence of "Ret" in the meaning of C 2 will not appear in the meaning of C. The application of async erases any occurrence of "Ret" from the meaning of C 2 -intuitively, because C 2 does not return control to its sequential context.
For example, the meaning of the command Ret) for every σ and σ ′ .On the other hand, the meaning of the command ) for every σ and σ ′ .The different positions of the marker Ret correspond to different junction points for any commands to be executed next.
If the meaning of C contains u(σ n , σ ′ n Ret)u ′ and the meaning of D contains (σ ′ n , σ ′′ n )v, then the meaning of C; D contains u(σ n , σ ′′ n )w, where w is a shuffle of u ′ and v. Notice that the marker from u(σ n , σ ′ n Ret)u ′ disappears in this combination.The marker in u(σ n , σ ′′ n )w, if present, comes from (σ ′ n , σ ′′ n )v.An analogous combination applies when the meaning of C contains u(σ n , σ ′ n Ret)u ′ and the meaning of D contains (σ ′ n , σ ′′ n Ret)v (a trace that starts with a transition with a marker).Moreover, if the meaning of C contains a trace without any occurrence of the marker Ret, then this trace is also in the meaning of C; D: the absence of a marker makes it impossible to combine this trace with traces from D.
An additional marker, "Done", ends traces that represent complete normally terminating executions.Thus, the meaning of skip will consist of the empty sequence ε and every sequence of the form (σ, σ Ret) plus every sequence of the form (σ, σ Ret)Done.Contrast this with the meaning of yield; block given above.
It is possible for a trace to contain a Ret marker but not a Done marker.Thus, the meaning of async (block) will contain the empty sequence ε plus every sequence of the form (σ, σ Ret), but not (σ, σ Ret)Done.
More elaborately, the meaning of the code of Figure 2 will contain all traces of the form where we write σ[n] as an abbreviation for σ[x → n].These traces model normal termination after taking the true branch of the conditional if x = 0 then x := 2 else block.The meaning will also contain all prefixes of those traces, which model partial executionsincluding those that take the false branch of the conditional and terminate abnormally.The two markers are somewhat similar.However, note that (σ, σ ′ Ret) is a prefix of (σ, σ ′ Ret)Done, but (σ, σ ′ ) is not a prefix of (σ, σ ′ Ret).Such differences are essential.

Transitions and Transition Sequences.
A plain transition is a pair of stores (σ, σ ′ ).A return transition is a pair of stores (σ, σ ′ Ret) in which the second is adorned with the marker Ret.A transition is a plain transition or a return transition.
A main-thread transition sequence (hereunder simply: transition sequence) is a finite (possibly empty) sequence, beginning with a sequence of transitions, of which at most one (not necessarily the last) is a return transition, and optionally followed by the marker Done if one of the transitions is a return transition.We write TSeq for the set of transition sequences.
A pure transition sequence is a finite sequence of plain transitions, possibly followed by a marker Done.Note that such a sequence need not be a transition sequence.It is proper if it is not equal to Done.We write PSeq for the set of pure transition sequences, and PPSeq for the subset of the proper ones.
We use the following notation: • We typically let u, v, and w range over transition sequences or pure transition sequences, and let t range over non-empty ones.• We write u ≤ p v for the prefix relation between sequences u and v (for both kinds of sequences, pure or not).For example, as mentioned above, we have that (σ, σ ′ Ret) ≤ p (σ, σ ′ Ret)Done, but (σ, σ ′ ) ≤ p (σ, σ ′ Ret).• A set P is prefix-closed if whenever u ≤ p v ∈ P then u ∈ P .We write P ↓ for the least prefix-closed set that contains P .• For a non-empty sequence of transitions t, we write fst(t) for the first store of the first transition of t. • For a transition sequence u, we write u c for the pure transition sequence obtained by cleaning u, which means removing the Ret marker, if present, from u.
• We let τ range over stores and stores with return markers.

Interpretation of Commands.
Preliminaries.We let Proc be the collection of the non-empty prefix-closed sets of transition sequences, and let Pool be the collection of the non-empty prefix-closed sets of pure transition sequences.Under the subset partial ordering, Proc and Pool are both ω-cpos (i.e., partial orders with sups of increasing sequences) with least element {ε}.We interpret commands as elements of Proc.We use Pool as an auxiliary ω-cpo; below it also serves for the semantics of thread pools.We also let AProc be the sub-ω-cpo of Pool of all non-empty prefix-closed sets of proper pure transition sequences.We think of such sets as modeling asynchronous threads, spawned by an active thread; the difference from Pool is that the latter also contains an element that models the empty thread pool.We define a continuous cleaning function − c : Proc → AProc by: P c = {u c | u ∈ P } (Continuous functions are those preserving all sups of increasing sequences.) We define the set u ⊲⊳ v of shuffles of a pure transition sequence u with a sequence v, whether a transition sequence or a pure transition sequence, as follows: • If neither finishes with Done, their set of shuffles is defined as usual for finite sequences.
• If u does not finish with Done, then a shuffle of u and v Done is a shuffle of u and v.
Similarly, if v does not finish with Done, then a shuffle of u Done and v is a shuffle of u and v. • A shuffle of u Done and v Done is a shuffle of u and v followed by Done.If both u and v are pure transition sequences then so is every element of u ⊲⊳ v; if u is a pure transition sequence and v is a transition sequence, then every element of u ⊲⊳ v is a transition sequence.
Lemma 4.1.For any u,v, and w where either: • all three are pure transition sequences, or • u and v are pure transition sequences, and w is a transition sequence we have: We define a continuous composition function Composition is associative with two-sided unit, given by: * maps a command to a non-empty prefix-closed set of transition sequences.We define it in Figure 5. There, the interpretation of loops relies on the following approximations: The 0-th approximant corresponds to divergence, which here we identify with blocking.
We straightforwardly extend the semantics to contexts, so that 4.4.Interpretation of Thread Pools.As an auxiliary definition, it is important to have also an interpretation of thread pools as elements of Pool.We develop one in this section.
We define the set of right shuffles u ⊲ v of a pure transition sequence u with a transition sequence v by setting The use of the notation async for both a unary and a binary operation is a slight abuse, though in line with the algebraic theory of effects: see the discussion in Section 6.In this regard note the equality async(P ) • Q = async(P, Q) (and the equality [[yield]] • P = d(P ) points to the corresponding relationship between d and [[yield]]).
4.4.2.Interpretation.We define the semantics of thread pools by: Proof.For the first part, one shows for all pure transition sequences u and v and transition sequences w that: To this end, one proceeds by cases on w, using Lemma 4.1.The second part is obvious.4.5.Equivalences.An attractive application of denotational semantics is in proving equivalences and implementation relations between commands.Such denotational proofs tend to be simple calculations.Via adequacy and full-abstraction results (of the kind established in Section 5), one then obtains operational results that would typically be much harder to obtain directly by operational arguments.
As an example, we note that we have the following equivalence: This equivalence follows from three facts: This particular equivalence is interesting for two reasons: • It models an implementation strategy (in use in AME) where, when executing C; yield; D, the yield causes a new asynchronous thread for D to be added to the thread pool.• It illustrates one possible, significant pitfall in more explicit semantics.As discussed above, such a semantics might detail that a particular step in a computation causes the spawning of a thread.More specifically, it might extend transitions with an extra trace component: a triple (σ, u, τ ) might represent a step from σ to τ that spawns a thread that contains the trace u.With such a semantics, the meanings of async (C; yield; D) and async (C; async (D)) would be different, since they have different spawning behavior.Many other useful equivalences hold.For instance, we have: trivially.For every C, we also have: and, for every C and D, we have: Thus, the semantics does not distinguish an infinite loop which never yields from immediate blocking.On the other hand, we have: The command while (0 = 0) do yield generates unbounded sequences of stutters (σ, σ).Similarly, we have: ] Alternative semantics that would distinguish while (0 = 0) do skip from block or that would identify while (0 = 0) do yield with block and yield; yield with yield are viable, however.We briefly discuss those variants and others in Section 7.
We leave as subjects for further research the problems of axiomatizing and of deciding equivalence and implementation relations, and the related problem of program verification, perhaps restricted to subsets of the language-even, for example, to the subset with just composition, spawning, and yielding.There is a large literature on axiomatization and decidability in concurrency theory; see, e.g., [AI07] for discussion and further references.Also, recent results on the automatic verification of asynchronous programs appear rather encouraging [JM07,GMR09]; some of their ideas might be applicable in our setting.4.6.Two Extensions.Trace-based semantics can also be given for variants and enhancements of our basic imperative language.Here we illustrate this point by considering two such enhancements, which illustrate the use of Ret and Done.Section 7 briefly considers other possible language features.4.6.1.finish.While cleaning maps a transition sequence sequence to a proper pure transition sequence, a marking function maps a proper pure transition sequence to a transition sequence.For a proper pure transition sequence u, we define u m by: Thus, u m includes a marker Ret only if u contains a marker Done (that is, if u corresponds to a terminating execution); the marker Ret is on the last transition of u m , intuitively indicating that control is returned to the sequential context when execution terminates.Much as for cleaning, we extend marking to non-empty prefix-closed sets of proper pure transition sequences: − m : AProc → Proc Using this extension, we can define the meaning of a construct finish, inspired by that of the X10 language [CGA05, SJ05].We set: The intent is that finish C executes C and returns control when all activities spawned by C terminate.For instance, in finish (async (x := 0)); x := 1, the assignment x := 1 will execute only after x := 0 is done.In contrast, in async (x := 0); x := 1, the assignments have the opposite ordering.However, finish (async (x := 0)) is not equivalent to x := 0, but rather to yield; x := 0. Beyond this simple example, finish can be applied to more complex commands, possibly with nested forks, and ensures that all the activities forked terminate before returning control.4.6.2.Parallel Composition.The definition of parallel composition relies on familiar themes: the use of shuffling, and the decomposition of parallel composition into two cases.The cases correspond to whether the left or the right argument of parallel composition takes the first step.
We define parallel composition at the level of transition sequences by letting u || u ′ and u || l u ′ be the least sets that satisfy prefix-closure and the following clauses: Extending this function to − || − : Proc × Proc → Proc we can define the meaning of a parallel-composition construct: The reader may verify that parallel composition, as defined here, has the expected properties, for instance that it is commutative and associative with unit skip.It is also worth noting that (under mild assumptions on the available expressions) the binary nondeterministic choice operator ∪ considered in Section 6.1 is definable from parallel composition.The converse also holds, under restricted circumstances: if all occurrences of yield in C and D occur inside an async then we have:

Adequacy and Full Abstraction
In this section we establish that the denotational semantics of Section 4 coincides with the operational semantics of Section 3, and is fully abstract.
The adequacy theorem (Theorem 5.10), which expresses the coincidence, says that the traces that the denotational semantics predicts are exactly those that can happen operationally.These traces may in general represent the behavior of a command in a context.As a special case, the adequacy theorem applies to runs, which are essentially traces that the command can produce on its own, i.e., with an empty context.This special case is spelled out in Corollary 5.11 which states that the runs that the denotational semantics predicts are exactly those that can happen operationally The full-abstraction theorem (Theorem 5.15) states that two commands C and D have the same set of traces denotationally if, and only if, they produce the same runs in combination with every context.In particular, observing runs, we cannot distinguish C and D in any context.Note that, given Corollary 5.11, we may equivalently speak of runs denotationally or operationally.We comment on other possible notions of observation, and the corresponding full-abstraction results, below.
Section 5.1 defines runs precisely.Sections 5.2 and 5.3 present our adequacy and fullabstraction results, respectively.5.1.Runs.A pure transition sequence generates a run if, however it can be written as such a pure transition sequence, we set run(w) = σ 1 . . .σ n and run(w Done) = σ 1 . . .σ n Done.A transition sequence u generates a run if u c does, and then we set run(u) = run(u c ).
If a pure transition sequence u generates a run, then it can be easily be recovered from run(u): the run σ 1 . . .σ n maps back to and the run σ 1 . . .σ n Done maps back to Since each non-empty run contains at least two elements, this definition applies when n = 0 and n ≥ 2. We write runs(P ) for the set of runs generated by (pure) transition sequences in P .

Adequacy.
Lemma 5.1.The following equalities hold: Proof.The first part is immediate from the semantics of block and the definition of composition.The second part holds as * is a unit for composition.The third part follows from the facts that async(P ) • Q = async(P, Q) and that composition is associative with unit * .
For the fourth part, using the third part one sees that it is enough to show that for every E we have: [ Proof.We calculate: Proof.Immediate from the definition of async.
The next lemma applies when C is neither skip nor blocked.
and we are done.
In the case where C instead has the form E[yield], we have σ ′ = σ, T ′ = T.E[skip], C ′ = skip and, again using Lemma 5.1, we calculate: and we are done.
In the next case, C has the form E[x := e], and we have Otherwise, C has one of the forms E[if b then C else D] or E[while b do C] and we proceed much as in the previous case.
Proof.This follows from Lemmas 5.3 and 5.4.
For the proof of the converse of this lemma, we proceed by an induction on the size of loop-free commands.We then extend to general commands by expressing their semantics in terms of the semantics of their approximations by loop-free commands.The size of a loop-free command is defined by structural recursion: and the conclusion follows.The other cases are straightforward.
Next we define the approximants C (i) of a command C by induction on i and structural recursion on C, beginning with the case where C has one of the forms skip, block, x := e, or yield, when C (i) = C, and continuing with: (2) For any command D: Proof.The first part is evident using the monotonicity of the semantics of the program constructors and the semantic of loops.For the second part, we proceed by structural induction on D. All cases are straightforward, using the continuity of the program constructors, except for loops where we calculate: We can now establish the converse of Lemma 5.5.
, where now D is not loop-free.By Lemma 5.7 . The desired conclusion follows immediately, using Lemma 5.6.Lemma 5.9.
(1) For any proper non-empty pure transition sequence u, (σ, In the case where u is proper the conclusion follows from Lemma 5.1.In the case where u is Done we see from the definition of The following Adequacy Theorem for pure transition sequences is an immediate consequence of Lemmas 5.8 and 5.9: As a corollary we obtain an adequacy theorem for runs: 5.3.Full Abstraction.The first lemma in the proof of full abstraction bounds the nondeterminism of commands in semantic terms.
Lemma 5.12.For all C, u, and σ, the set Proof.More generally, we prove that for all T , C, u = (σ 1 , τ 1 ) . . .(σ n−1 , τ n−1 ), and The proof is by induction on n.The proof relies on adequacy; a purely semantic proof might be possible but seems harder.
In case n = 1, we are done, with a unique choice for τ 1 .Otherwise, we conclude by induction hypothesis.• if C is blocked, then n = 0, by Lemma 5.2, so this case is vacuous.
• If C is neither skip nor blocked, then Lemma 5.8 implies that τ 1 is unique.In case n = 1, we are done, with a unique choice for τ 1 .Otherwise, Lemma 5.8 also implies that (σ 2 , τ 2 ) . . .(σ n , τ) ∈ [[T ′ ]] for a unique T ′ .As in the case of skip, the desired conclusion follows by induction hypothesis.Intuitively, Lemma 5.12 is useful because it implies that, at any point, there are certain steps that a command cannot take, and in proofs those steps can be used as unambiguous, visible markers of activity by the context.This lemma is somewhat fragile-it does not hold once one adds to the language either the nondeterministic choice operator considered in Section 6.1 or the parallel composition operator of Section 4.6.2.It follows that neither of these operators is definable in the language.An alternative argument that does not use the lemma relies on fresh variables instead.The fresh variables permit an alternative definition of the desired markers.
Full-abstraction results invariably require some notion of observation.Let us write obs(P ) for the observations that we make on P ∈ Proc.Equational full abstraction is that

[[C]] = [[D]] if and only if, for every context C, we have obs([[C[C]]]) = obs([[C[D]]]).
In other words, two commands have the same meaning if and only if they yield the same observations in every context of the language.The stronger inequational full abstraction is that [ The difficult part of this equivalence is usually the implication from right to left: that if, for every context One possible candidate for obs(P ) is P c .This notion of observation can be criticized as too fine-grained.Nevertheless, we find it useful to prove full abstraction for this notion of observation, with the following lemma.We first need some auxiliary definitions for its proof, and the lemma that follows.Given two stores σ and σ ′ , we define: • a boolean expression check(σ) as the conjunction of the formulas x = n for every variable x, where n is the natural number σ(x) (so check(σ) is true in σ and false elsewhere); • a command goto(σ) as the sequence of assignments x := n for every variable x, where n is the natural number σ(x); • a command (σ σ ′ ) as if check(σ) then goto(σ ′ ) else block; • a command (σ σ ′ σ ′′ ) as (σ σ ′ ); yield; (σ ′ σ ′′ ); yield.These definitions exploit the fact that the set of variables is finite.However, with more care, analogous definitions could be given otherwise, by focusing on the set of variables relevant to the programs under observation.If w = w c , then w is of the form u(σ, σ ′ Ret)v.We let C = [ ]; (σ ′ σ ′′ ) where σ ′′ does not appear in u or v and u(σ, σ ′′ ) ∈ Q (so, by prefix-closure, u(σ, σ ′′ )v ∈ Q).Such a choice of σ ′′ is always possible by Lemma 5.12.Thus, [[C]](P ) contains u(σ, σ ′′ Ret)v, and , and that this is because some sequence w ′ is in [[C]](Q) and w ′ c = u(σ, σ ′′ )v.By the definition of the semantics of sequential composition, this could arise in one of the following ways: and σ ′′ occurs as the second store of a return transition in either u ′ or v ′ .This contradicts the requirement that σ ′′ does not appear in u or v. • w ′ = u(σ, σ ′′ )v, w ′ ∈ Q, and w ′ does not have a return transition.This contradicts the requirement that u(σ, σ ′′ ) ∈ Q.
Another possible candidate for obs(P ) is runs(P ).Runs record more than mere inputoutput behavior, but much less than entire execution histories.We therefore find them attractive for our purposes.The following lemma connects runs to cleaning.
we assume that P c ⊆ Q c and prove that there exists For this, choose a sequence w ∈ P c but w ∈ Q c , in order to derive a contradiction.First, suppose that w is of the form (σ 1 , σ ′ 1 ) . . .(σ n , σ ′ n ), with n > 0. We let C be async [ ]; mesh(w), where mesh(w) is the command n ) where the stores σ ′′ i are all different from one another and from all other stores in w, and are such that i is always possible by Lemma 5.12.Since [[mesh(w)]] contains the transition sequence: )Done we obtain that [[C]](P ) contains the transition sequence: Suppose that this run is also in runs([[C]](Q)).Therefore, there exists )Done which we call w ′′ , or with a prefix of w ′′ .We analyze the origin of the transitions in the shuffle: • The transitions (σ i , σ ′ i ) must all come from w ′ , since each of the transitions in w ′′ contains one of the stores σ ′′ j and, by choice, these are different from σ i and σ ′ i .• Suppose that, up to some i − 1 < n, w ′ starts like w, in other words it starts as (σ 1 , σ ′ 1 ) . . .(σ i−1 , σ ′ i−1 ).Suppose further that, in the shuffle up to this point, each transition (σ j , σ ′ j ) is followed immediately by the corresponding transitions (σ ′ j , σ ′′ j )(σ ′′ j , σ j+1 ) from w ′′ .We argue that this remains the case up to n.
this transition comes from w ′′ .− One step further, in order to derive a contradiction, we suppose that the transition (σ ′′ i−1 , σ i ) comes from w ′ .So w ′ starts: since, as noted above, the last transition here must come from w ′ .The next transition in the shuffle is (σ ′ i , σ ′′ i ).By the choice of σ ′′ i , we have that (σ 1 , σ ′ 1 ) . . .
n , σ ′′ n ) comes from w ′′ , not from w ′ .In sum, w ′ = w, and therefore w ∈ Q c , contradicting our assumption that w ∈ Q c .
Next, suppose that w is of the form (σ 1 , σ ′ 1 ) . . .(σ n , σ ′ n ) Done.With the same C, we obtain that [[C]](P ) contains the transition sequence: )Done which generates the run: . Again, by the choice of σ ′′ 1 , . . ., σ ′′ n , this can be the case only if w is in Q c .(The argument for the contradiction may actually be simplified in this case, because of the marker Done.) We obtain the following Full-abstraction Theorem:

Proof. The implication from [[C]] ⊆ [[D]
] is an immediate consequence of the compositionality of the semantics (Proposition 4.2).The converse follows from Lemmas 5.13 and 5.14.
Coarser-grained definitions of obs(P ) may sometimes be appropriate.For those, we expect that full abstraction will typically require additional closure conditions on P , such as closure under suitable forms of stuttering and mumbling, much as in our work and Brookes's on parallel composition [AP93,Bro96].

Algebra
The development of the denotational semantics in Section 4 is ad hoc, in that the semantics is not related to any systematic approach.In this section we show how it fits in with the algebraic theory of effects [PP02, PP03, HPP06, PP08, PP09].
In the functional programming approach to imperative languages, commands have unit type, 1.Then, taking the monadic point of view [BHM02], they are modeled as elements of T (1) for a suitable monad T on, say, the category of ω-cpos and continuous functions.For parallelism one might look for something along the lines of the resumptions monad [HP79, CM93, HPP06].
In the algebraic approach to computational effects [PP02, HPP06], one analyses the monads as free algebra monads for a suitable equational or Lawvere theory L (here meaning in the enriched sense, so that inequations are allowed, as are families of operations continuously parameterized over an ω-cpo).The operations of the theory (for example a binary choice operation in the case of nondeterminism) are thought of as effect constructors in that they create the effects at hand.
As discussed in [HP79], resumptions are generally not fully abstract when their domain equation is solved in a category of cpos.If, instead, it is solved in a category of semilattices, increased abstraction may be obtained.The situation was analyzed from the algebraic point of view in [HPP06].It was shown there that resumptions arise by combining a theory for stores [PP02] with one for nondeterminism, one for nontermination, and one for a unary operation d thought of as suspending computation.The difference between solving the equation in a category of semilattices or cpos essentially amounts to whether or not one asks that d, and the other operations, commute with nondeterminism.
In [Bro96], Brookes, using an apparently different and mathematically elementary tracebased approach, succeeded in giving a fully abstract semantics for a language of the kind considered in [HP79].However, in [Jef95], Jeffrey showed that trace-based models of concurrent languages can arise as solutions to domain equations in a category of semilattices, thereby relating the two approaches.
We propose here to identify the suspension operation d with the operation of the same name introduced in Section 4.3; indeed this identification was the origin of the definition of yield given there, and it is natural to further identify yield as the generic effect [PP03] corresponding to the suspension operation.These identifications are justified by Corollary 6.5, below, and the discussion following it.
In Section 6.1 we carry out an algebraic analysis of resumptions.We show in Theorem 6.1 that, imposing the commutations with nondeterminism just discussed, they do indeed correspond to a traces model, provided one uses the Hoare or lower powerdomain.(This powerdomain is a natural choice as we consider only "may" semantics in this paper, and elements of such powerdomains are Scott closed, so downwards-closed, a natural generalization of our prefix-closedness condition.)The proof makes the link between domain equations and traces.
The missing ingredient in an algebraic analysis of Proc is then an account of async.In the denotational semantics of any command of the form async C, all Ret marking is lost from the meaning of C, because of the application of the cleaning function, − c ; further all the sequences in [[C]] c are proper.We propose to treat async as a generic effect, parameterized by an element of AProc (which will be [[C]] c ).
In order to give the equations for the async operation it will, as one may expect, be useful to first have an algebraic analysis of AProc; we carry out this analysis in Section 6.2.It turns out, as detailed in Theorem 6.2, that AProc is similar to, but not quite, a resumptions ω-cpo.Finally, we analyze processes in Section 6.3, showing, in Theorem 6.4, that a process is a kind of "double-thread"-more precisely, a resumption that returns not only a value but also an element of AProc.6.1.Resumptions.Our theory L Res for resumptions follows [HPP06] but is somewhat modified, as we are interested only in "may" semantics and as we wish to allow infinitely proceeding processes.The theory is a combination of several constituent theories which we now consider successively.
The Lawvere theory L S of stores can be presented via a family of unary operations update x,n and a family of "N-ary" operations lookup x (x ∈ Vars, n ∈ N). (An N-ary operation is a countably infinitary operation whose arguments are indexed by the natural numbers.)For any computation γ, update x,n (γ) is read as the computation that first updates x to n and then proceeds as γ; for any N-indexed collection (γ n ) n of computations, lookup x ((γ n ) n ) is read as the computation that proceeds as γ n if x has value n in the current store.
The Lawvere theory L H for nondeterminism is that of the lower (aka Hoare) powerdomain, presented using a binary nondeterministic choice operation ∪; the Lawvere theory L Ω for nontermination is the theory of a least element, presented using a constant Ω; and the Lawvere theory L d for suspension is that of a unary operation d, with no equations.See [PP02,HPP06] for more details of these theories, including an account of the equations for stores and for Hoare powerdomains.
For resumptions, continuing to follow [HPP06], we wish the operations of L S to commute with those of L H and L Ω (which automatically commute with each other) and it is also natural to have d commute with nondeterministic choice, but not with the operations of L S , as we wish to model interruption points, and not with Ω, as we want to be able to model infinitely proceeding processes.We therefore define: and let T Res be the associated monad.(For any two theories L and L ′ presented using disjoint signatures, the theories L + L ′ and L ⊗ L ′ can be presented using the union of the signatures of L and L ′ and, in the former case, by the union of their equations and, in the latter case, by the union of their equations together with additional equations that say that each operation of each theory commutes with each operation of the other.) We now give an elementary trace-based picture of T Res (P ) for sufficiently general ωcpos P .Let Q be a partial order.A Q-transition is a pair of states (σ, σ ′ x) in which the second is marked with an element x of Q; we let τ range over stores and stores marked with an element of Q.A basic Q-transition sequence is a non-empty sequence consisting of plain transitions optionally followed by a Q-transition.Let ≤ Q be the least preorder on the set of basic Q-transition sequences which contains the prefix relation ≤ p and is such that, for any x, y in Q, if x ≤ y then u(σ, σ ′ x) ≤ Q u(σ, σ ′ y).One has that ≤ Q is a partial order and that u ≤ Q v holds iff: We need a few notions concerning ideals in partial orders.An ideal I in a partial order Q is a downwards-closed subset of Q; for any subset X of Q we write X ↓ for the least ideal including X, viz {x ∈ Q | ∃y ∈ X. x ≤ y}; and for any x ∈ Q we write x↓ for {x}↓.Downwards-closed sets, i.e., ideals, provide a suitable generalization of prefix-closed sets when passing from sequences to general partial orders.
An ideal I is directed if it is nonempty and any two elements of the ideal have an upper bound in the ideal.An ideal is denumerably generated if I = X ↓ for some denumerable X ⊆ I.We write I ↑ ω (Q), respectively I ω (Q), for the collection of all denumerably generated directed ideals of Q, respectively all denumerably generated ideals of Q, and we partially order them by subset; I ↑ ω (Q) is an ω-cpo, indeed it is the free such over Q; and I ω (Q) is the free ω-cpo with all finite sups over Q: it follows that it is also the free such ω-cpo over Let Q-BTrans be the set of basic Q-transition sequences, partially ordered as above.One can view I ω (Q-BTrans) as an L Res -model with the following definitions of the operations, where now we use l to range over Vars: (We skip over the small difference between the notion of an L Res -model and of an algebra satisfying equations.) We write ωCpo and ωSL for, respectively, the category of ω-cpos and the category of ω-cpos with all finite sups.For any poset P , its lifting P ⊥ is the poset obtained from P by freely adjoining a least element ⊥; its elements are (0, x), for x ∈ P , and ⊥, and they are ordered in the evident way.If P has all sups of increasing ω-chains, i.e., is an ω-cpo (respectively has finite sups), so does P ⊥ .For any object a of any given category, and any set X, we write X ⊗ a and a X for, respectively, the X-fold sum and product of a with itself, assuming they exist.The category ωSL has countable biproducts, given by the usual cartesian product of posets, and it is convenient to identify X ⊗ L with L X , for countable sets X.
The next theorem shows that the algebraic notion of resumptions can indeed be characterized in trace-based terms, specifically as ideals of basic Q-transition sequences.Theorem 6.1.Viewed as an L Res -model, and, for any continuous f : is given by: By Theorem 1 of [PP02] the free algebra monad for L S over ωSL is T S = (S ⊗ −) S , where we abbreviate Store to S (the theorem depends on the set of variables being finite).The definitions of the operations (update l,n ) T S (L) and (lookup l ) T S (L) of an algebra T S (L) are given by Proposition 1 of [PP02]; the unit (η T S ) L at L is the canonical map L −→ (S ⊗ L) S .So, by Corollary 2 of [HPP06], for any poset Q, L is the solution of the following "domain equation" in ωSL: L ∼ = (S ⊗ (L ⊥ + I ω (Q))) S (6.1) by which we mean the initial ω-cpo with finite sups L and map The morphism (update l,n Now, since countable copowers and powers coincide in ωSL, Equation (6.1) can be rewritten as: As I ω : Pos → ωSL is a left adjoint, where Pos is the category of posets, it preserves all colimits; I ω also commutes with lifting.So there is an isomorphism: for any poset R. So, again using that I ω preserves all colimits, we can solve Equation (6.2) by first solving the equation: in the category Pos, and then applying I ω .To do that, one takes R to be the least set such that R = S × (S × (R ⊥ + Q)) and then imposes the evident inductively defined partial order on it.The solution of Equation (6.2) is then given by taking L = I ω (R) and α = β −1 .
We now have an expression of L as I ω (R), as well as definitions of (update l,n ) L , (lookup l ) L , d L , and the unit.So, given the initial discussion above, we see that L forms the free model of L Res over I ↑ ω (R) with unit: x ∈ I} and with operations: There is an evident isomorphism of partial orders θ Res : R ∼ = Q-BTrans, given recursively by: θ Res ((σ, (σ ′ , inl((0, u) = (σ, σ ′ x) This induces an isomorphism I ω (R) ∼ = I ω (Q-BTrans) of ω-cpos, and so the free such model is also carried by I ω (Q-BTrans).Using this, and the above definitions of the operations and unit for I ω (R), one then verifies that the operations and unit for I ω (Q-BTrans) are as required.
As regards the formula for the Kleisli extension, that f † (η T Res ) I ↑ ω (Q) = f is evident and that the purported extension is a morphism of models of L Res is a calculation.
There is a related programming language phenomenon.Denotationally, we have the inclusion: [[(async (yield; block) As in the proof of the full-abstraction theorem, one can distinguish [[yield; block]] from [[skip]] using a sequential context; however, this context is not available when the command is within an async.
To solve this difficulty we take the theory of asynchronous threads L AProc to be L Res extended by a new constant halt and the equation: We can turn AProc into a model of L AProc by defining operations as follows: We write T AProc for the monad associated to the theory AProc.The next theorem shows that the variant theory L AProc indeed captures AProc.First we need some notation.
• For every sequence of plain transitions u = (σ 1 , σ ′ 1 ) . . .(σ n , σ ′ n ) we define a unary derived operation a u by: • For every sequence of plain transitions u and σ, σ ′ ∈ Store, we define two constants u and u(σ, σ ′ )Done by: Note that u AProc = u I ω (Q) = u ↓, where, for example, u AProc is the interpretation of u in AProc; further u(σ, σ ′ )Done AProc = u(σ, σ ′ )Done ↓.Below we may confuse a constant or operation with its interpretation in a specific algebra A, e.g., writing u or a u rather than u A or (a u ) A , provided that the intended algebra can be understood from the context.Theorem 6.2.AProc is the initial L AProc -model, i.e., it is T AProc (0).
Proof.We begin by examining the connection between I ω ({Done}) and AProc.By Theorem 6.1, I ω ({Done}) is the free model of L Res over {Done}.So f : {Done} → AProc has a unique extension to a morphism f † : I ω ({Done}) → AProc of L Res -models, where f (Done) = def halt AProc .We now show that: from which it follows that f † is onto.It is enough to show that f † (u ↓) = θ AProc (u) ↓, which holds as, for any u not containing Done, we calculate that where, in both cases, the second equality holds as f † is a morphism of L Res -models.
Let L be a model of L AProc .We have to show there is a unique morphism h : AProc → L. For uniqueness, let h, h ′ be such morphisms.Then both f † • h and f † • h ′ are morphisms of L Res models from I ω ({Done}) to L, extending the map Done → halt L .So, as there is only one such map, For existence, define the map θ : PPSeq → L by: θ(u) = (a u ) L .Using the fact that L is a model of AProc, particularly the axiom d(Ω) ≤ halt, one has that θ is monotonic.One can then define a continuous map h : AProc → L by: with the sup on the right existing as I is denumerable.Let g be the unique morphism of L Res models from I ω ({Done}) to L, extending the map Done → halt L .
We have that h • f † = g, as, for any u not containing Done, we may calculate that: and that As h•f † = g, and f † and g are morphisms of L Res models, and f † is onto, h is automatically a morphism of L Res models.For example, for the preservation of d, given I ∈ AProc, choose J ∈ I ω ({Done}) such that f † (J) = I and calculate that: Further, h preserves halt as h(halt AProc ) = θ(halt AProc ) = halt L .We therefore have that h is a morphism of L AProc -models, which concludes the proof.
One can go on and obtain a general view of the monad T AProc using a suitable notion of (proper) pure Q-transition sequences.However we omit the details as they are not needed for an account of processes.
There is another possible proof of Theorem 6.2 along the lines of that of Theorem 6.1.First one notes that to have a model of L AProc in ωCpo is to have a model of L S in ωSL, with carrier L, say, together with a morphism d : L ⊥ → L and an element halt ∈ L such that d(Ω) ≤ halt.It is not hard to see that to have such a morphism and element is to have a morphism (L + I ω (½)) ⊥ → L, where ½ is the one-point partial order.
One then sees that the carrier of the initial such model is given by the solution of the domain equation: L ∼ = (S ⊗ (L + I ω (½)) ⊥ ) S and that that can be solved by first solving the corresponding equation in Pos and then setting L = I ω (R).The rest of the proof proceeds as expected.
Equally, there should be an elementary proof of Theorem 6.1, which, like that of Theorem 6.2, makes use of definability.The more conceptual proofs have the advantage of showing, via domain equations, the origins of the two kinds of transition sequences and their ordering.6.3.Processes.We turn to our algebraic account of Proc.The signature of our theory of processes, L Proc , is that for L Res together with two families of unary operation symbols async P and yield to P , where P is in AProc.The first of these corresponds to the function of the same name defined above, but restricted to asynchronous threads.The second corresponds to a slightly different version of async in which the first action is that of the thread spun off, rather than that of the active command.We often find it convenient to write async P t and yield to P t as, respectively, P ⊲ t and P ⊳ t, thinking of them as right and left shuffles.
We begin with a theory L Spawn for async and yield to which involves the other operations.The first group of equations for L Spawn concerns commutation with ∪: The second group of equations concerns the interaction of async with the other operations of L Proc (except for ⊳): where we write P ⊲⊳ x for the "left action" (P ⊲ x) ∪ (P ⊳ x).The first three state that P ⊲ − commutes with another operation; the next concerns the interaction of async with suspension and brings in yield to; the last reduces two occurrences of async to one.The third, and last, group of equations is for the interaction of yield to with the other operations of L AProc : (update l,n ) AProc (P ) ⊳ x = update l,n (P ⊳ x) The first three assert that − ⊳ x acts homomorphically with respect to an operation; the next concerns the interaction with suspension; and the last concerns what happens when asynchronous threads halt.Finally we add an inequation: We take the equations of L Proc to be those of L Spawn , i.e., the equations are the ones just given for async and yield to, together with those of L Res .One would naturally have expected L Proc also to have an equation with left-hand side P ⊲ (P ′ ⊳ x); indeed, we could have added the equation: P ⊲ (P ′ ⊳ x) = P ′ ⊳ (P ⊲⊳ x) However this equation is redundant as it can be proved from the others using the algebraic induction principle of "Computational Induction" described in [PP08].(One proceeds by such an induction on P ′ , with a subinduction on P .)The inequation is somewhat inelegant: a possible improvement would be to use Pool instead rather than restricting to asynchronous threads.This would give the possibility of a version of halt, to denote Done ↓, such that the equations halt ⊲ x = halt ⊳ x = x held, making the inequation redundant.
Let T Proc be the monad associated to the theory Proc.We now aim to give a picture of T Proc (I ↑ ω (Q)) like that we gave of T Res (I ↑ ω (Q)).Take the partial order Q-Trans of the Q-transition sequences to be that of the basic (Q × PSeq)-transition sequences.Note that one can regard Q-transition sequences as elements of a kind of "double thread" in which the first thread returns a value together with a second (asynchronous) thread.
We show that Q-Proc = def I ω (Q-Trans) carries the free model of L Proc on I ↑ ω (Q).We view Q-Proc as a L Res -model as in Section 6.1.In order to give async and yield to, we first mutually recursively define the incomplete right and left shuffles u ⊲ v and u ⊳ v in Q-Proc of a proper pure transition sequence u with a Q-transition sequence v, by: where, for any pure transition sequence w, w − is w less any occurrence of Done, and writing u ⊲⊳ v for the incomplete shuffles (u ⊳ v) ∪ (u ⊲ v) of u and v, and: where, in the last line, u is required to be proper.(Recall that an incomplete shuffle of two sequences is a shuffle of two of their prefixes, equivalently a prefix of a shuffle of them.)Both ⊲ and ⊳ are monotonic operations.
Then, for P ∈ AProc and I ∈ Q-Proc, we put: (async Proc ) P (I) = In the following we make use of the notation introduced in Section 6.2.
Lemma 6.3.For any proper pure transition sequence u, the equation Proof.The proof is by induction on the length of u.In the case where u = ε, we have u ↓= Ω AProc , and in the equational theory we have Ω AProc ⊳ Ω = Ω, as required.
Our main algebraic theorem characterizes free models of a natural equational theory for resumptions with thread-spawning in terms of a kind of double-thread.Theorem 6.4.Viewed as an L Proc -model, I ω (Q-Trans) is the free model over is given by: Proof.To show that I ω (Q-Trans) is the free algebra over I ↑ ω (Q) with unit as above, we must show that for any L Proc -model A and any continuous function f : I ↑ ω (Q) → A there is a unique morphism h : I ω (Q-Trans) → A of models of L Proc such that the following diagram commutes: We begin by showing uniqueness.To that end, fix A and f , and let h be a morphism such that the diagram commutes.Define g : This is a good definition, with monotonicity being established using the inequation for ⊲.We have f = gα and (η T Proc ) I ↑ ω (Q) = (η T Res ) I ↑ ω (Q×PSeq) α where α : We then have that the following diagram commutes: as we may we calculate, for u = Done, that: and, for u = Done, that: Q×PSeq) = g, and so h = h ′ , as h and h ′ are morphisms of models of L Res (being morphisms of models of L Proc ).
For existence we are again given A and f and wish to construct a suitable h.To that end, with g and α as before, take h to be the T Res -extension of g.Then we have h Q×PSeq) α = gα = f and so it remains to prove that h preserves async and yield to.
As regards the preservation of async, since it is continuous, preserves ∪ in each argument, and is strict in its second argument, it suffices to establish preservation for individual transition sequences.That is, it suffices to show, for all proper pure transition sequences u and all v in Q-Trans, that: where here, and below, we omit ↓'s, writing, e.g., u and v rather than u↓ and v↓.
As regards the preservation of yield to, since it is continuous and preserves ∪ in each argument, it suffices to show, for all proper pure transition sequences u and all v in Q-Trans that: For the last of these three equations, as h(Ω) = Ω, using Lemma 6.3, we see that is enough to show that h(u − ) = u − , and this holds as h is a homomorphism of models of L Res .
The proof of the first two equations is a simultaneous induction on the sum of the lengths of u and v, invoking L Proc equations on A as necessary.We begin with the first equation.In the first case, we consider v = (σ, σ ′ ).Here, on the one hand, we have: using the fact that h is a homomorphism for the last equality, and, on the other, we have: For the next case we consider v = (σ, σ ′ (x, u ′ )).Here, on the one hand we have: and, on the other hand, we have: For the last case for the first equation we have v = (σ, σ ′ )v ′ , with v ′ in Q-Trans, and we calculate: applying the induction hypothesis in the second line.
Turning to the second equation, the first case we consider is where u = ε, and we have: The second case is where u = (σ, σ ′ )Done and we have: The last case is where u = (σ, σ ′ )u ′ , with u ′ a proper pure transition sequence, and we have: applying the induction hypothesis to obtain the fourth equality.
Finally, the formula for the Kleisli extension follows from the construction of h, using the Kleisli formula of Theorem 6.1.
As in the case of resumptions, one can go further and obtain a closely related, if less elementary, picture of T Proc (P ) for arbitrary P .
Note that the proof of Theorem 6.4 is elementary, making use of definability in a similar way to the proof of Theorem 6.2.However, unlike in the cases of Theorems 6.1 and 6.2, we do not know any conceptual proof of Theorem 6.4.The difficulty is that the theory of processes L Proc , particularly the part concerning ⊳ and ⊲, seems somewhat ad hoc, and is not built up in a standard way from simpler theories.There is surely more to be understood here.
Nonetheless, with Theorem 6.4 available, we are in a position to give our algebraic account of Proc.There is an isomorphism θ One then has an isomorphism of ω-cpos θProc : I ω (Q-Trans) ∼ = Proc given by: θProc (I) = θ Proc (I) ∪ {ε}.It follows that Proc can be seen as the free model of L Proc over the terminal ω-cpo {Ret}, as we now spell out.First, define the set of left shuffles u ⊳ v of a pure transition sequence u with a transition sequence v by setting v} Then, we have: Corollary 6.5.Equip Proc with the following operations: Then θProc : I ω (Q-Trans) ∼ = Proc is an isomorphism of L Proc -models, and Proc is the free model of L Proc over {Ret}, with unit (η Proc ) {Ret} : {Ret} → Proc given by: The Kleisli extension of a map f : {Ret} → Proc is given by: Proof.The proof is a calculation using Theorem 6.4.The following equations are useful: θProc where u is a proper pure transition sequence and v is a {Ret}-transition sequence.
As we now see, the algebraic view also determines the semantics of our language.This achieves our aim of placing cooperative threads within the algebraic approach to effects, thereby justifying the previous, more ad hoc, account.
First, we have that [[skip]] = (η Proc ) {Ret} (Ret) and that P • Q = (Ret → Q) † (P ), so the Kleisli structure determines the semantics of skip and composition, just as one would expect from the monadic point of view.
Next, the update and lookup operations, together with the assumed primitive natural number and boolean functions, determine the semantics of assignments, conditionals, and while loops.The operations are equivalent to two generic effects, of assignment and reading: One can use the reading generic effect to give the semantics of numerical expressions as elements of T Proc (N); with that, one can give the semantics of assignments, using the assignment generic effect, standard monadic means, and θProc .Similarly, one can use the reading generic effect to give the semantics of boolean expressions as elements of T Proc (B), where B = def {true, false}; with that one can give the semantics of conditionals and while loops, again using standard monadic means and θProc (as well as least fixed-points for while loops).Continuing, the d operation is that of the algebra; and block is modeled by Ω Proc .Finally, the semantics of spawning is determined by async together with the cleaning function − c : Proc → AProc It turns out that the latter is also determined by algebraic means.Specifically, one can regard AProc as a model of L Res as in Section 6.2 (so we ignore halt) and then extend it to a model of L Proc as follows.First for any proper pure transition sequences u and v we define u ⊲ v ∈ AProc inductively on v by: where, in the last line, v is required to be proper.Then we put: (async AProc ) P (Q) = u∈P,v∈Q u ⊲ v and (yield to AProc ) P (Q) = (async AProc ) Q (P ).With these definitions, − c is the extension of the map Ret → halt AProc to Proc.
In the converse direction one can consider adding missing algebraic operations to the language, for example adding ∪ and yield to via constructs C or D and yield to C. The latter construct is to the binary yield to as async is to the binary async.It generalizes yield, which is equivalent to yield to skip.Its operational semantics is given by the rule: σ, T, E[yield to C] −→ σ, T.E[skip], C One may debate the programming usefulness of such additional constructs, but they do allow one to express the equations used for the algebraic characterizations at the level of commands.For example, the equation P ⊲ d(x) = d(P ⊲⊳ x) becomes: (async C); yield; D = yield; ((async C); D or (yield to C); D) 6.4.Dendriform Algebras and Modules.We have found it useful to employ various forms of shuffle: sometimes we shuffle two things of the same kind with each other, e.g., two pure transition sequences with each other; and sometimes we shuffle two things of different kinds with each other, e.g., a pure transition sequence with a transition sequence.
We have further found it useful to break down such shuffles into left and right shuffles, e.g., in the case of the left and right shuffles of asynchronous processes with processes; indeed we employ a uniform notation, writing ⊳, ⊲, and ⊲⊳ for left shuffles, right shuffles, and (ordinary) shuffles, respectively.Our algebraic account of threads has further involved a number of equations concerning the interaction of these shuffle operations with each other and with other operations.
Shuffle operations and their algebra have been studied in a variety of settings.In particular, Loday's dendriform algebras [Lod01,FG08] provide a wide-ranging general notion of left and right shuffling of two things of the same kind with each other.Foissy's dendriform A-modules [Foi07] provide the corresponding notion of action: left or right shuffling a thing of one kind with a thing of another kind.We next relate our treatment to these general concepts, thereby placing our various shuffle operations and our equations for them in a standard algebraic context.
Let R be a given commutative semiring (with no requirement for a 0 or a 1).Then a dendriform dialgebra is an R-module A equipped with two binary bilinear operations ⊳ and ⊲ such that, for all x, y, z ∈ A: where x ⊲⊳ y = def x ⊳ y + y ⊲ x; it is commutative if x ⊳ y = y ⊲ x always holds.Then (A, ⊲⊳) is a semigroup in the category of R-modules, equivalently ⊲⊳ is an associative bilinear operation; it is commutative if the dialgebra is.
Given a dendriform algebra A, a dendriform A-module is an R-module M equipped with two binary bilinear operations ⊳, ⊲ : A × M → M such that, for all a, b ∈ A and In all our examples we take R to be the natural two-element semiring over B; join semilattices with a zero form B-modules (setting true x = x and false x = 0).As a first example, consider the B-module of the collection of all languages, i.e., all sets of strings over a given alphabet, not containing ε.This is a commutative dialgebra, taking ⊳ to be the left shuffle operation, and ⊲ to be the right one; ⊲⊳ is then the usual shuffle operation.
The semilattice of asynchronous processes AProc forms a commutative dendriform Balgebra, setting: It follows that Proc also forms a dendriform AProc-module, using the definitions of the left and right shuffling given in Corollary 6.5.Algebraically, the first group of equations for L Spawn state the bilinearity of the two module operations.The second group contains the second of the three module equations.The equation a ⊲ (b ⊳ x) = b ⊳ (a ⊲⊳ x) generalizing one considered above, holds in any module over a commutative dendriform algebra.To account for the other two module equations algebraically one would need an algebraic treatment of the dendriform algebra operations on AProc.These operations are effect deconstructors rather than effect constructors.An account of unary deconstructors has been given in [PP09], but a satisfactory treatment of binary ones remains to be found; we therefore leave further algebraic treatment to future work.

Conclusion
A priori, the properties and the semantics of threads in general, and of cooperative threads in particular, may not appear obvious.In our opinion, a huge body of incorrect multithreaded software and a relatively small literature both support this point of view.With the belief that mathematical foundations could prove beneficial, the main technical goal of our work is to define and elucidate the semantics of threads.For instance, semantics can serve for validating reasoning principles; our work is only a preliminary, but encouraging, step in this respect.
Our initial motivation was partly practical-we wanted to understand and further the AME programming model and similar ones.We also saw an opportunity to leverage developments in trace-based denotational semantics and in the algebraic theory of effects, and to extend their applicability to threads.As our results demonstrate, the convergence of these three lines of work proved interesting and fruitful.
We focus on a particular small language with constructs for threads.Several possible extensions may be considered.These include constructs for parallel composition, nondeterministic choice, higher-order functions, and thread-joining.More speculatively, they also include generalized yields, of the kind that arise in the algebraic theory of effects, as discussed in Section 6. Importantly, our monadic treatment of threads indicates how to add higher-order functions to the semantics.
Our results mostly carry over to these extensions.In some cases, small changes or restrictions are required.In particular, the full-abstraction proof with nondeterministic choice would use fresh variables; the one for higher-order functions might require standard limitations on the order of functions, cf.[Jef95].Thus, our approach seems to be robust, and indeed-as in the case of higher-order functions-helpful in accounting for a range of language features.Further, our algebraic analysis of the thread monad links it to the broader theme of the algebraic treatment of effects.In that regard, as the discussion after Theorem 6.4 indicates, there is clearly still further understanding to be gained.
Another possible direction for further work is the exploration of alternative semantics.For instance, we could switch from the "may" semantics that we study to "must" semantics.We could also define alternative notions of observation.As suggested in Section 5.3, some of the coarser notions of observation might require closure conditions, such as closure under suitable forms of stuttering and under mumbling.These may correspond to suitable axioms on the suspension operator d, as alluded to in [Plo06]: we conjecture that stuttering corresponds to d(d(x)) ≤ d(x) and that mumbling corresponds to d(x) ≥ x.
It would also be interesting to consider finer notions of observation that distinguish blocking from divergence.To this end we could add constructs such as orElse [HMP05] and, in the semantics, treat blocking as a kind of exception.Finally, we could revisit lowerlevel semantics with explicit optimistic concurrency and roll-backs, of the kind employed in the implementation of AME.

[
[C]] : Proc → Proc is a continuous function on Proc.This function is defined by induction on the form of C, with the usual clauses of the definition of [[•]] plus [[[ ]]](P ) = P .

Lemma 4. 3 .
For all P, Q ∈ Pool and R ∈ Proc we have:(1) async As composition is associative with unit * , this is equivalent to showing that, for every C we have: [[yield; C]] c = async([[C]] c ) c which follows immediately, expanding the definitions.The proof of the fifth part is a straightforward verification.Lemma 5.2.If C is blocked then, for all T , [[T, C]] = {ε}.
Proof.We divide into cases according to the form of C. In the case where C has the form E[skip; D] we have σ ′ = σ, T ′ = T and C ′ = E[D].So, by Lemma 5.1, we have [[T ′ , C ′ ]] = [[T, C]], and we are done.In the case where C instead has the form E[async D], we have σ ′ = σ, T ′ = T.D and C ′ = E[skip] and we calculate: then so is C ′ and, further, |C ′ | < |C|.The approximation relation C D between loop-free commands C and general commands D is defined to be the least such relation closed under all non-looping program constructs and such that, for any b, C, D, and i ≥ 0: block D C D (while b do C) i (while b do D) This relation is extended to thread pools and contexts in the obvious way: we write T T ′ and C C ′ for these extensions.Lemma 5.6.Suppose that T U , C D, and, further, that σ, T, C −→ a σ ′ , T ′ , C ′ .Then, for some U ′ , D ′ with T ′ U ′ and C ′ D ′ , σ, U, D −→ a * σ ′ , U ′ , D ′ .Proof.One first notes that, for any C, D, if E[C] D then D has the form E ′ [D ′ ] where E E ′ and C D ′ .The proof then divides into cases according to the rule used to show that σ, T, C −→ a σ ′ , T ′ , C ′ .For example, suppose we have C = E[if b then C 1 else C 2 ] and σ(b) = true.We know that D must have the form E ′ [D ′ ] where E E ′ and (if b then C 1 else C 2 ) D ′ .Suppose now that D ′ has the form while b do D ′′ .Then we must have, for some i ≥ 0 that C 1 = C ′′ ; (while b do C ′′ ) i where C ′′ D ′′ .But then we observe that σ Proof.We begin by proving this for loop-free commands C. The proof is by induction on the size of C. If C is skip we have σ, T, skip −→ a * σ, T, skip and the conclusion follows, as, by Lemma 5.3, (σ, σ ′ )u ∈ [[T, skip]] c iff σ ′ = σ and u ∈ [[T ]] c .If C is blocked, the conclusion holds trivially, by Lemma 5.2.If C is neither skip nor blocked we have σ, T, C −→ a σ ′′ , T ′′ , C ′′ (and then C ′′ is loop-free and |C ′′ | < |C|).Then, by Lemma 5.4, (σ, σ

•
Finally, having established the claim for sequences of length n for sets of the form [[T, C]], we consider sequences of length n in a set of the form [[T ]].Suppose that T consists of C 1 , . . ., C k .A transition sequence v in [[T ]] is a shuffle of transition sequences in [[C 1 ]],. . .,[[C k ]], each of length at most n.The finiteness property for [[T ]] follows from the fact that there are only finitely many possible ways of decomposing v as a shuffle.
Proof.Letting P = [[C]] and Q = [[D]], we assume that P ⊆ Q and prove that there exists C such that [[C]](P ) c ⊆ [[C]](Q) c .For this, choose a sequence w in P but not in Q.If w = w c , then we can take C to be [ ].Therefore, for the rest of the proof, we consider the case w = w c .
Proc ) P (I) = u∈P, v∈I u ⊳ v ∪ {u − | u ∈ P, u = ε} If I is not empty we have: (yield to Proc ) P (I) = u∈P, v∈I u ⊳ v With these additional operations, Q-Proc is a model of L Proc .
yield to AProc ) P (Q) P ⊲ AProc Q = (async AProc ) P (Q) One then has that Q-Proc forms a dendriform AProc-module, setting: P ⊳ Q-Proc I = (yield to Proc ) P (I) P ⊲ Q-Proc I = (async Proc ) P (I) Models of L Res in ωCpo correspond to models of L S in ωSL together with a morphism d ′ : L ⊥ → L, where L is the carrier of the model.(Suchmorphisms are equivalent to ωcontinuous maps on L which preserve binary sups, but not necessarily ⊥.)The carrier L of the model of L Res is that of the model of L S in ωSL; it is necessarily an ω-cpo with all finite lubs.The L S operations on L become those of the model of L S in ωSL, and the map d : L → L extends uniquely to a morphism on L ⊥ , obtaining the required map d ′ .This correspondence extends straightforwardly to an equivalence of categories.So, as I ω (Q) is the free ω-cpo with finite sups over the ω-cpo I ↑ ω (Q), we seek the free structure (L, (update l,n