RPO, Second-order Contexts, and Lambda-calculus

First, we extend Leifer-Milner RPO theory, by giving general conditions to obtain IPO labelled transition systems (and bisimilarities) with a reduced set of transitions, and possibly finitely branching. Moreover, we study the weak variant of Leifer-Milner theory, by giving general conditions under which the weak bisimilarity is a congruence. Then, we apply such extended RPO technique to the lambda-calculus, endowed with lazy and call by value reduction strategies. We show that, contrary to process calculi, one can deal directly with the lambda-calculus syntax and apply Leifer-Milner technique to a category of contexts, provided that we work in the framework of weak bisimilarities. However, even in the case of the transition system with minimal contexts, the resulting bisimilarity is infinitely branching, due to the fact that, in standard context categories, parametric rules such as the beta-rule can be represented only by infinitely many ground rules. To overcome this problem, we introduce the general notion of second-order context category. We show that, by carrying out the RPO construction in this setting, the lazy observational equivalence can be captured as a weak bisimilarity equivalence on a finitely branching transition system. This result is achieved by considering an encoding of lambda-calculus in Combinatory Logic.


Introduction
Recently, much attention has been devoted to derive labeled transition systems and bisimilarity congruences from reactive systems, in the context of process languages and graph rewriting, [Sew02, LM00, SS03, GM05, BGK06, BKM06, EK06].In the theory of process algebras, the operational semantics of CCS was originally given via a labeled transition system (lts), while more recent process calculi have been presented via reactive systems plus structural rules.Reactive systems naturally induce behavioral equivalences which are congruences w.r.t.contexts, while lts's naturally induce bisimilarity equivalences with coinductive characterizations.However, such equivalences are not congruences in general, or else it is an heavy, ad-hoc task to prove that they are congruences.
Generalizing [Sew02], Leifer and Milner [LM00] presented a general categorical method for deriving a transition system from a reactive system, in such a way that the induced bisimilarity is a congruence.The labels in Leifer-Milner's transition system are those contexts which are minimal for a given reaction to fire.Minimal contexts are identified via the categorical notion of relative pushout (RPO).Leifer-Milner's central result guarantees that, under a suitable categorical condition, the induced bisimilarity is a congruence w.r.t.all contexts.
In the literature, some case studies have been carried out, especially in the setting of process calculi, for testing the expressivity of Leifer-Milner's approach.Some difficulties have arisen in applying the approach directly to such languages, viewed as Lawvere theories, because of structural rules.To overcome this problem, two different approaches have been considered.The first approach consists in using more complex categorical constructions, where structural rules are accounted for explicitly, [Lei01,SS03,SS05].In the second approach, intermediate encodings have been considered in graph theory, for which the approach of "borrowed contexts" has been developed [EK06], and in Milner's bigraph theory.
Here structural rules are avoided, since structurally equivalent terms are equated in the target language.
Moreover, the following further issues have arisen in applying Leifer-Milner's technique.(i) Leifer-Milner's bisimilarity is still redundant, and many labels have to be eliminated a posteriori, by an ad-hoc reasoning.Thus general results are called for, in order to reduce the complexity of the bisimilarity a priori.(ii) In some cases it is useful to consider weak variants of Leifer-Milner technique.However, for the weak bisimilarity we only have a partial congruence result, stating that such bisimilarity is a congruence w.r.t. a certain class of contexts.However, in many concrete cases, the weak bisimilarity turn out to be a full congruence.Thus it will be useful to study general conditions under which this happens.(iii) When Leifer-Milner technique is applied in the standard setting of term and context categories (Lawvere theories), the rules in the rewriting system cannot be represented parametrically, but only at a ground level through a (infinite) series of possible instantiations.As a consequence, the bisimilarity turns out to be infinitely branching.In [KSS05], a generalization of Leifer-Milner technique for dealing with parametric rules has been introduced.This approach is rather complex and not completely satisfactory.An alternative approach (which is considered in the present paper) consists in studying second-order versions of term and context categories, which allow parametric representations of rewriting rules, and carrying out Leifer-Milner technique in this setting.In this paper, we address all the above issues.In particular, in the first part of the paper, we extend Leifer-Milner theory, by providing general results for reducing the complexity of the bisimilarity, and by studying conditions under which the weak bisimilarity is a full congruence.Then, we focus on the prototypical example of reactive system given by the λ-calculus, endowed with lazy and call by value (cbv) reduction strategies.We show that, in principle, contrary to most of the case studies considered in the literature, one could deal directly with the λ-calculus syntax and apply Leifer-Milner technique to the category of term contexts induced by the λ-terms, provided that we work in the setting of weak bisimilarities.Applying our general results, we get quite economical weak bisimilarities which are congruences and we recover exactly both lazy and cbv contextual equivalences.As a by-product, we also get an alternative proof of the Context Lemma for the lazy case.However, the bisimilarities that we obtain are still infinitely branching.This is mainly due to the fact that, in the category of contexts, the β-rule cannot be described parametrically, but it needs to be described extensionally using an infinite set of pairs of ground terms.In order to overcome this problem, we consider the combinatory logic and we introduce the general notion of category of second-order term contexts, which provide a solution to the third issue above.Our main result amounts to the fact that, by carrying out Leifer-Milner's construction in this setting, the lazy contextual equivalence can be captured as a weak bisimilarity equivalence on a (finitely branching) transition system, while for the cbv case, the finitely branching transition system induces a bisimilarity which is strictly included in the contextual equivalence.Technically, these results are achieved by considering an encoding of the lazy (cbv) λ-calculus in KS Combinatory Logic (CL), endowed with a lazy (cbv) reduction strategy, and by showing that the lazy (cbv) contextual equivalence on λ-calculus can be recovered as a lazy (cbv) equivalence on CL.It is necessary to consider such encoding, since the approach of second-order context categories proposed in this paper works for reaction rules which are "local", that is, the reaction does not act on the whole term, but only locally.But the substitution operation on λ-calculus is not local.
Finally, the correspondence results obtained in this paper about the observational equivalences on λ-calculus and CL are interesting per se and, although natural and ultimately elementary, had not appeared previously in the literature.
Summary.In Section 2, we summarize the theory of reactive systems of [LM00].In Section 3, we extend such theory with new general results about weak bisimilarity, and about the "pruning" of Leifer-Milner lts and the induced bisimilarity.In Section 4, we present the λ-calculus together with lazy and cbv reduction strategies and observational equivalences, and we discuss the RPO approach applied to the λ-calculus endowed with a structure of context category.In Section 5, we focus on Combinatory Logic (CL), we show how to recover on CL the lazy and cbv strategies and observational equivalences, and we discuss the RPO approach applied to CL, viewed as a context category.In Section 6, we introduce the notion of second-order context category, and we apply the RPO approach to CL viewed as a second-order rewriting system, thus obtaining a characterization of the lazy observational equivalence as a weak bisimilarity on a finitely branching lts.Final remarks and directions for future work appear in Section 7.
The present paper extends [DHL08].The main new contribution of the present paper is the extension of Leifer-Milner theory, which appears in Section 3.This allows to deal with the λ-calculus in the subsequent sections in a smoother way, to get stronger results about the lts and the induced bisimilarity, both for the lazy and for the cbv case, and also to provide an alternative proof of the Context Lemma in the lazy case.
Acknowledgments.The authors thank the referees for many useful comments, which helped in greatly improving the paper.

The Theory of Reactive Systems
In this section, we summarize the theory of reactive systems proposed in [LM00] to derive lts's and bisimulation congruences from a given reduction semantics.Moreover, we discuss weak variants of Leifer-Milner's bisimilarity equivalence.
The theory of [LM00] is based on a categorical formulation of the notion of reactive system, whereby contexts are modeled as arrows of a category, terms are arrows having as domain 0 (a special object which denotes no holes), and reaction rules are pairs of terms.
Definition 2.1 (Reactive System).A reactive system C consists of: The reactive contexts are those in which a reaction can occur.By composition-reflecting we mean that dd ′ ∈ D implies d, d ′ ∈ D.
Reactive systems on term languages can be viewed as a special case of reactive systems in the sense of Leifer-Milner by instantiating C as a suitable category of term and contexts, also called the (free) Lawvere category, [LM00].In this view, we often call terms the arrows with domains 0, and contexts the other arrows.
From the set of reaction rules one generates the reaction relation by closing them under all reactive contexts: Definition 2.2 (Reaction Relation).Given a reaction system with reactive contexts D and reaction rules R, the reaction relation → is defined by: t → u iff t = dl, u = dr for some d ∈ D and l, r ∈ R .
The behavior of a reactive system is expressed as an unlabeled transition system.On the other hand, many useful behavioral equivalences are only defined for lts's.The passage from reactive systems to lts's is obtained as follows.
Definition 2.3 (Context Labeled Transition System).Given a reactive system C, the associated context lts is defined as follows: • states: arrows t : 0 → I in C, for any I; • transitions: t c −→ C u iff c ∈ C and ct → u (i.e., ct and u are in the reaction relation).
In the case of a reactive system defined on a category of contexts, a state is a term t, and an associated label is a context c such that ct reduces.In the following, we will consider also lts's obtained by reducing the set of transitions of the context lts.In the sequel, we will use the word lts to refer to any such lts obtained from a context lts.
(i) A symmetric relation R ⊆ I∈C C(0, I)×C(0, I) on the states of the lts is a bisimulation if: (ii) We call bisimilarity the largest bisimulation.
(iii) The bisimilarity on the context lts is called context bisimilarity ∼ C .
It is easy to check that the context bisimilarity is a congruence w.r.t.all contexts, i.e., if a ∼ C b, then for any context c, ca ∼ C cb.However, intuitively only those contexts which contain the minimal amount of information for a reaction to fire are relevant, while the others are redundant.Moreover, often context bisimilarity gives an equivalence which is too coarse, as we will see also in this paper.Thus, in [LM00], the authors proposed a categorical criterion for identifying the "smallest context allowing a reaction".They defined relative pushouts (RPOs), of which idem relative pushouts (IPOs) are a special case.One can define a lts using IPOs.Leifer-Milner's central result consists in showing that, under a suitable categorical condition, such lts is well-behaved, in the sense that the induced bisimilarity is a congruence.The following is a fundamental lemma stating a property of IPO squares.
(i) If the two squares of Fig. 2(ii) are IPOs so is the outer rectangle.
(ii) It the outer rectangle and the left square of Fig. 2(ii) are IPOs so is the right square.
From the above lemma Leifer and Milner derived their central result: Theorem 2.9 ([LM00]).Let C be a reactive system having redex RPOs.Then the IPO bisimilarity ∼ I is a congruence w.r.t.all contexts, i.e., if a ∼ I b then for all c of the appropriate type, ca ∼ I cb.
2.1.Weak Bisimilarity.For dealing with the λ-calculus, it will be useful to consider the weak versions of the context and IPO lts's defined above, together with the corresponding notions of weak bisimilarities.
One can proceed in general, by defining a weak lts from a given lts: Definition 2.10 (Weak lts and Bisimilarity).Let α −→ be a lts, and let τ be a label (identifying an unobservable action).
(i) We define the weak lts α =⇒ by The following lemma provides a coinduction "up-to" principle, which will be useful in the sequel: Lemma 2.12.Let α −→ be a lts and let α =⇒ be the corresponding weak lts.The induced weak bisimilarity is the greatest symmetric relation R s.t.: Proof.Let us call "bisimulation up-to" a relation R as in the statement of the lemma.In order to prove the claim, it is sufficient to prove that, if R is a bisimulation up-to, then R * is a bisimulation.Let R be a bisimulation up-to.First, one can easily check that (aR and, by what we have proved before, For dealing with the λ-calculus, we will consider a notion of weak IPO bisimilarity, where the identity context is unobservable.Such notions of weak IPO bisimilarities are not congruences w.r.t.all contexts, in general, however, as observed in [LM00] (end of Section 5), they are congruences at least w.r.t.reactive contexts: Theorem 2.13.Let C be a reactive system having redex RPOs.Then the weak IPO bisimilarity ≈ I , where the identity context is unobservable, is a congruence w.r.t.reactive contexts.

Extending the Theory of Reactive Systems
In this section, we present some original results concerning the lts obtained by the RPO construction.These results concern two issues: Weak-bisimilarity: Since in the λ-calculus the weak bisimilarity is the equivalence to be used, we present some general conditions assuring that the weak bisimilarity, on the lts obtained by an IPO construction, is a congruence w.r.t.all contexts.
Pruning the lts tree: In order to obtain a feasible lts, i.e., a lts with a reduced set of transitions, possibly finitely branching, it is often necessary to prune the lts obtained by an IPO construction.We present some general conditions allowing to prune IPO lts, without modifying the induced (weak)-bisimilarity.We present our results in two different versions, the first one is quite simple, but it does not apply to our particular case, so we present a second version that is more involved but suits our needs.We choose to present the simple first version of our results as an introduction to the second one, and also because it can have applications in modeling languages different from the λ-calculus.
Some preliminary definitions are necessary.
Definition 3.1.Given a lts obtained by the IPO construction: • Given a set of labels L, the L-restricted IPO lts is the lts obtained by removing from the IPO lts all transitions not labeled by elements in L. We denote by ≈ L the weak bisimilarity induced by the L-restricted IPO lts.• We denote by R the set of labels that are reactive contexts.We denote by ≈ R the weak bisimilarity induced by the R-restricted IPO lts.• In a reactive system, we say that the family of IPO transitions with label f : I 0 → I 1 is definable by contexts if there exists a list of contexts e 1 , . . ., e h : I 0 → I 1 such that, for all t : 0 → I 0 , we have that: ∀i.
Intuitively, a family of IPO transitions with label f : I 0 → I 1 is definable by contexts if f is an IPO for any arrow t : 0 → I 0 and the IPO transitions on f can be described by contexts, that is, they do not modify the internal structure of the term t.
where the outermost rectangle is the IPO inducing the transition ct f −→ I t ′ , namely t ′ = d ′ dr with l, r a reaction rule, while the left square is a RPO of the redex square.By Lemma 2.8, the IPO pasting property, we have that also the right-hand square of the diagram is an IPO.
There are two cases to consider: (i) If the context f ′ is definable by contexts, since t f ′ −→ I dr, there exists a context e such that dr = et and t ′ = d ′ et, it follows that u f ′ −→ I eu.That is, there exist a reaction rule l 1 , r 1 and a reactive context d 1 s.t.eu = d 1 r 1 , and the left-hand square of the following diagram is a IPO.
Since the right-hand square is IPO, by the IPO pasting property, Lemma 2.8, also the outermost rectangle is an IPO.It follows that cu reactive, then it so also the context d ′ f ′ (composition of reactive contexts) and the context c (reactive contexts are composition-reflecting).Moreover, by the definition of bisimilarity, there exists u 0 such that u For dealing with the λ-calculus, we present a second proposition that is similar in spirit to Proposition 3.2, although it is not a direct generalization.The second proposition considers both the category of unary linear term contexts and a category of "multi-holed" linear term contexts.The category of unary contexts is the most suitable for the IPO construction, while the category of multi-holed contexts is useful to represent some transitions (in the lts) through insertions of terms in suitable contexts.
The following definition formalizes the relation existing between the two categories of contexts.
• The objects of D are finite lists of objects of C different from 0.
• By identifying 0 with the empty list , and any other object I in C with the singleton list I , C is a full subcategory of D.
In the spirit of the previous remark we will call unary (single-holed) contexts the arrows in C (with domain different from 0), and multi-holed contexts the arrows in D. Two other definitions are necessary.
Definition 3.4.Given a reactive system C on a category C, and a category D, list extension of C: (i) we define a multi-holed context g : I 0 , . . ., I n → I IPO uniform if for any context f : I → J appearing as label in the IPO lts, there exists a list of multi-holed contexts g 1 : I 1,0 , . . ., I 1,n 1 → J, . . ., g h : I h,0 , . . ., I h,n h → J, and a list of functions l 1 : {0, . . ., n 1 } → {0, . . ., n}, . . ., l h : {0, . . ., n h } → {0, . . ., n} such that, for any n-tuple of C terms t 0 : 0 → I 0 , . . ., t n : 0 → I n , we have that: Intuitively, a context g is IPO uniform if the behavior wrt the IPO reaction of the term g(t l i (0) ⊗ . . .⊗ t l i (n i ) ) does not depend on the terms t l i (0) , . . ., t l i (n i ) .We remark that the notion of "uniform" is not a generalization of the notion of "definable by contexts".
Proposition 3.5.Let C be a reactive system having redex RPOs.
(i) The weak IPO bisimilarity ≈ I (with the identity IPO context unobservable) is a congruence if there exists a category D, list extension of C such that any (multi-holed) context g : I 0 , . .

. , I n → I is either IPO uniform or it has a reactive index (or both). (ii) Moreover, if the reaction relation is deterministic, i.e., any term can react in at most
one possible way, then the relation ≈ I coincides with ≈ R .
Proof.Here we present only the proof of point (ii).The proof of point (i) is almost identical and can be derived, from the present proof, by substituting the relation ≈ R with ≈ I , and by simplifying some steps.By repeating the same arguments used at the beginning of the proof of Proposition 3.2, it is sufficient to prove that the relation is contained in the weak bisimilarity.By Lemma 2.12, it is sufficient to show that for any The proof is by double induction on the number of steps of the transition g(t 0 ⊗ . . .⊗ t n ) f =⇒ I t, and on the number n of holes in the list context g.The basic case is when There are two cases to consider: (i) The context g is IPO-uniform: in this case there exists a context e : I ′ 0 , . . ., I ′ n ′ → J 1 and a function l : {0, . . .n ′ } → {0, . . ., n} such that t ′ = e(t l(0) ⊗ . . .⊗ t l(n ′ ) ) and . By application of the inductive hypothesis, on a smaller number of transitions steps, there exists u s.t.e(u l(0) ⊗ . . .⊗ u l(n ′ ) ) f ′′ =⇒ I u with tS * u, and from which the claim follows.
(ii) The context g has a reactive index i, for the sake of simplicity, assume i = 0. Consider the arrow , by inductive hypothesis, on the number of holes in the multiholed contexts, there exists u such that g To obtain the claim, it remains to prove that there exists More generally we prove that for any reactive context g o : J 0 → J 1 , any IPO context f : J 1 → J 2 , and any pair of terms The proof is by induction on the number of steps in the transition g ′′ (t 0 ) The basic case is when the reaction is of zero steps; in this case there is nothing to prove.
For the inductive case consider the following diagram of IPO squares defining the first reaction in the chain We need to consider two cases.The first one is where f ′ is a reactive context (f ′ ∈ {f, Id}).Since reactive contexts are composition-reflecting, then also the IPO context f ′′ is reactive.By the definition of bisimilarity, u o f ′′ =⇒ I u i with u i ≈ R dr.By reactivity of g o , using suitable IPO pasting diagrams, we can prove g o (u o ) o , we obtain the claim.The second case is where f ′ is a non reactive context (f ′ = f ).Since reactive contexts are compositional reflecting, then also the IPO context f ′′ is non reactive and therefore, by hypothesis, IPO uniform.Notice that the context Id is an IPO context for the term f ′′ (t o ), by the IPO uniformity of f ′′ , Id is an IPO context also for f ′′ (u o ) and there exist a list context g Notice that, if the reduction relation is deterministic, two terms that reduce one to the other via τ transitions are weakly bisimilar.It follows that , from which we derive the claim.
Remark 3.6.Propositions 3.2 and 3.5 above, about congruence of the weak IPO bisimilarity, are more related than what they look at first glance.From one side, by exploiting the fact that the composition of a non-reactive context with any context gives a non-reactive context, one can show that, if the non-reactive IPOs are definable by contexts, then any non-reactive context is IPO-uniform.Note that the condition of "definability by context" is in general simpler to verify than the one of "IPO-uniformity", and so we prefer to present the given formulation of Proposition 3.2.On the other side, it would be possible to extend the notion of "definability by context" to the case of list extension categories, however to this aim it would be necessary to present a series of new definitions, necessary to lift the IPO construction to the list extension categories.For the sake of simplicity, we prefer to avoid the introduction of these further notions.

The Lambda Calculus
First, we recall the λ-calculus syntax together with lazy and cbv reduction strategies and observational equivalences.Then, we show how to apply the RPO technique to λ-calculus, viewed as a context category, and we discuss some problematic issues.As usual, λ-terms are taken up-to α-conversion, and application associates to the left.We consider the standard notions of β-rule and β V -rule: Definition 4.2.
A reduction strategy on the λ-calculus determines, for each term which is not a value, a suitable β-redex appearing in it to be contracted.The lazy and cbv reduction strategies are defined on closed λ-terms as follows: (i) The lazy strategy → l ⊆ Λ 0 × Λ 0 reduces the leftmost β-redex, not appearing within a λ-abstraction.Formally, → l is defined by the rules: The call by value strategy → v ⊆ Λ 0 × Λ 0 reduces the leftmost β V -redex, not appearing within a λ-abstraction.Formally, → v is defined by the following rules: where V is a closed value, i.e., a λ-abstraction.
We denote by → * σ the reflexive and transitive closure of a strategy → σ , for σ ∈ {l, v}, by Val σ the set of values, i.e., the set of terms on which the reduction strategy halts (which coincides with the set of λ-abstractions in both cases), and by M ⇓ σ the fact that there exists V ∈ Val σ such that M → * σ V .As we will see in Section 4.2 below, each strategy defines a (deterministic) reactive system on λ-terms in the sense of Definition 2.1.To this aim, it is useful to notice that the above reduction strategies can be alternatively determined by specifying suitable sets of reactive contexts (see Remark 4.5 below), which are subsets of the following unary contexts, i.e., contexts with a single hole: Definition 4.4 (Unary Contexts).Let P ∈ Λ.The unary contexts are: The closed unary contexts are the unary contexts with no free variables.Each strategy induces an observational (contextual) equivalence à la Morris on closed terms, when we consider programs as black boxes and only observe their "halting properties".
Definition 4.6 (σ-observational Equivalence).Let → σ be a reduction strategy and let M, N ∈ Λ 0 .The observational equivalence ≈ σ is defined by The definition of ≈ σ can be extended to open terms by considering closing (by-value) substitutions, i.e., for M, N ∈ Λ s.t.F V (M, N ) ⊆ {x 1 , . . ., x n }, we define: Remark 4.7.Often in the literature, the observational equivalence is defined by considering multi-holed contexts.However, it is easy to see that the two notions of observational equivalences, obtained by considering just unary or all multi-holed contexts, coincide.
The problem of reducing the set of contexts in which we need to check the behavior of two terms has been widely studied in the literature.In particular, for both strategies in Definition 4.3 above, a Context Lemma holds, which allows us to restrict ourselves to applicative contexts of the shape [ ] P ([ ] V ), where P ( V ) denotes a list of closed terms (values).Let us denote by ≈ app σ the observational equivalence which checks the behavior of terms only in applicative (by-value) contexts.This admits a coinductive characterization as follows: − an applicative lazy bisimulation if the following holds: − an applicative cbv bisimulation if the following holds: (ii) The applicative equivalence ≈ app σ is the largest applicative bisimulation.
The following is a well-known result [AO93, EHR92]: By the Context Lemma, the class of contexts in which we have to check the behavior of terms is smaller, however it is still infinite, thus the applicative bisimilarity is infinitely branching.In the following, we will study alternative coinductive characterizations of the observational equivalences, arising from the application of Leifer-Milner technique.4.2.Lambda Calculus as a Reactive System.Both lazy and cbv λ-calculus can be endowed with a structure of reactive system in the sense of Definition 2.1, by considering corresponding context categories.Definition 4.10 (Lazy, cbv λ-reactive Systems).C λ σ , for σ ∈ {l, v}, consists of • the category whose objects are 0, 1, where the morphisms from 0 to 1 are the closed terms (up-to α-equivalence), the morphisms from 1 to 1 are the unary closed contexts (up-to α-equivalence), and composition is context insertion; • the subcategory of reactive contexts is determined by the reactive contexts for the lazy and cbv strategy, respectively, presented in Remark 4.5; • the (infinitely many) reaction rules are (λx.M )N → βσ M [N/x], for all M, N , where The above definition is well-posed, in particular the subcategory of reactive contexts is composition-reflecting.One can easily check that the reactive system C λ σ has redex RPOs; this fact can be proved by rephrasing the corresponding proof for the category of term contexts of [Sew02].
Here it is essential the fact that we consider only closed terms and closed contexts.
The IPO contexts of a closed term for the lazy and cbv reactive systems are summarized in the second columns of the tables in Fig. 3. Intuitively, such contexts are minimal for the given reduction to fire.Vice versa, contexts different from the ones above are not IPO; e.g.
where R is not a value and C 1 [M ] is a value.
that the weak context bisimilarity, where the identity context [ ] is unobservable, equates all closed terms.The appropriate notion is that of weak IPO bisimilarity, which, as we will see, turns out to capture exactly the lazy and cbv equivalences.
It is interesting to observe that also the observational equivalence and the applicative bisimilarity can be characterized as weak bisimilarities on suitable context lts's.In fact it is easy to prove that the observational equivalence ≈ σ coincides with the weak bisimilarity on a restriction of the context lts built on C λ σ , defined by M Similarly, the applicative equivalence can be characterized by considering only applicative contexts in the lts.
In the following we will show that all these lts's induce the same notion of equivalence.Moreover, using the results of Section 3, we will show that the set of IPO contexts in the weak IPO bisimilarity to be considered can be significantly simplified.Then, from the fact that the weak IPO lts is the smallest of the ones above, it follows that it induces the simplest proofs that two terms are bisimilar.Now, let us denote by ≈ σI , for σ ∈ {l, v}, the lazy/cbv weak IPO bisimilarity, where the identity context is unobservable.In order to prove that ≈ σI is a congruence w.r.t.all contexts, we need to consider the category D λ σ , list extension of C λ σ , where the objects are finite lists 1, . . ., 1 , and an arrow 1, . . ., m is a m-tuple of possibly closed multi-holed contexts C 1 , . . ., C m with n holes all together.Multi-holed contexts are defined by Then, in the lazy case one can show that any closed multi-holed context either is IPO uniform or it is of the shape then clearly the first hole is reactive.Otherwise, it is of the shape In the first case, the reduction (if any) involves only P or at most P C 1 [ ], where C 1 [ ] together with the term put in the holes, plays only a passive role as argument.In the latter case, since the term put in the holes is closed, again it will be not affected by the substitution induced by the reduction.Similarly, for the cbv case, all the multi-holed contexts are IPO uniform, apart from the contexts ranging on the following grammar, which have a reactive hole: where C is a closed multi-holed context.Moreover, the reduction relation is obviously deterministic.Thus, by applying Proposition 3.5, we have: Corollary 4.12.
(i) For all M, N ∈ Λ 0 , for any closed unary context C[ ], (ii) Moreover where ≈ σR denotes the weak IPO bisimilarity where only reactive contexts are considered (see the third columns in the tables of Fig. 3).
Now, we are left to prove that the IPO bisimilarity coincides with the original observational equivalence.Notice that, in the above proposition, we also provide a new alternative proof of the Context Lemma for the lazy case.Proof.For the lazy case, we proceed by proving the following chain of inclusions: The first inclusion, ≈ l ⊆ ≈ app l , holds by definition.The third inclusion, ≈ lR ⊆ ≈ lI , follows by Corollary 4.12(ii).The others are proved as follows: ⇒ M ′ , hence also there The above argument provides a new proof of the Context Lemma.
For the cbv case, considering the applicative equivalence ≈ app v does not help, but one can prove directly: • ≈ v ⊆ ≈ vR .One can easily check that ≈ v is a "weak IPO reactive bisimulation", using the fact that ≈ v is closed under β-reduction.• ≈ vR ⊆ ≈ vI .Immediate by Corollary 4.12(ii).
From M ≈ vI N , by Corollary 4.12(i), we have ⇒ M ′ , hence also there Remark 4.14.Corollary 4.12(ii) allows us to reduce the set of IPO contexts to be considered in the IPO bisimilarities.For the lazy case, only applicative contexts can be considered (see the first table in Figure 3), while for the cbv case, the set of reactive IPO contexts is larger (see the second table in Figure 3).However, also for the cbv case, one can prove that applicative (by-value) IPO contexts are sufficient.We omit the details.
Proposition 4.13 above gives us interesting characterizations of lazy and cbv observational equivalences, in terms of lts's where the labels are significantly reduced.However, such lts's (and bisimilarities) are still infinitely branching, e.g.λx.M P −→ I , for all P ∈ Λ 0 .This is due to the fact that the context categories underlying the reactive systems C λ l and C λ v allow only for a ground representation of the β-rule through infinitely many ground rules.In order to overcome this problem, one should look for alternative categories which allow for a parametric representation of the β-rule as (λx.X)Y → X[Y /x], where X, Y are parameters.To this aim, we introduce the category of second-order term contexts (see Section 6 below).However, as we will see, this approach works only if the reaction rules are "local", that is, they do not act on the whole term, but only locally.In particular, the operation of substitution on the λ-calculus is not local and thus it is not describable by a finite set of reaction rules.To avoid this problem, in the following section we consider encodings of the λ-calculus into Combinatory Logic (CL) endowed with suitable strategies and equivalences, which turn out to correspond to lazy and cbv equivalences.

Combinatory Logic
In this section, we focus on Combinatory Logic [HS86] with Curry's combinators K, S, and we study its relationships with the λ-calculus endowed with lazy and cbv reduction strategies.An interesting result that we prove is that we can define suitable reduction strategies on CL-terms, inducing observational equivalences which correspond to lazy and cbv equivalences on λ-calculus.As a consequence, we can safely shift our attention from the reactive system of λ-calculus to the simpler reactive system of CL.In this section, we apply Leifer-Milner construction to CL viewed as a (standard) context category, and we study weak versions of context and IPO bisimilarities.Our main result is that we can recover lazy and cbv observational equivalences as weak IPO equivalences on CL * , a variant of standard CL.Here the approach is first-order, thus the IPO equivalences are still infinitely branching.However, the results in this section are both interesting in themselves, and useful for our subsequent investigation of Section 6, where CL is viewed as a second-order rewriting system, and a characterization of the lazy observational equivalence as a finitely branching IPO bisimilarity is given.
In [Sew02], a construction, similar to Leifer-Milner construction, has been applied to the Combinary Logic case.However, in that paper, it has been left open the question of whether the weak bisimilarity on the derived LTS is a congruence.In this paper, using Proposition 3.5, we can positively answer that question.
Definition 5.1 (Combinatory Terms).The set of combinatory terms is defined by: where K, S are combinators.Let CL 0 denote the set of closed CL-terms.

5.1.
Correspondence with the λ-calculus.Let Λ(K, S) denote the set of λ-terms built over constants K, S. The following is a well-known encoding: Definition 5.2 (λ-encoding).Let T : Λ(K, S) → CL be the transformation defined as follows: T In particular, if we restrict the domain of T to Λ, we get an encoding of λ-terms into CL.Vice versa, there is a natural embedding of CL into the λ-calculus E : CL → Λ: The following lemma holds: Proof.First, one can easily prove that, if M is λ-free, then ET (λx.M ) = σ λx.M (by induction on M ).Then, using the fact that T (M ) is λ-free for all M , by definition of T , one gets that T 2 (M ) = T (M ) for all M .5.1.1.Lazy/cbv observational equivalence on CL.Usually, the set of combinatory terms are endowed with the following reaction rules: We will also consider a cbv version of the above rules, reducing CL redexes only when the arguments are values, i.e., terms on the following grammar: The cbv rules are the following: Definition 5.4 (Lazy/cbv Reduction Strategy on CL).
(i) The lazy reduction strategy → l ⊆ CL 0 × CL 0 reduces the leftmost outermost CLredex.Formally: Definition 5.5 (Unary Contexts on CL).The set of unary contexts on CL is defined by Alternatively we could define the lazy strategy → l as the closure of the standard CLreaction rules under the following reactive contexts (which coincide with the applicative ones): Similarly, we could define the cbv strategy → v as the closure of the cbv reaction rules under the following reactive contexts: Let ↓ σ denote the convergence relation on CL, for σ ∈ {l, v}.
Definition 5.6 (Lazy/cbv Equivalence on CL). (i Now we proceed to prove Theorem 5.7 (⇒).Assuming M ≈ l N , we have to prove that, for all closing P , T (M ) . By Lemmata 5.8, 5.10, using the fact that ≃ l is a congruence, we have T (M [E( P )/ x]) ≃ l T (N [E( P )/ x]).By Lemma 5.11, T (M )[T E( P )/ x] ≃ l T (N )[T E( P )/ x], hence by Lemma 5.11, using the fact that ≃ l is a congruence, we have T (M )[ P / x] ≃ l T (N )[ P / x].
In order to prove Theorem 5.7 (⇐), assume T (M ) ≃ l T (N ).We have to prove that, for all closing P , M 5.2.The First-order Approach: CL as a Context Category.We endow CL with a structure of reactive system in the sense of [LM00], by considering the context category of closed unary contexts: Definition 5.12 (Lazy, cbv CL Reactive Systems).C 1 σ , for σ ∈ {l, v}, consists of: • the context category whose objects are 0, 1, where the morphisms from 0 to 1 are the closed terms, the morphisms from 1 to 1 are the closed unary contexts, and composition is context substitution; • the subcategory of reactive contexts is determined by the reactive contexts for the lazy and cbv strategy, respectively, presented in Definition 5.4; • the reaction rules are the standard CL reduction rules for the lazy case, and the cbv reduction rules for the cbv case.
Lemma 5.13.The reactive systems C 1 σ have redex RPOs.One can easily check that the IPO contexts are the following.
• Lazy.The IPO contexts for a given term M are: − [ ] P , where P has the minimal length for the top-level reaction of M to fire, For M not a value, the following contexts are IPOs: For M value, the following contexts are IPOs: , where i is the minimum number of arguments necessary for the top-level reaction of M to fire, − [ ]V 1 . . .V i P , where P is not a value, and i, possibly 0, is less than the minimum number of arguments necessary for the top-level reaction of M to fire, − V C[ ]V 1 . . .V i where V and C[M ] are values and i + 1 is the minimum number of arguments necessary for the top-level reaction of V to fire, in more detail: , where V and C[M ] are values, P is not a value, and i + 1 is less than the minimum number of arguments necessary for the top-level reaction of V to fire, in more detail: For any term M , the following contexts are IPOs: where P is not a value and C[ ] is any context.
For any of the above contexts there is a reduction rule which applies, and the context is minimal for the given reduction to fire.By case analysis, one can show that all the other contexts are not IPO contexts.
The strong versions of context and IPO bisimilarities are too fine, since, as in the λcalculus case, they take into account reduction steps, and tell apart β-convertible terms.Thus we consider weak variants of such equivalences, where the identity context [ ] is unobservable.Weak context bisimilarity is too coarse, since it equates all terms.However, we will prove that the weak IPO bisimilarity "almost" coincides with the lazy/cbv equivalence.Moreover, we will show how to recover the exact correspondence by considering a suitable variant of CL.
First of all, let ≃ σI , for σ ∈ {l, v}, denote the lazy/cbv weak IPO bisimilarity obtained by considering the identity context as unobservable.Similarly to the case of the λ-calculus, we can define a list extension category by taking the category of multi-holed contexts.In this category all contexts with no reactive indexes are IPO uniform.In the lazy case, the contexts with a reactive index are of the shape [ ]C 1 [ ] . . .C k [ ] (with the leftmost hole being reactive), and the remaining ones have not reactive indexes and are IPO uniform.For the cbv case, one can show that the multi-holed contexts with a reactive index are given by the grammar: where C[ ] is any closed multi-holed context.Thus, by Proposition 3.5(i), we have: Proposition 5.14.For all M, N ∈ CL 0 , for any closed unary context C[ ], The rest of this section is devoted to compare the lazy/cbv weak IPO bisimilarity ≃ σI with the lazy/cbv equivalence on CL ≃ σ defined in Definition 5.6.The following lemma can be easily proved by coinduction, using Proposition 5.14.
Proof.We prove that ≃ σI is a lazy/cbv bisimulation on CL.Let M ≃ σI N .If M ↓ σ , then also N ↓ σ , since a convergent term has different IPO-transitions from a divergent term.We are left to prove that for all P , M P ≃ σI N P .But this follows from Proposition 5.14.However, the converse inclusion ≃ σ ⊆ ≃ σI does not hold, since for instance K ≃ σ S(KK)(SKK), because, e.g. for the lazy case, for all P , S(KK)(SKK)P → * KP .But → I .The problem, which was already noticed in [Sew02], arises since the equivalence ≃ σI tells apart terms whose top-level combinators expect a different number of arguments to reduce.In order to overcome this problem, we consider an extended calculus, CL * , where the combinators K and S become unary, at the price of adding new intermediate combinators and intermediate reductions (the reactive contexts are the ones in Definition 5.12).
• Rules: Notice that the calculus in the above definition is well-defined, since the set of terms is closed under the reaction rules.One can define lazy/cbv reduction strategies on CL * as in Definition 5.4, or as the closures of the reaction rules under the following reactive contexts: Definition 5.17 (CL * Reactive Contexts).
Let ≃ * σ be the lazy/cbv equivalence defined on CL * , similarly as in Definition 5.6 for CL.There is a trivial embedding of CL-terms into CL * .Moreover, one can easily check that, when restricted to terms of CL, ≃ * σ coincides with ≃ σ .Analogously to the CL case, we define the reactive system over CL * .In the context category, the unary closed contexts are defined by the grammar where M is a closed term.Notice that, under the above definition, expressions like K ′ [ ] do not represent unary closed context.In defining the IPO transitions, it is important to observe that C[M ] is a value iff M is a value and C[ ] is the identity context [ ].Let us denote by ≃ * σI the weak IPO bisimilarity obtained by considering the lazy/cbv reactive system over CL * .Since CL * -terms expect at most one argument, the IPO contexts for CL * are simpler than the ones for CL, and they are summarized in Figure 4.
Similarly to the previous case, one can consider the multi-holed contexts category as a list extension category.In this category all contexts are either IPO uniform or have a reactive index.Moreover, the reduction relation is deterministic.Thus Proposition 3.5 applies and we have: where R is not a value, V is a value, C[ ] is a generic unary context.By Proposition 5.18(ii) above, the weak IPO equivalence can be significantly simplified.Namely, in the lazy case, we obtain the weak IPO bisimilarity ≃ lR , where only applicative IPO contexts are considered (see Figure 4).In the cbv case, Proposition 5.18 allows us to reduce ourselves to contexts of the shape [ ], [ ]P, V [ ] (see Figure 4).However, one can prove that also in this case we can consider only applicative by-value contexts.We skip the details of such proof.
As a consequence of Theorem 5.7 and Theorem 5.19 above, we can recover the lazy/cbv observational equivalence on λ-terms as weak IPO bisimilarity on CL * .Proposition 5.20.For all M, N ∈ Λ 0 , M ≈ σ N ⇐⇒ T (M ) ≃ * σI T (N ).However, such notions of weak IPO bisimilarities still suffer of the problem of being infinitely branching, since the IPO contexts are [ ], [ ]P for the lazy case, and [ ], [ ]V for the cbv case, for all P, V ∈ (CL * ) 0 .This problem will be solved in the next section, where we introduce the notion of second-order context category, and we endow CL * with such a structure.

Second-order Term Contexts
The definition of term context category [LM00] can be generalized to a definition of second-order term context category.The generalization is obtained by extending the term syntax with function (second-order) variables, that is, variables not standing for terms but instead for functions on terms.The formal definition is the following Definition 6.1 (Category of Second-order Term Contexts).Let Σ be a signature for a term language.The category of second-order term contexts over Σ is defined by: objects are finite lists of naturals n 1 , . . ., n k , an arrow m 1 , . . ., m h → n 1 , . . ., n k is a k-tuple t 1 , . . ., t k , where the term t i is defined over the signature Σ ∪ {F m 1 1 , . . ., F m h h }∪ {X i,1 . . ., X i,n i }, where F m i i is a function variable of arity m i , X i,j is a ground variable.The category of secondorder linear term contexts is the subcategory whose arrows are n-tuples of terms, satisfying the condition that the n-tuples have to contain exactly one use of each function variable F m i i and ground variable X i,j .The category of second-order function-linear term contexts, T * 2 (Σ), is the subcategory whose arrows are n-tuples of terms, satisfying the condition that the n-tuples have to contain exactly one use of each function variable F m i i , moreover no function variable appears inside the argument of another function variable.
Remark.Notice that the above definition of second-order linear term contexts is different from that given in the conference version of the present paper, [DHL08].The modification was necessary because the original definition was incorrect (second-order linear contexts were not closed by composition).
In the following we are going to use just a subcategory of the category of second-order function-linear term contexts, however, at this point, we prefer to present the original idea of second-order term contexts in its full generality.
Example 6.2.Given the signature of natural numbers {0, S, +}, examples of second-order linear contexts representing arrows in 2, 0 → 0, 2 are: ) Note that the last context is not function-linear.Examples of second-order function-linear contexts are: None of the above contexts is linear.Examples of second-order contexts that are neither function-linear nor linear are: ) Intuitively, an arrow in 2, 0 → 0, 2 represents a pair of contexts containing two holes F 2 1 , F 0 2 , where F 2 1 is a hole that must be filled by a term representing a function with two arguments while F 0 2 is a hole that must be filled by a term representing function with no arguments, i.e., a ground term.The first context in the pair 2, 0 → 0, 2 represents a function with no arguments, while the second context represent a function with two arguments X 2,1 , X 2,2 .
One can check that the standard category of term contexts over Σ coincides with the subcategory whose objects are the lists containing only copies of the natural number 0; in fact this subcategory uses function variables with no arguments and the ground variables do not appear.
The identity arrow on the object n 1 , . . ., n k is: In order to define composition in the categories of second-order term contexts, it is convenient to consider the λ-closure of the tuple of terms representing arrows and to define arrow composition through β-reduction.

. , n j
To give an example, the composition between In other words, the composition is given by a j-tuple of expressions t i in which every function variable G l is substituted by the corresponding expression s l , with the ground variables of s l substituted by the corresponding parameters of G l in t i .
Note that the identity morphism is defined as a λ-term implementing the identity function, while composition on morphisms is defined by the function composition in the λ-setting.Given this correspondence, it is easy to prove that the categorical properties for the identity hold, while the associativity of composition essentially follows from the unicity of the normal form.
Finally one need to prove that composition preserve linearity and function-linearity.For what concerns linearity, it is a well-known result that linear λ-terms are closed by βreduction.From this fact one can immediately prove that second-order linear contexts are closed by composition.
Preservation of function-linearity can be proved similarly.First we generalize the notion of function-linearity to λ-terms stating that a function-linear λ-term is a typed lambda-term with constants, where • all the variables and constants have either a ground type or a first-order function type; • each bound function variable (e.g.F ) appears exactly once in the term, and only inside the arguments of constants (e.g.S(F (0) + 0), or inside the arguments of λ-expressions having a second-order function type (e.g.(λ λ λGλ λ λY.G(Y ) + Y )(λ λ λX.F (X + S(0)))).That is, no function variable appears inside the argument of an expression that has first order function type and is not a constant (e.g.G(S(F (0)) + 0) and (λ λ λX.X + X)(F (0))).It is straightforward to prove that function-linear λ-terms are closed by β-reduction and that, given two function-linear second-order contexts, the term, whose β-normal form defines composition, is a function-linear λ-term.From this the claim follows.
The main general result on second-order term contexts is the following: Proposition 6.3.For any signature Σ, in the category of second-order (linear) (functionlinear) term contexts over Σ, any commuting square, having as initial vertex the empty list ǫ, has an RPO.
Proof.First we present the proof for the special case useful in this paper, namely we consider the restricted category containing as objects the lists with at most one element.Given two arrows with domain the empty list: t 1 : ǫ → n 1 and t 2 : ǫ → n 2 , and two arrows s 1 : n 1 → m , s 2 : n 2 → m completing t 1 and t 2 into a commuting square (s 1 •t 1 = s 1 •t 1 : ǫ → m ), the corresponding RPO for this commuting square is inductively defined on the structures of s 1 , s 2 .There are several cases to consider: (i) s 1 = c 1 (s 1,1 , . . ., s 1,k 1 ) and s 2 = c 2 (s 2,1 , . . ., s 2,k 2 ), with c 1 , c 2 function symbols in the signature Σ. Necessarily c 1 = c 2 (and k 1 = k 2 ).We have to consider in which subterms of s 1 and s 2 the function variables, F n 1 1 and F n 2 2 , appear.If F n 1 1 and F n 2 2 appear in corresponding subterms, that is, there is an i such that all F n 1 1 appears in s 1,i and all F n 2 2 in s 2,i , then we have that s 1,i and s 2,i , together with t 1 , t 2 , form a commuting square, and the RPO, inductively defined, for this second commuting square, immediately induces the RPO for s 1 and s 2 .The subcase where F n 1 1 and F n 2 2 do not appear in corresponding subterms is treated at point (iii).(ii) s 1 = F n 1 1 (s 1,1 , . . ., s 1,n 1 ) and s 2 = F n 2 2 (s 2,1 , . . ., s 2,n 2 ), and, for the general case, F n 1 1 , F n 2 2 not appearing in the subterms s h,i .In this case, we have that that is, there is a unifier i.e., a substitution making t 1 and t 2 equal.Consider the most general unifier (mgu) for t 1 and t 2 , this is given by tuples of terms, s ′ 1,1 , . . ., s ′ 1,n 1 and i i S S S S S S S S S S S S S S S S S S In this point we consider all the remaining cases, that is, where: s 1 = c 1 (s 1,1 , . . ., s 1,k 1 ), s 2 = c 2 (s 2,1 , . . ., s 2,k 2 ) and either F n 1 1 and F n 2 2 do not appear in corresponding subterms, or c 1 = F n 1 1 or c 2 = F n 2 2 .Let us consider the term s ′ 1 obtained from s 1 by substituting any maximal subterm s o not containing F n 1 1 by a ground variable X so .For example, if )), and analogously for the term s 2 .Let s ′′ 1 = s ′ 1 •t 1 , and s ′′ 1 = s ′ 2 •t 2 .Now we have that: ] that is, there exists a unifier for s ′′ 1 and s ′′ 2 , we can consider the most general unifier, given by a pair tuples of terms s ′ 1, l 1 , . . ., s 1, lm 1 and s 2, j 1 , . . ., s 1, jm 2 . By repeating the arguments used at point (ii), we have that ] form an RPO.
The proof for the general case is now almost immediate.The RPO for the square m 1 , . . .m k n 1,1 , . . .n 1,j 1 s 1,1 ,...,s 1,k 8 8 q q q q q q q q q q n 2,1 , . . .n 2,j 2 s 2,1 ,...,s 2,k  for 1 ≤ i ≤ k into a sequence.In turn, the RPO for these diagrams can be obtained by essentially repeating the construction presented for the unary case.Finally, it is immediate to prove that the presented construction preserve linearity and function-linearity of arrows.
The above proposition holds also for the case of linear second-order contexts and the prove remains almost the same.6.1.CL * as Second-order Rewriting System.In this section, we consider the secondorder context category for the combinatory calculus CL * and we show that the weak IPO lazy bisimilarity thus obtained coincides with the lazy observational equivalence on λ-calculus, while for the cbv case we get a finer equivalence.Interestingly, the second-order open bisimilarity gives a uniform characterization also on open terms.
Note that the terms of CL are defined by the signature Σ CL = {K, S, app}, where app is the binary operation of application that is usually omitted.So the term SKK actually stands for app(app(S, K), K).
First we deal with the lazy case, then we will sketch also the cbv case.
6.1.1.The Lazy Second-order Reactive System.Definition 6.4 (Lazy Second-order Reactive System on CL * ).The lazy second-order reactive system C 2 * l consists of: • the function-linear category whose objects are the lists with at most one element, and whose arrows ǫ → n are the terms of CL * with, at most, n (first order) metavariables, and whose arrows m → n are the second-order contexts defined by: M n • the reactive contexts are all the second-order applicative contexts of the shape Second-order contexts as defined above can be represented by C[F (M 1 , . . ., M m )], where C[ ] is a unary first-order context on CL * (with metavariables).To maintain the notation for contexts used in Sections 4, 5, in the sequel a second-order context C[F (M 1 , . . ., M m )] : m → n will be more conveniently written as C[ ] θ , where θ is a substitution s.t.θ(X i ) = M i for all i = 1, . . ., m, moreover we write M term M IPO contexts reactive IPO contexts Example: Let M = XM 1 .Some of the IPO reductions of M are the following: In general, the IPO contexts are summarized in Figure 5.Using Proposition 3.5, we can prove that the weak IPO bisimilarity ≃ 2 * lI is a congruence, and it has a simpler characterization in terms of applicative contexts.Namely, we can consider as list extension category the category of all function-linear term contexts.In the alternative notation, a second-order linear term contexts can be written as C[ θ 1 , . . ., θn ], where C[ 1 , . . ., n ] is a first-order multi-holed context and θ 1 , . . ., θ n are n substitutions, each one acting on the term put in the corresponding hole.By repeating the arguments for the first-order case, one can show that any second-order linear term context either is IPO uniform or it has a reactive index.Then, by Proposition 3.5, we have: Proposition 6.6.
(i) For all terms of CL * M, N , for any substitution θ and for any (possibly open) first- lR , where ≃ 2 * lR denotes the weak IPO bisimilarity, where only reactive IPO contexts are considered (see Figure 5).By Proposition 6.6(ii) above, the notion of IPO bisimilarity turns out to be much simpler, but it is still infinitely branching (when the term is of the shape XP 0 P we have infinitely many IPO contexts [ ] {A Y /X} ).However, one can prove that also the contexts [ ] {A Y /X} , for any | Y | ≥ 1 can be eliminated.This requires an "ad-hoc" reasoning: lF N }, where M ⌢ N means that M and N are KS-convertible.Finally, we are left to prove that the second-order weak IPO bisimilarity exactly recover the lazy observational equivalence.More in general, we will prove that the two equivalences coincide on open terms.Namely, we can view open terms with n free variables as arrows from ǫ to n (by identifying variables with metavariables).Thus we have directly a notion of equivalence on open terms.We will show that this equivalence coincides with the usual extension to open terms of the observational equivalence by substitution.This gives a uniform finitely branching characterization of the observational equivalence on all (closed and open) terms.Proposition 6.8.For all M, N ∈ Λ, M ≈ l N ⇐⇒ T (M ) ≃ 2 * lI T (N ).
Proof of Proposition 6.8.We will show that ≃ 2 * lI coincides with the natural extension to open terms of the first-order IPO bisimilarity ≃ * lI of Section 5.2.Definition 6.9.Let ≃ * lI be the extension of ≃ * lI to open terms of CL * defined by, for all M, N CL * -terms such that F Proof.The proof follows from the fact that ∀θ.M θ → * l M ′ θ and ≃ * lI is closed under → l .Lemma 6.12.≃ * lR ⊆ ≃ 2 * lR .
6.1.2.The Cbv Second-order Reactive System.The main difference between the cbv and the lazy case is that the variables in the cbv case are meant to represent values, consequently cbv substitutions have to map variables into values.First of all, the values on CL * are defined by: Definition 6.13 (Cbv Second-order Reactive System on CL * ).The cbv second-order reactive system C 2 * v consists of: • the function-linear category whose objects are the lists with at most one element, and whose arrows ǫ → n are the terms of CL * with, at most, n (first order) metavariables, and whose arrows m → n are the second-order contexts defined, briefly, by: where the values V 1 , . . ., V m and the term N are built using n variables.• the reactive contexts are defined by • the reaction rules are By Proposition 6.3, we have: As in the lazy case, a second-order context C : m → n will be more conveniently denoted by C[ ] θ , where C[ ] is a unary first-order context and θ is a cbv substitution, i.e., s.t.θ(X i ) is a value, for all i = 1, . . ., m.
According to our definition, there are terms that are neither values nor they are reducible (they do not contain any redex), the term XY is an example.A term M of this kind can be transformed in a reducible one by substituting a single specific variable with a value.We call critical variable a variable of this kind.Definition 6.15.The critical variable of a second-order term M , Cr(M ), if it exists, is recursively defined by: Cr(V ) = ∅ , Cr(XV ) = X , Cr(V M ) = Cr(M ) , if M is not a value, Cr(M N ) = Cr(M ) , if M is not a value.
The second-order IPO contexts for cbv are summarized in Figure 7.In that figure, the symbol R ranges over most general reducible terms.That is, any reducible term can be obtained by instantiating the variables of a term contained in that grammar.The symbol T is used to represent general terms; remember that variables represent general values.
As for the previous case, by Proposition 3.5 and by considering as list extension category the category of all by-value function-linear term contexts, we have: It is important to notice that the reactive IPO contexts provide directly a finitely branching lts for the cbv combinatory logic (notice that, contrary to the lazy case, for the cbv case The problem arises from the fact that in the second-order cbv bisimilarity we observe the existence of a critical variable, while in the contextual equivalence we do not.

Final Remarks and Directions for Future Work
There are several other attempts to deal with parametric rules in the literature.In his seminal paper [Sew02], Sewell presents two different constructions, one based on ground reaction rules and the other based on parametric rules.The RPO construction can be seen as a categorical account of the ground rules construction.Parametric rules, in the form they are defined in [Sew02], do not have an obvious categorical presentation.In [KSS05], the authors introduce the notion of luxes to generalize the RPO approach to cases where the rewriting rules are given by pairs of arrows having a domain different from 0. Luxes can be seen as a categorical account of the parametric rules approach of Sewell.When instantiated to the category of contexts, the luxes approach allows to express rewriting rules not formed by pairs of ground terms but, instead formed by pairs of contexts (open terms), and so allowing parametricity.Compared to our approach, based on the notion of second-order context, the approach of luxes is more abstract and it can be applied to a wider range of cases (categories).However, if we compare the two approaches in the particular case of context categories, we find that the luxes approach has a more restricted way to instantiate a given parametric rule.This restriction results in a not completely satisfactory treatment of the λ-calculus.It remains the open question of substituting the notion of second-order context with a more abstract and general one.This will allow to recover the extra generality of luxes.A possible alternative approach for dealing with the λ-calculus in Leifer-Milner's RPO setting, it that of using suitable encodings in the (bi)graph framework [Mil06].However, we feel that our term solution based on second-order context categories and CL is simpler and more direct.Alternatively, in place of CL, one could also consider a λ-calculus with explicit substitutions, in order to obtain a convenient encoding of the β-rule, allowing for a representation as a second-order reactive system.This is an experiment to be done.Here we have chosen CL, since it is simpler; moreover, the correspondence between the standard λ-calculus and the one with explicit substitutions deserves further study.We have considered lazy and cbv strategies, however also other strategies, e.g. head and normalizing could be dealt with, possibly at the price of some complications due to the fact that such strategies are usually defined on open terms.It would be also interesting to explore non-deterministic strategies on λ-calculus.
Definition 2.5 (RPO/IPO).(i)Let C be a category and let us consider the commutative diagram in Fig.1(i).Any tuple I 5 , e, f, g which makes diagram in Fig.1(ii) commute is called a candidate for (i).A relative pushout (RPO) is the smallest such candidate, i.e., it satisfies the universal property that given any other candidate I 6 , e ′ , f ′ , g ′ , there exists a unique mediating morphism h : I 5 → I 6 such that both diagrams in Fig.1(iii) and Fig. 1(iv) commute.(ii) A commutative square such as diagram (i) in Fig 1 is an idem pushout (IPO) if I 4 , c, d, id I 4 is its RPO.Definition 2.6 (IPO Transition System).(1) States: arrows t : 0 → I in C, for any I; (2) Transitions: t c −→ I dr iff d ∈ D, ct = dl, l, r ∈ R and the diagram in Fig. 1(i) is an IPO.
where τ −→ * denotes the reflexive and transitive closure of τ −→.(ii) Let us call weak bisimilarity the bisimilarity induced by the weak lts.The above definition differs from the one proposed in[LM00], where, in case α = τ , .We cannot use the latter, since it discriminates λ-terms which are equivalent in the usual semantics.The following easy lemma gives a useful characterization of the weak bisimilarity, whereby any α −→-transition is mimicked by a α =⇒-transition: Lemma 2.11.Let α −→ be a lts and let α =⇒ be the corresponding weak lts.The induced weak bisimilarity is the greatest symmetric relation R s.t.: and R * denotes the reflexive and transitive closure of R.
dr.Since c is reactive and squares of the form by composition of IPO squares (and by induction) it is easy to prove that cu Id −→ I * cu 1 f −→ I d ′ u 2 Id −→ I * d ′ u 0 , which implies the claim.
4.1.Syntax, Reduction Strategies, Observational Equivalences.Definition 4.1 (Syntax).The set of λ-terms Λ is defined by (Λ ∋) M ::= x | M M | λx.M , where x ∈ Var is an infinite set of variables.Let FV (M ) denote the set of free variables in M , and let us denote by Λ 0 the set of closed λ-terms.
C[ ]P , for terms of the shape λx.M , is not IPO if C[ ] is different from λx.C 1 [ ] and [ ], because otherwise the reduction can fire already in C[ ].The strong versions of context and IPO bisimilarities are too fine, since they take into account reaction steps, and tell apart β-convertible terms.Trivially, I and II, where I = λx.x,are equivalent neither in the context bisimilarity nor in the IPO bisimilarity, since I [ ] →, while II [ ] → (both in the lazy and cbv case).On the other hand, one can easily check Lazy IPO lts's term IPO contexts reactive IPO contexts λx.M [ ]P, (λx.C[ ])P , P C[ ] [ ]P (λx.M )N P [ ], (λx.C[ ])P , P C[ ] [ ]

Figure 5 :
Figure 5: Second-order IPO contexts for the lazy CL * .

Figure 6 :
Figure 6: Finitely branching second-order IPO contexts for the lazy CL * .
then there are two cases.(i) M = C M , for a combinator C on CL * .Then θ = ∅, and for any closing θ and closed P such that | X| = | P |, M θP → I M ′′ and M ′′ = M ′ θ[ P /X].Since M θ ≃ * lR N θ, then N θ P ⇒ I N ′′ and M ′′ ≃ * lR N ′′ .There are two subcases: either X = [ ] or X = X.In the first subcase, we have M → I M ′ (second-order) and N ⇒ N (second-order), thus by Lemma 6.11 M ′ ≃ * lR N , and hence (M ′ , N ) ∈ R. In the second subcase, i.e., X = X, M is a value different from a variable, then one can check that also N must reduce to a value different from a variable, thusN [ ] ∅ X ⇒ N ′ and N ′′ = N ′ θ[P/X].Thus M ′ ≃ * lR N ′ ,and hence (M ′ , N ′ ) ∈ R. (ii) M = X M .Since for any closing θ, M θ ≃ * lR N θ, then also N [ ] θ X ⇒ I N ′ .Moreover, for any θ closing M θ, N θ, for any P such that | P | = | X|, we have M θθ P → I M ′′ , N θθ P → I N ′′ , M ′′ = M ′ θ[ P / X], N ′′ = N ′ θ[ P / X].Thus for all θ Figure 7: Second-order IPO contexts for cbv CL * .
IPO contexts of the shape [ ] {A Y /X} , for | Y | ≥ 1, do not exist, since substitutions have to map variables into values).The cbv weak IPO bisimilarity turns out to be strictly included in the cbv contextual equivalence.Namely, if we consider T (λx.x) = SKK , and T (λxy.xy)= S[S(KS)(S(KK)(SKK))][S(S(KS)(KK))(KK)] then T (λx.x) ≈ v T (λxy.xy),however T (λx.x) ≃ 2 * vI T (λxy.xy), because T (λxy.xy)[] ∅ X ⇒ S ′′ (K ′ X)(S ′′ KK) [ ] ∅ Y−→, while T (λx.x) Consider the relationS = { ct, cu | t ≈ R u, c context }.It is immediate that ≈ I ⊆≈ R , and from this, ≈ I ⊆ { ct, cu | t ≈ I u, c context} ⊆ S.If we prove also the inclusion S ⊆≈ I , then all relations are equal and ≈ I coincides with its contextual closure, i.e., it is congruence.By Lemma 2.11, in order to prove S ⊆≈ I it is sufficient to show that, for any ct, cu ∈ S, if ct I t ′ then there exists u ′ s.t.cu f −→ t 1 and t 2 into a commuting square that is also an RPO, in fact any other pair of arrows completing t 1 and t 2 into a commuting square and factorizing the original one needs to be of the form F n 1 ′′ , with the two sequences s ′′ 1,i and s ′′ 2,i defining a unifier for t 1 , t 2 .The unique arrow factorizing the two commuting squares is F m ′ (s ′′′ 1 , . . .s ′′′ m ′ ), where s ′′′