Relating Apartness and Bisimulation

A bisimulation for a coalgebra of a functor on the category of sets can be described via a coalgebra in the category of relations, of a lifted functor. A final coalgebra then gives rise to the coinduction principle, which states that two bisimilar elements are equal. For polynomial functors, this leads to well-known descriptions. In the present paper we look at the dual notion of"apartness". Intuitively, two elements are apart if there is a positive way to distinguish them. Phrased differently: two elements are apart if and only if they are not bisimilar. Since apartness is an inductive notion, described by a least fixed point, we can give a proof system, to derive that two elements are apart. This proof system has derivation rules and two elements are apart if and only if there is a finite derivation (using the rules) of this fact. We study apartness versus bisimulation in two separate ways. First, for weak forms of bisimulation on labelled transition systems, where silent (tau) steps are included, we define an apartness notion that corresponds to weak bisimulation and another apartness that corresponds to branching bisimulation. The rules for apartness can be used to show that two states of a labelled transition system are not branching bismilar. To support the apartness view on labelled transition systems, we cast a number of well-known properties of branching bisimulation in terms of branching apartness and prove them. Next, we also study the more general categorical situation and show that indeed, apartness is the dual of bisimilarity in a precise categorical sense: apartness is an initial algebra and gives rise to an induction principle. In this analogy, we include the powerset functor, which gives a semantics to non-deterministic choice in process-theory.


Introduction
Bisimulation is a standard way of looking at indistinguishability of processes, labelled transitions, automata and streams, etc. These structures all have in common that they can be seen as coalgebraic: the elements are not built inductively, using constructors, but they are observed through "destructors" or "transition maps". The coinduction principle states that two elements that have the same observations are equal, when mapped to a "final" model. A bisimulation is a relation that is preserved along transitions: if two elements are bisimilar, and we perform a transition, then we either get two new bisimilar elements, or we get equal outputs (in case our observation is a basic value). Two elements are bisimilar if This is a derivation system in the traditional (inductive) sense: a judgment holds if there is a finite derivation (so no infinite or circular derivations) that has that judgment as its conclusion. To show that the apartness view on LTSs is fruitful, we use the derivation system for branching apartness to show that the branching apartness relation is co-transitive and satisfies the apartness stuttering property. (These notions will be dealt with in Section 3.1.) These imply the stuttering property and the transitivity for branching bisimulation, properties that are known to be subtle to prove. (See [Bas96]. ) We also indicate how the derivation system can be used as an algorithm for proving branching apartness of two states in an LTS and we define and discuss the notion of rooted branching apartness which is the dual of rooted branching bisimulation.
The second part switches to a more abstract categorical level. It is restricted however to functors on the category of sets. First, the standard coalgebraic approach is recalled, in which a bisimulation is a coalgebra itself, for a lifting of the functor involved to the category of relations. This can be applied in particular to polynomial functors and yields familiar descriptions of bisimulation.
Next, apartness is described in an analogous manner. It does not use the category Rel of relations, nor its usual opposite Rel op , but a special "fibred" opposite Rel fop . A special lifting of a functor to Rel fop is described, via negation as a functor ¬ : Rel → Rel fop . An apartness relation is then defined as a coalgebra of the lifted functor (to Rel fop ). This set-up then guarantees that a relation R is a bisimulation iff ¬R is an apartness relation. Moreover, there is an analogue of the coinduction principle, stating that two states of a coalgebraic system are apart iff they are non-equal when mapped to the final coalgebra. A significant conclusion from this analysis is: bisimilarity is the greatest fixed point in a partial order of relations. But apartness is the least fixed point in that order. This means that apartness can be established in a finite number of steps. Hence it can be described via a system of proof rules. This, in the end, is the main reason why apartness can be more amenable than bisimulation.
We should emphasize that the two parts of this paper are really "apart" since there is no overlap. There is quite a bit of work on dealing with weak/branching bisimulation in a coalgebraic setting (see e.g. [SdVW09,BK17,Bre15,BMP15,GP14]), but there is no generic, broadly applicable approach. In this paper we are not solving this longstanding open problem. We have separate descriptions of weak/branching apartness (in the first part) and of categorical apartness (in the second part). The only hope that we can offer at this stage is that apartness might provide a fresh perspective on a common approach.
To clarify some terminology and relate the corresponding notions of the bisimulation view and the apartness view, we give the following table. A bisimulation relation models an equality of processes or process terms, whereas an apartness relation models an inequality, so R will be a bisimulation (of some type) if and only if ¬R is an apartness (of that same type). Bisimilarity is the largest bisimulation relation, which means that it is a coinductively defined concept. Apartness is the smallest apartness relation, which means that it is an inductively defined concept. A bisimulation should be (at least) an equivalence relation, meaning that it satisfies reflexivity, symmetry and transitivity. The dual notions are irreflexivity, symmetry and co-transitivity, which together are usually called "apartness" in the literature. To avoid confusion, we have introduced the terminology "proper apartness" for a relation that satisfies irreflexivity, symmetry and co-transitivity. In process theory, bisimulation is not an equivalence relation by definition, so neither is an apartness a "proper apartness" by definition. There is really some work to do, so therefore it is important to single out these notions. A relation R is a congruence in case it is preserved by application of operators: if R(x, y), then R((f (x), f (y)) for any operator f . The dual notion is strong extensionality, but in the "apartness view", this is not a property of the relation but of the operator. Operator f is strongly extensional (for apartness relation Q) if Q(f (x), f (y)) implies Q(x, y). (Intuitively: if f (x) and f (y) are different, then x and y should be different.) 1.1. Contents of the sections. In Section 2, we introduce bisimulation and apartness for streams and for deterministic automata, as preparation for more general/complicated cases. In Section 3, we discuss weak and branching bisimulation and apartness and we indicate the potential use of reasoning with apartness instead of bisimulation. In Section 4 we recap the coalgebraic treatment of bisimulation for coalgebras in the category Set as a coalgebra in the category Rel. In Section 5 we introduce the dual case and give a coalgebraic treatment of apartness, as the opposite of bisimulation. For completeness, we give, in the Appendix, a syntactic treatment of the general picture of Section 2, where we have a general type of coalgebras for which we define bisimulation and apartness.
Special Thanks. We dedicate this article to Jos Baeten on the occasion of his retirement. Much of Jos' research has centered around process theory and process algebra, where various forms of bisimulation equivalence have always played a central role. As a math student, before going to the USA to do a PhD on a topic in the intersection of recursion theory and set theory, Jos was part of the Dutch "school" on constructive mathematics, and we think that it is nice to see that 'apartness', a notion which originates from constructive mathematics, also has a natural place in the study of process (non-)equivalence. The first author in particular would like to thank Jos for the years he has worked at the Technical University Eindhoven in the Formal Methods group, led by Jos, the many things he has learned during this period and the pleasant cooperation on topics of science, education and organisation. Thanks Jos!

Bisimulation and apartness for streams and deterministic automata
We start from the coalgebra of streams over an alphabet A and the coalgebra of DAs (Deterministic Automata) over A, for which we illustrate the notions of bisimulation and apartness. We work in the category Set of sets and functions. The coalgebra of streams over A is given by a function c = h, t : K → A × K, where we associate every s ∈ K with a stream by letting h(s) ∈ A denote the head of s and t(s) ∈ K the tail of s.
Definition 2.1. Let A be a fixed set/alphabet. A coalgebraic map h, t : K → A × K gives rise to the following notions of bisimulation for c and apartness for c. (1) A relation R ⊆ K × K is a c-bisimulation if it satisfies the following rule Two states s 1 , s 2 ∈ K are c-bisimilar, notation s 1 ↔ c s 2 , is defined by s 1 ↔ c s 2 := ∃R ⊆ K × K (R is a c-bisimulation and R(s 1 , s 2 )).
(2) A relation Q ⊆ K × K is a c-apartness if it satisfies the following rules Two states s 1 , s 2 ∈ K are c-apart, notation s 1 # c s 2 , is defined by Before we prove some generalities about bisimulation and apartness, we now first treat the example of deterministic automata, DAs. A DA over A is given by a set of states, K, a transition function δ : K × A → K and a function f : K → {0, 1} denoting whether q ∈ K is a final state or not. We write 2 for {0, 1} and we view, as usual in coalgebra, a DA as a coalgebra c : K → K A × 2, consisting of two maps c = δ, f with δ : K → K A and f : K → 2. We use the standard notation for automata and write q → a q if δ(q)(a) = q and q ↓ if f (q) = 0.
We now introduce the notions of bisimulation and apartness for DAs. The first is well-known, the second less so. These notions can be defined in a canonical way for a large set of functors on Set. This we will describe categorically in Section 5. In the Appendix, we will give an outline in logical-syntactic terms.
Definition 2.2. Let A be an alphabet and let K be a set of states. A coalgebraic map c 1 , c 2 : K → K A × 2 gives rise to the following notions of bisimulation for c and apartness for c.
(1) A relation R ⊆ K × K is a c-bisimulation if it satisfies the following rule.
(2) A relation Q ⊆ K × K is a c-apartness if it satisfies the following rules.
As usual, rules are "schematic" in the free variables that occur in it, so the left rule represents a separate rule for each a ∈ A. That two states q 1 , q 2 ∈ K are c-apart, notation q 1 # c q 2 , is defined by q 1 # c q 2 := ∀Q ⊆ K × K (if Q is a c-apartness, then Q(q 1 , q 2 )).
In case the coalgebra c is clear from the context, we will ignore it. In DAs, two states are bisimilar if and only if they are not apart, which can easily be observed in the following example.
A bisimulation is given by q 1 ∼ q 2 . It can be shown that q 0 # q 3 because for every apartness Q we have the derivation given on the right.
We see that "being c-apart", being the smallest relation satisfying specific closure properties, is an inductive property. This implies that the closure properties yield a derivation system for proving that two elements are c-apart. This will be further explored in the next section. In the example, we are basically using this: we have proven q 3 # q 0 by giving a derivation.
A relation Q is usually (e.g. see [TvD88], Chapter 8) called an apartness relation if it is irreflexive, symmetric and co-transitive. As we have already used the terminology "apartness relation" for the dual of a bisimulation relation, we shall, for the present paper, refer to these as "proper apartness relations".
It is easy to see that inequality on a set is a proper apartness relation. The following is a standard fact that relates equivalence relations and proper apartness relations.
Lemma 2.5. For R a relation, R is an equivalence relation if and only if ¬R is a proper apartness relation.
Proof. The only interesting property to check is that R is transitive iff ¬R is co-transitive. If ¬R(x, y) and R(x, z), then ¬R(z, y) by transitivity of R, so we have ¬R(x, y) =⇒ ¬R(x, z) ∨ ¬R(z, y)). The other way around, suppose R(x, y) and R(y, z) and ¬R(x, z). Then ¬R(x, y) ∨ ¬R(z, y) by co-transitivity of ¬R, contradiction, so R(x, z).
Bisimulation and apartness for DAs and streams can be defined by induction over the structure of the functor F : Set → Set that we consider the coalgebra for. In the case of DAs, we have c : K → F (K) with F (X) = X A × 2 and for streams, we have c : K → F (K) with F (X) = A × X. The general definition in category-theoretic terms can be found in Section 4. A purely logical-syntactic presentation can be found in the Appendix.
Lemma 2.6. We have the following result relating bisimulation and apartness for the case of DAs and streams (but it also applies to the general case treated in the Appendix).
(1) R is a bisimulation if and only if ¬R is an apartness.
(2) The relation ↔ is the union of all bisimulations, ↔ = {R | R is a bisimulation}, and it is itself a bisimulation.
(3) The relation # satisfies # = {Q | Q is an apartness relation}, and is thus the intersection of all apartness relations; it is itself also an apartness relation.
The other items are easily verified: if R 1 and R 2 are bisimulations, then R 1 ∪ R 2 is also a bisimulation, and if Q 1 and Q 2 are apartness relations, then Q 1 ∩ Q 2 is also an apartness relation.
Remark 2.7. A relation R is a c-bisimulation in case it satisfies a specific closure property that is given in Definitions 2.1, 2.2 via a rule that R should satisfy. Similarly, there is a closure property that defines when Q is a c-apartness (also given via a rule that Q should satisfy).
To prove that s and t are c-bisimilar, we need to find an R that satisfies the rules for c-bisimulations such that R(s, t) holds. Dually, to prove that s and t are c-apart, we need to show that Q(s, t) holds for every Q that satisfies the rules for c-apartness. This means that we can use the rules for being a c-apartness as the derivation rules of the proof system for proving s # c t: we have s # c t if and only if there is a finite derivation of s # c t using these rules.
So, for apartness, the rules that define "Q is a c-apartness" can be used as the derivation rules for proving s # c t. This is obviously not the case for bisimilarity. There the rules just represent the closure properties that R should satisfy to be a c-bisimulation 2 In Sections 4 and 5 we will give a more general categorical picture of bisimulation and apartness on coalgebras.
2.1. Apartness in constructive mathematics. The notion of apartness is standard in constructive real analysis and goes back to Brouwer, with Heyting giving the first axiomatic treatment in [Hey27]. (See also e.g. [TvD88] Chapter 8.) The observation is that, if one reasons in constructive logic, the primitive notion for real numbers is apartness: if two real numbers are apart, this can be positively decided in a finite number of steps, just by computing better and better approximations until one positively knows an -distance between them. Then equality on real numbers is defined as the negation of apartness: x = y := ¬(x#y).
As a matter of fact, one can start from apartness and define equality using its negation, and then build up the real numbers axiomatically from there. This is done in [GN00], where an axiomatic description of real numbers is given and it is shown how Cauchy sequences over the rationals form a model of that axiomatization, all in a constructive setting, i.e. without using the excluded middle rule. If one assumes apartness # to be a proper apartness (as in our Definition 2.4), the defined equality is an equivalence relation.
In the setting of the present paper, these constructive issues do not play a role, because we reason classically. There is one point to make, which is the issue of congruence, which has been studied in depth in the context of process theory [BBR10,Fok00]. Then the question is if, in a theory of terms describing processes, with a notion of bisimilarity describing a semantic equivalence of the terms as labelled transition systems, bisimulation is preserved by the operators of the theory. Simply put: if q 1 ↔ p 1 and q 2 ↔ p 2 , is it the case that f (q 1 , q 2 ) ↔ f (p 1 , p 2 )? In constructive analysis, if one starts from apartness and defines equality as its negation, the corresponding notion is strong extensionality.
It is easily checked that, if one defines an equivalence relation ∼ as the negation of #, then strong extensionality implies congruence with respect to ∼. So, if we wish to deal with process theories in terms of apartness, we will have to require operations and relations to be strongly extensional. It turns out that weaker forms of bisimulation (e.g. branching bisimulation) are not congruences, and therefore one considers rooted branching bisimulation. In Section 3.2 we will briefly study its complement, rooted branching apartness and the connection between congruence and strong extensionality.

Weak and branching bisimulation
We now apply the techniques that we have seen before to weak and branching bisimulation. We do not give a categorical treatment, because the functors proposed for weak [SdVW09] and branching [BK17] bisimulation are not so easy to work with. Instead, we use the definition of "bisimulation" (for a specific type of system) to directly define the notion of "apartness" as its negation, and thereby we define a derivation system for apartness. Then, two states s and t are (weakly, branching) apart iff they are not (weakly, branching) bisimilar. We also apply our definitions in a simple example to show how apartness (and thereby the absence of a bisimulation) can be proved.
We also rephrase some known results about branching bisimulation in terms of apartness, notably we reprove the stuttering property for branching bisimulation and the fact that branching bisimulation is an equivalence relation by rephrasing these results in terms of branching apartness. In the known proofs of these results, the notion of semi-branching bisimulation is used. Here we use a notion of semi-branching apartness for similar purposes. Finally we look into applications of the derivation system for actually deriving that two states in an LTS are branching apart (and therefore not branching bisimilar) and we suggest some new rules, using both apartness and bisimulation, that may be useful for analyzing algorithms for branching bisimulation.
The systems we focus on are labelled transition systems, LTSs. An LTS is a tuple (X, A τ , →), where X is a set of states, A τ = A ∪ {τ } is a set of actions (containing the special "silent action" τ ), and → ⊆ X × A τ × X is the transition relation. We write q 1 → u q 2 Vol. 17:3 APARTNESS AND BISIMULATION 15:9 for (q 1 , u, q 2 ) ∈ → and we write τ to denote the reflexive transitive closure of → τ . So q 1 τ q 2 if q 1 → τ . . . → τ q 2 in zero or more τ -steps.
Convention 3.1. We will reserve q 1 → a q 2 to denote a transition with an a-step with a ∈ A (so a = τ ).
First we recapitulate the standard definitions of labelled transition system and weak and branching bisimulation. We do this in a "rule" style. The standard definition of R ⊆ X × X being a weak bisimulation relation is that we have, for all q, p, q ∈ X and all a ∈ A, , and also the symmetric variants of these two properties: ), Many rules in the rest of this paper have symmetric variants, like branching bisimulation above. We will not give these explicitly, but just refer to them as the "symmetric variants" of the rules.
We will rephrase the properties of weak/branching bisimulation (equivalently) as rules. These look uncommon for bisimulation, but will turn out to be useful when we look at their inverse, apartness.
Definition 3.2. A relation R ⊆ X × X on a LTS (X, A τ , →) is a weak bisimulation relation if it the following two rules and their symmetric variants hold for R.
The states q, p are weakly bismilar, notation q ↔ w p if and only if there exists a weak bisimulation relation R such that R(q, p).
A relation R ⊆ X × X is a branching bisimulation relation if the following two rules and their symmetric variants hold for R.
The states q, p are branching bisimilar, notation q ↔ b p if and only if there exists a branching bisimulation relation R such that R(q, p).
It is well-known that weak bisimulation is really weaker than branching bisimulation (if s ↔ b t, then s ↔ w t, but in general not the other way around) and that various efficient algorithms for checking branching bisimulation exist ( [GV90,JGKW20]). Here we wish to analyze these notions by looking at their opposite: weak apartness and branching apartness.

H. Geuvers and B. Jacobs
Vol. 17:3 Definition 3.3. Given a labelled transition system (X, A τ , →), we say that Q ⊆ X × X is a weak apartness relation in case the following rules hold for Q.
The states q and p are weakly apart, notation q # w p, if for all weak apartness relations Q, we have Q(q, p).
The relation of "being weakly apart" is itself a weak apartness relation: it is the smallest weak apartness relation, so we have an inductive definition of "being weakly apart", using a derivation system. We express this explicitly in the following Corollary to the Definition.
Corollary 3.4. Given a labelled transition system (X, A τ , →), and q, p ∈ X, we have q # w p if and only if this can be derived using the following derivation rules.
Remark 3.5 (Also see Remark 2.7). The notions of weak bisimulation and weak apartness are defined using closure properties that a relation should satisfy. As weak apartness is an inductive notion, the rules that define the closure property for weak apartness can be used as the derivation rules of a proof system to derive q # w p. More precisely: we have q # w p if and only if this can be derived using a finite derivation with the rules of Corollary 3.4. Again, this is not the case for weak bisimilarity.
We now define the notion of branching apartness.
Definition 3.6. Given a labelled transition system (X, A τ , →), we say that Q ⊆ X × X is a branching apartness in case the following rules hold for Q. The states q and p are branching apart, notation q # b p, if for all branching apartness relations Q, we have Q(q, p).
Again, being branching apart is an inductive definition (it is the smallest branching apartness relation), so we have a derivation system. We express this explicitly in the following Corollary to the Definition, where again Remark 3.5 applies.
Corollary 3.7. Given a labelled transition system (X, A τ , →), and q, p ∈ X, we have q # b p if and only if this can be derived using the following derivation rules.
Remark 3.8 (A note on symmetry). In the rules, e.g. of Definition 3.6 and Corollary 3.7, there is a choice of adding symmetry as a rule, or adding symmetric variants of the rules. In our presentation, we choose to add symmetry as a rule. In the literature on bisimulation, it is standard to add symmetric variants of the rules, and then it can be shown that the relations themselves are symmetric. To be clear, the symmetric variants of the rules of Corollary 3.7 would be as follows.
and then one can prove that (without rule (symm)), the relation # b is symmetric.
In the following, we will regularly prove properties about an apartness relation by induction on the derivation and then of course it matters which rules one has chosen. We found that having symmetry as a rule, and not a slightly informal "symmetric duplication" of all rules is a bit more clear and concise. In fact, for the proofs that are given below, it doesn't really matter what rules one has chosen: symmetry as a rule, or symmetry "built in" by adding the symmetric variants of the rules. The induction proofs that follow are mostly symmetric in either side of the apartness sign, with one notable exception, and that is the stuttering property, Lemma 3.21.
We now show how to use apartness on a few simple well-known examples. We show how we can derive that two states are branching apart (i.e. not branching bisimilar) by giving a derivation of this fact using the rules for # b .
Example 3.9. We describe two LTSs from [DV95] that serve as examples to show the difference between weak and branching bisimulation. We apply our apartness definitions to show the difference between # w and # b . The LTS on the left consists of states {s, s 1 , s 2 , s 3 , s 4 , r, r 1 , r 2 , r 3 } and the point is that s # b r, while s ↔ w r. The LTS on 15:12

H. Geuvers and B. Jacobs
Vol. 17:3 the right consists of states {q, q 1 , q 2 , q 3 , q 4 , q 5 , p, p 1 , p 2 , p 3 , p 4 } and the point is that In the LTS on the left, we have s # b r 1 , because s can do a d-step, while r 1 can not. Therefore, s # b r, because s → c s 2 and the only possible c-step from r is r τ r 1 → c r 3 , and s # b r 1 . Given that we now have a derivation system, we can also give a derivation of s # b r: On the other hand we have s ↔ w r. This can be seen by the weak bisimulation ∼ given by the following equivalence classes: {s, r}, {s 1 , r 1 }, {s 2 , s 4 , r 3 }, {s 3 , r 2 }. This is indeed a weak bisimulation following Definition 3.2. A different way to prove s ↔ w r is by showing ¬s # w r, which can be achieved by proving that there is no derivation of s # w r. This is more involved, as we have to reason about all possible derivations of s # w r. The only relevant candidate is below, which fails on finding a derivation of s 2 # w s 3 (which does not exist).
In the LTS on the right, we have q 5 # b p 1 , because q 5 cannot do an e-step. Therefore, q # b p, because q → c q 5 and the only c-step from p leads to p 1 and q 5 # b p 1 . Also here, we can give a derivation: The notions of weak, resp. branching, apartness and weak, resp. branching, bisimulation relate in the standard way we have seen before in Section 2: R is a weak (branching) apartness if and only if ¬R is a weak (branching) bisimulation. This also implies that we can transfer properties from (weak/branching) bisimulation to (weak/branching) apartness and vice versa. In the next Section, we show how we can use apartness to proved results about bisimulation. We now summarize the results that relate bisimulation and apartness in a couple of Lemmas.
Lemma 3.10. A relation R over an LTS is a weak (resp. branching) bisimulation if and only if ¬R is a weak (resp. branching) apartness.
Proof. The proofs are by some standard logical manipulations, similar to the proof of Lemma 2.6. To simplify the work, it is easiest to first replace the rule (symm) by the "symmetric variants" of the other rules, as discussed in Remark 3.8.
We have ↔ w = {R | R is a weak bisimulation} and similarly for ↔ b and it is straightforward to verify that ↔ w is itself a weak bisimulation (and similarly for ↔ b ). For apartness we have the same result: # w = {Q | Q is a weak apartness}, and similarly for # b . The last part of the Lemma follows from This results in the following Lemma.
(2) # w (resp. # b ) is the smallest weak (resp. branching) apartness. ( 3.1. Using apartness to prove results about bisimulation. The first result we prove is that weak apartness is included in branching apartness, which implies the well-known result that branching bisimulation is included in weak bisimulation. The interesting aspect is that we prove these results by induction (on the derivation). Then we will prove co-transitivity of branching apartness (which implies transitivity of branching bisimulation). We introduce semi-branching apartness as a means to prove a stuttering property and some other basic properties (for semi-branching apartness), from which we can conclude that semi-branching and branching apartness are the same, from which we derive co-transitivity.
Proof. By induction on the derivation of s # w t, where we distinguish cases according to the last rule.
, which are the hypotheses for the rule (in bτ ), so we conclude q # b p by the rule (in bτ ).

H. Geuvers and B. Jacobs
Vol. 17:3 It is well-known from the literature that the relations ↔ w and ↔ b are equivalence relations. For ↔ w , the proof is in [Mil89]. For ↔ b , the proof is remarkably subtle, as it is not the case in general that, if R 1 and R 2 are branching bisimulations, then R 1 • R 2 is a branching bisimulation. In [Bas96] the transitivity of ↔ b is proven (and thereby that ↔ b is an equivalence relation), using the notion of semi-branching bisimulation. In [GW96,Bas96], semi-branching bisimulation is also used to prove the so called stuttering property. The results from those papers can also be cast in terms of apartness, which we will do now. We prove that ↔ b is an equivalence relation by proving that # b is a proper apartness relation and using the fact that ↔ b is the complement of # b . Similarly we prove an apartness stuttering property for # b and conclude the stuttering property for ↔ b from that. It turns out that, for proving co-transitivity of # b (and also stuttering) we need a notion of semibranching apartness, which is comparable to the complement of the notion of semi-branching bisimulation of [GW96,Bas96] (but slightly different). We introduce those notions first.
Definition 3.13. A relation Q ⊆ X × X is a semi-branching apartness in case the following derivation rules hold for Q. (So in sbτ replaces the rule in bτ .) The states q and p are semi-branching apart, notation q # sb p, if for all semi-branching apartness relations Q, we have Q(q, p).
So the rules symm and in b are the same as for branching bisimulation of Definition 3.6, and only the rule for τ -steps has been modified. Note that in particular, to derive Q(q, p) from q → τ q , we need to prove Q(q , p) first.
Corollary 3.14. The states q and p are semi-branching apart, q # sb p, if this can be derived from the following rules.
p # sb q symm q # sb p q → τ q q # sb p ∀p , p (p τ p → τ p =⇒ q # sb p ∨ (q # sb p ∧ q # sb p )) in sbτ q # sb p We also define the dual (complement) notion of a semi-branching bisimulation relation.
Definition 3.15. A relation R ⊆ X × X is a semi-branching bisimulation relation if the following two derivation rules and the symmetry rule hold for R.
The states q, p are semi-branching bisimilar, notation q ↔ sb p if and only if there exists a semi-branching bisimulation relation R such that R(q, p).
It can again be shown that Q is a semi-branching apartness if and only if ¬Q is a semi-branching bisimulation. Using this and the fact that # sb is the smallest semibranching apartness and ↔ sb is the largest semi-branching bisimulation, we obtain that q # sb p ⇔ ¬(q ↔ sb p).
Our definition of semi-branching bisimulation is slightly different from the one in [Bas96] and [GW96], but it can be shown that they are equivalent.
The rest of this section will be devoted to proving the co-transitivity of # b (and thereby that ↔ b is an equivalence relation) in the following steps.
(2) We prove a number of basic Lemmas for # sb ; typically useful results we would also like to have for # b , but we can't obtain directly for # b : Lemma 3.17 and Corollary 3.19 (3) We prove the apartness stuttering property for # sb : Lemma 3.21.
(4) We prove that q # b p =⇒ q # sb p, using the apartness stuttering property, and we conclude that # b = # sb : Lemma 3.22. (5) We prove co-transitivity for # b , using the basic lemmas mentioned above. Many of the proofs will proceed by induction on the derivation, where we use the apartness as an inductively defined relation (defined via derivation rules). For one of the basic Lemmas under (2) we will move over to the "bisimulation view", as the result seems easier to obtain there.
Lemma 3.16. For all states q, p, q # sb p =⇒ q # b p.
Proof. By induction on the derivation of q # sb p. The only interesting case is when the last rule applied is (in sbτ ).
We have q → τ q and by induction hypothesis q # b p and ∀p , p (p τ p → τ p =⇒ q # b p ∨ (q # b p ∧ q # b p )). To apply rule (in bτ ) and conclude q # b p we need to prove ∀p , p (p τ p → τ p =⇒ q # b p ∨ q # b p ). Let p , p be such that p τ p → τ p . Form the induction hypothesis we have two cases.

H. Geuvers and B. Jacobs
Vol. 17:3 We first state two simple derivable rules, that are nevertheless convenient to make explicit for use in further proofs.
Lemma 3.17. The following two derived rules holds for # sb . (1) Proof. For the proof of (1), assume (a) p τ t, . Then by rule (in sbτ ), we find q # sb p, so we may assume that (a) p τ t is non-empty and we have (e) p τ p → τ t.
We use (e) in (d), taking t for p and find that q # sb t or q # sb p ∧ q # sb t. In the latter case we have q # sb t and we are done. In case q # sb t, to prove q # sb t, we apply rule (in sbτ ). We need to show that ∀t , t (t τ t → τ t =⇒ q # sb t ∨ (q # sb t ∧ q # sb t ), which follows from p τ p → τ t and (d).
The proof of (2) is similar, but slightly simpler.
Lemma 3.18. The following two derived rules holds for ↔ sb (and as a matter of fact they hold for any semi-branching bisimulation relation). (1) Proof. Assuming q 0 τ q has the shape q 0 → τ q 1 . . . → τ q n = q, the proof proceeds by induction on n. We only treat (1), because (2) is similar (but slightly simpler).
• (n = 0) We need to show that the following holds which is immediate by (bis sbτ ) • (n > 0) We need to show that the following holds By the induction hypothesis, we find ).
Corollary 3.19. The following two derived rules holds for # sb . (1) Proof. Immediately from Lemma 3.18 by taking the complement.
In the literature on branching bisimulation, the "stuttering property" refers to the following property for a relation R, that we depict as a rule here.
So, if in a τ -path, the first and the last state are bisimilar with p, then all states in between are bisimilar with p. In [GW96] (and also in other papers), the stuttering property is proved for ↔ b . We cast this property in terms of apartness.
Definition 3.20. A relation Q satisfies the apartness stuttering property if the following rule holds for Q. r τ q τ t Q(q, p) stut Q(r, p) ∨ Q(t, p) The equivalence between Q being an apartness stuttering property and ¬Q satisfying the stuttering property of 3.1 should be clear. Another way of phrasing the stuttering property for bisimulations, e.g. in [DV95], is as follows.
The apartness variation of this property is Property 3.3 follows easily from the apartness stuttering property of Definition 3.20, using irreflexivity of Q.
Lemma 3.21. The relation # sb (semi-branching apartness) satisfies the apartness stuttering property: Proof. By induction on the proof of q # sb p. There are four cases to consider: either q # sb p was derived by rule (in sbτ ) or (in b ), or p # sb q was derived by rule (in sbτ ) or (in b ), and then q # sb p was derived by symmetry (symm).
• Case q # sb p was derived using rule (in sbτ ). So we have Then we conclude r # sb p using Corollary 3.19 (1), and so r # sb p ∨ t # sb p.
• Case q # sb p was derived using rule (in b ). So we have Then we conclude r # sb p using Corollary 3.19 (2), and so r # sb p ∨ t # sb p.
• Case p # sb q was derived using rule (in sbτ ). So we have p → τ p p # sb q ∀q , q (q τ q → τ q =⇒ p # sb q ∨ (p # sb q ∧ p # sb q )) in sbτ p # sb q Then we conclude p # sb t using Lemma 3.17 (1), and so r # sb p ∨ t # sb p. • Case p # sb q was derived using rule (in b ). So we have Then we conclude p # sb t using Lemma 3.17 (2), and so r # sb p ∨ t # sb p. Proof. We prove q # b p =⇒ q # sb p by induction on the derivation of q # b p, using the apartness stuttering property. We conclude # b = # sb using Lemma 3.16.
For the induction we only treat the case where q # b p has been derived using the rule (in bτ ), as the other cases are immediate. So assume we have the following. So we have q → τ q and by induction hypothesis we have (a) q # sb p and (b) ∀p , p (p τ p → τ p =⇒ q # sb p ∨ q # sb p ). To be able to apply the rule (in sbτ ) to conclude q # sb p, we need to prove ∀p , p (p τ p → τ p =⇒ q # sb p ∨ (q # sb p ∧ q # sb p )).
Let p , p be such that p τ p → τ p . Using (b) we have two cases.
• Case q # sb p . Then, by p τ p → τ p and the stuttering property (Lemma 3.21), we have q # sb p ∨ q # sb p . In case q # sb p, we are done, because that's exactly what we had to prove in the end; in case q # sb p we have q # sb p ∨ (q # sb p ∧ q # sb p ) and we are done. • Case q # sb p . Then q # sb p ∨ (q # sb p ∧ q # sb p ) and we are done.
As a consequence of this Lemma, Corollary 3.19 and Lemma 3.17 also apply to branching apartness, # b .
Lemma 3.23. The relation # b is co-transitive: for all q, p, r: Proof. We prove q # b p =⇒ ∀r(q # b r ∨ r # b p) by induction on the derivation of q # b p, using the properties we have proved before about # b and # sb .
• Case q # b p was derived using (in bτ ).
Let r be a state. If (a) q # b r and (b) ∀r , r (r τ r → τ r =⇒ q # b r ∨ q # b r ), then q # b r and we are done. Otherwise, ¬(q # b r) or ∃r , r (r τ r → τ r ∧ ¬(q # b r ) ∧ ¬(q # b r )).
If ¬(q # b r), we apply induction on q # b p to derive q # b r ∨ r # b p, from which we conclude r # b p and we are done.
In the other case we consider r , r with (d) ¬(q # b r ) and (e) ¬(q # b r ). We will prove that r # b p. Let p , p be such that p τ p → τ p . (If there are no such p , p , then r # b p due to Corollary 3.19 (1) and the fact that r # b p, which follows from induction on q # b p, which yields q # b r ∨ r # b p, but we know ¬(q # b r ) from (e).) Then Then by induction q # b r ∨ p # b r , so p # b r by (e) and so p # b r ∨ p # b r . So r # b p and we apply Corollary 3.19 (1), to conclude r # b p.
• Case q # b p was derived using (in b ).

H. Geuvers and B. Jacobs
Vol. 17:3 Let r be a state. If ∀r , r (r τ r → a r =⇒ q # b r ∨ q # b r ), then q # b r and we are done. Otherwise Consider r , r with (d) ¬(q # b r ) and (e) ¬(q # b r )). We will prove that r # b p. Let p , p be such that p τ p → a p . (If there are no such p , p , then r # b p due to Corollary 3.19 (2).) Then by induction q # b r ∨ p # b r , so p # b r by (e) and so p # b r ∨ p # b r . So r # b p and we apply Corollary 3.19 (2), to conclude r # b p.
The co-transitivity is the crucial property for showing that # b is a proper apartness relation.
Theorem 3.24. The relation # b is a proper apartness relation (in the sense of Definition 2.4).
Proof. We need to verify irreflexivity, symmetry and co-transitivity. Symmetry is built in and co-transitivity has been proved in Lemma 3.23. For irreflexivity, consider the shortest derivation of q # b q (for some q). If this is derived using rule in b , we have q → a q and q # b q ∨ q # b q , which means that there is a shorter derivation of a reflexivity, contradiction. If this is derived using rule in bτ , we have q → τ q and q # b q ∨ q # b q , which again means that there is a shorter derivation of a reflexivity, contradiction. So there is no derivation of q # b q for any q.
Corollary 3.25. The relation ↔ b is an equivalence relation.
Proof. Immediately from Theorem 3.24 using the fact that ↔ b is the complement of # b .
3.2. Using branching apartness. Further research has to establish whether the notion of apartness is really useful in the study and analysis of labelled transition systems. In the previous section we have shown how to use apartness in the meta-theory of branching bisimulation to give some new proofs for known properties. In follow up research we would like to analyze well-known algorithms for checking branching bisimulation, as in [JGKW20], and possibly develop variations on those algorithms. One way to decide branching bisimilarity of states in a finite LTS is by deciding branching apartness. In the present section, we give some ideas of what an algorithm for deciding branching apartness could look like and we also give some variations of the rules for branching apartness, also combined with branching bisimulation that might provide useful. In the end of this section, we briefly mention rooted branching apartness as the complement of rooted branching bisimulation. Rooted branching bisimulation is a congruence [GW96,Fok00], while branching bisimulation is not. For apartness this means that operations are strongly extensional with respect to rooted branching apartness, while they are not with respect to branching apartness.
An obvious algorithm to decide q # b p is by trying to find a derivation of q # b p in a structured way and concluding that q ↔ b p holds in case such a derivation cannot be found. It may look as if, for LTSs with loops, this could lead to an infinite search process. But this can be avoided if we look for a shortest derivation and keep track of goals that we have already encountered. If we encounter the goal again, we can conclude it is not provable. Also, some of the goals will be disjunctions of apartness assertions, like q # b p ∨ q # b p . In that case we will search for a proof of q # b p and for a proof of q # b p , in parallel, and we conclude as soon as we have found a proof of one of them. To clarify this point a bit better we show two pairs of LTSs with loops and how a proof of branching apartness is found for the first pair, and a proof of branching bisimilarity for the second pair.
Example 3.26. We give 4 LTSs with loops. In the first two LTSs, we have q 0 # b p 0 , which is established by the derivation below.
Observe that an algorithm would have to go through all possible d-steps from q 0 and "replay" them from p 0 . We have chosen the "successful" d-step that leads to a derivation of q 0 # b p 0 . Similarly in proving q 0 # b p 2 , we have chosen the successful d-step, q 0 → d q 1 . When proving a disjunction, an algorithm would have to try to prove both parts of the disjunction in parallel. We have only shown the successful one.
For the third and fourth LTS, we have q ↔ b p, so we want to show that ¬(q # b p). This is achieved by trying to find the shortest derivation and observing there is none. This search leads to the following derivation.
Note that this is the complete search tree for a derivation of q # b p, where we have stopped at a branch as soon as we find a goal that we have already encountered. Therefore we fail at 15:22

H. Geuvers and B. Jacobs
Vol. 17:3 the goal q # b p ∨ q # b p, because both q # b p and q # b p have already been encountered.
We now look into some variations on the rules for branching apartness.
Lemma 3.27. The following alternative in b -rule 3 is sound for Proof. Assume we have q → a q and ∀p , p (p τ p → a p =⇒ p # b p ∨ q # b p ). We need to prove q # b p. Suppose ¬q # b p. We want to apply the original in b -rule, so we need to prove the hypothesis to that rule, which is ∀p , • Case p # b p . Then q # b p ∨ q # b p by co-transitivity. We know from our assumption that ¬q # b p, so q # b p and so q # b p ∨ q # b p and done. • Case q # b p . Then q # b p ∨ q # b p and done. So we can apply the original in b -rule and conclude q # b p. This contradicts our assumption ¬q # b p, so we conclude q # b p.
We conjecture that the rule (in A b ) is also complete for proving # b , that is: if we replace rule (in b ) with rule (in A b ) we can derive the same apartness judgments. If we write # A b for the system with rule (in b ) replaced by rule (in A b ), Lemma 3.27 states that q # A b p =⇒ q # b p. For the proof of completeness, q # b p =⇒ q # A b p, it seems we need to prove co-transitivity of # A b first. Using the notion of apartness, we can also add some rules that combine apartness and bisimulation and that may be useful in analyzing or developing new algorithms for checking branching bisimulation, as in [JGKW20].
Lemma 3.28. The following two rules are sound for proving branching apartness # b .
Proof. The proof is immediate from the fact that # b = ¬ ↔ b and Corollary 3.7.
In the literature, the rules concerning bisimulation are often depicted in a diagram for better memorization. The two rules above can be depicted as follows.
On the left: Suppose that q # b p and for all p , p with Similarly, we introduce an adapted rule for branching bisimulation.
Lemma 3.29. The following rule is sound for branching bisimulations. If R is a branching bisimulation, then the following rule (and its symmetric variant) holds.
Proof. The proof is immediate from the definition of branching bisimulation (Definition 3.2) and the fact that, if R is a branching bisimulation, then q # b p =⇒ ¬R(q , p).
The notion of rooted branching bisimulation has been introduced to recover the failure of the congruence property for branching bisimulation [GW96,BBR10,Fok00]. If two labelled transition systems are branching bisimilar and one composes them both with a third one, one expects the newly obtained LTSs to be branching bisimilar again. But this is not the case. Of course, this also depends on the notion of composition and therefore this problem is usually cast in terms of process terms and a process operator f , where congruence means that if q 1 ↔ b p 1 and q 2 ↔ b p 2 , then f (q 1 , q 2 ) ↔ b f (p 1 , p 2 ). Here, the bisimulation equivalence should be understood as being between the interpretation of the process terms as LTSs. A main example of non-congruence arises from the non-deterministic choice operator +. A process q + p can non-deterministically choose for q and do a step in q or for p and do a step in p. The LTS for q + p arises from joining the LTS for q with the one for p at the root node. This leads to the non-congruence exemplified below.
Example 3.30. For the two LTSs on the left we have q 0 ↔ b p 0 . If we compose them via non-deterministic choice with an LTS that does just a c-step, we get the two LTSs on the right, for which we have ¬(q 0 ↔ b p 0 ). For the LTSs on the right, a derivation of q 0 # b p 0 is as follows.
To remedy the non-congruence, the notion of rooted branching bisimulation has been introduced.
Definition 3.31 [GW96]. A relation R on an LTS is a rooted branching bisimulation if the following properties hold.
• R is a branching bisimulation.
• R satisfies the rule (bis rb ), where x ranges over A τ , The states q and p are rooted branching bisimilar, notation q ↔ rb p if there is a rooted branching bisimulation R such that R(q, p).
As a complement we define that the relation Q on an LTS is a rooted branching apartness if the following properties hold.
• Q is a branching apartness.
• Q satisfies the rule (in rb ), where x ranges over A τ , The states q and p are rooted branching apart, notation q # rb p if for all rooted branching apartness relations Q, we have Q(q, p).
It is easy to see that q ↔ rb p if and only if ¬(q # b p) and that q # b p if and only if this is derivable using the rules (# b ), (in bτ ) and (in rb ).
Example 3.32. As a continuation of Example 3.30, we can now show that both for the two LTSs on the right and also for the two LTSs on the left, we have q 0 # rb p 0 . For the pair on the left, the derivation is simply as follows.
To understand how congruence is regained in general and relate that to strong extensionality, assume we have an operation + on LTSs that represents non-deterministic choice, which means to join the root-nodes of two LTSs. So process terms of the form q + p have the following behavior as LTS.
The interpretation of q + p as an LTS should be understood as the union of the LTSs arising from q and p that are joined at their root node. That rooted branching bisimulation is a congruence for the + operator means that, if q 1 ↔ rb p 1 and q 2 ↔ rb p 2 , then (q 1 + q 2 ) ↔ rb (p 1 + p 2 ). We will look into this property from the "apartness side", where we want + to be strongly extensional for # rb : if (q 1 + q 2 ) # rb (p 1 + p 2 ), then q 1 # rb p 1 ∨ q 2 # rb p 2 . We can prove strong extensionality of + as follows. Assume that (q 1 + q 2 ) # rb (p 1 + p 2 ). This means that there is a state r and x ∈ A τ with (q 1 + q 2 ) → x r and ∀r ((p 1 + p 2 ) → x r =⇒ r # b r ). The transition (q 1 + q 2 ) → x r either arises from q 1 → x r or from q 2 → x r. In case q 1 → x r we have the derivation below. In case q 2 → x r we have a symmetric situation that we don't depict here.
which, together with q 1 → x r, using rule (in rb ) gives q 1 # rb p 1 . In case q 2 → x r we have a symmetric situation, so we can conclude q 1 # rb p 1 ∨ q 2 # rb p 2 .

Coalgebras and lifting
This section recalls some basic facts about the description of bisimulations on coalgebras in terms of lifting of the functor from sets to relations. We write Rel for the category of binary relations R ⊆ X × X. A morphism f : Rel is a function f : X → Y between the underlying sets satisfying (x 1 , x 2 ) ∈ R =⇒ (f (x 1 ), f (x 2 )) ∈ S. This can equivalently be expressed via the existence of a function f in: We can also describe this situation via an inclusion There is an obvious functor Rel → Set which sends a relation R ⊆ X × X to its underlying set X. The poset P(X × X) of relations on a set X is often called the "fibre over X", since it mapped by this functor to X.
In this setting we restrict ourselves to (endo)functors F : Set → Set, with associated category CoAlg(F ) of coalgebras. There is a standard way to "lift" such a functor F from Set to Rel in a commuting diagram, as on the left below.

Rel
Rel(F ) G G

Rel
Rel For R ⊆ X × X one obtains Rel(F )(R) ⊆ F (X) × F (X) via: Here we write the inclusion map R → X × X as a pair r 1 , r 2 : R → X × X. It is not hard to see that Rel(F ) is functorial: for a morphism f : A bisimulation for a coalgebra c : X → F (X) is a relation R ⊆ X × X for which c is a map in the category Rel of the form c : R → Rel(F )(R). Thus we may consider the category of coalgebras CoAlg Rel(F ) as the category of bisimulations -for F -coalgebras.
For (Kripke) polynomial functors the generic form of relation lifting (4.3) specializes to well-known formulas, see [Jac16] for details. (1) Rel(id) = id; (2) Rel(A) = Eq(A), where A on the left is the constant-A functor; (1) Let F be the functor F (X) = A × X for streams, as in Definition 2.1. We can then describe the coalgebra c : Y → A × Y as a pair c = h, t for a "head" function h : Y → A and a "tail" function t : Y → Y . The lifting is: y 1 ), (a 2 , y 2 )) | a 1 = a 2 and R(y 1 , y 2 )} = {((a, y 1 ), (a, y 2 )) | R(y 1 , y 2 )}. Thus, h, t being a map R → Rel(F )(R) in Rel corresponds to the usual definition of bisimulation: (2) For deterministic automata one uses the functor F (X) = X A × 2, where 2 = {0, 1}, as in Definition 2.2. A coalgebra is again a tuple c = c 1 , c 2 : Y → Y A × 2, with the following standard notation: y → a y ⇔ c 1 (y)(a) = y and y ↓ ⇔ c 2 (y) = 0.
(3) We now use a functor F (X) = P X + X × A × X and investigate the associated form of bisimulation. We show that it resembles the formulation that we have seen earlier for weaker forms of bisimulation. So lets start with a coalgebra c : Y → P Y + Y × A × Y and write: x → 1 y ⇔ κ 1 y ∈ c(x) and x → 2 y → a z ⇔ κ 2 (y, a, z) ∈ c(x).
These clauses show that with this categorical way of capturing silent steps (via → 1 and → 2 ) means that the numbers of silent steps in both coordinates must be equal. This differs in an important way from weak/branching bisimulation where a single silent step on one coordinate can be mimicked by multiple (zero or more) silent steps in the other coordinate. We refer to the literature for different ways of bringing the two approaches closer togeter [SdVW09,BK17,Bre15,BMP15,GP14].
The next result gives a concrete description of what is captured abstractly in [HJ98].
Proposition 4.4. The equality functor Eq : Set → Rel restricts to Eq : CoAlg(F ) → CoAlg(Rel(F )), for each functor F , and preserves final coalgebras. This implies that bisimilar elements become equal when mapped to the final coalgebra.
Proof. Let c : X → F (X) be an arbitrary coalgebra. Applying the equality functor to it yields a Rel(F )-coalgebra: Eq(X) c G G Eq(F (X)) = Rel(F )(Eq(X)). Now let ζ : Z ∼ = −→ F (Z) be the final F -coalgebra. Let c : R → Rel(F )(R) be a coalgebra, so that R is a bisimulation on c : X → F (X). We write X/R for the quotient of X under (the least equivalence relation containing) R, with quotient map q R : X → X/R. Then q R : R → Eq(X/R) in Rel, giving: This means that there is a unique coalgebra c/R : As a result, the unique coalgebra homomorphism g from c to the final coalgebra ζ factors as: Then, for (x, x ) ∈ R, q R (x) = q R (x ), and therefore g(x) = g(x ). This means that g is a Rel(F )-coalgebra homomorphism R → Eq(Z). By finality of ζ : Z ∼ = −→ F (Z) there can be at most one such homomorphism.
We conclude this section with the following observation. As we have seen, R is a bisimulation for a coalgebra c when: Thus, the bisimilarity ↔ c -that is, the greatest bisimulation on c -can be obtained as the greatest post-fixed point (final coalgebra) of the monotone operator R −→ (c × c) −1 Rel(F )(R) .

Apartness
The opposite Rel op of the category of relations contains relations as objects with reversed arrows. We are going to use a different category Rel fop which is the "fibred opposite", where the order relations in the fibres are reversed. This is an instance of a more general construction [Jac99, Defn. 1.10.11].
Definition 5.1. The category Rel fop has binary relations as objects. A morphism f : This means that (f (x 1 ), f (x 2 )) ∈ S implies (x 1 , x 2 ) ∈ R. There is an obvious forgetful functor Rel fop → Set, given by (R ⊆ X × X) → X.
(1) We shall write ¬ : Rel → Rel fop for the negation functor, where ¬R = {(x, x ) | (x, x ) ∈ R}. On morphisms we have ¬(f ) = f . (2) We write nEq := ¬ • Eq : Set → Rel fop for the inequality functor, sending a set X to It is easy to see that Rel fop fop = Rel and that ¬ is a functor Rel → Rel fop : if R ⊆ (f × f ) −1 (S) then: Moreover, inequality is a functor since for f : . Similarly, ¬ can be described as a functor ¬ : Rel fop → Rel. Clearly, the composite ¬ • ¬ : Rel → Rel fop → Rel is the identity functor, and similarly for ¬ • ¬ : Rel fop → Rel → Rel fop .
We have the following analogue of Lemma 4.2 for opposite relation lifting.
(1) A relation R ⊆ Y × Y is an apartness relation for the functor F (X) = A × X when there is a coalgebra h, t : R → Rel fop (F )(R) in Rel fop . This amounts to: h(x 1 ) = h(x 2 ) =⇒ R x 1 , x 2 and R t(x 1 ), t(x 2 ) =⇒ R x 1 , x 2 . Apartness # is the least relation R for which these two implications hold. In such a situation one commonly writes these implications as rules: Alternatively, x 1 # x 2 iff h t n (x 1 ) = h t n (x 2 ) for some n ∈ N.
Definition 5.4. Let c : X → F (X) be an arbitrary coalgebra.
(1) An apartness relation for c is a relation R on X for which c is a coalgebra c : R → Rel fop (F )(R). This means that: Thus, R is an apartness relation iff ¬R is a bisimulation relation.
The category CoAlg(Rel fop (F )) thus has apartness relations as objects. In order to find the apartness relation on a coalgebra c : X → F (X) we need to find the greatest post-fixed point (final coalgebra) of the mapping R −→ (c × c) −1 Rel(F )(¬R) in the poset (P(X × X), ⊇) with opposite order. A crucial observation is that this is the least pre-fixed point (initial algebra) in P(X × X) with usual inclusion order ⊆. Elements of such an initial algebra can typically be constructed in a finite number of steps. This corresponds to the idea that finding a difference in behaviour of coalgebra can be done in finitely many steps -although you may not know how many steps -whereas showing indistinguishability of behaviour involves all steps.
Proposition 5.5. The inequality functor nEq : Set → Rel fop restricts to nEq : CoAlg(F ) → CoAlg(Rel fop (F )), for each functor F , and preserves final coalgebras. This implies that elements that are apart become non-equal (different) when mapped to the final coalgebra.
Proof. We assume a final F -coalgebra ζ : Z → F (Z). Let c : R → Rel fop (F )(R) describe an apartness relation R ⊆ X × X, for a coalgebra c : X → F (X). The latter has a unique

Conclusion and Further directions
In this paper we have explored the notion of "apartness" from a coalgebraic perspective, as the negation of bisimulation. We have shown what this means concretely in the simple cases of streams and deterministic automata and in the cases of weak and branching bisimulation for labelled transition systems. We have also given a general categorical treatment of apartness as the negation of bisimulation. An important contribution of this view is that it yields a logic for proving that two states are apart, proving that they are not bisimilar. This applies to the general situation for coalgebras of a polynomial functor, but also to the specific situation of weak and branching bisimulation. It would be interesting to see if apartness can be applied in the analysis or description of algorithms for computing bisimulation, in particular branching bisimulation [JGKW20,GV90]. In existing algorithms, apartness clearly plays a role, as they are described in terms of a collection of "blocks" that is refined. States in different blocks are apart, so these algorithms refine an apartness until in the limit, the finest possible apartness is reached. Our approach is not intended to replace bisimulation with apartness. In fact, we view a combined approach as most promising, e.g. as we have in Lemma 3.29. In the period of submission and revision of the present paper, the concept of apartness has already been picked up and applied to develop a new approach to active automata learning [VGRW21], which underlines the potential fruitfulness of the concept. Appendix A. Syntax and logic for bisimulation and apartness As we have seen, the treatment of bisimulation and apartness in Section 2 can be generalized using category theory. We could also have chosen for a purely logical-syntactic approach, which we briefly summarize here. As a matter of fact, the treatment in Sections 4 and 5 can be seen as a semantics of the syntax we introduce here. We start with a description of the systems we want to deal with, which are basically a slight generalization of polynomial functors. A coalgebra has a carrier and a destructor operation. We describe the "types" of the possible range of a destructor.
Definition A.1. Let a set At of symbols fot atomic types be given and let X be a special symbol (a type variable) not in At.
A destructor signature is a pair (d, σ) with d : X → σ.
A coalgebra for the destructor signature (d, σ) is a K ∈ Set and h : K → F (K) (the destructor), where F is the functor on Set defined from X → σ(X) in the obvious way.
We have seen examples for this definition in 2.2, 2.1, 4.3 and 5.3.
Definition A.2. Let K be a coalgebra for the signature of Definition A.1 with destructor h : K → F (K).
(1) A relation R ⊆ K × K is called a bisimulation (or a coinduction assumption) on K if the following holds for all x, y ∈ K.
R(x, y) ⇒ h(x) Here, the relation B ↔ R is defined by induction on the structure of B as follows. • B = D ∈ At, then x B ↔ R y := Eq D (x, y), equality on D.
• B = 1, then x B ↔ R y is true.
• B = 2 B 1 , then x B ↔ R y := (∀u(x(u) = 1 =⇒ ∃v(y(v) = 1 ∧ u (2) Two elements x, y ∈ K are called bisimilar, notation x ↔ y, if there exists a bisimulation R for which R(x, y) holds. Thus: The condition that R ⊆ K × K should satisfy in order to be a bisimulation can be viewed as a derivation rule. This gives an alternative way to say that R is a bisimulation.
Lemma A.4. The relation ↔ is itself a bisimulation relation, and therefore ↔ is the largest bisimulation relation.
Proof. This is a standard fact about bisimulations that is easily verified for this more general setting. The crucial propertyis that if x B ↔ R y for some bisimulation R, then x B ↔ ↔ y.
Similar to the definition of bisimulation (Definition A.2), we can give a general definition of apartness for a coalgebra of a destructor signature.
Definition A.5. Let again K be a coalgebra for the signature of Definition A.1 with destructor h : K → F (K).
(1) A relation Q ⊆ K × K is called an apartness on K if the following holds for all x, y ∈ K. h(x) Here, the relation B # Q is defined by induction on the structure of B as follows.
• B = D ∈ At, then x B # Q y := ¬Eq D (x, y), in-equality on D.
• B = 1, then x B # Q y is false.
• B = 2 B 1 , then x B # R y := (∃u(x(u) = 1 ∧ ∀v(y(v) = 1 =⇒ u (2) Two elements x, y ∈ K are called apart, notation x # y, if for all apartness relations Q we have Q(x, y). Thus: Just as for bisimulation, we can phrase the property that a relation should satisfy in order to be an apartness in the form of aderivation rule.
Remark A.6. A relation Q ⊆ K × K is an apartness if the following derivation rule holds for Q.
h(x) Lemma A.7. The relation # is itself an apartness relation, and therefore # is the smallest apartness relation.
Proof. That the rule of Remark A.6 is sound for # is easily verified by observing that, if x B # # y, then x B # Q y for any apartness Q.
As x # y is the smallest apartness relation, the rule of Remark A.6 gives a complete derivation system for proving x # y for x, y ∈ K.
We can summarize the relations between apartness and bisimulation as follows. This is the general statement of Lemma 2.6. The proof is simply checking all the properties.