Lowerbounds for Bisimulation by Partition Refinement

We provide time lower bounds for sequential and parallel algorithms deciding bisimulation on labeled transition systems that use partition refinement. For sequential algorithms this is $\Omega((m \mkern1mu {+} \mkern1mu n ) \mkern-1mu \log \mkern-1mu n)$ and for parallel algorithms this is $\Omega(n)$, where $n$ is the number of states and $m$ is the number of transitions. The lowerbounds are obtained by analysing families of deterministic transition systems, ultimately with two actions in the sequential case, and one action for parallel algorithms. For deterministic transition systems with one action, bisimilarity can be decided sequentially with fundamentally different techniques than partition refinement. In particular, Paige, Tarjan, and Bonic give a linear algorithm for this specific situation. We show, exploiting the concept of an oracle, that this approach is not of help to develop a faster generic algorithm for deciding bisimilarity. For parallel algorithms there is a similar situation where these techniques may be applied, too.


Introduction
Strong bisimulation [Par81,Mil80] is the gold standard for equivalence on labelled transition systems (LTSs).Deciding bisimulation equivalence among the states of an LTS is a crucial step for tool-supported analysis and model checking of LTSs.The well-known and widelyused partition refinement algorithm of Paige and Tarjan [PT87] has a worst-case upperbound O(m log n) for establishing the bisimulation equivalence classes.Here, n is the number of states and m is the number of transitions in an LTS.
The algorithm of Paige and Tarjan seeks to find, starting from an initial partition, via refinement steps, the coarsest stable partition, that in fact is built from the bisimulation equivalence classes that are looked for.The algorithm achieves the complexity of the logarithm of the number of states times the number of transitions by restricting the amount of work for refining blocks and moving states.When refining, the splitting blocks are investigated using an intricate bookkeeping trick.Only the smaller parts of a block that are to be moved to a new block are split off, leaving the bulk of the original block at its place.These specific ideas go back to [Hop71] and make the difference with the earlier O(mn) algorithm of Kanellakis and Smolka [KS90].The algorithms by Kanellakis-Smolka and Paige-Tarjan, with the format of successive refinements of an initial partition till a fixpoint is reached, have been leading for variations and generalisations for deciding specific forms of (strong) bisimilarities, see e.g.[Buc99, DPP04, GVV18, WDMS20, JGKW20].
We are interested in the question whether the Paige-Tarjan algorithm is computationally optimal.A lowerbound for a related problem is provided in [BBG17] that studies colour refinement of graphs.Colour refinement computes, given a graph and an initial colouring, a minimal consistent colouring such that every two equally coloured nodes have, for every colour, the same number of neighbours of the same colour.More specifically, in that paper it is proven that for a family of graphs with n nodes and m edges, finding the canonical coarsest stable colouring is in Ω((m + n) log n).However, the costs for computations on graphs for colour refinement are charged differently than those for partition refinement for bisimulation on LTSs.The former takes edges between blocks of uniformly coloured nodes into account, the latter focuses on the size of newly created blocks of states.In [BBG17] it is described how the family of graphs underlying the lowerbound for colour refinement can be transformed into a family of Kripke structures for which computing bisimulation is Ω((m + n) log n) when counting the numbers of edges.
In this paper we follow a different approach to obtain a lowerbound.We define the concept of a partition refinement algorithm and articulate the complexity in terms of the number of states that are moved.In particular, we define the notion of a valid refinement sequence which has its counterpart in iteration sequences for colour refinement.Then, we introduce a family of (deterministic) LTSs, called bisplitters, for which we show that computing bisimulation requires n log n work.The family of n log n-hard LTSs that we use to establish the lowerbound, involves an action set of log n actions.Building on this result and exploiting ideas borrowed from [PTB85] to extend the bisimulation classes for the states in the end structures, i.e. cycles, to the states of the complete LTS, we provide another family of (deterministic) LTSs that have two actions only.Then we argue that for the two-action case the complexity of deciding bisimulation is Ω((m + n) log n).We want to stress that the families involved consist of deterministic LTSs.
Recently, a linear time algorithm for bisimilarity was proposed for a PRAM (Parallel Random Access Machine) using max(n, m) processors [MGH + 21].This algorithm also employs partition refinement.This naturally raises the question whether the algorithm is optimal, or whether it can fundamentally be improved.We answer the question in the present paper by showing an Ω(n) lowerbound for parallel algorithms employing partition refinement, using a family of deterministic transition systems with one action label.
We obtain our lowerbound results assuming that algorithms use partition refinement.However, one may wonder if a different approach than partition refinement can lead to a faster decision procedure for bisimulation.For the specific case of deterministic LTSs with a singleton action set and state labelling, Robert Paige, Robert Tarjan and Robert Bonic propose a sequential algorithm [PTB85] that uses linear time.We refer to it as Roberts' algorithm.In [CRS08] it is proven that partition refinement à la Hopcroft has a lowerbound of Ω(n log n) in this case.Concretely, this means that in the one-letter case Roberts' algorithm achieves the essentially better performance by using a completely different technique than partition refinement to determine the bisimulation equivalence classes.
Crucial for Roberts' algorithm is the ability to identify, in linear time, the bisimilarity classes of cycles.In this paper we show that if the alphabet consists of at least two actions, a rapid decision on 'cycles' as in [PTB85] will not be of help to improve on the Paige-Tarjan algorithm for general LTSs.We argue that the speciality in the algorithm of [PTB85], viz. to be able to quickly decide the bisimilarity of the states on a cycle, can be captured by means of a stronger notion, namely an oracle that provides the bisimulation classes of the states of a so-called 'end structure', the counterpart in the multiple action setting of a cycle in the single action setting.The oracle can be consulted to refine the initial partition with respect to the bisimilarity on the end structures of the LTS for free.We show that for the class of sequential partition refinement algorithms enhanced with an oracle as described, thus encompassing the algorithm of [PTB85], the (m+n) log n lowerbound persists for action sets with at least two actions.
For parallel algorithms a similar situation occurs as for deterministic Kripke structures: an O(log n) parallel algorithm exists [JR94] to determine the bisimulation equivalence classes.This algorithm also necessarily employs techniques that go beyond partition refinement.We believe that these techniques cannot be used either to fundamentally improve the complexity of determining bisimilarity on LTSs, but leave the proof as an open question.
The document is structured as follows.In Section 2 we give the necessary preliminaries on the problem.A recap of the linear algorithm of [PTB85] is provided in Section 3. Next, we introduce the family of LTSs B k for which we show in Section 4 that deciding bisimilarity is Ω(n log n) for the class of partition refinement algorithms and for which we establish in Section 5 an Ω(n log n) lowerbound for the class of partition refinement algorithms enhanced with an oracle for end structures.In Section 6 we introduce the family of deterministic LTSs C k , each involving two actions only, to take the number of transitions m into account and establish an Ω((m + n) log n) lowerbound for partition refinement with and without an oracle for end structures.In Section 7 we provide the Ω(n) lowerbound for parallel refinement algorithms.In Section 8 we discuss the differences and similarities with the lowerbound results on colour refinement of [BBG17].We wrap up with concluding remarks.
Note The present paper is an extension the conference paper [GMV21] that appeared in the proceedings of CONCUR 2021.

Preliminaries
Given a set of states S, a partition of S is a set of subsets of states π ⊆ 2 S such that ∅ ∈ π, for all B, B ∈ π either B ∩ B = ∅ or B = B , and B∈π B = S.The elements of a partition are referred to as blocks.A partition π of S induces an equivalence relation = π ⊆ S × S, where for two states s, t ∈ S, s = π t iff the states s and t are in the same block, i.e. there is a block B ∈ π such that s, t ∈ B. A partition π of S is a refinement of a partition π of S iff for every block B ∈ π there is a block B ∈ π such that B ⊆ B. It follows that each block of π is the disjoint union of blocks of π .The refinement is strict if π = π .The common refinement of two partitions π and π is the partition with blocks Definition 2.1.A labelled transition system with initial partition (LTS) is a four-tuple L = (S, A , →, π 0 ) where S is a finite set of states S, A is a finite alphabet of actions, → ⊆ S × A × S is a transition relation, and π 0 is a partition of the set of states S. A labelled transition system with initial partition is called deterministic if the transition relation is a total function S × A → S.
Given an LTS L = (S, A , →, π 0 ), states s, t ∈ S, and an action a ∈ A , we write s a − → t instead of (s, a, t) ∈ →.For notational convenience, we occasionally write and, for a deterministic LTS L, we may use L(s, a) to denote the unique state t of L such that s a − → t.We say that s reaches t via a iff s a − → t.A state s reaches a set U ⊆ S via action a iff there is a state in U that is reached by s via a, notation s a − → U .A set of states V ⊆ S is called stable under a set of states U ⊆ S iff for all actions a, either all states in V reach U via a, or no state in V reaches U via a.Thus, a set of states V is not stable under U iff for two states s and t in V and an action a it holds that s a − → U and t a U .A partition π is stable under a set of states U iff each block B ∈ π is stable under U .A partition π is called stable iff it is stable under all its blocks.So, for any two blocks B and C of π and any action a ∈ A , either each state s of B has an a-transition to C or each state s of B doesn't have an a-transition to C. Following [Par81,Mil80], given an LTS L, a symmetric relation R ⊆ S × S is called a bisimulation relation iff for all (s, t) ∈ R and a ∈ A , we have that s a − → s for some s ∈ S implies that t a − → t for some t ∈ S such that (s , t ) ∈ R. In the setting of the present paper, as we incorporate the initial partition in the definition of an LTS, bisimilarity is slightly non-standard.For a bisimulation relation R, we additionally require that it respects the initial partition π 0 of L, i.e. (s, t) ∈ R implies s = π 0 t.Two states s, t ∈ S are called (strongly) bisimilar for L iff a bisimulation relation R exists with (s, t) ∈ R, notation s ↔ L t.
Bisimilarity is an equivalence relation on the set of states of L. We write [s] ↔ L for the bisimulation equivalence class of the state s in L.
Note that for a deterministic LTS with a set of states S and initial partition π 0 = {S}, we have that π 0 itself already represents bisimilarity, contrary to LTSs in general.
Partition refinement algorithms for deciding bisimilarity on LTSs start with an initial partition π 0 , which is subsequently repeatedly refined until a stable partition is reached.Thus, unstable blocks are replaced by several smaller blocks.The stable partition that is reached happens to be the coarsest stable partition of the LTS refining π 0 and coincides with bisimilarity [KS90,PT87].
Below we define so-called valid refinement sequences.An algorithm is called a partition refinement algorithm iff every run of the algorithm is reflected by a valid refinement sequence (π 0 , . . ., π n ).All the lowerbounds that we provide apply to algorithms producing valid partition sequences, which virtually all known bisimulation algorithms do, and as such this is the core definition in this paper.
A partition sequence (π 0 , . . ., π n ) is valid when the direct successor π i of a partition π i−1 in the sequence is obtained by splitting one or more unstable blocks in π i−1 using only information available in π i−1 .Furthermore, the last partition in the sequence, the partition π n , is stable.If block B of π i−1 is replaced in π i because it is not stable under block B of π i−1 , then B is referred to as a splitter block.Definition 2.2.Let L = (S, A , →, π 0 ) be an LTS, and π a partition of S. A refinement π of π is called a valid refinement with respect to L iff the following criteria hold.
(a) π is a strict refinement of π.(b) If s = π t for s, t ∈ S, then (i) s = π t or (ii) s ∈ S exists such that s a − → s for some a ∈ A and, for all t ∈ S such that t a − → t , it holds that s = π t , or the other way around with t replacing s.
A sequence of partitions Π = (π 0 , . . ., π n ) is called a valid partition sequence iff every successive partition π i , for 0 < i n, is a valid refinement of π i−1 , and, moreover, the partition π n is stable.10:5 When a partition π is refined into a partition π , states that are in the same block but can reach different blocks can lead to a split of the block into smaller subsets, say k subsets.This means that a block B ∈ π is split into blocks B 1 , . . ., B k ∈ π .The least amount of work is done for this operation if we create new blocks for the least number of states.That means if B ∈ π is split into B 1 , . . ., B k ∈ π and B 1 is the biggest block, then the states of B 2 , . . ., B k are moved to new blocks and the states of B 1 remain in the current block that was holding B. Therefore, we define the refinement costs rc for the refinement π of π by For a sequence of refinements Π = (π 0 , . . ., π n ) we write rc(Π) for n i=1 rc(π i−1 , π i ).For an LTS L, we have rc(L) = min{ rc(Π) | Π a valid refinement sequence for L }.
Note that this complexity measure is different from the one used in [BBG17], which counts transitions.Our complexity measure rc is bounded from above by the former.
In various examples below we characterise the states of LTSs by sequences of bits.The set of bits is denoted as B = {0, 1}.Bit sequences of length up to and including k are written as B k .The complement of a bit b is denoted by b.Thus 0 = 1 and 1 = 0.For two bit sequences σ, σ , we write σ ď σ to indicate that σ is a prefix of σ and write σ ≺ σ iff σ is a strict prefix of σ .For a bit sequence σ ∈ B k , for any i, j k, we write σ[i] to indicate the bit at position i starting from position 1.We write σ to indicate the subword from position i to position j.Occasionally we use, for a bit sequence σ, the notation σB k to denote { σσ | σ ∈ B k }, the set of all bit sequences of length |σ| + k having σ as prefix.

Roberts' algorithm
Most algorithms to determine bisimulation for an LTS use partition refinement.However, there are a few notable exceptions to this.For the class of deterministic LTSs that have a singleton action alphabet, deciding the coarsest stable partition, i.e. bisimilarity, requires linear time only; a linear algorithm is due to Robert Paige, Robert Tarjan, and Robert Bonic [PTB85], which we therefore aptly refer to as Roberts' algorithm.
The algorithm of [PTB85] exploits the specific structure of a deterministic LTS with one action label.An example of such a transition system is depicted in Figure 1, where the action label itself has been suppressed and the initial partition is indicated by single/double circled states.In general, a deterministic LTS with one action label can be characterised as a directed graph, possibly with self-loops, consisting of a number of cycles of one or more states together with root-directed trees with their root on a cycle.Below we refer to a cycle with the trees connected to it as an end structure.In a deterministic LTS with one action label, each state belongs to a unique end structure; it is on a cycle or has a unique directed path leading to a cycle.
In brief, Roberts' algorithm for deterministic LTSs with one action label can be described as follows (see [PTB85] for more details).
(1) As preparatory step, find all the end structures of the LTS, i.e. detect all cycles, and all root-oriented trees leading to cycles.(2) Observe that each state s on a cycle encodes a sequence of blocks, viz. the sequence starting from the block the state is in, and blocks encountered when following the transitions, up to the state on the cycle that leads back to s.This sequence of blocks forms a word w over the alphabet of the initial partition, where each block of the initial partition is a symbol of this alphabet.The word w can be uniquely written as v k with v of minimal length and k > 0. The string v is referred to as the repeating prefix of the state s.
We consider the repeating prefixes of all states on the cycle and identify the lexicographically least repeating prefix v.This can be done in linear time in the size of the cycle using a string matching algorithm due to Knuth, Morris, and Pratt [KMP77].The lexicographically least repeating prefix v and the minimal number of transitions that is required to reach a state t from a state that has v as repeating prefix, determines the bisimulation equivalence class of the state t.We encode this bisimulation equivalence class by the corresponding rotation of the prefix v.This way the bisimulation class is established for all states on all cycles.By comparing least repeating prefixes bisimilarity across cycles can be detected.
(3) By a backward calculation along the path leading from a state up in a tree down to their root on a cycle, the bisimilarity equivalence classes for the remaining states can subsequently be determined in linear time as well.The root of the tree is a state on the cycle and therefore has been assigned a string, hence a bisimulation class.We assign to a child the string of the parent prepended with the symbol of the initial class of the child.
Example.The deterministic LTS of Figure 1 has a single end structure, viz. the cycle formed by the states c 1 to c 6 and five trees, tree T 1 with leaf s 14 and rooted in c 1 , tree T 2 with leaves s 22 and s 23 rooted in c 2 , tree T 3 with leaves s 31 and s 32 rooted in c 3 , the tree T 4 Roberts' algorithm solves the so-called single function coarsest partition problem in O(n) for a set of n elements.A striking result is that any algorithm that is based on partition refinement requires Ω(n log n), as witnessed in [BC04,CRS08], where it is shown that partition refinement algorithm of Hopcroft [Hop71] cannot do better than O(n log n).Thus, Roberts' algorithm must use other techniques than partition refinement.Below we come back to this observation, showing that it is not possible to use the ideas in Roberts' algorithm to come up with a linear algorithm for computing bisimilarity for a class of LTSs that either includes nondeterministic LTSs, or allows LTSs to involve more than one action label.

B k is Ω(n log n) for partition refinement
In this section we introduce a family of deterministic LTSs called bisplitters B k for k 1, on which the cost of any partition refinement algorithm is Ω(n log n), where n is the number of states.Building on the family of B k 's, we propose in Section 6 a family of LTSs C k for which the cost of partition refinement is Ω((n + m) log n), where m is the number of transitions.
Definition 4.1.For k 1, the bisplitter B k is defined as the LTS with initial partition where the set of states B k is the set of all bit strings of length k, We see that the bisplitter B k has 2 k states, viz.all bit strings of length k, and k−1 different actions.The LTS B k is deterministic.Each state has exactly one outgoing transition for each action a i , 1 i < k.Thus, B k has (k−1)2 k transitions: (i) a self-loop for bitstring σ with label a i if the i+1-th bit σ[i+1] of σ equals 0; (ii) otherwise, i.e. when bit σ[i+1] equals 1, the bitstring σ has a transition for label a i to the bitstring that equals the first i−1 bits of σ, flips the i-th bit of σ, and has k−i many 0's following.The initial partition π k 0 distinguishes the bit strings starting with 0 from those starting with 1. Drawings of the first three bisplitters B 1 to B 3 are given in Figure 2. We see in the picture of B 3 for example for the bitstring σ = 101, an a 1 -transition to itself, as Also note that B 3 contains two copies of B 2 .In the copies, the action label a 1 of B 2 maps to the action label a 2 in B 3 , and each state associated with a bitstring σ ∈ B 2 produces two copies in B 3 ; one copy is obtained by the mapping σ → 0σ and the other copy is obtained by the mapping σ → 1σ.In general, bisplitter B k is twice embedded in bisplitter B k+1 via the mappings σ → 0σ and σ → 1σ from B k to B k+1 for the states using the mapping a i → a i+1 from A k to A k+1 for the action labels.Note that initial partitions are not respected.Definition 4.2.For any string σ ∈ B k , we define the prefix block B σ of B k to be the block The following lemma collects a number of results related to prefix blocks that we need in our complexity analysis for computing bisimilarity for the bisplitters.
Proof.(a) Initially, for π k 0 = {B 0 , B 1 }, both its blocks are prefix blocks by definition.We prove, if partition π i , for 0 i < n, consists of prefix blocks only, then all blocks in π i+1 are prefix blocks as well.
Assume, to arrive at a contradiction, that there is a block B ∈ π i+1 that is not a prefix block.Because π i+1 is a refinement of π i , we have B ⊆ B σ for some prefix block B σ ∈ π i .This means that σ is a common prefix of all elements of B. We can choose θ such that σθ is the longest common prefix of all elements of B. Since every singleton of B k is a prefix block, B is not a singleton.This implies that |σθ| < k and that there are elements σ 1 and σ 2 of B such that σθ0 is a prefix of σ 1 and σθ1 is a prefix of σ 2 .Because B is not a prefix block by assumption, there must exist a string τ ∈ B k with prefix σθ such that τ ∈ B. Obviously, we have either (i) σθ0 is a prefix of τ , or (ii) σθ1 is a prefix of τ .We will show that in both these cases τ in fact belongs to B, thus arriving at a contradiction.
(i) Suppose σθ0 is a prefix of τ .We argue that τ and σ 1 belong to the same block in π i+1 since, for each a j , 1 j < k, the target states σ 1 and τ of the transitions σ 1 a j − → σ 1 and τ a j − → τ belong to the same block of π i .There are three cases: where = σθ, and we have σ 1 = τ .So, they clearly belong to the same block of π i .
j+1] = 0, we have σ 1 = σ 1 and τ = τ , and hence both σ 1 and τ belong to B σ .• j > |σθ|: In this case, for a string of the form σθ , an a j -transition leads to a string of the form σθ .In particular this means that if j > |σθ| and σ 1 a j − → σ 1 and τ a j − → τ , then σθ is a prefix of both σ 1 and τ , and σ 1 and τ belong to B σ in π i .(ii) Now, suppose σθ1 is a prefix of τ .We argue that τ and σ 2 belong to the same block in π i+1 because for each a j (where 1 j < k) the transitions σ 2 a j − → σ 2 and τ a j − → τ lead to the same block of π i .Also, here there are three cases: • j < |σθ|: Similar as for (i).
= σθ, so clearly σ 1 and τ are in a same block in π i .• j > |σθ|: Similar as for (i).Thus, both in case (i) and in case (ii) we see that we must have τ ∈ B, contradicting the choice for τ .
Thus B σ isn't stable, and hence π i isn't either.
(c) We show that for a prefix block B σ ∈ π i , a bit b ∈ B and all θ, θ ∈ B k−(|σ|+1) , the states σ 1 = σbθ and σ 2 = σbθ are not split by action a j , for 1 j < k, and thus are in the same block of π i+1 .Pick j, 1 j < k, and suppose σ 1 and don't split for a j either.If j > |σ| then both σ 1 , σ 2 ∈ B σ and don't split for a j either.
With the help of the above lemma, clarifying the form of the partitions in a valid refinement sequence for the bisplitter family, we are able to obtain a lowerbound for any algorithm exploiting partition refinement to compute bisimilarity.
With n the number of states of B k , we have that n = 2 k , thus k−1 = log 1 2 n.Hence, rc(Π) = 1 2 n log 1 2 n which is in Ω(n log n).Thus, for every valid partition refinement sequence Π for B k we have rc(Π) ∈ Ω(n log n).
In particular this bound applies to the valid refinement sequence of minimal cost, and hence we conclude rc(B k ) ∈ Ω(n log n).

B k is Ω(n log n) for partition refinement with an oracle
In the previous section we have shown that computing bisimilarity with partition refinement for the family of bisplitters is Ω(n log n).The bisplitters are deterministic LTSs but have growing actions sets.For the corner case of deterministic transition systems with a singleton action set, Roberts' algorithm discussed in Section 3 establishes bisimilarity in O(n).Linearity was obtained by the trick of calculating the (lexicographically) least repeating prefix on the cycles in the transition system.
One may wonder whether an approach different from partition refinement of establishing bisimulation equivalence classes for transition systems with non-degenerate action sets can provide a linear performance.In order to capture the approach of [PTB85], we augment the class of partition refinement algorithms with an oracle.At the start of the algorithm the oracle can be consulted to identify the bisimulation classes for designated states, viz.for those that are in a so-called end structure, the counterpart of the cycles in Roberts' algorithm.This results in a refinement of the initial partition; partition refinement then starts from the updated partition.
Thus, we can ask the oracle to provide the bisimulation classes of all states in an end structure of the input LTS, also including bisimilar states of the LTS not in an end structure.This yields a new partition, viz. the common refinement of the initial partition, on the one side, and the partition induced by the bisimulation equivalence classes as given by the oracle and the complement of their union, on the other side.Hence, the work that remains to be done is establishing the bisimulation equivalence classes, with respect to the initial partition, for the states not bisimilar to any in an end structure.
We will establish that a partition refinement algorithm that can consult an oracle cannot improve upon the complexity of computing bisimulation by partition refinement.We first define the notion of an end structure of an LTS formally as well as the associated notion of an end structure partition.
Like the cycles exploited in Roberts' algorithm, an LTS can have multiple end structures.The end structure partition π es consists of all the bisimilarity equivalence classes of L that include at least one state of an end structure, completed with blocks holding the remaining states, if non-empty.So, for every state s of an end structure, the end structure partition has identified all states that are bisimilar to state s and separates s and its bisimilar states from the rest of the LTS.The other states are assigned in the end structure partition to the blocks just as the initial partition does.
Example.In the LTS of Figure 1 the cycle of c 1 to c 6 is the only end structure.All states have a path to the cycle, hence every non-empty set that is closed under transitions will contain the cycle.Would the LTS have contained any isolated states, these would be end structures by themselves.Thus, consultation of the oracle leads to the refinement of the initial structure π 0 that consists of the two blocks Next we enhance the notion of a partition refinement algorithm.Now, an oracle can be consulted for the states in the end structures.In this approach, the initial partition is replaced by a partition in which all bisimilarity equivalence classes of states in end structures are split off from the original blocks.
Definition 5.3.A partition refinement algorithm with end structure oracle yields for an LTS L = (S, A , →, π 0 ) a valid refinement sequence Π = (π 0 , π 1 , . . ., π n ) where π 0 is the end structure partition of L. The partition π 0 is called the updated initial partition of L.
As Roberts' algorithm witnesses, in the case of a singleton action set the availability of an end structure oracle results in an algorithm with linear asymptotic performance.In the remainder of this section we confirm that in the case of more action labels the end structure does not help.The next lemma states that the amount of work required for the bisplitter B k by a partition refinement algorithm enhanced with an oracle, dealing with end structures, is at least the amount of work needed by a partition refinement algorithm without oracle for the bisplitter B k−2 .
Proof.Observe that there are only two end structures in B k , viz. the singletons of the two states 0 k and 10 k−1 .Since all other states can reach 0 k or 10 k−1 , these states are not in an end structure: Choose σ ∈ B k , σ = 0 k , 10 k−1 .Then σ is of the form b0 j 1θ for some b ∈ B, j 0 and θ ∈ B * .For j = 0 we have σ a 1 − → b0 k−1 which is either 0 k or 10 k−1 ; for j > 0 we have σ a j+1 − −− → b0 j−1 10 k−(j+1) while b0 j−1 10 k−(j+1) reaches 0 k or 10 k−1 by induction.By Lemma 4.3, every state σ ∈ B k of B k has its own bisimulation equivalence class {σ}.It follows that the updated initial partition π 0 consists of the blocks To construct Π from Π, we use the partial projection function p : B k B k−2 that removes the prefix 11 from a bitstring and is undefined if 11 is not a prefix.That means p(11σ) = σ for all σ ∈ B k−2 and p(σ ) is undefined for σ ∈ 11B k−2 .A partition π of B k is projected to a partition of B k−2 by projecting all the blocks of π and ignoring empty results, thus }}, i.e. the unit partition of B k−2 consisting of the prefix block B ε only.Second, we remove repeated partitions from the sequence (p(π 0 ), p(π 1 ), . . ., p(π n )) to obtain a subsequence Π , say Π = ( 0 , 1 , . . ., ).Thus, for some order preserving surjection q : {1, ..., n} → {1, ..., } it holds that p(π i ) = p(π i ) iff q(i) = q(i ), and j = p(π i ) if q(i) = j for 1 i n, 1 j .We have containing the prefix blocks of 0 and 1 of B k−2 : Suppose to the contrary that bθ, bθ ∈ B k−2 , for a bit b ∈ B and strings θ, θ ∈ B k−3 , are two different states which are not in the same block of 1 .Let i, 0 i < n be such that p(π i ) = 0 and p(π i+1 ) = 1 .Then 11bθ and 11bθ have been separated when refining π i into π i+1 .But no action a j witnesses such a split: (i) B k (11bθ, a 1 ) = B k (11bθ , a 1 ) as both equal 1 .Since 1 = 0 , 1 has at least two blocks.Hence, these must be 0B k−3 and 1B k−3 .Thus 1 = {0B k−3 , 1B k−3 } as claimed.
Next we prove that every refinement of i into i+1 of Π , for i, 1 i < , is valid for B k−2 .We first observe that, for all σ, σ ∈ B k−2 , a j ∈ A , it holds that B k−2 (σ, a j ) = σ iff B k (11σ, a j+2 ) = 11σ .This is a direct consequence of the definition of the transition functions of B k−2 and B k .From this we obtain (5.1) provided i = p(π h ), for 0 i and a suitable choice of h, via the definition of the projection function p. Now, consider the subsequent partitions i and i+1 in Π , 1 i .Now, let h, 0 h < n, be such that i = p(π h ) and i+1 = p(π h+1 ).Clearly, i+1 is a refinement of The validity of the refinement of i into i+1 is justified by the validity of π h into π h+1 .If σ = i σ and σ = i+1 σ for σ, σ ∈ B k−2 , then σ, σ ∈ 0B k−3 or σ, σ ∈ 1B k−3 since i is a refinement of 0 .Moreover, 11σ = π h 11σ and 11σ = π h+1 11σ by (5.1).Hence, by validity, B k (11σ, a j ) = π h B k (11σ , a j ) for some a j ∈ A .Clearly j = 1, since (11σ , showing the refinement of i into i+1 to be valid. Finally, since every block in π n is a singleton, this is also the case for .Thus, is indeed the coarsest stable partition for B k−2 as required for Π to be a valid refinement sequence for B k−2 .Every refinement of i into i+1 of Π is projected from a refinement of some π h into π h+1 of Π as argued above.Therefore, since p(π h ) = i and p(π h+1 ) = i+1 , we have rc(π h , π h+1 ) rc( i , i+1 ), and hence rc(Π) Since, by definition, rc(B k−2 ) is the minimum over all valid refinement sequences for B k−2 it holds that rc(Π ) rc(B k−2 ).Therefore, rc(Π) rc(Π ) rc(B k−2 ) as was to be shown.
Next we combine the above lemma with the lowerbound provided by Theorem 4.4 in order to prove the main result of this section.
Theorem 5.5.Any partition refinement algorithm with an end structure oracle to decide bisimilarity for a deterministic LTS is Ω(n log n).
Proof.Let B k be the updated bisplitter with the initial partition π 0 containing {0 k }, B 0 \{0 k }, {10 k−1 }, and B 1 \{10 k−1 } as given by the oracle for end structures rather than the partition π 0 containing B 0 and B 1 .By Lemma 5.4 we have, for k > 2, that rc(B k ) rc(B k−2 ).By Theorem 4.4 we know that rc(   conclude that deciding bisimilarity for B k with the help of an oracle for the end structures is Ω(n log n).
We modify the bisplitter B k , that has an action alphabet of k−1 actions, to obtain a deterministic LTS with two actions only.The resulting LTS C k has the action alphabet {a, b}, for each k > 1, and is referred to as the k-th layered bisplitter.We use C k to obtain an Ω((n + m) log n) lowerbound for deciding bisimilarity for LTSs with only two actions, where n is the number of states and m is the number of transitions.
In order to establish the lowerbound we adapt the construction of B k at two places.We introduce for each σ ∈ B k , a stake of 2 k states.Moreover, to each stake we add a tree gadget.These gadgets have height log( k−1 2 ) to accommodate (k−1)/2 leaves in order to encode the action alphabet A k of B k with k−1 actions.Definition 6.1.Let k > 1, B k be the k-th bisplitter, and A = {a, b} be a two-element action set.The deterministic LTS C k = (S C k , A, → C , π C 0 ), over the action set A, (a) has the set of states S C k defined as 10:15 The auxiliary labelling function lbl : A log(k−1) → N, used in item (b) is defined by lbl (w) = min{bin(w)+1, k−1}.Here bin : A * → N is the binary evaluation function defined by bin(ε) = 0, bin(wa) = 2 * bin(w), and bin(wb) = 2 * bin(w)+1.
We see that with each string σ ∈ B k we associate in C k as many as 2 k stake states [σ, 1], . . ., [σ, 2 k ], one for each level , 1 2 k .The stake states are traversed from the top [σ, 1] to bottom [σ, 2 k ] for each string σ over A of length 2 k .The tree gadget, with states [σ, w] for bit sequences σ and strings w over A, consists of a complete binary tree of height log( k−1 2 ) that hence has (k − 1)/2 leaves.Traversal down the tree takes a left child on action a from A, a right child on action b from A. Together with the two actions of A, k−1 source-label pairs can be encoded, connecting the stake on top of the tree gadgets k−1 times with other stakes.To simulate a transition σ a j − → σ of B k in C k from a leaf of a tree gadget of σ to the top of the stake of σ , we need to be at a leaf σ, w of the tree gadget of σ such that the combined string wα for α ∈ A is the binary encoding according to lbl of the index j.An α-transition thus leads from the source σ, w to the target [σ , 1] if σ a j − → σ in B k and wα corresponds to j.The partition distinguishes, for each level , the states at level of the stakes of strings starting with 0 in C 0 , the states of the stakes at level of strings starting with 1 in C 1 , and the states of the tree gadgets collected in C ε .
Figure 3 depicts the layered 3-splitter C 3 .Because also B 3 has an action set of size 2 the tree gadgets only consist of the root node of the form σ, ε .In Figure 2 we see that for bisplitter B 3 we have 101 0 that contains 17 blocks: for each level , 1 2 3 , π C 0 contains a block holding the four states of the stakes in C 0 on the left and a block with the four stake states in C 1 on the right, and lastly one block consisting of the eight tree states in C ε at the bottom of the picture.
The 6-th bisplitter B 6 has five actions, a 1 to a 5 .A tree gadget for the layered bisplitter C 6 with corresponding outgoing transitions is drawn in Figure 4.The tree has height log((6 − 1)/2) = log 5 2 = 2, hence it has 2 2 = 4 leaves.Since each leaf has two outgoing transitions, one labelled a and one labelled b, the two leftmost leaves σ, aa and σ, ab are used with the two labels a and b to simulate transitions for a 1 up to a 4 , the two rightmost leaves σ, ba and σ, bb have together four transitions all simulating the a 5 -transition of σ.
The next lemma introduces three facts for the layered bisplitter C k that we need in the sequel.The first states that if two states at different stakes, but at the same level, are separated during partition refinement, then all corresponding states at lower levels are separated as well.The second fact helps to transfer witnessing transitions in B k to the setting of through the tree gadget of σ from root to leaf and then to the top state [σ , 1] of the stake of σ .The word wα encountered going down and out the tree gadget corresponds to the action a j according to the lbl -function.Lastly, it is shown that no two pairs of different states within the stakes are bisimilar.
Proof.(a) For a proof by contradiction, suppose the partition π is the first partition of Π that falsifies the statement of the lemma.So π = π C 0 , since for the initial partition π C 0 the statement holds.Thus, π is a refinement of a partition π in Π.So, there are two states [σ, ], [σ , ] ∈ S C k in different blocks of π while the states [σ, +1], [σ , +1] are in the same block of π and hence of π .Since [σ, ] and [σ , ] only have transitions to [σ, +1] and [σ , +1], respectively, that are in the same block π , the refinement would not have been valid.We conclude that no falsifying partition π in Π exists and that the lemma holds.
(b) We first prove, that for all w ∈ A * , |w| log(k−1) − 1, if σ 1 , w and σ 2 , w are split in π, then there are v ∈ A * and α ∈ A such that σ 1 , w in different blocks of π.We prove this for all possible lengths |w| by reverse induction.If w has maximal length, i.e. |w| = log(k−1) − 1 this is clear.If σ 1 , w and σ 2 , w are split, for |w| < log(k−1) − 1, then either a-transitions or b-transitions lead to split states.By the induction hypothesis, suitable paths exist from the targets of such transitions.Adding the respective transition proves the induction hypothesis.Since [σ 1 , 2 k ] and [σ 2 , 2 k ] can only reach σ 1 , ε and σ 2 , ε the statement follows.
The next lemma states that the splitting of states [σ, ] ∈ S C , for each level , has refinement costs that are at least that of B k .Lemma 6.3.It holds that rc(C k ) 2 k rc(B k ) for all k > 1.
Proof.Let Π = (π C 0 , π 1 , . . ., π n ) be a valid refinement sequence for C k .We show that for each level , the sequence Π induces a valid refinement sequence Π for B k .
For each ∈ N, such that 1 m, we define a partial projection function p : S C k B k .The mapping p maps states of shape [σ, ] ∈ S C k of C k to σ ∈ B k and is undefined on all other states.A block B in a partition of C k is mapped to the block p [B] of B k , by applying p on all elements, resulting in: A partition p (π) of B k is obtained by applying p to a partition π of C k and ignoring the empty blocks, i.e. p (π) = { p [B] | B ∈ π } \ ∅.The sequence Π = (π 0 , . . ., π m ) is obtained from the sequence (p (π C 0 ), p (π 1 ), . . ., p (π n )) by removing possible duplicates.We verify that Π is a valid refinement sequence for B k .
First, we check that π i is a refinement of π i−1 , for 1 i m.Choose index i arbitrarily.Let the index h with 1 h n be such that p (π h−1 ) = π i−1 and p (π h ) = π i .Then we fix a block B ∈ π i .Since with [σ 1 , 1] and [σ 2 , 1] in different blocks of π g−1 .Hence, σ 1 and σ 2 are in different blocks of π i−1 while σ 1 a j − → B σ 1 and σ 2 a j − → B σ 2 for j = lbl (wα), which justifies splitting σ 1 and σ 2 for π i .We conclude that Π is a valid refinement sequence for B k .
We have established that if Π is a valid refinement sequence for C k , then Π is a valid refinement sequence for B k .The sequence Π is obtained from Π by sifting out the blocks of Π's partitions and removing repeated partitions.Therefore it holds that rc(Π) rc(Π ).Since the mappings p and p include pairwise distinct sets of stake states for = , 1 2 k , it follows that rc(Π) Taking the minimum over all valid refinement sequences for C k we conclude that rc(C k ) 2 k rc(B k ) as was to be shown.
With the above technical lemma in place, we are able to strengthen the Ω(n log n) lowerbound of Theorem 4.4 by now taking the number of transitions into account.The improved lowerbound is Ω((m + n) log n), where m is the number of transitions and n the number of states.
Theorem 6.4.Deciding bisimilarity for (deterministic) LTSs with a partition refinement algorithm is Ω((m + n) log n), where n is the number of states and m is the number of transitions of the LTS.
Underlying the proof of the lowerbound for deciding bisimilarity for the family of layered bisplitters C k is the observation that each C k can be seen as 2 k stacked instances of the ordinary bisplitters B k , augmented with tree gadgets to handle transitions properly.The other essential ingredient for the proof of Theorem 6.4 is the complexity of deciding bisimilarity with a partition refinement algorithm on the B k family.The same reasoning applies when considering partition refinement algorithms with an oracle for end structures from Section 5. Also with an oracle the lowerbound of Ω((m+n) log n) remains.Theorem 6.5.Any partition refinement algorithm with an oracle for end structures that decides bisimilarity for (deterministic) LTSs is Ω((m + n) log n).
Proof.The proof is similar to that of Lemma 5.4 and Theorem 6.4.Consider, for some k > 2, the layered bisplitter C k having initial partition π C 0 .The LTS C k has two end structures, viz. the set S 0 ⊆ S C k containing the states of the stake and accompanying tree gadget 2 ) } for 0 k and a similar end structure S 1 ⊆ S C k for 10 k−1 .The sets S 0 and S 1 are minimally closed under the transitions of C k .Other states, on the stake or tree gadget for a string σ, have a path to these sets inherited from a path from σ to 0 k or 10 k in B k .The bisimulation classes S 0 and S 1 , say, with respect to S C k rather than π C 0 , consist of S 0 and S 1 themselves plus a part of the tree gadgets for transitions in C k leading to S 0 and S 1 , respectively.
The update of the initial partition π C 0 with oracle information, which concerns, ignoring the tree gadgets, the common refinement of the layers { on the one hand, and the bisimulation classes S 0 and S 1 on the other hand, is therefore equal to π C 0 on the stakes, and generally finer on the tree gadgets.Next, every valid refinement sequence Π = (π 0 , π 2 , . . ., π n ) for the updated LTS C k = (S, A , →, π 0 ) satisfies rc(Π) rc(C k−2 ).Following the lines of the proof of Lemma 5.4, we can show that a valid refinement sequence Π for C k with updated initial partition π 0 induces a valid refinement sequence Π for C k−2 .
The number of states in C k−2 is Θ(n) with n the number of states of C k , and the number of transitions in C k−2 is Θ(m) with m the number of transitions of C k .Therefore, rc(Π) rc(Π ) rc(C k−2 ), from which we derive that any partition refinement algorithm with an oracle for end structures involves Θ((m+n) log n) times moving a state for C k , and hence, the algorithm is Ω((m+n) log n).

An Ω(n) lowerbound for parallel partitioning algorithms
In this section we pose the question of the effect of the concept of valid refinements on parallel partition refinement algorithms.We show an Ω(n) lowerbound.This result was already suggested in [Kul13, Theorem 3], without making explicit which operations are allowed to calculate the refinement.In particular, for deterministic LTSs with singleton alphabets, an O(log n) parallel refinement algorithm [JR94] exists, defying the argumentation of [Kul13].This latter algorithm clearly is not based on valid refinements.
Parallel bisimulation algorithms are most conveniently studied in the context of PRAMs (Parallel Random Access Machine) [SV84], which have an unbounded number of processors that can all access the available memory.PRAMs are approximated by GPUs (Graphical Processing Units) that currently contain thousands of processor cores, but more interestingly, in combination with the operating system, can run millions of independent threads simultaneously.
There are a few variants of the PRAM model.The most important variation is in what happens when multiple processors try to write to the same address in memory.In the common scheme a write to a particular address takes place if all processors writing to this address write the same value.Otherwise, the write fails and the address will contain an arbitrary value.In the arbitrary scheme, one of the processors writing to the address will win, and writes its value; the writes of other processors to the address are ignored.In the priority scheme, the processor with the lowest index writes to the address.
A number of algorithms have been proposed to calculate bisimulation on PRAMs or GPUs [LR94, RL98, Wij15, MGH + 21], and there are also parallel algorithms developed for networks of parallel computers [BO05].The algorithms in [LR94,RL98] require O(n log n) time on respectively m log n log log n and m n log n processors.The algorithm in [MGH + 21] has the best worst-case time complexity of O(n) and uses max(n, m) processors.All these parallel algorithms have in common that they can be classified as partition refinement algorithms in the sense that they all calculate a valid sequence of partitions.
Note that parallel refinement algorithms can fundamentally outperform sequential algorithms.In order to understand why parallel algorithms achieve an upperbound of O(n) vs. a lowerbound of Ω((m + n) log n) for sequential algorithms, we look at the algorithm in [MGH + 21] in more detail as it has the best time complexity.For the sake of exposition we assume here that there is only one action, although the story with multiple actions is essentially the same.In the algorithm, first an unstable block is chosen.All states reaching this block are marked, which is done in constant time, by one processor per transition.Subsequently, all marked states in each block separate themselves from the other states in constant time using one processor per state, by employing an intricate trick where each block is characterised by a unique 'leader' state in each block.Here it is essential that the PRAM model uses the arbitrary or priority writing scheme.The algorithm does not work in the common scheme.In [MGH + 21] it is shown that at most 3n of these constant-time splitting steps need to be performed, leading to a time complexity of O(n).
An important observation is that parallel algorithms allow to split blocks in constant time, whereas for sequential algorithms we defined the refinement costs as the minimal number of states that had to be moved from one block to a new block.We assume, similarly to the sequential setting, that a new refinement can only be calculated on the basis of a previous refinement.This makes it natural to define the notion of parallel refinement costs as the minimal conceivable length of a valid refinement sequence and take this as the minimal time required to calculate bisimulation using partitioning in a parallel setting.
For an LTS L and a sequence Π = (π 0 , . . ., π n ), the parallel refinement cost is the number of refinements in the sequence, prc(Π) = n.For an LTS L we define prc(L) = min{ prc(Π) | Π is a valid refinement sequence for L }.
Observe that parallel refinement costs allow for extremely fast partitioning of transition systems.Below we show an example with 2 k + k states with a refinement cost of 1.The states are given by b 0 , . . ., b k−1 , a 0 , . . ., a 2 k −1 .There is a transition from a i to b j if the j-th bit in the binary representation of i is 1.The initial partition π init groups all states a i in one partition, and puts each state b j in a partition of its own.So, π init contains k+1 blocks.In Figure 5 this transition system is depicted for k = 3.The shortest valid refinement sequence is (π init , π final ), where in π final each state is in a separate block.This refinement is valid, because in π init there is enough information to separate each state from any other, as can easily be checked against the definition.As this refinement sequence has length 1, the parallel refinement cost of this transition system is 1, indicating that it is conceivable to make a bisimulation partitioning algorithm doing this refinement in constant time.Note that existing parallel algorithms do not achieve this performance.For instance, the algorithm in [MGH + 21] requires linear time as it checks stability for each new block sequentially.
Although parallel partitioning can be fast, we show, using the notion of parallel refinement costs, that calculating bisimulation in parallel requires time Ω(n).For this purpose, we construct a family of LTSs D n for which the length of any valid refinement sequence grows linearly with the number of states.
Still the Kripke structures S k of [BBG17] can be interpreted as non-deterministic labelled transition systems with a single action label and an initial partition based on the assignment of atomic proposition.Similarly, it is not obvious how to transform the C k -family into a family of undirected or directed graphs such that colour refinement requires inspection of Ω((m+n) log n) edges.

Conclusion
We have shown that, even when restricted to deterministic LTSs, it is not possible to construct a sequential algorithm based on partition refinement that is more efficient than Ω((m + n) log n).The bound obtained is preserved even when the algorithm is extended with an oracle that can determine for specific states in constant time whether they are bisimilar or not.The oracle proof technique enabled us to show that the algorithmic ideas underlying Roberts' algorithm [PTB85] for the one-letter alphabet case cannot be used to come up with a fundamentally faster enhanced partition refinement algorithm for bisimulation.
Of course, this is not addressing a generic lower bound to decide bisimilarity on LTSs, nor proving the conjecture that the Paige-Tarjan algorithm is optimal for deciding bisimilarity.It is conceivable that a more efficient algorithm for bisimilarity exists that is not based on partitioning.However, as it stands, no techniques are known to prove such a generic algorithmic lowerbound, and all techniques that do exist make assumptions on allowed operations, such as the well-known lowerbound on sorting.
But by relaxing the notion of a valid partition sequence, and maybe introducing alternatives for oracles, it may very well be possible that the lower bound is extended to a wider range of algorithmic techniques to determine bisimulation, making it very unlikely that sequential algorithms for bisimulation with a time-complexity better than O((m + n) log n) exist.Note that the current lowerbound already applies to all known efficient algorithms for bisimulation.
For the parallel setting, we showed that deciding bisimilarity by partitioning is Ω(n).In this case a similar situation occurs.For LTSs with one action label it is possible to calculate bisimulation in logarithmic time, cf.[JR94].An interesting, but as yet open question is whether the techniques used [JR94] can fundamentally improve the efficiency of determining bisimulation in parallel, or, as we believe, the lowerbound result can be strengthened along the lines of the sequential case to show that the techniques of [JR94] are insufficient to obtain a sub-linear parallel time complexity to determine bisimulation for labelled transition systems with at least two action labels.

Figure 1 :
Figure 1: An example of a deterministic LTS with initial partition (action label suppressed).

Figure 2 :
Figure 2: The bisplitters B 1 , B 2 , and B 3 .Initial partitions are indicated by single-circled and double circled states.
Lemma 4.3.Let k 1 and consider the LTS with initial partition B k = (B k , A k , →, π k 0 ), i.e. the k-th bisplitter.Let the sequence Π = (π k 0 , . . ., π n ) be a valid refinement sequence for B k .Then it holds that (a) Every partition π i in Π contains prefix blocks only.(b) If partition Suppose B σ ∈ π i and |σ| = < k.Let θ ∈ B * be such that σ 1 = σ0θ and σ 2 = σ1θ.Then we have σ 1 a Theorem 4.4.For any k > 1, application of partition refinement to the bisplitter B k has refinement costs rc(B k ) ∈ Ω(n log n) where n = 2 k is the number of states of B k .Proof.Let Π = (π k 0 , . . ., π m ) be a valid refinement sequence for B k .By items (a) and (b) of Lemma 4.3, we have π m = { {s} | s ∈ B k } since π m is stable and thus contains singleton blocks only.Item (c) of Lemma 4.3 implies that in every refinement step (π i , π i+1 ) a block is either kept or it is refined in two prefix blocks of equal size.The cost of refining the block B σ , for 1 |σ| k−1, into B σ0 and B σ1 is the number of states in B σ0 or the number of states in B σ1 , which are the same and are equal to 1 2 2 k−|σ| .Therefore, we have Definition 5.1.Given an LTS L = (S, A , →, π 0 ) with an initial partition, a non-empty subset S ⊆ S is called an end structure of L iff S is a minimal set of states that is closed under all transitions, L[S ] ⊆ S and for all S ⊆ S it holds that L[S ] ⊆ S and S ∩ S = ∅ implies S ⊆ S .Moreover, es(L) = { S ⊆ S | S end structure of L }, ES (L) = es(L), and the partition π es such that {c 1 , c 3 , c 4 , c 6 } ∪ {s 11 , s 13 , s 14 } ∪ {s 21 , s 22 , s 23 } ∪ {s 31 } ∪ {s 42 , s 43 , s 44 } ∪ {s 52 } and {c 2 , c 5 } ∪ {s 12 } ∪ {s 32 } ∪ {s 41 } ∪ {s 51 , s 53 } into end structure partition π es with five blocks, viz. the three blocks {c 1 , c 4 } ∪ {s 13 , s 21 , s 52 }, {c 2 , c 5 } ∪ {s 12 , s 32 }, and {c 3 , c 6 } ∪ {s 11 , s 14 , s 22 , s 23 } consisting of the states on the cycle together with the states that are bisimilar, on the one hand, and the two blocks with the remaining states {s 31 } ∪ {s 42 , s 43 , s 44 } and {s 41 } ∪ {s 51 , s 53 } on the other hand.Lemma 5.2.Let L = (S, A , →, π 0 ) be a deterministic LTS.(a) If |A | = 1 then es(L) consists of all cycles in L. (b) Every s ∈ S has a path to an end structure of L. Proof.(a) Since an end structure S is closed under transitions, S is a lasso.Because S is minimal and non-empty, it follows that S is a cycle.(b) Let U = { t ∈ S | s w − → * t, w ∈ A * } be the set of states reachable from state s.Then the set U is closed under transitions.The minimal non-empty subset U ⊆ U which is still closed under transitions is an end structure of L and can be reached by s.

Figure 3 :
Figure 3: The partial layered bisplitter C 3 with tree gadgets, the colours represent the initial partition.

Figure 4 :
Figure 4: The example of the outgoing tree for C 6 from the root [011010, ε] ∈ S C 6 .