Partial Order Infinitary Term Rewriting

We study an alternative model of infinitary term rewriting. Instead of a metric on terms, a partial order on partial terms is employed to formalise convergence of reductions. We consider both a weak and a strong notion of convergence and show that the metric model of convergence coincides with the partial order model restricted to total terms. Hence, partial order convergence constitutes a conservative extension of metric convergence, which additionally offers a fine-grained distinction between different levels of divergence. In the second part, we focus our investigation on strong convergence of orthogonal systems. The main result is that the gap between the metric model and the partial order model can be bridged by extending the term rewriting system by additional rules. These extensions are the well-known B\"ohm extensions. Based on this result, we are able to establish that -- contrary to the metric setting -- orthogonal systems are both infinitarily confluent and infinitarily normalising in the partial order setting. The unique infinitary normal forms that the partial order model admits are B\"ohm trees.


Introduction
Infinitary term rewriting [14] extends the theory of term rewriting by giving a meaning to transfinite rewriting sequences. Its formalisation [9] is chiefly based on the metric space of terms as studied by Arnold and Nivat [2]. Other models for transfinite reductions, using for example general topological spaces [22] or partial orders [7,6], were mainly considered to pursue quite specific purposes and have not seen nearly as much attention as the metric model. In this paper we introduce a novel foundation of infinitary term rewriting based on the partially ordered set of partial terms [12]. We show that this model of infinitary term rewriting is superior to the metric model. This assessment is supported by two findings: First, the partial order model of infinitary term rewriting conservatively extends the metric model. That is, anything that can be done in the metric model can be achieved in the partial order model as well by simply restricting it to the set of total terms. Secondly, unlike the metric model, the partial order model provides a fine-grained distinction between different levels of divergence and exhibits nice properties like infinitary confluence and normalisation of orthogonal systems.
The defining core of a theory of infinitary term rewriting is its notion of convergence for transfinite reductions: which transfinite reductions are "admissible" and what is their final outcome. In this paper we study both variants of convergence that are usually considered in the established theory of metric infinitary term rewriting: weak convergence [9] and strong convergence [16]. For both variants we introduce a corresponding notion of convergence based on the partially ordered set of partial terms.
The first part of this paper is concerned with comparing the metric model and the partial order model both in their respective weak and strong variants. In both cases, the partial order approach constitutes a conservative extension of the metric approach: a reduction in the metric model is converging iff it is converging in the partial order model and only contains total terms.
In the second part we focus on strong convergence in orthogonal systems. To this end we reconsider the theory of meaningless terms of Kennaway et al. [17]. In particular, we consider Böhm extensions. The Böhm extension of a term rewriting system adds rewrite rules which admit contracting meaningless terms to ⊥. The central result of the second part of this paper is that the additional rules in Böhm extensions close the gap between partial order convergence and metric convergence. More precisely, we show that reachability w.r.t. partial order convergence in a term rewriting system coincides with reachability w.r.t. metric convergence in the corresponding Böhm extension.
From this result we can easily derive a number of properties for strong partial order convergence in orthogonal systems: • Infinitary confluence, • infinitary normalisation, and • compression, i.e. each reduction can be compressed to length at most ω The first two properties exhibit another improvement over the metric model which does not have either of these. Moreover, it means that each term has a unique infinitary normal form -its Böhm tree.
The most important tool for establishing these results is provided by a notion of complete developments that we have transferred from the metric approach to infinitary rewriting [16]. We show, that the final outcome of a complete development is unique and that, in contrast to the metric model, the partial order model admits complete developments for any set of redex occurrences. To this end, we use a technique similar to paths and finite jumps known from metric infinitary term rewriting [14,21].
Outline. After providing the basic preliminaries for this paper in Section 1, we will briefly recapitulate the metric model of infinitary term rewriting including meaningless terms and Böhm extensions in Section 2. In Section 3, we introduce our novel approach to infinitary term rewriting based on the partial order on terms. In Section 4, we compare both models and establish that the partial order model provides a conservative extension of the metric model. In the remaining part of this paper, we focus on the strong notion of convergence. In Section 5, we establish a theory of complete developments in the setting of partial order convergence. This is then used in Section 6 to prove the equality of reachability w.r.t. partial order convergence and reachability w.r.t. metric convergence in the Böhm extension. Finally, we evaluate our results and point to interesting open questions in Section 7.

Preliminaries
We assume the reader to be familiar with the basic theory of ordinal numbers, orders and topological spaces [13], as well as term rewriting [24]. In the following, we briefly recall the most important notions.
1.1. Transfinite Sequences. We use α, β, γ, λ, ι to denote ordinal numbers. A transfinite sequence (or simply called sequence) S of length α in a set A, written (a ι ) ι<α , is a function from α to A with ι → a ι for all ι ∈ α. We use |S| to denote the length α of S. If α is a limit ordinal, then S is called open. Otherwise, it is called closed. If α is a finite ordinal, then S is called finite. Otherwise, it is called infinite. For a finite sequence (a i ) i<n or a sequence (a i ) i<ω of length ω, we also use the notation a 0 , a 1 , . . . , a n−1 respectively a 0 , a 1 , . . . . In particular, denotes an empty sequence.
The concatenation (a ι ) ι<α · (b ι ) ι<β of two sequences is the sequence (c ι ) ι<α+β with c ι = a ι for ι < α and c α+ι = b ι for ι < β. A sequence S is a (proper) prefix of a sequence T , denoted S ≤ T (resp. S < T ), if there is a (non-empty) sequence S ′ with S · S ′ = T . The prefix of T of length β is denoted T | β . The binary relation ≤ forms a complete semilattice (see Section 1.3 below). Similarly, a sequence S is a (proper) suffix of a sequence T if there is a (non-empty) sequence S ′ with S ′ · S = T .
Let S = (a ι ) ι<α be a sequence. A sequence T = (b ι ) ι<β is called a subsequence of S if there is a monotone function f : β → α such that b ι = a f (ι) for all ι < β. To indicate this, we write S/f for the subsequence T . If f (ι) = f (0) + ι for all ι < β, then S/f is called a segment of S. That is, T is a segment of S iff there are two sequences T 1 , T 2 such that S = T 1 · T · T 2 . We write S| [β,γ) for the segment S/f , where f : α ′ → α is the mapping defined by f (ι) = β + ι for all ι < α ′ , with α ′ the unique ordinal with γ = β + α ′ . Note that in particular S| [0,α) = S| α for each sequence S and ordinal α ≤ |S|.
1.3. Partial Orders. A partial order ≤ on a set A is a binary relation on A that is transitive, reflexive, and antisymmetric. The pair (A, ≤) is then called a partially ordered set. We use < to denote the strict part of ≤, i.e. a < b iff a ≤ b and b ≤ a. A sequence (a ι ) ι<α in (A, ≤) is called a (strict) chain if a ι ≤ a γ (resp. a ι < a γ ) for all ι < γ < α. A subset D of the underlying set A is called directed if it is non-empty and each pair of elements in D has an upper bound in D. A partially ordered set (A, ≤) is called a complete semilattice if it has a least element, every directed subset D of A has a least upper bound (lub) D, and every subset of A having an upper bound also has a least upper bound. Hence, complete Note that the limit in a metric space has the same behaviour as the one for the limit inferior described by the proposition above. However, one has to keep in mind that -unlike the limit -the limit inferior is not invariant under taking cofinal subsequences! With the prefix order ≤ on sequences we can generalise concatenation to arbitrary sequences of sequences: Let (S ι ) ι<α be a sequence of sequences in a common set. The concatenation of (S ι ) ι<α , written ι<α S ι , is recursively defined as the empty sequence if α = 0, ι<α ′ S ι · S α ′ if α = α ′ + 1, and γ<α ι<γ S ι if α is a limit ordinal. For instance, the concatenation i<ω i, i + 1 yields the sequence 0, 1, 1, 2, 2, . . . of length ω, and the concatenation ι<α ι , for any ordinal α, yields the sequence (ι) ι<α .
1.4. Terms. Unlike in the traditional -i.e. finitary -framework of term rewriting, we consider the set T ∞ (Σ, V) of infinitary terms (or simply terms) over some signature Σ and a countably infinite set V of variables. A signature Σ is a countable set of symbols. Each symbol f is associated with its arity ar(f ) ∈ N, and we write Σ (n) for the set of symbols in Σ which have arity n. The set T ∞ (Σ, V) is defined as the greatest set T such that, for each element t ∈ T , we either have t ∈ V or t = f (t 0 , . . . , t k−1 ), where f ∈ Σ (k) , and t 0 , . . . , t k−1 ∈ T . A symbol c ∈ Σ (0) of arity 0 is also called a constant symbol, and we use the shorthand c to denote a term c(). We consider T ∞ (Σ, V) as a superset of the set T (Σ, V) of finite terms.
Note that while the set of terms T ∞ (Σ, V) is defined coinductively, the set of positions of a term is defined inductively. Consequently, the subterm at a position and substitution at a position are defined by recursion.
That is, sim(s, t) is the minimal depth at which s and t differ, or ∞ if s = t. Based on this, a distance function d can be defined by d(s, t) = 2 −sim(s,t) , where we interpret 2 −∞ as 0. Note that 0 ≤ d(s, t) ≤ 1. In particular, d(s, t) = 0 iff s and t coincide, and d(s, t) = 1 iff s and t differ at the root. The pair (T ∞ (Σ, V), d) is known to form a complete ultrametric space [2]. Partial terms, i.e. terms over signature Σ ⊥ = Σ ⊎ {⊥} with ⊥ a fresh constant symbol, can be endowed with a binary relation ≤ ⊥ by defining s ≤ ⊥ t iff s can be obtained from t by replacing some subterm occurrences in t by ⊥. Interpreting the term ⊥ as denoting "undefined", ≤ ⊥ can be read as "is less defined than". The pair (T ∞ (Σ ⊥ , V), ≤ ⊥ ) is known to form a complete semilattice [12]. For a partial term t ∈ T ∞ (Σ ⊥ , V) we use the notation P ⊥ (t) and P Σ (t) for the set {π ∈ P(t) | t(π) = ⊥ } of non-⊥ positions resp. the set {π ∈ P(t) | t(π) ∈ Σ } of positions of function symbols. With this, ≤ ⊥ can be characterised alternatively by s ≤ ⊥ t iff s(π) = t(π) for all π ∈ P ⊥ (s). To explicitly distinguish them from partial terms, we call terms in T ∞ (Σ, V) total.
1.5. Term Rewriting Systems. A term rewriting system (TRS) R is a pair (Σ, R) consisting of a signature Σ and a set R of term rewrite rules of the form l → r with l ∈ T ∞ (Σ, V)\V and r ∈ T ∞ (Σ, V) such that all variables occurring in r also occur in l. Note that this notion of a TRS deviates slightly from the standard notion of TRSs in the literature on infinitary rewriting [14] in that it allows infinite terms on the left-hand side of rewrite rules! This generalisation will be necessary to accommodate Böhm extensions, which are introduced later in Section 2.2. TRSs having only finite left-hand sides are called left-finite.
As in the finitary setting, every TRS R defines a rewrite relation → R : Instead of s → R t, we sometimes write s → π,ρ t in order to indicate the applied rule ρ and the position π, or simply s → t. The subterm s| π is called a ρ-redex or simply redex, rσ its contractum, and s| π is said to be contracted to rσ. Let ρ : l → r be a term rewrite rule. The pattern of ρ is the context lσ, where σ is the substitution {x → | x ∈ V } that maps all variables to . If t is a ρ-redex, then the pattern P of ρ is also called the redex pattern of t w.r.t. ρ. When referring to the occurrences in a pattern, occurrences of the symbol are neglected.

PATRICK BAHR
Let ρ 1 : l 1 → r 1 , ρ 2 : l 2 → r 2 be rules in a TRS R. The rules ρ 1 , ρ 2 are said to overlap if there is a non-variable position π in l 1 such that l 1 | π and l 2 are unifiable and π is not the root position in case ρ 1 , ρ 2 are renamed copies of the same rule. A TRS is called non-overlapping if none of its rules overlap. A term t is called linear if each variable occurs at most once in t. The TRS R is called left-linear if the left-hand side of every rule in R is linear. It is called orthogonal if it is left-linear and non-overlapping.

Metric Infinitary Term Rewriting
In this section we briefly recall the metric model of infinitary term rewriting [16] and some of its properties. We will use the metric model in two ways: Firstly, it will serve as a yardstick to compare the partial order model to. But most importantly, we will use known results for metric infinitary rewriting and transfer them to the partial order model. In order to accomplish the latter, we shall develop correspondence theorems (Theorem 4.9 and Theorem 4.12) that relate convergence in the metric model and convergence in the partial order model. Specifically, these correspondence results show that the two notions of convergence coincide if we restrict ourselves to total terms. At first we have to make clear what a reduction in our setting of infinitary rewriting is: This definition of reductions is a straightforward generalisation of finite reductions. As an example consider the TRS with the single rule a → f (a). In this system we get a reduction S : a → * f (f (f (a))) of length 3: In a more concise notation we write Clearly, we can extend this reduction arbitrarily often which results in the following infinite reduction of length ω: However, this is as far we can go with this simple definition of reductions. As soon as we go beyond ω, we get reductions which do not make sense. For example, consider the following reduction: The reduction T of length ω can be extended by an arbitrary reduction, e.g. by the reduction S. The notion of reductions according to Definition 2.1 is only meaningful if restricted to reductions of length at most ω. The problem is that the ω-th step in the reduction, viz. the second step of the form a → f (a) in the example above, is completely independent of all previous steps since it does not have an immediate predecessor. This issue occurs at each limit ordinal number. An appropriate definition of a reduction of length beyond ω requires a notion of continuity to bridge the gaps that arise at limit ordinals. In the next section we will present the well-know metric approach to this. Later in Section 3, we will introduce a novel approach using partial orders.
2.1. Metric Convergence. In this section we consider two notions of convergence based on the metric on terms as defined in Section 1.4. We consider both the weak [9] and the strong [16] variant known from the literature. Related to this notion of convergence is a corresponding notion of continuity. In order to distinguish both from the partial order model that we will introduce in Section 3 we will use the names weak resp. strong m-convergence and weak resp. strong m-continuity.
It is important to understand that a reduction is a sequence of reduction steps rather than just a sequence of terms. This is crucial for a proper definition of strong convergence resp. continuity, which does not only depend on the sequence of terms that are derived within the reduction but does also depend on the positions where the contractions take place: Definition 2.2 (m-continuity/-convergence). Let R be a TRS and S = (ϕ ι : t ι → πι t ι+1 ) ι<α a non-empty reduction in R. The reduction S is called if it is weakly m-continuous and for each limit ordinal λ < α, the sequence (|π ι |) ι<λ of contraction depths tends to infinity.
(iii) weakly m-converging to t in R, written S : t 0 ֒→ m R t, if it is weakly m-continuous and t = lim ι→ α t ι . (iv) strongly m-converging to t in R, written S : t 0 ։ m R t, if it is strongly m-continuous, weakly m-converges to t and, in case that S is open, (|π ι |) ι<α tends to infinity. Whenever S : t 0 ֒→ m R t or S : t 0 ։ m R t, we say that t is weakly resp. strongly m-reachable from t 0 in R. By abuse of notation we use ֒→ m R and ։ m R as a binary relation to indicate weakly resp. strongly m-reachability. In order to indicate the length of S and the TRS R, we write S : t 0 ֒→ m α R t resp. S : t 0 ։ m α R t. The empty reduction is considered weakly/strongly m-continuous and m-convergent for any identical start and end term, i.e. : t ։ m R t for all t ∈ T (Σ, V).
From the above definition it is clear that strong m-convergence implies both weak mconvergence and strong m-continuity and that both weak m-convergence and strong mcontinuity imply weak m-continuity, respectively. This is indicated in Figure 1. It is important to recognise that m-convergence implies m-continuity. Hence, only meaningful, i.e. m-continuous, reductions can be m-convergent.
For a reduction to be weakly m-continuous, each open proper prefix of the underlying sequence (t ι ) ι< α of terms must converge to the term following next in the sequence -or, equivalently, (t ι ) ι< α must be continuous. For strong m-continuity, additionally, the depth at which contractions take place has to tend to infinity for each of the reduction's open proper prefixes. The convergence properties do only differ from the continuity properties in that they require the above conditions to hold for all open prefixes, i.e. including the whole reduction itself unless it is closed. For example, considering the rule a → f (a), the reduction g(a) → g(f (a)) → g(f (f (a))) → . . . strongly m-converges to the infinite term g(f ω ). The first step takes place at depth 1, the next step at depth 2 and so on. Having the rule g(x) → g(f (x)) instead, the reduction g(a) → g(f (a)) → g(f (f (a))) → . . . is  trivially strongly m-continuous but is now not strongly m-convergent since every step in this reduction takes place at depth 0, i.e. the sequence of reduction depths does not tend to infinity. However, the reduction still weakly m-converges to g(f ω ).
In contrast to the strong notions of continuity and convergence, the corresponding weak variants are independent from the rules that are applied during the reduction. What makes strong m-convergence (and -continuity) strong is the fact that it employs a conservative overapproximation of the differences between consecutive terms in the reduction. For weak m-convergence the distance d(t ι , t ι+1 ) between consecutive terms in a reduction (t ι → πι t ι+1 ) ι<λ has to tend to 0. For strong m-convergence the depth |π ι | of the reduction steps has to tend to infinity. In other words, 2 −|πι| has to tend to 0. Note that 2 −|πι| is a conservative overapproximation of d(t ι , t ι+1 ), i.e. 2 −|πι| ≥ d(t ι , t ι+1 ). So strong m-convergence is simply weak m-convergence w.r.t. this overapproximation of d [4]. If this approximation is actually precise, i.e. coincides with the actual value, both notions of m-convergence coincide.
Remark 2.3. The notion of m-continuity can be defined solely in terms of m-convergence [4]. More precisely, we have for each reduction S = (t ι → t ι+1 ) ι<α that S is weakly m-continuous iff every (open) proper prefix of S| β weakly m-converges to t β . Analogously, strong mcontinuity can be characterised in terms of strong m-convergence. An easy consequence of this is that m-converging reductions are closed under concatenation, i.e. S : s ֒→ m t, T : t ֒→ m u implies S · T : s ֒→ m u and likewise for strong m-convergence.
For the most part our focus in this paper is set on strong m-convergence and its partial order correspondent that we will introduce in Section 3. Weak m-convergence is well-known to be rather unruly [23]. Strong convergence is far more well-behaved [16]. Most prominently, we have the following Compression Lemma [16] which in general does not hold for weak mconvergence: As an easy corollary we obtain that the final term of a strongly m-converging reduction can be approximated arbitrarily accurately by a finite reduction: Corollary 2.5 (finite approximation). Let R be a left-linear, left-finite TRS and s ։ m t. Then, for each depth d ∈ N, there is a finite reduction s → * t ′ such that t and t ′ coincide up to depth d, i.e. d(t, t ′ ) < 2 −d .
Proof. Assume s ։ m t. By Theorem 2.4, there is a reduction S : s ։ m ≤ω t. If S is of finite length, then we are done. If S : s ։ m ω t, then, by strong m-convergence, there is some n < ω such that all reductions steps in S after n take place at a depth greater than d. Consider S| n : s → * t ′ . It is clear that t and t ′ coincide up to depth d.
As a special case of the above corollary, we obtain that s ։ m t implies s → * t whenever t is a finite term.
An important difference between m-converging reductions and finite reductions is the confluence of orthogonal systems. In contrast to finite reachability, m-reachability of orthogonal TRSs -even in its strong variant -does not necessarily have the diamond property, i.e. orthogonal systems are confluent but not infinitarily confluent [16]: Example 2.6 (failure of infinitary confluence). Consider the orthogonal TRS consisting of the collapsing rules ρ 1 : f (x) → x and ρ 2 : g(x) → x and the infinite term t = g(f (g(f (. . . )))).
We then obtain the reductions S : t ։ m g ω and T : t ։ m f ω by successively contracting all ρ 1resp. ρ 2 -redexes. However, there is no term s such that g ω ։ m s և m f ω (or g ω ֒→ m s ←֓ m f ω ) as both g ω and f ω can only be rewritten to themselves, respectively.
In the following section we discuss a method for obtaining an appropriate notion of transfinite reachability based on strong m-reachability which actually has the diamond property.

Meaningless Terms and Böhm
Trees. At the end of the previous section we have seen that orthogonal TRSs are in general not infinitarily confluent. However, as Kennaway et al. [16] have shown, orthogonal TRSs are infinitarily confluent modulo so-called hypercollapsing terms -in the sense that two forking strongly m-converging reductions t ։ m t 1 , t ։ m t 2 can always be extended by two strongly m-converging reductions t 1 ։ m t 3 , t 2 ։ m t ′ 3 such that the resulting terms t 3 , t ′ 3 only differ in the hyper-collapsing subterms they contain. This result was later generalised by Kennaway et al. [17] to develop an axiomatic theory of meaningless terms. Intuitively, a set of meaningless terms in this setting consists of terms that are deemed meaningless since, from a term rewriting perspective, they cannot be distinguished from one another and they do not contribute any information to any computation. Kennaway et al. capture this by a set of axioms that characterise a set of meaningless terms. For orthogonal TRSs, one such set of terms, in fact the least such set, is the set of root-active terms [17]: Definition 2.7 (root-activeness). Let R be a TRS and t ∈ T ∞ (Σ, V). Then t is called root-active if for each reduction t → * t ′ , there is a reduction t ′ → * s to a redex s. The set of all root-active terms of R is denoted RA R or simply RA if R is clear from the context. Intuitively speaking, as the name already suggests, root-active terms are terms that can be contracted at the root arbitrarily often, e.g. the terms f ω and g ω from Example 2.6.
In this paper we are only interested in this particular set of meaningless terms. So for the sake of brevity we restrict our discussion in this section to the set RA instead of the original more general axiomatic treatment by Kennaway et al. [17].
Since, denotationally, root-active terms cannot be distinguished from each other it is appropriate to equate them [17]. This can be achieved by introducing a new constant symbol ⊥ and making each root-active term equal to ⊥. By adding rules which enable rewriting root-active terms to ⊥, this can be encoded into an existing TRS [17]: obtained from s by replacing each occurrence of ⊥ in s with some term in U .
We write s → U ,⊥ t for a reduction step using a rule in B. If R and U are clear from the context, we simply write B and → ⊥ instead of B R,U and → U ,⊥ , respectively. A reduction that is strongly m-converging in the Böhm extension B is called Böhm-converging. A term t is called Böhm-reachable from s if there is a Böhm-converging reduction from s to t.
The definition of U ⊥ is quite subtle and deserves further attention before we move on. According to the definition, a term t is in U ⊥ iff the term obtained from t by replacing occurrences of ⊥ in t by terms from U is also in U . More illuminating, however, is the converse view, i.e. how to construct a term in U ⊥ from a term in U . First of all, any term in U is also in U ⊥ . Secondly, we may obtain a term in U ⊥ by taking a term t ∈ U and replacing any number of subterms of t that are in U by ⊥. For Böhm extensions, this means that we may contract any term t ∈ U to ⊥, even if we already contracted some proper subterms of t to ⊥ before.
It is at this point where we, in fact, need the generality of allowing infinite terms on the left-hand side of rewrite rules: The additional rules of a Böhm extension allow possibly infinite terms t ∈ U ⊥ \ {⊥} on the left-hand side. Remark 2.9 (closure under substitution). Note that, for orthogonal TRSs, RA is closed under substitutions and, hence, so is RA ⊥ [17]. Therefore, whenever C[t] → RA,⊥ C[⊥], we can assume that t ∈ RA ⊥ .
With the additional rules provided by the Böhm extension, we gain infinitary confluence of orthogonal systems: Theorem 2.10 (infinitary confluence of Böhm-converging reductions, [17]). Let R be an orthogonal, left-finite TRS. Then the Böhm extension B of R w.r.t. RA is infinitarily confluent, i.e.
The lack of confluence for strongly m-converging reductions is resolved in Böhm extensions by allowing (sub-)terms, which where previously not joinable, to be contracted to ⊥. Returning to Example 2.6, we can see that g ω and f ω can be rewritten to ⊥ as both terms are root-active.
In fact, w.r.t. Böhm-convergence, every term of an orthogonal TRS has a normal form: Theorem 2.11 (infinitary normalisation of Böhm-converging reductions, [17]). Let R be an orthogonal, left-finite TRS. Then the Böhm extension B of R w.r.t. RA is infinitarily normalising, i.e. for each term t there is a B-normal form Böhm-reachable from t.
This means that each term t of an orthogonal, left-finite TRS R has a unique normal form in B R,RA . This normal form is called the Böhm tree of t (w.r.t. RA) [17].
The rest of this paper is concerned with establishing an alternative to the metric notion of convergence based on the partial order on terms that is equivalent to the Böhm extension approach.

Partial Order Infinitary Rewriting
In this section we introduce an alternative model of infinitary term rewriting which uses the partial order on terms to formalise convergence of transfinite reductions. To this end we will turn to partial terms which, like in the setting of Böhm extensions, have an additional constant symbol ⊥. The result will be a more fine-grained notion of convergence in which, intuitively speaking, a reduction can be diverging in some positions but at the same time converging in other positions. The "diverging parts" are then indicated by a ⊥-occurrence in the final term of the reduction: Example 3.1. Consider the TRS consisting of the rules h(x) → h(g(x)), b → g(b) and the term t = f (h(a), b). In this system, we have the reduction which alternately contracts the redex in the left and in the right argument of f .
The reduction S weakly m-converges to the term f (h(g ω ), g ω ). But it does not strongly m-converge as the depth at which contractions are performed does not tend to infinity. However, this does only happen in the left argument of f , not in the other one. Within the partial order model we will still be able to obtain that S weakly converges to f (h(g ω ), g ω ) but we will also obtain that it strongly converges to the term f (⊥, g ω ). That is, we will be able to identify that the reduction S strongly converges except at position 0 , the first argument of f .
3.1. Partial Order Convergence. In order to formalise continuity and convergence in terms of the complete semilattice (T ∞ (Σ ⊥ , V), ≤ ⊥ ) instead of the complete metric space (T ∞ (Σ, V), d), we move from the limit of the metric space to the limit inferior of the complete semilattice: Definition 3.2 (p-continuity/-convergence). Let R = (Σ, R) be a TRS and S = (ϕ ι : t ι → πι t ι+1 ) ι<α a non-empty reduction in R ⊥ = (Σ ⊥ , R). The reduction S is called (i) weakly p-continuous in R, written S : t 0 ֒→ p R . . . , if lim inf ι→λ t ι = t λ for each limit ordinal λ < α. (ii) strongly p-continuous in R, written S : t 0 ։ p R . . . , if lim inf ι→λ c ι = t λ for each limit ordinal λ < α, where c ι = t ι [⊥] πι . Each c ι is called the context of the reduction step ϕ ι , which we indicate by writing ϕ ι : t ι → cι t ι+1 . (iii) weakly p-converging to t in R, written S : t 0 ֒→ p R t, if it is weakly p-continuous and t = lim inf ι→ α t ι . (iv) strongly p-converging to t in R, written S : t 0 ։ p R t, if it is strongly p-continuous and S is closed with t = t α+1 or t = lim inf ι→α c ι . Whenever S : t 0 ֒→ p R t or S : t 0 ։ p R t, we say that t is weakly resp. strongly p-reachable from t 0 in R. By abuse of notation we use ֒→ p R and ։ p R as a binary relation to indicate weak resp. strong p-reachability. In order to indicate the length of S and the TRS R, we write S : t 0 ֒→ p α R t resp. S : t 0 ։ p α R t. The empty reduction is considered weakly/strongly p-continuous and p-convergent for any start and end term, i.e. : t ։ p R t for all t ∈ T (Σ, V).
The definitions of weak p-continuity and weak p-convergence are straightforward "translations" from the metric setting to the partial order setting replacing the limit lim ι→α by the limit inferior lim inf ι→α . On the other hand, the definitions of the strong counterparts seem a bit different compared to the metric model: Whereas strong m-convergence simply adds a side condition regarding the depth |π ι | of the reduction steps, strong p-convergence is defined in a different way compared to the weak variant. Instead of the terms t ι of the reduction, it considers the contexts c ι = t ι [π ι ] ⊥ . However, one can surmise some similarity due to the fact that the partial order model of strong convergence indirectly takes into account the position π ι of each reduction step as well. Moreover, for the sake of understanding the intuition of strong p-convergence it is better to compare the contexts c ι rather with the glb of two consecutive terms t ι ⊓ t ι+1 instead of the term t ι itself. The following proposition allows precisely that. Proof. Let a = lim inf ι→λ a ι and a = lim inf ι→λ (a ι ⊓ a ι+1 ). Since a ι ⊓ a ι+1 ≤ a ι for each ι < λ, we have a ≤ a. On the other hand, consider the sets A α = {a ι | α ≤ ι < λ } and A α = {a ι ⊓ a ι+1 | α ≤ ι < λ } for each α < λ. Of course, we then have A α ≤ a ι for all α ≤ ι < λ, and thus also A α ≤ a ι ⊓ a ι+1 for all α ≤ ι < λ. Hence, A α is a lower bound of A α which implies that A α ≤ A α . Consequently, a ≤ a and, due to the antisymmetry of ≤, we can conclude that a = a.
With this in mind we can replace lim inf ι→λ t ι in the definition of weak p-convergence resp. p-continuity with lim inf ι→λ t ι ⊓ t ι+1 . From there it is easier to see the intention of moving from t ι ⊓ t ι+1 to the context t ι [⊥] πι in order to model strong convergence: What makes the notion of strong p-convergence (and p-continuity) strong, similar to the notion of strong m-convergence (resp. m-continuity), is the choice of taking the contexts t ι [⊥] πι for defining the limit behaviour of reductions instead of the whole terms t ι . The context t ι [⊥] πι provides a conservative underapproximation of the shared structure t ι ⊓ t ι+1 of two consecutive terms t ι and t ι+1 in a reduction step ϕ ι : t ι → πι t ι+1 . More specifically, we have that t ι [⊥] πι ≤ ⊥ t ι ⊓ t ι+1 . That is, as in the metric model of strong convergence, the difference between two consecutive terms is overapproximated by using the position of the reduction step as an indicator. Likewise, strong p-convergence is simply weak p-convergence w.r.t. this underapproximation of t ι ⊓ t ι+1 [4]. If this approximation is actually precise, i.e. coincides with the actual value, both notions of p-convergence coincide. Remark 3.4. As for the metric model, also in the partial order model, continuity can be defined solely in terms of convergence [4]. More precisely, we have for each reduction S = (t ι → t ι+1 ) ι<α that S is weakly p-continuous iff every (open) proper prefix of S| β weakly p-converges to t β . Analogously, strong p-continuity can be characterised in terms of strong p-convergence. An easy consequence of this is that p-converging reductions are closed under concatenation, i.e. S : s ֒→ t, T : t ֒→ u implies S · T : s ֒→ u and likewise for strong p-convergence.
In order to understand the difference between weak and strong p-convergence let us look at a simple example: Example 3.5. Consider the TRS with the single rule f (x, y) → f (y, x). This rule induces the following reduction: S : f (a, f (g(a), g(b))) → f (a, f (g(b), g(a))) → f (a, f (g(a), g(b))) → . . . S simply alternates between the terms f (a, f (g(a), g(b))) and f (a, f (g(b), g(a))) by swapping the arguments of the inner f occurrence. The reduction is depicted in Figure 2. The picture  illustrates the parts of the terms that remain unchanged and those that remain completely untouched by the corresponding reduction step by using a lighter resp. a darker shade of grey. The unchanged part corresponds to the glb of the two terms of a reduction step, viz.
for the first step By symmetry, the glb of the terms of the second step is the same one. It is depicted in Figure 3a. Let (t i ) i<ω be the sequence of terms of the reduction S. By definition, S weakly p-converges to lim inf i→ω t i . According to Proposition 3.3, this is equal to , g(⊥))), the reduction sequence weakly p-converges to f (a, f (g(⊥), g(⊥))).
Similarly, the part of the term that remains untouched by the reduction step corresponds to the context. For the first step, this is f (a, ⊥). It is depicted in Figure 3b. By definition, S strongly p-converges to lim inf i→ω c i for (c i ) i<ω the sequence of contexts of S. As one can see in Figure 2, the context constantly remains f (a, ⊥). Hence, S strongly p-converges to f (a, ⊥). The example sequence is a particularly simple one as both the glbs t i ⊓ t i+1 and the contexts c i remain stable. In general, this is not necessary, of course.
One can clearly see from the definition that, as for their metric counterparts, weak resp. strong p-convergence implies weak resp. strong p-continuity. In contrast to the metric model, however, also the converse implication holds! Since the partial order ≤ ⊥ on partial terms forms a complete semilattice, the limit inferior is defined for any non-empty sequence of partial terms. Hence, any weakly resp. strongly p-continuous reduction is also weakly resp. strongly p-convergent. This is a major difference to m-convergence/-continuity. Nevertheless, p-convergence constitutes a meaningful notion of convergence: The final term of a p-convergent reduction contains a ⊥ subterm at each position at which the reduction is "locally diverging" as we have seen in Example 3.1 and Example 3.5. In fact, as we will show in Section 4, whenever there are no '⊥'s involved, i.e. if there is no "local divergence", m-convergence and p-convergence coincide -both in the weak and the strong variant.
Recall that strong m-continuity resp. m-convergence implies weak m-continuity resp. m-convergence. This is not the case in the partial order setting. The reason for this is that strong p-convergence resp. p-continuity is defined differently compared to its weak variant. It uses the contexts instead of the terms in the reduction, whereas in the metric setting the strong notion of convergence is a mere restriction of the weak counterpart as we have observed earlier.
Example 3.6. Consider the TRS consisting of the rules Then the reduction is clearly both strongly p-continuous and -convergent. On the other hand it is neither weakly p-continuous nor -convergent for the simple fact that S does not weakly p-converge to f (⊥) but to f (h(g ω )).
Nevertheless, by observing that lim inf ι→α c ι ≤ ⊥ lim inf ι→α t ι since c ι ≤ ⊥ t ι for each ι < α, we obtain the following weaker relation between weak and strong p-convergence: Proof. Let S = (ϕ ι : t ι → ρι t ι+1 ) ι<α be a reduction strongly p-converging to t α . By induction we construct for each prefix The proposition then follows from the case where β = α.
The case β = 0 is trivial. If β = γ + 1, then by induction hypothesis we have a reduction γ satisfies desired conditions. If β is a limit ordinal, we can apply the induction hypothesis to obtain for each γ < β a reduction And indeed, returning to Example 3.6, we can see that there is a reduction that, starting from f (h(a)), weakly p-converges to g(h(g ω )) which is strictly larger than g(⊥).
A simple example shows that left-linearity is crucial for the above proposition: Example 3.8. Let R be a TRS consisting of the rules We then get the strongly p-converging reduction Yet, there is no reduction in R that, starting from f (a, b), weakly p-converges to c.

3.2.
Strong p-Convergence. In this paper we are mainly focused on the strong notion of convergence. To this end, the rest of this section will be concerned exclusively with strong p-convergence. We will, however, revisit weak p-convergence in Section 4 when comparing it to weak m-convergence.
Note that in the partial order model we have to consider reductions over the extended signature Σ ⊥ , i.e. reductions containing partial terms. Thus, from now on, we assume reductions in a TRS over Σ to be implicitly over Σ ⊥ . When we want to make it explicit that a reduction S contains only total terms, we say that S is total. When we say that a strongly p-convergent reduction S : s ։ p t is total, we mean that both the reduction S and the final term t are total. 1 In order to understand the behaviour strong p-convergence, we need to look at how the lub and glb of a set of terms looks like. The following two lemmas provide some insight. Lemma 3.9 (lub of terms). For each T ⊆ T ∞ (Σ ⊥ , V) and t = T , the following holds (i) P(t) = s∈T P(s) (ii) t(π) = f iff there is some s ∈ T with s(π) = f for each f ∈ Σ ∪ V, and position π.
Proof. Clause ((i)) follows straightforwardly from clause ((ii)). The "if" direction of ((ii)) follows from the fact that if s ∈ T , then s ≤ ⊥ t and, therefore, s(π) = f implies t(π) = f . For the "only if" direction assume that no s ∈ T satisfies s(π) = f . Since, s ≤ ⊥ t for each s ∈ T , we have π ∈ P ⊥ (s) for each s ∈ T . But then t ′ = t[⊥] π is an upper bound of T with t ′ < ⊥ t. This contradicts the assumption that t is the least upper bound of T .
and P a set of positions closed under prefixes such that all terms in T coincide in all positions in P , i.e. s(π) = t(π) for all π ∈ P and s, t ∈ T . Then the glb T also coincides with all terms in T in all positions in P .
Proof. Construct a term s such that it coincides with all terms in T in all positions in P and has ⊥ at all other positions. That is, given an arbitrary term t ∈ T , we define s as the unique term with s(π) = t(π) for all π ∈ P , and s(π · i ) = ⊥ for all π ∈ P with π · i ∈ P(t) \ P . Then s is a lower bound of T . By construction, s coincides with all terms in T in all positions in P . Since s ≤ ⊥ T , this property carries over to T .
Following the two lemmas above, we can observe that -intuitively speaking -the limit inferior lim inf ι→α t ι of a sequence of terms is the term that contains those parts that become eventually stable in the sequence. Remaining holes in the term structure are filled with '⊥'s.
The above lemma is central for dealing with strongly p-convergent reductions. It also reveals how the final term of a strongly p-convergent reduction is constructed. According to the equality of ((a)) and ((c)), the final term has the non-⊥ symbol f at some position π iff some term t α in the reduction also had this symbol f at this position π and no reduction after that term occurred at π or above. In this way, the final outcome of a strongly p-convergent reduction consists of precisely those parts of the intermediate terms which become eventually persistent during the reduction, i.e. are from some point on not subjected to contraction any more.
Now we turn to a characterisation of the parts that are not included in the final outcome of a strongly p-convergent reduction, i.e. those that do not become persistent. These parts are either omitted or filled by the placeholder ⊥. We will call these positions volatile: Definition 3.12 (volatility). Let R be a TRS and S = (t ι → πι t ι+1 ) ι<λ an open pconverging reduction in R. A position π is said to be volatile in S if, for each ordinal β < λ, there is some β ≤ γ < λ such that π γ = π. If π is volatile in S and no proper prefix of π is volatile in S, then π is called outermost-volatile.
In Example 3.1 the position 0 is outermost-volatile in the reduction S.
R admits the following reduction S of length ω: The reduction S p-converges to f (s(0), ⊥), i.e. we have S : f (s(0), s(h(0))) ։ p ω R f (s(0), ⊥). Figure 4 illustrates the reduction indicating the position of each reduction step by two circles and a reduction arrow in between. One can clearly see that both π 1 = 1 and π 2 = 1, 0 are volatile in S. Again and again reductions take place at π 1 and π 2 . Since these are the only volatile positions and π 1 is a prefix of π 2 , we have that π 1 is an outermost-volatile position in S.
As we shall see later in Section 6, volatility is closely related to root-active terms: if a reduction has a volatile position π, then we find a term in the reduction with a rootactive subterm at π. Conversely, from each root-active term starts a reduction with volatile position (cf. Proposition 6.9). This connection between volatility and root-activeness is the cornerstone of the correspondence between p-convergence and Böhm-convergence that we prove in Section 6.
((ii)) At first consider the "only if" direction. To this end, suppose that t α (π) = ⊥. In order to show that then ((a)) or ((b)) holds, we will prove that ((b)) must hold true whenever ((a)) does not hold. For this purpose, we assume that π is not outermost-volatile in S. Note that no proper prefix π ′ of π can be volatile in S as this would imply, according to clause ((i)), that π ′ ∈ P ⊥ (t α ) and, therefore, π ∈ P(t α ). Hence, π is also not volatile in S. In sum, no prefix of π is volatile in S. Consequently, there is an upper bound β < α on the indices of reduction steps taking place at π or above. But then t β (π) = ⊥ since otherwise Lemma 3.11 would imply that t α (π) = ⊥. This shows that ((b)) holds.
Clause ((ii)) shows that a ⊥ subterm in the final term can only have its origin either in a preceding term which already contains this ⊥ which then becomes stable, or in an outermostvolatile position. That is, it is exactly the outermost-volatile positions that generate '⊥'s.
We can apply this lemma to Example 3.13: As we have seen, the position π 1 = 1 is outermost-volatile in the reduction S mentioned in the example. Hence, S strongly pconverges to a term that has, according to Lemma 3.14, the symbol ⊥ at position π 1 . That is, S strongly p-converges to f (s(0), ⊥).
This characterisation of the final outcome of a p-converging reduction clearly shows that the partial order model captures the intuition of strong convergence in transfinite reductions even though it allows that every continuous reduction is also convergent: The final outcome only represents the parts of the reduction that are converging. Locally diverging parts are cut off and replaced by ⊥.
In fact, the absence of such local divergence, or volatility, as we call it here, is equivalent to the absence of ⊥: Lemma 3.15 (total reductions). Let R be a TRS, s a total term in R, and S : s ։ p R t. S : s ։ p R t is total iff no prefix of S has a volatile position.
Proof. The "only if" direction follows straightforwardly from Lemma 3.14.
We prove the "if" direction by induction on the length of S. If |S| = 0, then the totality of S follows from the assumption of s being total. If |S| is a successor ordinal, then the totality of S follows from the induction hypothesis since single reduction steps preserve totality. If |S| is a limit ordinal, then the totality of S follows from the induction hypothesis using Lemma 3.14.
Moreover, as we shall show in the next section, if local divergences are excluded, i.e. if total reductions are considered, both the metric model and the partial order model coincide.

Comparing m-Convergence and p-Convergence
In this section we want to compare the metric and the partial order model of convergence. In particular, we shall show that the partial order model is only a conservative extension of the metric model: If we only consider total reductions, i.e. reductions over terms in T ∞ (Σ, V), then m-convergence and p-convergence coincide both in their weak and strong variant.
The first and rather trivial observation to this effect is that already on the level of single reduction steps the partial order model conservatively extends the metric model: The next step is to establish that the underlying structures that are used to formalise convergence exhibit this behaviour as well. That is, the limit inferior in the complete semi- Note that, as a corollary, the above property implies that lim ι→α t ι is defined iff lim inf ι→α t ι is a total term. In Section 4.1 we shall establish the above property. This result is then used in Section 4.2 in order to show the desired property that p-convergence is a conservative extension of m-convergence in both their respective weak and strong variant.
4.1. Complete Semilattice vs. Complete Metric Space. In order to compare the complete semilattice of partial terms with the complete metric space of term, it is convenient to have an alternative characterisation of the similarity sim(s, t) of two terms s, t, which in turn provides an alternative characterisation of the metric d on terms. To this end we use the truncation of a term at a certain depth. This notion was originally used by Arnold and Nivat [2] to show that the d is a complete ultrametric on terms: More concisely we can say that the truncation of a term t at depth d replaces all subterms at depth d with ⊥. From this we can easily establish the following two properties of the truncation: Proof. Straightforward.
Recall that the similarity of two terms is the minimal depth at which they differ resp. ∞ if they are equal. However, saying that two terms differ at a certain minimal depth d is the same as saying that the truncation of the two terms at that depth d coincide. This provides an alternative characterisation of similarity: We can use this characterisation to show the first part of the compatibility of the metric and the partial order: Proof. If α is a successor ordinal, this is trivial. Let α be a limit ordinal, t = lim ι→α t ι , and t = lim inf ι→α t ι . Then for each i.e. t|d ≤ ⊥ T β . Since t = β<α T β , we also have that T β ≤ ⊥ t. By transitivity, we obtain t|d ≤ ⊥ t for each d ∈ N. Since t is total, we can thus conclude, according to Proposition 4.3, that t = t.
Before we continue, we want introduce another characterisation of similarity which bridges the gap to the partial order ≤ ⊥ . In order to follow this approach, we need the to define the ⊥-depth of a term t ∈ T ∞ (Σ ⊥ , V). It is the minimal depth of an occurrence of the subterm ⊥ in t: ⊥-depth(t) = min {|π| | t(π) = ⊥ } ∪ {∞} Intuitively, the glb s ⊓ t of two terms s, t represents the common structure that both terms share. The similarity sim(s, t) is a much more condensed measure. It only provides the depth up two which the terms share a common structure. Using the ⊥-depth we can directly condense the glb s ⊓ t to the similarity sim(s, t): Proof. Follows from Lemma 3.10.
We can employ this alternative characterisation of similarity to show the second part of the compatibility of the metric and the partial order: Lemma 4.7 (total limit inferior implies Cauchy). Let (t ι ) ι<α be a sequence in T ∞ (Σ, V) such that lim inf ι→α t ι is total. Then (t ι ) ι<α is Cauchy.
Since there are only finitely many positions in t of length at most d, there is some π * ∈ P(t) such that Since s β ≤ ⊥ s γ , whenever β ≤ γ, we can rewrite (3) as follows: Since π * ∈ P(t), we can employ Lemma 3.9 to obtain from (3') that t(π * ) = ⊥. This contradicts the assumption that t = lim inf ι→α t ι is total.
The following proposition combines Lemma 4.5 and Lemma 4.7 in order to obtain the desired property that the metric and the partial order are compatible: Proposition 4.8 (partial order conservatively extends metric). For every sequence (t ι ) ι<α in T ∞ (Σ, V) the following holds: Proof. If lim ι→α is defined, the equality follows from Lemma 4.5. If lim inf ι→α t ι is total, the sequence (t ι ) ι<α is Cauchy by Lemma 4.7. Then, as the metric space (T ∞ (Σ, V), d) is complete, (t ι ) ι<α converges and we can apply Lemma 4.5 to conclude the equality.
In the previous section we have established that the metric and the partial order on (partial) terms are compatible in the sense that the corresponding notions of limit and limit inferior coincide whenever the limit is defined or the limit inferior is a total term. As weak m-convergence and weak p-convergence are solely based on the limit in the metric space resp. the limit inferior in the partially ordered set, we can directly apply this result to show that both notions of convergence coincide on total reductions: Theorem 4.9 (total weak p-convergence = weak m-convergence). For every reduction S in a TRS the following equivalences hold: Proof. Both equivalences follow directly from Proposition 4.8 and Fact 4.1, both of which are applicable as we presuppose that each term in the reduction is total.
In order to replicate Theorem 4.9 for the strong notions of convergence, we first need the following two lemmas that link the property of increasing contraction depth to volatile positions and the limit inferior, respectively: Lemma 4.10 (strong m-convergence). Let S = (t ι → πι t ι+1 ) ι<λ be an open reduction. Then (|π ι |) ι<λ tends to infinity iff, for each position π, there is an ordinal α < λ such that π ι = π for all α ≤ ι < λ.
Proof. The "only if" direction is trivial. For the converse direction, suppose that |π ι | does not tend to infinity as ι approaches λ. That is, there is some depth d ∈ N such that there is no upper bound on the indices of reduction steps taking place at depth d. Let d * be the minimal such depth. That is, there is some α < λ such that all reduction steps in S| [α,λ) are at depth at least d * , i.e. |π ι | ≥ d * holds for all α ≤ ι < λ. Of course, also in S| [α,λ) the indices of steps at depth d * are not bounded from above. As all reduction steps in S| [α,λ) take place at depth d * or below, t ι |d * = t ι ′ |d * holds for all α ≤ ι, ι ′ < λ. That is, all terms in S| [α,λ) have the same set of positions of length d * . Let P * = {π ∈ P(t n ) | |π| = d * } be this set. Since there is no upper bound on the indices of steps in S| [α,λ) taking place at a position in P * , yet, P * is finite, there has to be some position π * ∈ P * for which there is also no such upper bound. This contradicts the assumption that there is always such an upper bound. Lemma 4.11 (limit inferior of truncations). Let (t ι ) ι<λ be a sequence in T ∞ (Σ ⊥ , V) and (d ι ) ι<λ a sequence in N such that λ is a limit ordinal and (d ι ) ι<λ tends to infinity. Then lim inf ι→λ t ι = lim inf ι→λ t ι |d ι .
Proof. Let t = lim inf ι→λ t ι |d ι and t = lim inf ι→λ t ι . Since, according to Proposition 4.3, t ι |d ι ≤ ⊥ t ι for each ι < λ, we have that t ≤ ⊥ t. Thus, it remains to be shown that also t ≤ ⊥ t holds. That is, we have to show that t(π) = t(π) holds for all π ∈ P ⊥ ( t).
We now can prove the counterpart of Theorem 4.9 for strong convergences: Proof. It suffices to only prove ((ii)) since ((i)) follows from ((ii)) according to Remark 3.4 resp. Remark 2.3. Let S = (ϕ ι : t ι → πι,cι t ι+1 ) ι<α be a reduction in a TRS R ⊥ . We continue the proof by induction on α. The case α = 0 is trivial. If α is a successor ordinal β + 1, we can reason as follows Let α be a limit ordinal. At first consider the "only if" direction. That is, we assume that S : t 0 ։ p t α is total. According to Remark 3.4, we have that S| β : t 0 ։ p t β for each β < α. Applying the induction hypothesis yields S| β : t 0 ։ m t β for each β < α. That is, following Remark 2.3, we have S : t 0 ։ m . . . . Since c ι ≤ ⊥ t ι for all ι < α, we have that t α = lim inf ι→α c ι ≤ ⊥ lim inf ι→α t ι . Because t α is total and, therefore, maximal w.r.t. ≤ ⊥ , we can conclude that t α = lim inf ι→α t ι . According to Proposition 4.8, this also means that t α = lim ι→α t ι . For strong m-convergence it remains to be shown that (|π ι |) ι<α tends to infinity. So let us assume that this is not the case. By Lemma 4.10, this means that there is a position π such that, for each β < α, there is some β ≤ γ < α such that the step ϕ γ takes place at position π. By Lemma 3.14, this contradicts the fact that t α is a total term. Now consider the converse direction and assume that S : t 0 ։ m t α . Following Remark 2.3 we obtain S| β : t 0 ։ m t β for all β < α, to which we can apply the induction hypothesis in order to get S| β : t 0 ։ p t β for all β < α so that we have S : t 0 ։ p . . . , according to Remark 3.4. It remains to be shown that t α = lim inf ι→α c ι . Since S strongly m-converges to t α , we have that (a) t α = lim ι→α t ι , and that (b) the sequence of depths (d ι = |π ι |) ι<α tends to infinity. Using Proposition 4.8 we can deduce from ((a)) that t α = lim inf ι→α t ι . Due to ((b)), we can apply Lemma 4.11 to obtain lim inf ι→α t ι = lim inf ι→α t ι |d ι and lim inf ι→α c ι = lim inf ι→α c ι |d ι .
Since t ι |d ι = c ι |d ι for all ι < α, we can conclude that The main result of this section is that we do not loose anything when switching from the metric model to the partial order model of infinitary term rewriting. Restricted to the domain of the metric model, i.e. total terms, both models coincide in the strongest possible sense as Theorem 4.9 and Theorem 4.12 confirm.
At the same time, however, the partial order model provides more structure. Whenever the metric model can only conclude divergence, the partial order model can qualify the degree of divergence. If a reduction p-converges to ⊥, it can be considered completely divergent. If it p-converges to a term that only contains ⊥ as proper subterms, it can be recognised as being only partially divergent with the diverging parts of the reduction indicated by '⊥'s, whereas complete absence of '⊥'s then indicates complete convergence.
In the rest of this paper we will put our focus on strong convergence. Theorem 4.12 will be one of the central tools in Section 6 where we shall discover that Böhm-reachability 24 PATRICK BAHR coincides with strong p-reachability in orthogonal systems. The other crucial tool that we will leverage is the existence and uniqueness of complete developments. This is the subject of the subsequent section.

Strongly p-Converging Complete Developments
The purpose of this section is to establish a theory of residuals and complete developments in the setting of strongly p-convergent reductions. Intuitively speaking, the residuals of a set of redexes are the remains of this set of redexes after a reduction, and a complete development of a set of redexes is a reduction which only contracts residuals of these redexes and ends in a term with no residuals.
Complete developments are a well-known tool for proving (finitary) confluence of orthogonal systems [24]. It has also been lifted to the setting of strongly m-convergent reductions in order to establish (restricted forms of) infinitary confluence of orthogonal systems [16]. As we have seen in Example 2.6, m-convergence in general does not have this property.
After introducing residuals and complete developments in Section 5.1, we will show in Section 5.2 resp. Section 5.3 that complete developments do always exist and that their final terms are uniquely determined. We then use this in Section 5.4 to show the Infinitary Strip Lemma for strongly p-converging reductions which is a crucial tool for proving our main result in Section 6.

5.1.
Residuals. At first we need to formalise the notion of residuals. It is virtually equivalent to the definition for strongly m-convergent reduction by Kennaway et al. [16]: Definition 5.1 (descendants, residuals). Let R be a TRS, S : t 0 ։ p α R t α , and U ⊆ P ⊥ (t 0 ). The descendants of U by S, denoted U//S, is the set of positions in t α inductively defined as follows: (a) If α = 0, then U//S = U . (b) If α = 1, i.e. S : t 0 → π,ρ t 1 for some ρ : l → r, take any u ∈ U and define the set R u as follows: If π ≤ u, then R u = {u}. If u is in the pattern of the ρ-redex, i.e. u = π · π ′ with π ′ ∈ P Σ (l), then R u = ∅. Otherwise, i.e. if u = π · w · x, with l| w ∈ V, then That is, u ∈ U//S iff u ∈ P ⊥ (t α ) and ∃β < α∀β ≤ ι < α : u ∈ U//S| ι If, in particular, U is a set of redex occurrences, then U//S is also called the set of residuals of U by S. Moreover, by abuse of notation, we write u//S instead of {u} //S. Clauses ((a)), ((b)) and ((c)) are as in the finitary setting. Clause ((d)) lifts the definition to the infinitary setting. However, the only difference to the definition of Kennaway et al. is, that we consider partial terms here. Yet, for technical reasons, the notion of descendants has to be restricted to non-⊥ occurrences. Since ⊥ cannot be a redex, this is not a restriction for residuals, though.
Remark 5.2. One can easily see that the descendants of a set of non-⊥-occurrences is again a set of non-⊥-occurrences. The restriction to non-⊥-occurrences has to be made explicit for the case of open reductions. In fact, without this explicit restriction the definition would yield descendants which might not even be occurrences in the final term t α of the reduction.
For example, consider the system with the single rule f (x) → x and the strongly p-convergent reduction S : f ω → f ω → . . . ⊥ in which each reduction step contracts the redex at the root of f ω . Consider the set U = { , 0 , 0, 0 , 0, 0, 0 , . . . } of all positions in t ω . Without the abovementioned restriction, the descendants of U by S would be U itself as the descendants of U by each proper prefix of S is also U . However, none of the positions 0 , 0, 0 , 0, 0, 0 , · · · ∈ U is even a position in the final term ⊥. The position ∈ U occurs in ⊥, but only as a ⊥-occurrence. With the restriction to non-⊥-occurrences we indeed get the expected result U//S = ∅.
The definition of descendants of open reductions is quite subtle which makes it fairly cumbersome to use in proofs. The lemma below establishes an alternative characterisation which will turn out to be useful in later proofs:
The following lemma confirms the expected monotonicity of descendants: Lemma 5.4 (monotonicity of descendants). Let R be a TRS, S : s ։ p R t and U, V ⊆ P ⊥ (s). If U ⊆ V , then U//S ⊆ V //S.

Proof. Straightforward induction on the length of S.
This lemma can be generalised such that we can see that descendants are defined "pointwise": Proposition 5.5 (pointwise definition of descendants). Let R be a TRS, S : s ։ p R t and U ⊆ P ⊥ (s). Then it holds that U//S = u∈U u//S. Proof. Let S = (t ι → πι,cι t ι+1 ) ι<α . For α = 0 and α = 1, the statement is trivially true. If α = α ′ + 1 > 1, then abbreviate S| α ′ and S| [α ′ ,α) by S 1 and S 2 , respectively, and reason as follows: For the converse direction, assume that π ∈ U//S. By Lemma 5.3, there is some β < α such that π ι ≤ π for all β ≤ ι < α and π ∈ U//S| β . Applying the induction hypothesis yields that π ∈ u∈U u//S| β , i.e. there is some u * ∈ U such that π ∈ u * //S| β . By employing Lemma 5.3 again, we can conclude that π ∈ u * //S and, therefore, that π ∈ u∈U u//S. Note that the above proposition fails if we would include ⊥-occurrences in our definition of descendants: Reconsider the example in Remark 5.2 and assume we would drop the restriction to non-⊥-occurrences. Then the residuals u//S of each occurrence u ∈ U would be empty, whereas the residuals U//S of all occurrences would be the root occurrence . Proof. We will prove the contraposition of the statement. To this end, suppose that there is some occurrence w ∈ U//S ∩ V //S. By Proposition 5.5, there are occurrences u ∈ U and v ∈ V such that w ∈ u//S ∩ v//S. We will show by induction on the length of S that then u = v and, therefore, U ∩ V = ∅. If S is empty, then this is trivial. If S is of successor ordinal length or open, then u = v follows from the induction hypothesis.
Remark 5.7. The two propositions above imply that each descendant u ′ ∈ U//S of a set U of occurrences is the descendant of a uniquely determined occurrence u ∈ U , i.e. u ′ ∈ u//S for exactly one u ∈ U . This occurrence u is also called the ancestor of u ′ by S.
The following proposition confirms a property of descendants that one expects intuitively: The descendants of descendants are again descendants. That is, the concept of descendants is composable. The following proposition confirms that the disjointness of occurrences is propagated through their descendants: Proposition 5.9 (disjoint descendants). The descendants of a set of pairwise disjoint occurrences are pairwise disjoint as well.
Proof. Let S : s ։ p α t and let U be a set of pairwise disjoint occurrences in s. We show that U//S is also a set of pairwise disjoint occurrences by induction on α.
For α being 0, the statement is trivial, and, for α being a successor ordinal, the statement follows straightforwardly from the induction hypothesis. Let α be limit ordinal and suppose that there are two occurrences u, v ∈ U//S which are not disjoint. By definition, there are ordinals β 1 , β 2 < α such that u ∈ U//S| ι for all β 1 ≤ ι < α, and v ∈ U//S| ι for all β 2 ≤ ι < α. Let β = max {β 1 , β 2 }. Then we have that u, v ∈ U//S| β . This, however, contradicts the induction hypothesis which, in particular, states that U//S| β is a set of pairwise disjoint occurrences.
For the definition of complete developments it is important that the descendants of redex occurrences are again redex occurrences: Proposition 5.10 (residuals). Let R be an orthogonal TRS, S : s ։ p R t and U a set of redex occurrences in s. Then U//S is a set of redex occurrences in t.
So assume that α is a limit ordinal and that π ∈ U//S. We will show that t| π is a redex. From Lemma 5.3 we obtain that there is some β < α with π ∈ U//S| β and π ι ≤ π for all β ≤ ι < α. (1) By applying the induction hypothesis, we get that π is a redex occurrence in t β . Hence, there is some rule l → r ∈ R such that t β | π is an instance of l. We continue this proof by showing the following stronger claim: for all β ≤ γ ≤ α t γ | π is an instance of l, and c ι | π is an instance of l for all β ≤ ι < γ For the special case γ = α the above claim (2) implies that t| π is a redex. We proceed by an induction on γ. For γ = β, part (2) of the claim has already been shown and (3) is vacuously true. Let γ = γ ′ + 1 > β. According to the induction hypothesis, (2) and (3) hold for γ ′ . Hence, it remains to be shown that both t γ | π and c γ ′ | π are instances of l. At first consider c γ ′ | π . Recall that c γ ′ = t γ ′ [⊥] π γ ′ . At first consider the case where π and π γ ′ are disjoint. Then c γ ′ | π = t γ ′ | π . Since, by induction hypothesis, t γ ′ | π is an instance of l, so is c γ ′ | π . Next, consider the case where π and π γ ′ are not disjoint. Because of (1), we then have that π < π γ ′ , i.e. there is some non-empty π ′ with π γ ′ = π · π ′ . Since R is non-overlapping, π ′ cannot be a position in the pattern of the redex t γ ′ | π w.r.t. l. Therefore, also c γ ′ | π is an instance of l. So in either case c γ ′ | π is an instance of l. Since c γ ′ ≤ ⊥ t γ , also t γ | π is an instance of l.
Let γ > β be a limit ordinal. Part (3) of the claim follows immediately from the induction hypothesis. Hence, c ι | π is an instance of l for all β ≤ ι < γ. This and (1) implies that all terms in the set T = {c ι | β ≤ ι < γ } coincide in all occurrences in the set P = π ′ π ′ ≤ π ∪ π · π ′ π ′ ∈ P Σ (l) 28 PATRICK BAHR P is obviously closed under prefixes. Therefore, we can apply Lemma 3.10 in order to obtain that T coincides with all terms in T in all occurrences in P . Since T ≤ ⊥ t γ , this property carries over to t γ . Consequently, also t γ | π is an instance of l.
Next we want to establish an alternative characterisation of descendants based on labellings. This is a well-known technique [24] that keeps track of descendants by labelling the symbols at the relevant positions in the initial term. In order to formalise this idea, we need to extend a given TRS such that it can also deal with terms that contain labelled symbols: Definition 5.11 (labelled TRSs/terms). Let R = (Σ, R) be a TRS.
(i) The labelled signature Σ l is defined as Σ ∪ {f l | f ∈ Σ }. The arity of the function symbol f l is the same as that of f . The symbols f l are called labelled; the symbols f ∈ Σ are called unlabelled. Terms over Σ l are called labelled terms. Note that the symbol ⊥ ∈ Σ ⊥ has no corresponding labelled symbol ⊥ l in the labelled signature Σ l ⊥ . Likewise, there are no labelled variables. (ii) Labelled terms can be projected back to the original unlabelled ones by removing the labels via the projection function · : That is, t (U ) = t and the labelled symbols in t (U ) are exactly those at positions in U .
The key property which is needed in order to make the labelling approach work is that any reduction in a left-linear TRS that starts in some term t can be lifted for any labelling t ′ of t to a unique equivalent reduction in the corresponding labelled TRS that starts in t ′ : Proposition 5.12 (lifting reductions to labelled TRSs). Let R = (Σ, R) be a left-linear TRS, S = (s ι → ρι,πι s ι+1 ) ι<α a reduction strongly p-converging to s α in R , and t 0 ∈ T ∞ (Σ l ⊥ , V) a labelled term with t 0 = s 0 . Then there is a unique reduction T = (t ι → ρ ′ ι ,πι t ι+1 ) ι<α strongly p-converging to t α in R l such that (a) t ι = s ι , ρ ′ ι = ρ ι , for all ι < α, and (b) t α = s α .
Proof. We prove this by an induction on α. For the case of α being zero, the statement is trivially true. For the case of α being a successor ordinal, the statement follows straightforwardly from the induction hypothesis (the argument is the same as for finite reductions; e.g. consult [24]).
Let α be a limit ordinal. By induction hypothesis, for each proper prefix S| γ of S there is a uniquely defined strongly p-convergent reduction T γ in R l satisfying ((a)) and ((b)). Since the sequence (S| ι ) ι<α forms a chain w.r.t. the prefix order ≤, so does the corresponding sequence (T ι ) ι<α . Hence the sequence T = ι<α T ι is well-defined. By construction, T γ ≤ T holds for each γ < α, and we can use the induction hypothesis to obtain part ((a)) of the proposition. In order to show s α = t α , we prove the two inequalities s α ≤ ⊥ t α and s α ≥ ⊥ t α : To show t α ≤ ⊥ s α , we take some π ∈ P ⊥ ( t α ) and show that t α (π) = s α (π). Let f = t α (π). That is, either t α (π) = f or t α (π) = f l . In either case, we can employ Lemma 3.11 to obtain some β < α such that t β (π) = f resp. t β (π) = f l and π ι ≤ π for all β ≤ ι < α. Since, by ((a)), s β = t β , we have in both cases that s β (π) = f . By applying Lemma 3.11 again, we get that s α (π) = f , too.
Having this, we can establish an alternative characterisation of descendants using labellings: Proposition 5.13 (alternative characterisation of descendants). Let R be a left-linear TRS, S : s 0 ։ p R s α , and U ⊆ P ⊥ (s 0 ). Following Proposition 5.12, let T : t 0 ։ p R t α be the unique lifting of S to R l starting with the term t 0 = s (U ) 0 . Then it holds that t α = s (U//S) α . That is, for all π ∈ P ⊥ (s α ), it holds that t α (π) is labelled iff π ∈ U//S. Proof. Let S = (s ι → πι s ι+1 ) ι<α and T = (t ι → πι t ι+1 ) ι<α . We prove the statement by an induction on the length α of S. If α = 0, then the statement is trivially true. If α is a successor ordinal, then a straightforward argument shows that the statement follows from the induction hypothesis. Here the restriction to left-linear systems is vital.

5.2.
Constructing Complete Developments. Complete developments are usually defined for (almost) orthogonal systems. This ensures that the residuals of redexes are again redexes. Since we are going to use complete developments for potentially overlapping systems as well, we need to make restrictions on the set of redex occurrences instead: Definition 5.14 (conflicting redex occurrences). Two distinct redex occurrences u, v in a term t are called conflicting if there is a position π such that v = u · π and π is a pattern position of the redex at u, or, vice versa, u = v · π and π is a pattern position of the redex at v. If this is not the case, then u and v are called non-conflicting.
One can easily see that in an orthogonal TRS any pair of redex occurrences is nonconflicting.
Definition 5.15 ((complete) development). Let R be a left-linear TRS, s a partial term in R, and U a set of pairwise non-conflicting redex occurrences in s.
(i) A development of U in s is a strongly p-converging reduction S : s ։ p α t in which each reduction step ϕ ι : t ι → πι t ι+1 contracts a redex at π ι ∈ U//S| ι .
This is a straightforward generalisation of complete developments known from the finitary setting and coincides with the corresponding formalisation for metric infinitary rewriting [16] if restricted to total terms.
The restriction to non-conflicting redex occurrences is essential in order guarantee that the redex occurrences are independent from each other: Proposition 5.16 (non-conflicting residuals). Let R be a left-linear TRS, s a partial term in R, U a set of pairwise non-conflicting redex occurrences in s, and S : s ։ U t a development of U in s. Then also U//S is a set of pairwise non-conflicting redex occurrences.
Proof. This can be proved by induction on the length of S. The part showing that the descendants are again redex occurrences can be copied almost verbatim from Proposition 5.10. Instead of referring to the non-overlappingness of the system one can refer to the nonconflictingness of the preceding residuals which can be assumed by the induction hypothesis. The part of the induction proof that shows non-conflictingness is analogous to Proposition 5.9.
It is relatively easy to show that complete developments of sets of non-conflicting redex occurrences do always exists in the partial order setting. The reason for this is that strongly p-continuous reductions do always strongly p-converge as well. This means that as long as there are (residuals of) redex occurrences left after an incomplete development, one can extend this development arbitrarily by contracting some of the remaining redex occurrences. The only thing that remains to be shown is that one can devise a reduction strategy which eventually contracts (all residuals of) all redexes. The proposition below shows that a parallel-outermost reduction strategy will always yield a complete development in a leftlinear system. Proposition 5.17 (complete developments). Let R be a left-linear TRS, t a partial term in R, and U a set of pairwise non-conflicting redex occurrences in t. Then U has a complete development in t.
Proof. Let t 0 = t, U 0 = U and V 0 the set of outermost occurrences in U 0 . Furthermore, let S 0 : t 0 ։ p V 0 t 1 be some complete development of V 0 in t 0 . S 0 can be constructed by contracting the redex occurrences in V 0 in a left-to-right order. This step can be continued for each i < ω by taking Note that then, by iterating Proposition 5.8, it holds that If there is some n < ω for which U n = ∅, then S 0 · . . . · S n−1 is a complete development of U according to (1). If this is not the case, consider the reduction S = i<ω S i , i.e. the concatenation of all 'S i 's. We claim that S is a complete development of U . Suppose that this is not the case, i.e. U//S = ∅. Hence, there is some u ∈ U//S. Since all 'U i 's are non-empty, so are the 'V i 's. Consequently, all 'S i 's are non-empty reductions which implies that S is a reduction of limit ordinal length, say λ. Therefore, we can apply Lemma 5.3 to infer from u ∈ U//S that there is some α < λ such that u ∈ U//S| α and all reduction steps beyond α do not take place at u or above. This is not possible due to the parallel-outermost reduction strategy that S adheres.  This shows that complete developments of any set of redex occurrences do always exist in any (almost) orthogonal system. This is already an improvement over strongly m-converging reductions, which only allow this if no collapsing rules are present or the considered set of redex occurrences does not contain an infinite set of nested collapsing redexes -also known as an infinite collapsing tower.
We shall discuss the issue of collapsing rules as well as infinite collapsing towers in more detail in the subsequent section, where we will show that complete developments are also unique in the sense that the final outcome is uniquely determined by the initial set of redexes occurrences.

5.3.
Uniqueness of Complete Developments. The goal of this section is to show that the final term of a complete development is uniquely determined by the initial set of redex occurrences U . There are several techniques to show that in the metric model. One of these approaches, introduced by Kennaway and de Vries [14] and detailed by Ketema and Simonsen [20,19] for infinitary combinatory reduction systems, uses so-called paths. Paths are constructed such that they, starting from the root, run through the initial term t of the complete development, and whenever a redex occurrence of the development is encountered, the path jumps to the root of the right-hand side of the corresponding rule and jumps back to the term t when it reaches a variable in the right-hand side. Figure 5a illustrates this idea. It shows a path in a term t that encounters two redex occurrences of the complete development. As soon as such a redex occurrence is encountered, the path jumps to the right-hand side of the corresponding rule as indicated by the dashed arrows. Then the path runs through the right-hand side. When a variable is encountered, the path jumps back to the position of the term t that matches the variable. This jump is again indicated by a dashed arrow. The path that is obtained by this construction is shown in Figure 5b. With the collection of the thus obtained paths one can then construct the final term of the complete development. This technique -slightly modified -can also be applied in the present setting.
A path consists of nodes, which are connected by edges. We have two kinds of nodes: a node (⊤, π) represents a location in the term t and a node (r, π, u) represents a location in the right-hand side r of a rule. These nodes of the form (⊤, π) and (r, π, u) encode that the path is currently at position π in the term t resp. r. The additional component u provides the information that the path jumped to the right-hand side r from the redex t| u . Both nodes and the edges between them are labelled. Each node is labelled with the symbol at the current location of the path, unless it is a redex occurrence in t or a variable occurrence in a right-hand side. The labellings of the edges provide information on how the path moves through the terms: a labelling i represents a move along the i-th edge in the term tree from the current location whereas an empty labelling indicates a jump from or to a right-hand side of a rule.
Definition 5.18 (path). Let R be a left-linear TRS, t a partial term in R, and U a set of pairwise non-conflicting redex occurrence in t. A U, R-path (or simply path) in t is a sequence of length at most ω containing so-called nodes and edges in an alternating manner like this: n 0 , e 0 , n 1 , e 1 , n 2 , e 2 , . . . where the 'n i 's are nodes and the 'e i 's are edges. A node is either a pair of the form (⊤, π) with π ∈ P(t) or a triple of the form (r, π, u) with r the right-hand side of a rule in R, π ∈ P(r), and u ∈ U . Edges are denoted by arrows →. Both edges and nodes might be labelled by elements in Σ ⊥ ∪ V and N, respectively. We write paths as the one sketched above as n 0 → n 1 → n 2 → · · · or, when explicitly indicating labels, as n 0 where empty labels are explicitly given by the symbol ∅. If a path has a segment of the form n → n ′ , then we say there is an edge from n to n ′ or that n has an outgoing edge to n ′ .
Every path starts with the node (⊤, ) and is either infinitely long or ends with a node. For each node n having an outgoing edge to a node n ′ , the following must hold: (1) If n is of the form (⊤, π), then (a) n ′ = (⊤, π · i) and the edge is labelled by i, with π · i ∈ P(t) and π ∈ U , or (b) n ′ = (r, , u) and the edge is unlabelled, with t| u a ρ-redex for ρ : l → r ∈ R and u ∈ U . (2) If n is of the form (r, π, u), then (a) n ′ = (r, π · i, u) and the edge is labelled by i, with π · i ∈ P(r), or (b) n ′ = (⊤, u · π ′ ) and the edge is unlabelled, with t| u a ρ-redex for ρ : l → r ∈ R, r| π a variable, and π ′ the unique occurrence of r| π in l. .
Additionally, the nodes of a path are supposed to be labelled in the following way: (3) A node of the form (⊤, π) is unlabelled if π ∈ U and is labelled by t(π) otherwise. (4) A node of the form (r, π, u) is unlabelled if r| π is a variable and labelled by r(π) otherwise.
Remark 5.19. The above definition is actually a coinductive one. This is necessary to also define paths of infinite length. Also in [14] paths are considered to be possibly infinite, although they are defined inductively and are, therefore, finite.
Remark 5.20. Our definition of paths deviates slightly from the usual definition found in the literature [16,20,21]: In our setting, term nodes are of the form (⊤, π). The symbol ⊤ is used to indicate that we are in the host term t. In the definitions found in the literature, the term t itself is used for that, i.e. term nodes are of the form (t, π). Our definition of paths makes them less dependant on the term t they are constructed in. This makes it easier to construct a path in a host term from other paths in different host terms. This will become necessary in the proof of Lemma 5.33. However, we have to keep in mind that the node labels in a path are dependent on the host term under consideration. Thus, the labelling of a path might be different depending on which host term it is considered to be in.
Returning to the schematic example illustrated in Figure 5, we can observe how the construction of a path is carried out: The path starts with a segment in the term t. This segment is entirely regulated by the rule ((a)); all its edges and nodes are labelled according to ((a)) and (3). The jump to the right-hand side r 1 following that initial segment is justified by rule ((b)). This jump consists of a node (⊤, u 1 ), unlabelled according to (3), corresponding to the redex occurrence u 1 , and an unlabelled edge to the node (r 1 , , u 1 ), corresponding to the root of the right-hand side r 1 . The segment of the path that runs through the right-hand side r 1 is subject to rule ((a)); again all its nodes and edges are labelled, now according to ((a)) and (4). As soon as a variable is reached in the right-hand side term (in the schematic example it is the variable x) a jump to the main term t is performed as required by rule ((b)). This jump consists of a node (r 1 , π, u 1 ), unlabelled according to (4), where π is the current position in r 1 , i.e. the variable occurrence, and an unlabelled edge to the node (⊤, u 1 · π ′ ). The position π ′ is the occurrence of the variable x in the left-hand side. As we only consider left-linear systems, this occurrence is unique. Afterwards, the same behaviour is repeated: A segment in t is followed by a jump to a segment in the right-hand side r 2 which is in turn followed by a jump back to a final segment in t.
Note that paths do not need to be maximal. As indicated in the schematic example, the path ends somewhere within the main term, i.e. not necessarily at a constant symbol or a variable. What the example does not show, but which is obvious from the definition, is that paths can also terminate within a right-hand side. A jump back to the main term is only required if a variable is encountered.
The purpose of the concept of paths is to simulate the contraction of all redexes of the complete development in a locally restricted manner, i.e. only along some branch of the term tree. This locality will keep the proofs more concise and makes them easier to understand once we have grasped the idea behind paths. The strategy to prove our conjecture of uniquely determined final terms is to show that paths can be used to define a term and that a contraction of a redex of the complete development preserves a property of the collection of all paths which ensures that the induced term remains invariant. Then we only have to observe that the induced term of paths in a term with no redexes (in U ) is the term itself.
The following fact is obvious from the definition of a path.
Fact 5.21. Let R be a left-linear TRS, t a partial term in R, and U a set of redex occurrences in t.
(i) An edge in a U, R-path in t is unlabelled iff the preceding node is unlabelled.
(ii) Any prefix of a U, R-path in t that ends in a node is also a U, R-path in t.
As we have already mentioned, collapsing rules and in particular so-called infinite collapsing towers play a significant role in m-convergent reductions as they obstruct complete developments. Also in our setting of p-convergent reductions they are important as they are responsible for volatile positions: (iii) A collapsing tower is a non-empty sequence (u i ) i<α of collapsing redex occurrences in a term t such that u i+1 = u i · π i for each i < α, where π i is a collapsing position of the redex at u i . It is called maximal if it is not a proper prefix of another collapsing tower.
One can easily see that, in orthogonal TRSs, maximal collapsing towers in the same term are uniquely determined by their topmost redex occurrence. That is, two maximal collapsing towers (u i ) i<α , (v i ) i<α in the same term are equal iff u 0 = v 0 . As mentioned, we shall use the U, R-paths in a term t in order to define the final term of a complete development of U in t. However, in order to do that, we only need the information that is available from the labellings. The inner structure of nodes is only used for the bookkeeping that is necessary for defining paths. The following notion of traces defines projections to the labels of paths: Definition 5.23 (trace). Let R be a left-linear TRS, t a partial term in R, and U a set of pairwise non-conflicting redex occurrences in t.
(i) Let Π be a U, R-path in t. The trace of Π, denoted tr t (Π), is the projection of Π to the labelling of its nodes and edges ignoring empty labels and the node label ⊥. (ii) P(t, U, R) is used to denote the set of all U, R-paths in t that end in a labelled node, or are infinite but have a finite trace. The set of traces of paths in P(t, U, R) is denoted by T r(t, U, R).
By Fact 5.21, the trace of a path is a sequence alternating between elements in Σ ∪ V and N, which, if non-empty, starts with an element in Σ ∪ V. Moreover, by definition, T r(t, U, R) is a set of finite traces of U, R-paths in t.
As we have mentioned in Remark 5.20, the labelling of a path depends on the host term under consideration. Hence, also the trace of a path is depended on the host term. That is why we need to index the trace mapping tr t (·) with the corresponding host term t.
Example 5.24. Consider the term t = g(f (g(h(⊥)))) and the TRS R consisting of the two rules Furthermore, let U be the set of all redex occurrences in t, viz. U = 0 , 0 3 . The following path Π is a U, R-path in t: As a matter of fact, Π is the greatest path of t. Hence, according to Fact 5.21, the set of all prefixes of Π ending in a node is the set of all U, R-paths in t. Note that since Π itself ends in a labelled node, it is contained in P(t, U, R). The trace tr t (Π) of Π is the sequence Now consider the term t ′ = g(f (g(h ω ))) and the set U ′ of all its redexes, viz. U ′ = { 0 } ∪ 0 3 , 0 4 , . . . . Then the following path Π ′ is a U, R-path in t ′ : Since Π ′ is infinitely long but has a finite trace, it is contained in P(t ′ , U, R).
The lemma below shows that there is a one-to-one correspondence between paths in P(t, U, R) and their traces in T r(t, U, R).
Lemma 5.25 (tr t (·) is a bijection). Let R be an orthogonal TRS, t a partial term in R, and U a set of redex occurrences in t. tr t (·) is a bijection from P(t, U, R) to T r(t, U, R).
Proof. By definition, tr t (·) is surjective. Let Π 1 , Π 2 be two paths having the same trace. We will show that then Π 1 = Π 2 by an induction on the length of the common trace.
Let tr t (Π 1 ) = . Following Fact 5.21, there are two different cases: The first case is that Π 1 = Π · (⊤, π) ⊥ , where the prefix Π corresponds to a finite maximal collapsing tower (u i ) i≤α starting at the root of t or Π is empty if such a collapsing tower does not exists. If the collapsing tower exists, then But then also Π 2 starts with the prefix Π · (⊤, π) due to the uniqueness of the collapsing tower and the involved rules. In both cases, Π 1 = Π 2 follows immediately. The second case is that Π 1 is infinite. Then there is an infinite collapsing tower (u i ) i<ω starting at the root of t. Hence, Π 1 = Π 2 follows from the uniqueness of the infinite collapsing tower. At first glance one might additionally find a third case where Π 1 = Π · (⊤, π) ∅ ∅ → (r, , π) ⊥ with Π a prefix corresponding to a collapsing tower as in the first case. However, this is not possible as it would require the occurrence of ⊥ in a rule.
Let tr t (Π 1 ) = f . Then there are two cases: Either Π 1 = Π · (⊤, π) f or Π 1 = Π · (⊤, π) ∅ ∅ → (r, , π) f , where the prefix Π corresponds to a finite maximal collapsing tower (u i ) i≤α starting at the root of t or Π is empty if such a collapsing tower does not exists. The argument is analogous to the argument employed for the first case of the induction base above.
Finally, we consider the induction step. Hence, there are the two cases: Either tr t (Π 1 ) = T · i or tr t (Π 1 ) = T · i, f . For both cases, the induction hypothesis can be invoked by taking two prefixes Π ′ 1 and Π ′ 2 of Π 1 and Π 2 , respectively, which both have the trace T and, therefore, are equal according to the induction hypothesis. The argument that the remaining suffixes of Π 1 and Π 2 are equal is then analogous to the argument for two base cases.
As mentioned above, the traces of paths contain all information necessary to define a term which we will later identify to be the final term of the corresponding complete development. The following definition explains how such a term, called a matching term, is determined: Definition 5.26 (matching term). Let R be a left-linear TRS, t a partial term in R, and U a set of pairwise non-conflicting redex occurrences in t.
(i) The position of a trace T ∈ T r(t, U, R), denoted pos(T ), is the subsequence of T containing only the edge labels. The set of all positions of traces in T r(t, U, R) is denoted PT r(t, U, R). (ii) The symbol of a trace T ∈ T r(t, U, R), denoted sym t (T ), is f if T ends in a node label f , and is ⊥ otherwise, i.e. whenever T is empty or ends in an edge label. (iii) A term t ′ is said to match T r(t, U, R) if P(t ′ ) = PT r(t, U, R) and t ′ (pos(T )) = sym t (T ) for all T ∈ T r(t, U, R).
Returning to the definition of paths, one can see that the label of a node is the symbol of the "current" position in a term. Similarly, the label of an edge says which edge in the term tree was taken at that point in the construction of the path. Hence, by projecting to the edge labels, we obtain the "history" of the path, i.e. the position. In the same way we obtain the symbol of that node by taking the label of the last node of the path, provided the corresponding path ends in a non-⊥-labelled node. In the other case that the trace does not end in a node label, the corresponding path either ends in a node labelled ⊥ or is infinite. As we will see, infinite paths with finite traces correspond to infinite collapsing towers, which in turn yield volatile positions within the complete development. Eventually, these volatile positions will also give rise to ⊥ subterms.
The following lemma shows that there is also a one-to-one correspondence between the traces in T r(t, U, R) and their positions in PT r(t, U, R): Lemma 5.27 (pos(·) is a bijection). Let R be an orthogonal TRS, t a partial term in R and U a set of redex occurrences in t. pos(·) is a bijection from T r(t, U, R) to PT r(t, U, R).
Proof. An argument similar to the one for Lemma 5.25 can be given in order to show that the composition pos(·) • tr t (·) is a bijection. Together with the bijectivity of tr s (·), according to Lemma 5.25, this yields the bijectivity of pos(·).
Having this lemma, the following proposition is an easy consequence of the definition of matching terms. It shows that matching terms do always exists and are uniquely determined: Proposition 5.28 (unique matching term). Let R be an orthogonal TRS, t a partial term in R, and U a set of redex occurrences in t. Then there is a unique term, denoted F(t, U, R), that matches T r(t, U, R).
Proof. Define the mapping ϕ : PT r(t, U, R) → Σ ⊥ ∪ V by setting ϕ(pos(T )) = sym t (T ) for each trace T ∈ T r(t, U, R). By Lemma 5.27, ϕ is well-defined. Moreover, it is easy to see from the definition of paths, that PT r(t, U, R) is closed under prefixes and that ϕ respects the arity of the symbols, i.e. π · i ∈ PT r(t, U, R) iff 0 ≤ i < ar(ϕ(π)). Hence, ϕ uniquely determines a term s with s(π) = ϕ(π) for all π ∈ PT r(t, U, R). By construction, s matches T r(t, U, R). Moreover, any other term s ′ matching T r(t, U, R) must satisfy s ′ (π) = ϕ(π) for all π ∈ PT r(t, U, R) and is therefore equal to s.
It is also obvious that the matching term of a term t w.r.t. an empty set of redex occurrences is the term t itself.
Lemma 5.29 (matching term w.r.t. empty redex set). For any TRS R and any partial term t in R, it holds that F(t, ∅, R) = t.
Remark 5.30. Now it only remains to be shown that the matching term stays invariant during a development, i.e. that, for each development S : t ։ p t ′ of U , the matching terms F(t, U, R) and F(t ′ , U//S, R) coincide. Since the matching term F(t, U, R) only depends on the set T r(t, U, R) of traces, it is sufficient to show that T r(t, U, R) and T r(t ′ , U//S, R) coincide. The key observation is that in each step s → s ′ in a development the paths in s ′ differ from the paths in s only in that they might omit some jumps. This can be seen in Figure 5a: In a step s → s ′ of a development, (some residual of) some redex occurrence in U is contracted. In the picture this corresponds to removing the pattern, say l 1 , of the redex and replacing it by the corresponding right-hand side r 1 of the rule. One can see that, except for the jump to and from the right-hand side r 1 the path remains the same.
In order to establish the above observation formally, we need a means to simulate reduction steps in a development directly as an operation on paths. The following definition provides a tool for this. Definition 5.31 (position and prefix of a path). Let R be a left-linear TRS, t a partial term in R, U a set of pairwise non-conflicting redex occurrences in t, and Π ∈ P(t, U, R).
(i) Π is said to contain a position π ∈ P(t) if it contains the node (⊤, π).
(ii) For each u ∈ U , the prefix of Π by u, denoted Π (u) , is defined as Π whenever Π does not contain u and otherwise as the unique prefix of Π that ends in (⊤, u).
Remark 5.32. It is obvious from the definition that each prefix Π (u) of a path Π ∈ P(t, U, R) by an occurrence u is the maximal prefix of Π, that does not contain positions that are proper extensions of u. Hence, if Π contains u, then Π (u) is the maximal prefix of Π that only contains prefixes of u (including u itself).
The following lemma is the key step towards proving the invariance of matching terms in developments. It formalises the observation described in Remark 5.30.
Lemma 5.33 (preservation of traces). Let R be an orthogonal TRS, t a partial term in R, U a set of redex occurrences in t, and S : t ։ p t ′ a development of U in t. There is a surjective mapping ϑ S : P(t, U, R) → P(t ′ , U//S, R) such that tr t (Π) = tr t ′ (ϑ S (Π)) for all Π ∈ P(t, U, R).
If α = 0, then the statement is trivially true.
The above lemma effectively establishes the invariance of matching terms during a development. Together with Lemma 5.29 this implies the uniqueness of final terms of complete developments of the same redex occurrences. As a corollary from this, we obtain that descendants are also unique among all complete developments: Proposition 5.34 (final term and descendants of complete developments). Let R be an orthogonal TRS, t a partial term in R, and U a set of redex occurrences in t. Then the following holds: (i) Each complete development of U in t strongly p-converges to F(t, U, R).
(ii) For each set V ⊆ P ⊥ (t) and two complete developments S and T of U in t, respectively, it holds that V //S = V //T .
Proof. ((i)) Let S : t ։ p U t ′ be a complete development of U in t strongly p-converging to t ′ . By Lemma 5.33, there is a surjective mapping ϑ : P(t, U, R) → P(t ′ , U ′ , R) with tr t (Π) = tr t ′ (ϑ(Π)) for all Π ∈ P(t, U, R), where U ′ = U//S. Hence, it holds that T r(t, U, R) = T r(t ′ , U ′ , R) and, consequently, F(t, U, R) = F(t ′ , U ′ , R). Since S is a complete development of U in t, we have that U ′ = ∅ which implies, according to Lemma 5.29, that F(t ′ , U ′ , R) = t ′ . Therefore, F(t, U, R) = t ′ . ((ii)) Let t ′ = t (V ) . By Proposition 5.13, both reductions S and T can be uniquely lifted to reductions S ′ and T ′ in R l , respectively, such that V //S and V //T are determined by the Figure 6. The Infinitary Strip Lemma.
final term of S ′ and T ′ , respectively. It is easy to see that also R l is an orthogonal TRS and that S ′ and T ′ are complete developments of U in t ′ . Hence, we can invoke clause ((i)) of this proposition to conclude that the final terms of S ′ and T ′ coincide and that, therefore, also V //S and V //T coincide.
By the above proposition, the descendants of a complete development of a particular set of redex occurrences are unique. Therefore, we adopt the notation U//V for the descendants U//S of U by some complete development S of V . According to Proposition 5.17 and Proposition 5.34, U//V is well-defined for any orthogonal TRS.
Furthermore, Proposition 5.34 yields the following corollary establishing the diamond property of complete developments: Corollary 5.35 (diamond property of complete developments). Let R be an orthogonal TRS and t ։ p U t 1 and t ։ p V t 2 be two complete developments of U respectively V in t. Then t 1 and t 2 are joinable by complete developments t 1 ։ p V //U t ′ and t 2 ։ p U//V t ′ . Proof. By Proposition 5.5, it holds that By the equation above and Proposition 5.8, we have that S · S ′ : t ։ p U t 1 ։ p V //U t ′ is a complete development of U ∪ V . Analogously, we obtain that T · T ′ : t ։ p V t 2 ։ p U//V t ′′ is a complete development of U ∪ V , too. According to Proposition 5.34, this implies that both S · S ′ and T · T ′ strongly p-converge in the same term, i.e. t ′ = t ′′ .
In the next section we shall make use of complete developments in order to obtain the Infinitary Strip Lemma for p-converging reductions and a limited form of infinitary confluence for orthogonal systems.

The Infinitary Strip Lemma.
In this section we use the results we have obtained for complete developments in the previous two sections in order to establish that a complete development of a set of pairwise disjoint redex occurrences commutes with any strongly p-convergent reduction: Proposition 5.36 (Infinitary Strip Lemma). Let R be an orthogonal TRS, S : t 0 ։ p α t α a strongly p-convergent reduction, and t 0 ։ p U s 0 a complete development of a set U of pairwise disjoint redex occurrences in t 0 . Then t α and s 0 are joinable by a reduction S/T : s 0 ։ p s α and a complete development T /S : t α ։ p U//S s α .
s β (π) = f and π ′ ≤ π for all π ′ ∈ v ι //U ι and β ≤ ι < α, which means that no reduction step in s β ։ p s α takes place at some prefix of π. Thus, we can conclude, according to Lemma 3.11, that s α (π) = f . Similarly, one can show that s α (π) = f = ⊥ implies t α (π) = f . Suppose t α (π) = ⊥. Hence, according to Lemma 3.14, π is outermost-volatile in S or there is some β < α such that t β (π) = ⊥ and v ι ≤ π for all β ≤ ι < α. For the latter case, we can argue as in the case for t α (π) = ⊥ above. In the former case, π is outermost-volatile in T as well. Thus, by applying Lemma 3.14, we obtain that s α (π) = ⊥. A similar argument can be employed for the reverse direction.
The reduction S/T constructed in the proof above is called the projection of S by T . Likewise, the reduction T /S is called the projection of T by S. As a corollary we obtain the following semi-infinitary confluence result: Corollary 5.37 (semi-infinitary confluence). In every orthogonal TRS, two reductions t ։ p t 2 and t → * t 1 can be joined by two reductions t 2 ։ p t 3 and t 1 ։ p t 3 .
Proof. This can be shown by an induction on the length of the reduction t → * t 1 . If it is empty, the statement trivially holds. The induction step follows from Proposition 5.36.
In the next section we shall, based on the Infinitary Strip Lemma, show that strong p-reachability coincides with Böhm-reachability, which then yields, amongst other things, full infinitary confluence of orthogonal systems.

Comparing Strong p-Convergence and Böhm-Convergence
In this section we shall show the core result of this paper: For orthogonal, left-finite TRSs, strong p-reachability and Böhm-reachability w.r.t. the set RA of root-active terms coincide. As corollaries of that, leveraging the properties of Böhm-convergence, we obtain both infinitary normalisation and infinitary confluence of orthogonal systems in the partial order model. Moreover, we will show that strong p-convergence also satisfies the compression property.
The central step of the proof of the equivalence of both models of infinitary rewriting is an alternative characterisation of root-active terms which is captured by the following definition: Definition 6.1 (destructiveness, fragility). Let R be a TRS.
(i) A reduction S : t ։ p s is called destructive if is a volatile position in S.
(ii) A partial term t in R is called fragile if a destructive reduction starts in t.
Looking at the definition, fragility seems to be a more general concept than root-activeness: A term is fragile iff it admits a reduction in which infinitely often a redex at the root is contracted. For orthogonal TRSs, root-active terms are characterised in almost the same way. The difference is that only total terms are considered and that the stipulated reduction contracting infinitely many root redexes has to be of length ω. However, we shall show the set of total fragile terms to be equal to the set of root-active terms by establishing a compression lemma for destructive reductions. Using Lemma 3.14 we can immediately derive the following alternative characterisations: Fact 6.2 (destructiveness, fragility). Let R be a TRS.
(i) A reduction S : s ։ p t is destructive iff S is open and t = ⊥ (ii) A partial term t in R is fragile iff there is an open strongly p-convergent reduction t ։ p ⊥.
One has to keep in mind, however, that a closed reduction to ⊥ is not destructive. Such a notion of destructiveness would include the empty reduction from ⊥ to ⊥, and reductions that end with the contraction of a collapsing redex as, for example, in the single step reduction f (⊥) → ⊥ induced by the rule f (x) → x. Such reductions do not "produce" the term ⊥. They are merely capable of "moving" an already existent subterm ⊥ by a collapsing rule. In this sense, fragile terms are, according to Lemma 3.15, the only terms which can produce the term ⊥. This is the key observation for studying the relation between strong p-convergence and Böhm-convergence.
In order to show that strong p-reachability and Böhm-reachability w.r.t. RA coincide we will proceed as follows: At first we will show that strong p-reachability implies Böhmreachability w.r.t. the set of total fragile terms, i.e. the fragile terms in T ∞ (Σ, V). From this we will derive a compression lemma for destructive reductions. We will then use this to show that the set RA of root-active terms coincides with the set of total fragile terms. From this we conclude that strong p-reachability implies Böhm-reachability w.r.t. RA. Finally, we then show the other direction of the equality.
6.1. From Strong p-Convergence to Böhm-Convergence. For the first step we have to transform a strongly p-converging reduction in to a Böhm-converging reduction w.r.t. the set of total fragile terms, i.e. a strongly m-converging reduction w.r.t. the corresponding Böhm extension B. Recall that, by Theorem 4.12, the only difference between strongly pconverging reductions and strongly m-converging reductions is the ability of the former to produce ⊥ subterms. This happens, according to Lemma 3.14, precisely at volatile positions.
We can, therefore, proceed as follows: Given a strongly p-converging reduction we construct a Böhm-converging reduction by removing reduction steps which cause the volatility of a position in some open prefix of the reduction and then replacing them by a single → ⊥ -step.
The intuition of this construction is illustrated in Figure 7. It shows a strongly pconverging reduction of length ω · 4 from s to t. In order to maintain readability, we restrict the attention to a particular branch of the term (tree) as indicated in Figure 7a. The picture shows five positions which are volatile in some open prefix of the reduction. We assume that they are the only volatile positions at least in the considered branch. Note that the positions do not need to occur in all of the terms in the reduction. They might disappear and reappear repeatedly. Each of them, however, appears in infinitely many terms in the reduction, as, by definition of volatility, infinitely many steps take place at each of these positions. In Figure 7b, the prefixes of the reduction that contain a volatile position are indicated by a waved rewrite arrow pointing to a ⊥. The level of an arrow indicates the position which is volatile. A prefix might have multiple volatile positions. For example, both π 2 and π 4 are volatile in the prefix of length ω. But a position might also be volatile for several prefixes. For instance, π 3 is volatile in the prefix of length ω · 2 and the prefix of length ω · 4.
By Lemma 3.14, outermost-volatile positions are responsible for the generation of ⊥ subterms. By their nature, at some point there are no reductions taking place above outermostvolatile positions. The suffix where this is the case is a nested destructive reduction. The subterm where this suffix starts is, therefore, a fragile term and we can replace this suffix with a single → ⊥ -step. The segments which are replaced in this way are highlighted by dashed boxes in Figure 7b. As indicated by the dotted lines, this then also includes reduction steps which occur below the outermost-volatile positions. Therefore, also volatile positions which are not outermost are removed as well. Eventually, we obtain a reduction  Proof. Assume that there is a reduction S = (t ι → πι t ι+1 ) ι<α in R that strongly p-converges to t α . We will construct a strongly m-convergent reduction T : t 0 ։ m B t α in B by removing reduction steps in S that take place at or below outermost-volatile positions of some prefix of S and replace them by → ⊥ -steps. Let π be an outermost-volatile position of some prefix S| λ . Then there is some ordinal β < λ such that no reduction step between β and λ in S takes place strictly above π, i.e. π ι < π for all β ≤ ι < λ. Such an ordinal β must exist since otherwise π would not be an outermost-volatile position in S| λ . Hence, we can construct a destructive reduction S ′ : t β | π ։ p ⊥ by taking the subsequence of the segment S| [β,λ) that contains the reduction steps at π or below. Note that t β | π might still contain the symbol ⊥. Since ⊥ is not relevant for the applicability of rules in R, each of the ⊥ symbols in t β | π can be safely replaced by arbitrary total terms, in particular by terms in U . Let r be a term that is obtained in this way. Then there is a destructive reduction S ′′ : r ։ p ⊥ that applies the same rules at the same positions as in S ′ . Hence, r ∈ U . By construction, r is a ⊥, U -instance of t β | π which means that t β | π ∈ U ⊥ . Additionally, t β | π = ⊥ since there is a non-empty reduction S ′ : t β | π ։ p ⊥ starting in t β | π . Consequently, there is a rule t β | π → ⊥ in B. Let T ′ be the reduction that is obtained from S| λ by replacing the β-th step, which we can assume w.l.o.g. to take place at π, by a step with the rule t β | π → ⊥ at the same position π and removing all reduction steps ϕ ι taking place at π or below for all β < ι < λ. Let t ′ be the term that the reduction T ′ strongly p-converges to. t λ and t ′ can only differ at position π or below. However, by construction, we have t ′ (π) = ⊥ and, by Lemma 3.14, t λ (π) = ⊥.
This construction can be performed for all prefixes of S and their respective outermostvolatile positions. Thereby, we obtain a strongly p-converging reduction T : t 0 ։ p B t α for which no prefix has a volatile position. By Lemma 3.15, T is a total reduction. Note that B is a TRS over the extended signature Σ ′ = Σ ⊎ {⊥}, i.e. terms containing ⊥ are considered total. Hence, by Theorem 4.12, T : t 0 ։ m B t α .
6.2. From Böhm-convergence to Strong p-Convergence. Next, we establish a compression lemma for destructive reductions, i.e. that each destructive reduction can be compressed to length ω. Before we continue with this, we need to mention the following lemma from Kennaway et al. [17]: In the next proposition we show that, excluding ⊥ subterms, the final term of a strongly p-converging reduction can be approximated arbitrarily well by a finite reduction. This corresponds to Corollary 2.5 which establishes finite approximations for strongly m-convergent reductions.
Proposition 6.5 (finite approximation). Let R be a left-linear, left-finite TRS and s ։ p R t. Then, for each finite set P ⊆ P ⊥ (t), there is a reduction s → * R t ′ such that t and t ′ coincide in P .
Proof. Assume that s ։ p R t. Then, by Proposition 6.3, there is a reduction s ։ m B t, where B is the Böhm extension of R w.r.t. the set of total, fragile terms of R. By Lemma 6.4, there is a reduction s ։ m R s ′ ։ m ⊥ t. Clearly, s ′ and t coincide in P ⊥ (t). Let d = max {|π| | π ∈ P }. Since P is finite, d is well-defined. By Corollary 2.5, there is a reduction s → * R t ′ such that t ′ and s ′ coincide up to depth d and, thus, in particular they coincide in P . Consequently, since s ′ and t coincide in P ⊥ (t) ⊇ P , t and t ′ coincide in P , too.
In order to establish a compression lemma for destructive reductions we need that fragile terms are preserved by finite reductions. We can obtain this from the following more general lemma showing that destructive reductions are preserved by forming projections as constructed in the Infinitary Strip Lemma: Lemma 6.6 (preservation of destructive reductions by projections). Let R be an orthogonal TRS, S : t 0 ։ p t α a destructive reduction, and T : t 0 ։ p U s 0 a complete development of a set U of pairwise disjoint redex occurrences. Then the projection S/T : s 0 ։ p s α is also destructive.
As a consequence of this preservation of destructiveness by forming projections, we obtain that the set of fragile terms is closed under finite reductions: Lemma 6.7 (closure of fragile terms under finite reductions). In each orthogonal TRS, the set of fragile terms is closed under finite reductions.
Proof. Let t be a fragile term and T : t → * t ′ a finite reduction. Hence, there is a destructive reduction starting in t. A straightforward induction proof on the length of T , using Lemma 6.6, shows that there is a destructive reduction starting in t ′ . Thus, t ′ is fragile. Now we can show that destructiveness does not need more that ω steps in orthogonal, left-finite TRSs. This property will be useful for proving the equivalence of root-activeness and fragility of total terms as well as the Compression Lemma for strongly p-convergent reductions.
Proposition 6.8 (Compression Lemma for destructive reductions). Let R be an orthogonal, left-finite TRS and t a partial term in R. If there is a destructive reduction starting in t, then there is a destructive reduction of length ω starting in t.
Proof. Let S : t 0 ։ p λ ⊥ be a destructive reduction starting in t 0 . Hence, there is some α < λ such that S| α : t 0 ։ p s 1 , where s 1 is a ρ-redex for some ρ : l → r ∈ R. Let P be the set of pattern positions of the ρ-redex s 1 , i.e. P = P Σ (l). Due to the left-finiteness of R, P is finite. Hence, by Proposition 6.5, there is a finite reduction t 0 → * s ′ 1 such that s 1 and s ′ 1 coincide in P . Hence, because R is left-linear, also s ′ 1 is a ρ-redex. Now consider the reduction T 0 : t 0 → * s ′ 1 → ρ, t 1 ending with a contraction at the root. T 0 is of finite length and, according to Lemma 6.7, t 1 is fragile.
Since t 1 is again fragile, the above argument can be iterated arbitrarily often which yields for each i < ω a finite non-empty reduction T i : t i → * t i+1 whose last step is a contraction at the root. Then the concatenation T = i<ω T i of these reductions is a destructive reduction of length ω starting in t 0 .
The above proposition bridges the gap between fragility and root-activeness. Whereas the former concept is defined in terms of transfinite reductions, the latter is defined in terms of finite reductions. By Proposition 6.8, however, a fragile term is always finitely reducible to a redex. This is the key to the observation that fragility is not only quite similar to root-activeness but is, in fact, essentially the same concept. Proposition 6.9 (root-activeness = fragility). Let R be an orthogonal, left-finite TRS and t a total term in R. Then t is root-active iff t is fragile.
Proof. The "only if" direction is easy: If t is root-active, then there is a reduction S of length ω starting in t with infinitely many steps taking place at the root. Hence, S : t ։ p ω ⊥ is a destructive reduction, which makes t a fragile term.
For the converse direction we assume that t is fragile and show that, for each reduction t → * s, there is a reduction s → * t ′ to a redex t ′ . By Lemma 6.7, also s is fragile. Hence, there is a destructive reduction S : s ։ p ⊥ starting in s. According to Proposition 6.8, we can assume that S has length ω. Therefore, there is some n < ω such that S| n : s → * t ′ for a redex t ′ .
To prove the other direction of the equality of strong p-reachability and Böhm-reachability we need the property that strongly m-convergent reductions consisting only of → ⊥ -steps, i.e. contractions of RA ⊥ -terms to ⊥, can be compressed to length at most ω as well. In order to show this, we will make use of the following lemma from Kennaway et al. [17]: Lemma 6.10 (⊥, RA-instances). Let RA be the root-active terms of an orthogonal, leftfinite TRS and t ∈ T ∞ (Σ ⊥ , V). If some ⊥, RA-instance of t is in RA, then every ⊥, RAinstance of t is.
Lemma 6.11 (compression of → ⊥ -steps). Consider the Böhm extension of an orthogonal TRS w.r.t. its root-active terms and S : s ։ m ⊥ t with s ∈ T ∞ (Σ, V), t ∈ T ∞ (Σ ⊥ , V). Then there is a strongly m-converging reduction T : s ։ m ⊥ t of length at most ω that is a complete development of a set of disjoint occurrences of root-active terms in s.
Proof. The proof is essentially the same as that of Lemma 7.2.4 from Ketema [18].
Let S = (t ι → πι t ι+1 ) ι<α be the mentioned reduction strongly m-converging to t α , and let π be a position at which some reduction step in S takes place. That is, there is some β such that π β = π. We will prove by induction on β that t 0 | π ∈ RA.
Consider the term t β | π . Since a → ⊥ -rule is applied here, we have, according to Remark 2.9, that t β | π ∈ RA ⊥ . Let V = P ⊥ (t β | π ). Hence, for each v ∈ V , there is some γ < β such that π γ = π · v. Therefore, we can apply the induction hypothesis and get that t 0 | π·v ∈ RA for all v ∈ V . It is clear that we can obtain t 0 | π from t β | π by replacing each ⊥-occurrence at v ∈ V with the corresponding term t 0 | π·v . That is, t 0 | π is a ⊥, RAinstance of t β | π . Because t β | π ∈ RA ⊥ , there is some ⊥, RA-instance of t β | π in RA. Thus, by Lemma 6.10, also t 0 | π is in RA. This closes the proof of the claim. Now let V = P ⊥ (t α ). Clearly, all positions in V are pairwise disjoint. Moreover, for each v ∈ V , there is a step in S that takes place at v. Hence, by the claim shown above, V is a set of occurrences in t 0 of terms in RA. A complete development of V in t 0 leads to t α and can be performed in at most ω steps by an outermost reduction strategy.
The important part of the above lemma is the statement that only terms in RA are contracted instead of the general case where a → ⊥ -step contracts a term in RA ⊥ ⊃ RA.
Finally, we have gathered all tools necessary in order to prove the converse direction of the equivalence of strong p-reachability and Böhm-reachability w.r.t. root-active terms. Theorem 6.12 (strong p-reachability = Böhm-reachability w.r.t. RA). Let R be an orthogonal, left-finite TRS and B the Böhm extension of R w.r.t. its root-active terms. Then s ։ p R t iff s ։ m B t.
Proof. The "only if" direction follows immediately from Proposition 6.9 and Proposition 6.3. Now consider the converse direction: Let s ։ m B t be a strongly m-convergent reduction in B. W.l.o.g. we assume s to be total. Due to Lemma 6.4, there is a term s ′ ∈ T ∞ (Σ, V) such that there are strongly m-convergent reductions S : s ։ m R s ′ and T : s ′ ։ m ⊥ t. By Lemma 6.11, we can assume that in s ′ ։ m ⊥ t only pairwise disjoint occurrences of rootactive terms are contracted. By Proposition 6.9, each root-active term r ∈ RA is fragile, i.e. we have a destructive reduction r ։ p R ⊥ starting in r. Thus, following Remark 2.9, we can construct a strongly p-converging reduction T ′ : s ′ ։ p R t by replacing each step C[r] → ⊥ C[⊥] in T with the corresponding reduction C[r] ։ p R C[⊥]. By combining T ′ with the strongly m-converging reduction S, which, according to Theorem 4.12, is also strongly p-converging, we obtain the strongly p-converging reduction S · T ′ : s ։ p R t.

Corollaries.
With the equivalence of strong p-reachability and Böhm-reachability established in the previous section, strongly p-convergent reductions inherit a number of important properties that are enjoyed by Böhm-convergent reductions: Theorem 6.13 (infinitary confluence). Every orthogonal, left-finite TRS is infinitarily confluent. That is, for each orthogonal, left-finite TRS, s 1 և p t ։ p s 2 implies s 1 ։ p t ′ և p s 2 .
Proof. Leveraging Theorem 6.12, this theorem follows from Theorem 2.10.
Returning to Example 2.6 again, we can see that, in the setting of strongly p-converging reduction, the terms g ω and f ω can now be joined by repeatedly contracting the redex at the root which yields two destructive reductions g ω ։ p ⊥ and f ω ։ p ⊥, respectively. Theorem 6.14 (infinitary normalisation). Every orthogonal, left-finite TRS is infinitarily normalising. That is, for each orthogonal, left-finite TRS R and a partial term t in R, there is an R-normal form strongly p-reachable from t.
Proof. This follows immediately from Theorem 6.12 and Theorem 2.11.
Combining Theorem 6.13 and Theorem 6.14, we obtain that each term in an orthogonal TRS has a unique normal form w.r.t. strong p-convergence. Due to Theorem 6.12, this unique normal form is the Böhm tree w.r.t. root-active terms.
Since strongly p-converging reductions in orthogonal TRS can always be transformed such that they consist of a prefix which is a strongly m-convergent reduction and a suffix consisting of nested destructive reductions, we can employ the Compression Lemma for strongly m-convergent reductions (Theorem 2.4) and the Compression Lemma for destructive reductions (Proposition 6.8) to obtain the Compression Lemma for strongly p-convergent reductions: Theorem 6.15 (Compression Lemma for strongly p-convergent reductions). For each orthogonal, left-finite TRS, s ։ p t implies s ։ p ≤ω t.
Proof. Let s ։ p R t. According to Theorem 6.12, we have s ։ m B t for the Böhm extension B of R w.r.t. RA and, therefore, by Lemma 6.4, we have reductions S : s ։ m R s ′ and T : s ′ ։ m ⊥ t. Due to Theorem 2.4, we can assume S to be of length at most ω and, due to Theorem 4.12, to be strongly p-convergent, i.e S : s ։ p ≤ω R s ′ . If T is the empty reduction, then we are done. If not, then T is a complete development of pairwise disjoint occurrences of root-active terms according to Lemma 6.11. Hence, each step is of the form C[r] → ⊥ C[⊥] for some root-active term r. By Proposition 6.9, for each such term r, there is a destructive reduction r ։ p R ⊥ which we can assume, in accordance with Proposition 6.8, to be of length ω. Hence, each step C[r] → ⊥ C[⊥] can be replaced by the reduction C[r] ։ p ω R C[⊥]. Concatenating these reductions results in a reduction T ′ : s ′ ։ p R t of length at most ω · ω. If S : s ։ p ≤ω R s ′ is of finite length, we can interleave the reduction steps in T ′ such that we obtain a reduction T ′′ : s ′ ։ p ω R t of length ω. Then we have S · T ′′ : s ։ p ω R t. If S : s ։ p ≤ω R s ′ has length ω, we construct a reduction s ։ p R t as follows: As illustrated above, T ′ consists of destructive reductions taking place at some pairwise disjoint positions. These steps can be interleaved into the reduction S resulting into a reduction s ։ p R t of length ω. The argument for that is similar to that employed in the successor case of the induction proof of the Compression Lemma of Kennaway et al. [16]. model for transfinite reductions as it does not need the cumbersomely defined "shortcuts" provided by → ⊥ -steps, which depend on allowing infinite left-hand sides in rewrite rules. Vice versa destructive reductions in the partial order model provide a justification for admitting these shortcuts.
Related Work. This study of partial order convergence is inspired by Blom [6] who investigated strong partial order convergence in lambda calculus and compared it to strong metric convergence. Similarly to our findings for orthogonal term rewriting systems, Blom has shown for lambda calculus that reachability in the metric model coincides with reachability in the partial order model modulo equating so-called 0-undefined terms.
Also Corradini [7] studied a partial order model. However, he uses it to develop a theory of parallel reductions which allows simultaneous contraction of a set of mutually independent redexes of left-linear rules. To this end, Corradini defines the semantics of redex contraction in a non-standard way by allowing a partial matching of left-hand sides. Our definition of complete developments also provides, at least for orthogonal systems, a notion of parallel reductions but does so using the standard semantics of redex contraction.
Future Work. While we have studied both weak and strong p-convergence and have compared it to the respective metric counterparts, we have put the focus on strong p-convergence. It would be interesting to find out whether the shift to the partial order model has similar benefits for weak convergence, which is known to be rather unruly in the metric model [23]. A starting point in this direction would be to find correspondences between weak and strong p-convergence. For example, in the metric setting we have that s ֒→ m R t implies that there is some t ′ with s ։ m B t ′ and t ։ m B t ′ [14, Theorem 12.9.14]. If we had the analogous correspondence for p-convergence, we would immediately obtain infinitary normalisation and confluence for weak p-convergence.
Moreover, we have focused on orthogonal systems in this paper. It should be easy to generalise our results to almost orthogonal systems. The only difficulty is to deal with the ambiguity of paths when rules are allowed to overlay. This could be resolved by considering equivalence classes of paths instead. The move to weakly orthogonal systems is much more complicated: For strong m-convergence Endrullis et al. [11] have shown that weakly orthogonal systems do not even satisfy the infinitary unique normal form property (UN ∞ ), a property that orthogonal systems do enjoy [16]. Due to Theorem 4.12, this means that also in the setting of strong p-convergence, weakly orthogonal systems do not satisfy UN ∞ and are therefore not infinitarily confluent either! Endrullis et al. [11] have shown that this can be resolved in the metric setting by prohibiting collapsing rules. However, it is not clear whether this result can be transferred to the partial order setting.
Another interesting direction to follow is the ability to finitely simulate transfinite reductions by term graph rewriting. For strong m-convergence this is possible, at least to some extent [15]. We think that a different approach to term graph rewriting, viz. the doublepushout approach [10] or the equational approach [1], is more appropriate for the present setting of p-convergence [8,3].