Modes of Convergence for Term Graph Rewriting

Term graph rewriting provides a simple mechanism to finitely represent restricted forms of infinitary term rewriting. The correspondence between infinitary term rewriting and term graph rewriting has been studied to some extent. However, this endeavour is impaired by the lack of an appropriate counterpart of infinitary rewriting on the side of term graphs. We aim to fill this gap by devising two modes of convergence based on a partial order respectively a metric on term graphs. The thus obtained structures generalise corresponding modes of convergence that are usually studied in infinitary term rewriting. We argue that this yields a common framework in which both term rewriting and term graph rewriting can be studied. In order to substantiate our claim, we compare convergence on term graphs and on terms. In particular, we show that the modes of convergence on term graphs are conservative extensions of the corresponding modes of convergence on terms and are preserved under unravelling term graphs to terms. Moreover, we show that many of the properties known from infinitary term rewriting are preserved. This includes the intrinsic completeness of both modes of convergence and the fact that convergence via the partial order is a conservative extension of the metric convergence.


Introduction
Non-terminating computations are not necessarily undesirable. For instance, the termination of a reactive system would be usually considered a critical failure. Even computations that, given an input x, should produce an output y are not necessarily terminating in nature either. For example, the various iterative approximation algorithms for π produce approximations of increasing accuracy without ever terminating with the exact value of π. While such iterative approximation computations might not reach the exact target value, they are able to come arbitrary close to the correct value within finite time.
It is this kind of non-terminating computations which is the subject of infinitary term rewriting [24]. It extends the theory of term rewriting by giving a meaning to transfinite reductions instead of dismissing them as undesired and meaningless artifacts. Following the paradigm of iterative approximations, the result of a transfinite reduction is simply the 2 PATRICK BAHR term that is approximated by the reduction. In general, such a result term can be infinite. For example, starting from the term rep(0), the rewrite rule rep(x) → x :: rep(x) produces a reduction rep(0) → 0 :: rep(0) → 0 :: 0 :: rep(0) → 0 :: 0 :: 0 :: rep(0) → . . . that approximates the infinite term 0 :: 0 :: 0 :: . . . . Here, we use :: as a binary symbol that we write infix and assume to associate to the right. That is, the term 0 :: 0 :: rep(0) is parenthesised as 0 :: (0 :: rep(0)). Think of the :: symbol as the list constructor cons.
Term graphs, on the other hand, allow us to explicitly represent and reason about sharing and recursion [3] by dropping the restriction to a tree structure, which we have for terms. Apart from that, term graphs also provide a finite representation of certain infinite terms, viz. rational terms. As Kennaway et al. [23,26] have shown, this can be leveraged in order to finitely represent restricted forms of infinitary term rewriting using term graph rewriting.
In this paper, we extend the theory of infinitary term rewriting to the setting of term graphs. To this end, we devise modes of convergence that constrain reductions of transfinite length in a meaningful way. Our approach to convergence is twofold: we generalise the metric on terms that is used to define convergence for infinitary term rewriting [14] to term graphs. In a similar way, we generalise the partial order on terms that has been recently used to define a closely related notion of convergence for infinitary term rewriting [7]. The use of two different -but on terms closely related -approaches to convergence will allow us both to assess the appropriateness of the resulting infinitary calculi and to compare them against the corresponding infinitary calculi of term rewriting.

Motivation.
1.1.1. Lazy Evaluation. Term rewriting is a useful formalism for studying declarative programs, in particular, functional programs. A functional program essentially consists of functions defined by a set of equations and an expression that is supposed to be evaluated according to these equations. The conceptual process of evaluating an expression is nothing else than term rewriting.
A particularly interesting feature of modern functional programming languages, such as Haskell [29], is the ability to use conceptually infinite computations and data structures. For example, the following definition of a function from constructs for each number n the infinite list of consecutive numbers starting from n: from(n) = n :: from(s(n)) Here, we use the binary infix symbol :: to denote the list constructor cons and s for the successor function. While we cannot use the infinite list generated by from directly -the evaluation of an expression of the form from n does not terminate -we can use it in a setting in which we only read a finite prefix of the infinite list conceptually defined by from. Functional languages such as Haskell allow this use of semantically infinite data structures through a non-strict evaluation strategy, which delays the evaluation of a subexpression until its result is actually required for further evaluation of the expression. This non-strict semantics is not only a conceptual neatness but in fact one of the major features that make functional programs highly modular [18].
Infinitary term rewriting [24] provides a notion of convergence that may assign a meaningful result term to such an infinite reduction provided there exists one. In this sense, the above reduction converges to the infinite term 0 :: s(0) :: s(s(0)) :: . . . , which represents the infinite list of numbers 0, 1, 2, . . . . Due to this extension of term rewriting with explicit limit constructions for non-terminating reductions, infinitary term rewriting allows us to directly reason about non-terminating functions and infinite data structures.
Non-strict evaluation is rarely found unescorted, though. Usually, it is implemented as lazy evaluation [17], which complements a non-strict evaluation strategy with sharing. The latter avoids duplication of subexpressions by using pointers instead of copying. For example, the function from above duplicates its argument n -it occurs twice on the righthand side of the defining equation. A lazy evaluator simulates this duplication by inserting two pointers pointing to the actual argument. Sharing is a natural companion for nonstrict evaluation as it avoids re-evaluation of expressions that are duplicated before they are evaluated.
The underlying formalism that is typically used to obtain sharing for functional programming languages is term graph rewriting [30,31]. Term graph rewriting [11,32] uses graphs to represent terms thus allowing multiple arcs to point to the same node. For example, term graphs allows us to change the representation of the term rewrite rule defining the function f rom by replacing :: x f rom s x the tree representation :: x f rom s by a graph representation which shares the variable x by having two arcs pointing to it. While infinitary term rewriting is used to model the non-strictness of lazy evaluation, term graph rewriting models the sharing part of it. By endowing term graph rewriting with a notion of convergence, we aim to unify the two formalisms into one calculus, thus allowing us to model both aspects withing the same calculus.
1.1.2. Rational Terms. Term graphs can do more than only share common subexpressions. Through cycles term graphs may also provide a finite representation of certain infinite terms -so-called rational terms. For example, the infinite term 0 :: 0 :: 0 :: . . . can be represented as the finite term graph :: Since a single node on a cycle in a term graph represents infinitely many corresponding subterms, the contraction of a single term graph redex may correspond to a transfinite term reduction that contracts infinitely many term redexes. For example, if we apply the rewrite rule 0 → s(0) to the above term graph, we obtain a term graph that represents the term s(0) :: s(0) :: s(0) :: . . . , which can only be obtained from the term 0 :: 0 :: 0 :: . . . via a transfinite term reduction with the rule 0 → s(0). Kennaway et al. [26] investigated this correspondence between cyclic term graph rewriting and infinitary term rewriting. Among other results they characterise a subset of transfinite term reductions -called rational reductions -that can be simulated by a corresponding finite term graph reduction. The above reduction from the term 0 :: 0 :: 0 :: . . . is an example of such a rational reduction.
With the help of a unified formalism for infinitary and term graph rewriting, it should be easier to study the correspondence between infinitary term rewriting and finitary term graph rewriting further. The move from an infinitary term rewriting system to a term graph rewriting system only amounts to a change in the degree of sharing if we use infinitary term graph rewriting as a common framework.
Reconsider the term rewrite rule rep(x) → x :: rep(x), which defines a function rep that repeats its argument infinitely often: This reduction happens to be not a rational reduction in the sense of Kennaway et al. [26].
The move from the term rule rep(x) → x :: rep(x) to a term graph rule is a simple matter of introducing sharing of common subexpressions: rep x :: x rep x rep x is represented by :: Instead of creating a fresh copy of the redex on the right-hand side, the redex is reused by placing an edge from the right-hand side of the rule to its left-hand side. This allows us to represent the infinite reduction approximating the infinite term 0 :: 0 :: 0 :: . . . with the following single step term graph reduction induced by the above term graph rule: rep 0 :: 0 Via its cyclic structure the resulting term graph represents the infinite term 0 :: 0 :: 0 :: . . . . Since both transfinite term reductions and the corresponding finite term graph reductions can be treated within the same formalism, we hope to provide a tool for studying the ability of cyclic term graph rewriting to finitely represent transfinite term reductions.

Contributions & Related Work.
1.2.1. Contributions. The main contributions of this paper are the following: (i) We devise a partial order on term graphs based on a restricted class of graph homomorphisms. We show that this partial order forms a complete semilattice and thus is technically suitable for defining a notion of convergence (Theorem 5. 15). Moreover, we illustrate alternative partial orders and show why they are not suitable for formalising convergence on term graphs. (ii) Independently, we devise a metric on term graphs and show that it forms a complete ultrametric space on term graphs (Theorem 7.4). (iii) Based on the partial order respectively the metric we define a notion of weak convergence for infinitary term graph rewriting. We show that -similar to the term rewriting case [7] -the metric calculus of infinitary term graph rewriting is the total fragment of the partial order calculus of infinitary term graph rewriting (Theorem 8.10). (iv) We confirm that the partial order and the metric on term graphs generalise the partial order respectively the metric that is used for infinitary term rewriting (Proposition 5.19 and 6.16). Moreover, we show that the corresponding notions of convergence are preserved by unravelling term graphs to terms thus establishing the soundness of our notions of convergence on term graphs w.r.t. the convergence on terms (Theorems 9.9 and 9.11). (v) We substantiate the appropriateness of our calculi by a number of examples that illustrate how increasing the sharing gradually reduces the number of steps necessary to reach the final result -eventually, from an infinite number of steps to a finite number (Sections 8 and 9).

Related Work.
Calculi with explicit sharing and/or recursion, e.g. via letrec, can also be considered as a form of term graph rewriting. Ariola and Klop [3] recognised that adding such an explicit recursion mechanism to the lambda calculus may break confluence. In order to reconcile this, Ariola and Blom [2,1] developed a notion of skew confluence that allows them to define an infinite normal form in the vein of Böhm trees.
Recently, we have investigated other notions of convergence for term graph rewriting [10,9] that use simpler variants of the partial order and the metric that we use in this paper. Both of them have theoretically pleasing properties, e.g. the ideal completion and the metric completion of the set of finite term graphs both yield the set of all term graphs. However, the resulting notions of weak convergence are not fully satisfying and in fact counterintuitive for some cases. We will discuss this alternative approach and compare it to the present approach in more detail in Sections 5 and 6.
1.3. Overview. The structure of this paper is as follows: in Section 2, we provide the necessary background for metric spaces, partially ordered sets and term rewriting. In Section 3, we give an overview of infinitary term rewriting. Section 4 provides the necessary theory for graphs and term graphs. Sections 5 and 6 form the core of this paper. In these sections we study the partial order and the metric on term graphs that are the basis for the modes of convergence we propose in this paper. In Section 7, we then compare the two resulting modes of convergence. Moreover, in Section 8, we use these two modes of convergence to study two corresponding infinitary term graph rewriting calculi. In Section 9, we study the correspondences between infinitary term graph rewriting and infinitary term rewriting.
Some proofs have been omitted from the main body of the text. These proofs can be found in the appendix of this paper.

Preliminaries
We assume the reader to be familiar with the basic theory of ordinal numbers, orders and topological spaces [21], as well as term rewriting [34]. In order to make this paper selfcontained, we briefly recall all necessary preliminaries.

Sequences.
We use the von Neumann definition of ordinal numbers. That is, an ordinal number (or simply ordinal) α is the set of all ordinal numbers strictly smaller than α.
In particular, each natural number n ∈ N is an ordinal number with n = {0, 1, . . . , n − 1}. The least infinite ordinal number is denoted by ω and is the set of all natural numbers. Ordinal numbers will be denoted by lower case Greek letters α, β, γ, λ, ι.
A sequence S of length α in a set A, written (a ι ) ι<α , is a function from α to A with ι → a ι for all ι ∈ α. We use |S| to denote the length α of S. If α is a limit ordinal, then S is called open. Otherwise, it is called closed. If α is a finite ordinal, then S is called finite. Otherwise, it is called infinite. For a finite sequence (a i ) i<n we also use the notation a 0 , a 1 , . . . , a n−1 . In particular, denotes the empty sequence. We write A * for the set of all finite sequences in A.
The concatenation (a ι ) ι<α · (b ι ) ι<β of two sequences is the sequence (c ι ) ι<α+β with c ι = a ι for ι < α and Let (a ι ) ι<α be a sequence in a metric space (M, d). The sequence (a ι ) ι<α converges to an element a ∈ M , written lim ι→α a ι , if, for each ε ∈ R + , there is a β < α such that d(a, a ι ) < ε for every β < ι < α; (a ι ) ι<α is continuous if lim ι→λ a ι = a λ for each limit ordinal λ < α. Intuitively speaking, (a ι ) ι<α converges to a if the metric distance between the elements a ι of the sequence and a tends to 0 as the index ι approaches α, i.e. they approximate a arbitrarily well. Accordingly, (a ι ) ι<α is continuous if it does not leap to a distant object at limit ordinal indices.
The sequence (a ι ) ι<α is called Cauchy if, for any ε ∈ R + , there is a β < α such that, for all β < ι < ι ′ < α, we have that d(m ι , m ι ′ ) < ε. That is, the elements a ι of the sequence move closer and closer to each other as the index ι approaches α.
A metric space is called complete if each of its non-empty Cauchy sequences converges. That is, whenever the elements a ι of a sequence move closer and closer together, they in fact approximate an existing object of the metric space, viz. lim ι→α a ι .

Partial Orders.
A partial order ≤ on a set A is a binary relation on A such that x ≤ y, y ≤ z implies x ≤ z (transitivity); x ≤ x (reflexivity); and x ≤ y, y ≤ x implies x = y (antisymmetry) for all x, y, z ∈ A. The pair (A, ≤) is then called a partially ordered set. A subset D of the underlying set A is called directed if it is non-empty and each pair of elements in D has an upper bound in D. A partially ordered set (A, ≤) is called a complete partial order (cpo) if it has a least element and each directed set D has a least upper bound (lub) D. A cpo (A, ≤) is called a complete semilattice if every non-empty set B has greatest lower bound (glb) B. In particular, this means that, in a complete semilattice, the limit inferior of any sequence (a ι ) ι<α , defined by lim inf ι→α a ι = β<α β≤ι<α a ι , always exists.

PATRICK BAHR
There is also an alternative characterisation of complete semilattices: a partially ordered set (A, ≤) is called bounded complete if each set B ⊆ A that has an upper bound in A also has a least upper bound in A. Two elements a, b ∈ A are called compatible if they have a common upper bound, i.e. there is some c ∈ A with a, b ≤ c. Proposition 2.1 (bounded complete cpo = complete semilattice, [19]). Given a cpo (A, ≤) the following are equivalent: Given two partially ordered sets (A, ≤ A ) and (B, 2.4. Terms. Since we are interested in the infinitary calculus of term rewriting, we consider the set T ∞ (Σ) of infinitary terms (or simply terms) over some signature Σ. A signature Σ is a countable set of symbols such that each symbol f ∈ Σ is associated with an arity ar(f ) ∈ N, and we write Σ (n) for the set of symbols in Σ that have arity n. The set T ∞ (Σ) is defined as the greatest set T such that t ∈ T implies t = f (t 1 , . . . , t k ) for some f ∈ Σ (k) and t 1 , . . . , t k ∈ T . For each constant symbol c ∈ Σ (0) , we write c for the term c(). For a term t ∈ T ∞ (Σ) we use the notation P(t) to denote the set of positions in t. P(t) is the least subset of N * such that ∈ P(t) and i · π ∈ P(t) if t = f (t 0 , . . . , t k−1 ) with 0 ≤ i < k and π ∈ P(t i ). For terms s, t ∈ T ∞ (Σ) and a position π ∈ P(t), we write t| π for the subterm of t at π, t(π) for the function symbol in t at π, and t[s] π for the term t with the subterm at π replaced by s. As positions are sequences, we use the prefix order ≤ defined on them. A position is also called an occurrence if the focus lies on the subterm at that position rather than the position itself. The set T (Σ) of finite terms is the subset of T ∞ (Σ) that contains all terms with a finite set of positions.
On T ∞ (Σ) a similarity measure sim(·, ·) : That is, sim(s, t) is the minimal depth at which s and t differ, respectively ω if s = t. Based on this similarity measure, a distance function d is defined by d(s, t) = 2 −sim(s,t) , where we interpret 2 −ω as 0. The pair (T ∞ (Σ), d) is known to form a complete ultrametric space [4]. Partial terms, i.e. terms over signature Σ ⊥ = Σ ⊎ {⊥} with ⊥ a fresh nullary symbol, can be endowed with a binary relation ≤ ⊥ by defining s ≤ ⊥ t iff s can be obtained from t by replacing some subterm occurrences in t by ⊥. Interpreting the term ⊥ as denoting "undefined", ≤ ⊥ can be read as "is less defined than". The pair (T ∞ (Σ ⊥ ), ≤ ⊥ ) is known to form a complete semilattice [16]. To explicitly distinguish them from partial terms, we call terms in T ∞ (Σ) total.
2.5. Term Rewriting Systems. For term rewriting systems, we have to consider terms with variables. To this end, we assume a countably infinite set V of variables and extend a signature Σ to a signature Σ V = Σ ⊎ V with variables in V as nullary symbols. Instead of T ∞ (Σ V ) we also write T ∞ (Σ, V). A term rewriting system (TRS) R is a pair (Σ, R) consisting of a signature Σ and a set R of term rewrite rules of the form l → r with l ∈ T ∞ (Σ, V) \ V and r ∈ T ∞ (Σ, V) such that all variables occurring in r also occur in l. Note that both the left-and the right-hand side may be infinite. We usually use x, y, z and primed respectively indexed variants thereof to denote variables in V. A substitution σ is a mapping from V to T ∞ (Σ, V). Such a substitution σ can be uniquely lifted to a homomorphism from As in the finitary setting, every TRS R defines a rewrite relation → R that indicates rewrite steps: Instead of s → R t, we sometimes write s → π,ρ t in order to indicate the applied rule ρ and the position π, or simply s → t. The subterm s| π is called a ρ-redex or simply redex, rσ its contractum, and s| π is said to be contracted to rσ.

Infinitary Term Rewriting
Before pondering over the right approach to an infinitary calculus of term graph rewriting, we want to provide a brief overview of infinitary term rewriting [24,7,13]. This should give an insight into the different approaches to dealing with infinite reductions. However, in contrast to the majority of the literature on infinitary term rewriting, which is concerned with strong convergence [24,27], we will only consider weak notions of convergence in this paper; cf. [14,20,33]. This weak form of convergence, also called Cauchy convergence, is entirely based on the sequence of objects produced by rewriting without considering how the rewrite rules are applied.
A (transfinite) reduction in a term rewriting system R, is a sequence S = (t ι → R t ι+1 ) ι<α of rewrite steps in R. Note that the underlying sequence of terms (t ι ) ι< α has length α, where α = α if S is open, and α = α + 1 if S is closed. The reduction S is called m-continuous in R, written S : t 0 ֒→ m R . . . , if the sequence of terms (t ι ) ι< α is continuous in (T ∞ (Σ), d), i.e. lim ι→λ t ι = t λ for each limit ordinal λ < α. The reduction S is said to m-converge to a term t in R, written S : t 0 ֒→ m R t, if it is m-continuous and lim ι→ α t ι = t. Example 3.1. Consider the term rewriting system R containing the rule ρ 1 : a :: x → b :: a :: x. By repeatedly applying ρ 1 , we obtain the infinite reduction S : a :: c → b :: a :: c → b :: b :: a :: c → . . .
The position at which two consecutive terms differ moves deeper and deeper during the reduction S, i.e. the d-distance between them tends to 0. Hence, S m-converges to the infinite term s = b :: b :: b :: . . . , i.e. S : a :: c ֒→ m s. Now consider a TRS with the slightly different rule ρ 2 : a :: x → a :: b :: x. This TRS yields a reduction S ′ : a :: c → a :: b :: c → a :: b :: b :: c → . . . Even though the rule ρ 2 is applied at the root of the term in each step of S ′ , the d-distance between two consecutive terms tends to 0 again. The reduction S ′ m-converges to the infinite term s ′ = a :: b :: b :: . . . , i.e. S ′ : a :: c ֒→ m s ′ .
In contrast to the weak m-convergence that we consider here, strong m-convergence [24,27] additionally requires that the depth of the contracted redexes tends to infinity as the reduction approaches a limit ordinal. Concerning Example 3.1 above, we have for instance that S also strongly m-converges -the rule is applied at increasingly deep redexes -whereas S ′ does not strongly m-converge -each step in S ′ results from a contraction at the root.
In the partial order model of infinitary rewriting [7], convergence is defined via the limit inferior in the complete semilattice (T ∞ (Σ ⊥ ), ≤ ⊥ ). Given a TRS R = (Σ, R), we extend it to R ⊥ = (Σ ⊥ , R) by adding the fresh constant symbol ⊥ such that it admits all terms in The distinguishing feature of the partial order approach is that each continuous reduction also converges due to the semilattice structure of partial terms. Moreover, pconvergence provides a conservative extension to m-convergence that allows rewriting modulo meaningless terms [7] by essentially mapping those parts of the reduction to ⊥ that are divergent according to the metric mode of convergence.
Intuitively, the limit inferior in (T ∞ (Σ ⊥ ), ≤ ⊥ ) -and thus p-convergence -describes an approximation process that accumulates each piece of information that remains stable from some point onwards. This is based on the ability of the partial order ≤ ⊥ to capture a notion of information preservation, i.e. s ≤ ⊥ t iff t contains at least the same information as s does but potentially more. A monotonic sequence of terms t 0 ≤ ⊥ t 1 ≤ ⊥ . . . thus approximates the information contained in i<ω t i . Given this reading of ≤ ⊥ , the glb T of a set of terms T captures the common (non-contradicting) information of the terms in T . Leveraging this observation, a sequence that is not necessarily monotonic can be turned into a monotonic sequence t j = j≤i<ω s i such that each t j contains exactly the information that remains stable in (s i ) i<ω from j onwards. Hence, the limit inferior lim inf i→ω s i = j<ω j≤i<ω s i is the term that contains the accumulated information that eventually remains stable in (s i ) i<ω . This is expressed as an approximation of the monotonically increasing information that remains stable from some point on.
Example 3.2. Reconsider the system from Example 3.1. The reduction S also p-converges to s. This can be seen by forming the sequence ( j≤i<ω s i ) i<ω of stable information of the underlying sequence (s i ) i<ω of terms in S: :: :: :: ⊥ :: approximates the term t = ⊥ :: ⊥ :: ⊥ . . . . Hence, T p-converges to t.
Note that in both the metric and the partial order setting continuity is simply the convergence of every proper prefix: a reduction S = (t ι → t ι+1 ) ι<α is m-continuous (respectively p-continuous) iff every proper prefix S| β m-converges (respectively p-converges) to t β .
In order to define p-convergence, we had to extend terms with partiality. However, apart from this extension, both m-and p-convergence coincide. To describe this more precisely we use the following terms: a reduction S : s ֒→ p . . . is p-continuous in T ∞ (Σ) iff each term in S is total, i.e. in T ∞ (Σ); a reduction S : s ֒→ p t is called p-convergent in T ∞ (Σ) iff t and each term in S is total. We then have the following theorem: [5]). For every reduction S in a TRS the following equivalences hold: Example 3.2 illustrates the correspondence between p-and m-convergence: the reduction S p-converges in T ∞ (Σ) and m-converges whereas the reduction T p-converges but not in T ∞ (Σ) and thus does not m-converge. Kennaway [22] and Bahr [6] investigated abstract models of infinitary rewriting based on metric spaces respectively partially ordered sets. We shall take these abstract models as a basis to formulate a theory of infinitary term graph reductions. The key question that we have to address is what an appropriate metric space respectively partial order on term graphs looks like.

Graphs & Term Graphs
This section provides the basic notions for term graphs and more generally for graphs. Terms over a signature, say Σ, can be thought of as rooted trees whose nodes are labelled with symbols from Σ. Moreover, in these trees a node labelled with a k-ary symbol is restricted to have out-degree k and the outgoing edges are ordered. In this way the i-th successor of a node labelled with a symbol f is interpreted as the root node of the subtree h(a, b)).
h f a (d) Sub-term graph of g. Figure 1. Tree representation of a term and generalisation to (term) graphs.
that represents the i-th argument of f . For example, consider the term f ( a, h(a, b)). The corresponding representation as a tree is shown in Figure 1a.
In term graphs, the restriction to a tree structure is abolished. The corresponding notion of term graphs we are using is taken from Barendregt et al. [11]. We begin by defining the underlying notion of graphs. Definition 4.1 (graphs). Let Σ be a signature. A graph over Σ is a tuple g = (N, lab, suc) consisting of a set N (of nodes), a labelling function lab: N → Σ, and a successor function suc : N → N * such that |suc(n)| = ar(lab(n)) for each node n ∈ N , i.e. a node labelled with a k-ary symbol has precisely k successors. The graph g is called finite whenever the underlying set N of nodes is finite. If suc(n) = n 0 , . . . , n k−1 , then we write suc i (n) for n i . Moreover, we use the abbreviation ar g (n) for the arity ar(lab(n)) of n.  (i) A path in g from n to m is a finite sequence π ∈ N * such that either − π is empty and n = m, or − π = i · π ′ with 0 ≤ i < ar g (n) and the suffix π ′ is a path in g from suc i (n) to m. (ii) If there exists a path from n to m in g, we say that m is reachable from n in g.
Since paths are sequences, we may use the prefix order on sequences for paths as well. That is, we write π 1 ≤ π 2 (respectively π 1 < π 2 ) if there is a (non-empty) path π 3 with π 1 · π 3 = π 2 . Definition 4.4 (term graphs). Given a signature Σ, a term graph g over Σ is a tuple (N, lab, suc, r) consisting of an underlying graph (N, lab, suc) over Σ whose nodes are all reachable from the root node r ∈ N . The term graph g is called finite if the underlying graph is finite, i.e. the set N of nodes is finite. The class of all term graphs over Σ is denoted G ∞ (Σ); the class of all finite term graphs over Σ is denoted G(Σ). We use the notation N g , lab g , suc g and r g to refer to the respective components N ,lab, suc and r of g. In analogy to subterms, term graphs have sub-term graphs. Given a graph or a term graph h and a node n in h, we write h| n to denote the sub-term graph of h rooted in n. Example 4.5. Let Σ = {f /2, h/2, c/0} be a signature. The term graph g over Σ, depicted in Figure 1c, is given by the quadruple (N, lab, suc, r), where N = {r, n 1 , n 2 , n 3 }, suc(r) = n 1 , n 2 , suc(n 1 ) = n 1 , n 3 , suc(n 2 ) = n 1 , n 3 , suc(n 3 ) = and lab(r) = lab(n 1 ) = f , lab(n 2 ) = h, lab(n 3 ) = c. Figure 1d depicts the sub-term graph g| n 2 of g.
Paths in a graph are not absolute but relative to a starting node. In term graphs, however, we have a distinguished root node from which each node is reachable. Paths relative to the root node are central for dealing with term graphs: Definition 4.6 (positions, depth, cyclicity, trees). Let g ∈ G ∞ (Σ) and n ∈ N g .
(i) A position of n in g is a path in the underlying graph of g from r g to n. The set of all positions in g is denoted P(g); the set of all positions of n in g is denoted P g (n). 1 (ii) The depth of n in g, denoted depth g (n), is the minimum of the lengths of the positions of n in g, i.e. depth g (n) = min {|π| | π ∈ P g (n) }. (iii) For a position π ∈ P(g), we write node g (π) for the unique node n ∈ N g with π ∈ P g (n) and g(π) for its symbol lab g (n). (iv) A position π ∈ P(g) is called cyclic if there are paths π 1 < π 2 ≤ π with node g (π 1 ) = node g (π 2 ), i.e. π passes a node twice. The non-empty path π ′ with π 1 · π ′ = π 2 is then called a cycle of node g (π 1 ). A position that is not cyclic is called acyclic. If g has a cyclic position, g is called cyclic; otherwise g is called acyclic. (v) The term graph g is called a term tree if each node in g has exactly one position.
Note that the labelling function of graphs -and thus term graphs -is total. In contrast, Barendregt et al. [11] considered open (term) graphs with a partial labelling function such that unlabelled nodes denote holes or variables. This is reflected in their notion of homomorphisms in which the homomorphism condition is suspended for unlabelled nodes.

4.1.
Homomorphisms. Instead of a partial node labelling function for term graphs, we chose a syntactic approach that is closer to the representation in terms: variables, holes and "bottoms" are represented as distinguished syntactic entities. We achieve this on term graphs by making the notion of homomorphisms dependent on a set of constant symbols ∆ for which the homomorphism condition is suspended: Definition 4.7 (∆-homomorphisms). Let Σ be a signature, ∆ ⊆ Σ (0) , and g, h ∈ G ∞ (Σ).
(i) A function φ : N g → N h is called homomorphic in n ∈ N g if the following holds: that is homomorphic in n for all n ∈ N g with lab g (n) ∈ ∆ and satisfies Note that, for ∆ = ∅, we get the usual notion of homomorphisms on term graphs (e.g. Barendsen [12]). The ∆-nodes can be thought of as holes in the term graphs that can be filled with other term graphs. For example, if we have a distinguished set of variable symbols V ⊆ Σ (0) , we can use V-homomorphisms to formalise the matching step of term graph rewriting, which requires the instantiation of variables.
Example 4.8. Figure 2 depicts two functions φ and ψ. Whereas φ is a homomorphism, the function ψ is not a homomorphism since, for example, the node labelled a in g 3 is mapped to a node labelled h in g 3 . Nevertheless, ψ is a {a, b}-homomorphism. Note that    Proof. The identity ∆-homomorphism is obviously the identity mapping on the set of nodes. Moreover, an easy equational reasoning reveals that the composition of two ∆homomorphisms is again a ∆-homomorphism. Associativity of this composition is obvious as ∆-homomorphisms are functions.
To show that the category is a preorder, assume that there are two ∆-homomorphisms φ 1 , φ 2 : g → ∆ h. We prove that φ 1 = φ 2 by showing that φ 1 (n) = φ 2 (n) for all n ∈ N g by induction on the depth of n in g.
Let depth g (n) = 0, i.e. n = r g . By the root condition for φ, we have that φ 1 (r g ) = r h = φ 2 (r g ). Let depth g (n) = d > 0. Then n has a position π · i in g such that depth g (n ′ ) < d for n ′ = node g (π). Hence, we can employ the induction hypothesis for n ′ . Moreover, since n ′ has at least one successor node, viz. n, it cannot be labelled with a nullary symbol and a fortiori not with a symbol in ∆. Therefore, the ∆-homomorphisms φ 1 and φ 2 are homomorphic in n ′ and we can thus reason as follows: As a consequence, whenever there are two ∆-homomorphisms φ : g → ∆ h and ψ : h → ∆ g, they are inverses of each other, i.e. ∆-isomorphisms. If two term graphs are ∆-isomorphic, we write g ∼ = ∆ h.
The structure of positions permits a convenient characterisation of ∆-homomorphisms: Proof. For the "only if" direction, assume that φ : g → ∆ h. (b) is the labelling condition and is therefore satisfied by φ. To establish (a), we show the equivalent statement We do so by induction on the length of π: if π = , then π ∈ P g (n) implies n = r g . By the root condition, we have φ(r g ) = r h and, therefore, π = ∈ φ(r g ). If π = π ′ · i , then let n ′ = node g (π ′ ). Consequently, π ′ ∈ P g (n ′ ) and, by induction hypothesis, π ′ ∈ P h (φ(n ′ )).
Since π = π ′ · i , we have suc g i (n ′ ) = n. By the successor condition we can conclude ). This and π ′ ∈ P h (φ(n ′ )) yields that π ′ · i ∈ P h (φ(n)). For the "if" direction, we assume (a) and (b). The labelling condition follows immediately from (b). For the root condition, observe that since ∈ P g (r g ), we also have ∈ P h (φ(r g )). Hence, φ(r g ) = r h . In order to show the successor condition, let n, n ′ ∈ N g and 0 ≤ i < ar g (n) such that suc g i (n) = n ′ . Then there is a position π ∈ P g (n) with π · i ∈ P g (n ′ ). By (a), we can conclude that π ∈ P h (φ(n)) and π · i ∈ P h (φ(n ′ )) which implies that suc h i (φ(n)) = φ(n ′ ). By Proposition 4.9, there is at most one ∆-homomorphism between two term graphs. The lemma above uniquely defines this ∆-homomorphism: if there is a ∆-homomorphism from g to h, it is defined by φ(n) = n ′ , where n ′ is the unique node n ′ ∈ N h with P g (n) ⊆ P h (n ′ ). Moreover, while it is not true for arbitrary ∆-homomorphisms, we have that homomorphisms are surjective.
Proof. Follows from an easy induction on the depth of the nodes in h.
The {a, b}-homomorphism illustrated in Figure 2b, shows that the above lemma does not hold for ∆-homomorphisms in general.

Isomorphisms & Isomorphism Classes.
When dealing with term graphs, in particular, when studying term graph transformations, we do not want to distinguish between isomorphic term graphs. Distinct but isomorphic term graphs do only differ in the naming of nodes and are thus an unwanted artifact of the definition of term graphs. In this way, equality up to isomorphism is similar to α-equivalence of λ-terms and has to be dealt with.
In this section, we characterise isomorphisms and more generally ∆-isomorphisms. From this we derive two canonical representations of isomorphism classes of term graphs. One is simply a subclass of the class of term graphs while the other one is based on the structure provided by the positions of term graphs. The relevance of the former representation is derived from the fact that we still have term graphs that can be easily manipulated whereas the latter is more technical and will be helpful for constructing term graphs up to isomorphism.
Note that a bijective ∆-homomorphism is not necessarily a ∆-isomorphism. To realise this, consider two term graphs g, h, each with one node only. Let the node in g be labelled with a and the node in h with b then the only possible a-homomorphism from g to h is clearly a bijection but not an a-isomorphism. On the other hand, bijective homomorphisms indeed are isomorphisms.
We need to show that φ −1 is a homomorphism from h to g. The root condition follows immediately from the root condition for φ. Similarly, an easy equational reasoning reveals that φ −1 is From the characterisation of ∆-homomorphisms in Lemma 4.10, we immediately obtain a characterisation of ∆-isomorphisms as follows: Proof. Immediate consequence of Lemma 4.10 and Proposition 4.9.
Note that whenever ∆ is a singleton set, the condition lab g (n), lab h (φ(n)) ∈ ∆ in the above lemma implies lab g (n) = lab h (φ(n)). Therefore, we obtain the following corollary: Note that the above equivalence does not hold for ∆-homomorphisms with more than one symbol in ∆: consider the term graphs g = a and h = b consisting of a single node labelled a respectively b. While g and h are ∆-isomorphic for ∆ = {a, b}, they are not isomorphic.

Canonical Term Graphs.
From the Lemmas 4.12 and 4.13 we learned that isomorphisms between term graphs are bijections that preserve and reflect the positions as well as the labelling of each node. These findings motivate the following definition of canonical term graphs as candidates for representatives of isomorphism classes: Definition 4.15 (canonical term graphs). A term graph g is called canonical if n = P g (n) holds for each n ∈ N g . That is, each node is the set of its positions in the term graph. The set of all (finite) canonical term graphs over Σ is denoted G ∞ C (Σ) (respectively G C (Σ)). Given a term graph h ∈ G ∞ C (Σ), its canonical representative C(h) is the canonical term graph given by The above definition follows a well-known approach to obtain, for each term graph g, a canonical representative C(g) [32]. One can easily see that C(g) is a well-defined canonical term graph. With this definition we indeed capture a notion of canonical representatives of isomorphism classes: Proposition 4.16 (canonical term graphs are isomorphism class representatives). Given g ∈ G ∞ (Σ), the term graph C(g) canonically represents the equivalence class [g]∼ = . More precisely, it holds that Proof. Straightforward consequence of Lemma 4.13.

Labelled Quotient Trees.
Intuitively, term graphs can be thought of as "terms with sharing", i.e. terms in which occurrences of the same subterm may be identified. The representation of isomorphic term graphs as labelled quotient trees, which we shall study in this section, makes use of and formalises this intuition. To this end, we introduce an equivalence relation on the positions of a term graph that captures the sharing in a term graph: Definition 4.17 (aliasing positions). Given a term graph g and two positions π 1 , π 2 ∈ P(g), we say that π 1 and π 2 alias each other in g, denoted π 1 ∼ g π 2 , if node g (π 1 ) = node g (π 2 ).
One can easily see that the thus defined relation ∼ g on P(g) is an equivalence relation. Moreover, the partition on P(g) induced by ∼ g is simply the set {P g (n) | n ∈ N g } that contains the sets of positions of nodes in g. The characterisation of ∆-homomorphisms of Lemma 4.10 can be recast in terms of aliasing positions, which then yields the following characterisation of the existence of ∆homomorphisms: Proof. For the "only if" direction, assume that φ is a ∆-homomorphism from g to h. Then we can use the properties (a) and (b) of Lemma 4.10, which we will refer to as (a') and (b') to avoid confusion. In order to show (a), assume π ∼ g π ′ . Then there is some node n ∈ N g with π, π ′ ∈ P g (n). (a') yields π, π ′ ∈ φ(n) and, therefore, π ∼ h π ′ . To show (b), we assume some π ∈ P(g) with g(π) ∈ ∆. Then we can reason as follows: For the converse direction, assume that both (a) and (b) hold. Define the function φ : N g → N h by φ(n) = m iff P g (n) ⊆ P h (m) for all n ∈ N g and m ∈ N h . To see that this is well-defined, we show at first that, for each n ∈ N g , there is at most one m ∈ N h with P g (n) ⊆ P h (m). Suppose there is another node m ′ ∈ N h with P g (n) ⊆ P h (m ′ ). Since Secondly, we show that there is at least one such node m. Choose some π * ∈ P g (n). Since then π * ∼ g π * and, by (a), also π * ∼ h π * holds, there is some m ∈ N h with π * ∈ P h (m). For each π ∈ P g (n), we have π * ∼ g π and, therefore, π * ∼ h π by (a). Hence, π ∈ P h (m). So we know that φ is well-defined. By construction, φ satisfies (a'). Moreover, because of (b), it is also easily seen to satisfy (b'). Hence, φ is a homomorphism from g to h.
Intuitively, Clause (a) states that h has at least as much sharing of nodes as g has, whereas Clause (b) states that h has at least the same non-∆-labelling as g. In this sense, the above characterisation confirms the intuition about ∆-homomorphisms that we mentioned in Example 4.8, viz. ∆-homomorphisms may only introduce sharing and relabel ∆-nodes. This can be observed in the two ∆-homomorphisms illustrated in Figure 2.
From the above characterisations of the existence of ∆-homomorphisms, we can easily derive the following characterisation of ∆-isomorphisms using the uniqueness of ∆homomorphisms between two term graphs: Proof. Immediate consequence of Lemma 4.19 and Proposition 4.9.

Remark 4.21. ∆-homomorphisms can be naturally lifted to the set of isomorphism classes
These ∆-homomorphisms then form a category which can easily be show to be isomorphic to the category of ∆-homomorphisms on G ∞ C (Σ) via the mapping [·]∼ = .
Lemma 4.20 has shown that term graphs can be characterised up to isomorphism by only giving the equivalence ∼ g and the labelling g(·) : π → g(π) of the involved term graphs. This observation gives rise to the following definition: Definition 4.22 (labelled quotient trees). A labelled quotient tree over signature Σ is a triple (P, l, ∼) consisting of a non-empty set P ⊆ N * , a function l : P → Σ, and an equivalence relation ∼ on P that satisfies the following conditions for all π, π ′ ∈ N * and i ∈ N: In other words, a labelled quotient tree (P, l, ∼) is a ranked tree domain P together with a congruence ∼ on it and a labelling function l : P/ ∼ → Σ that honours the rank. Also note that since P must be non-empty, the reachability condition implies that ∈ P . The following lemma confirms that labelled quotient trees uniquely characterise any term graph up to isomorphism: Lemma 4.24 (labelled quotient trees are canonical). Each term graph g ∈ G ∞ (Σ) induces a canonical labelled quotient tree (P(g), g(·), ∼ g ) over Σ. Vice versa, for each labelled quotient tree (P, l, ∼) over Σ there is a unique canonical term graph g ∈ G ∞ C (Σ) whose canonical labelled quotient tree is (P, l, ∼), i.e. P(g) = P , g(π) = l(π) for all π ∈ P , and ∼ g = ∼.
For the second part, let (P, l, ∼) be a labelled quotient tree. Define the term graph g = (N, lab, suc, r) by The functions lab and suc are well-defined due to the congruence condition satisfied by (P, l, ∼). Since P is non-empty and closed under prefixes, it contains . Hence, r is welldefined. Moreover, by the reachability condition, each node in N is reachable from the root node. An easy induction proof shows that P g (n) = n for each node n ∈ N . Thus, g is a well-defined canonical term graph. The canonical labelled quotient tree of g is obviously (P, l, ∼). Whenever there are two canonical term graphs with the same canonical labelled quotient tree (P, l, ∼), they are isomorphic due to Lemma 4.20 and, therefore, have to be identical by Proposition 4.16.
Labelled quotient trees provide a valuable tool for constructing canonical term graphs as we shall see. Nevertheless, the original graph representation remains convenient for practical purposes as it allows a straightforward formalisation of term graph rewriting and provides a finite representation of finite cyclic term graphs, which induce an infinite labelled quotient tree.

Terms, Term Trees & Unravelling.
Before we continue, it is instructive to make the correspondence between terms and term graphs clear. First, note that, for each term tree t, the equivalence ∼ t is the identity relation I P(t) on P(t), i.e. π 1 ∼ t π 2 iff π 1 = π 2 . Consequently, we have the following one-to-one correspondence between canonical term trees and terms: each term t ∈ T ∞ (Σ) induces the canonical term tree given by the labelled quotient tree (P(t), t(·), I P(t) ). For example, the term tree depicted in Figure 1a corresponds to the term f ( a, h(a, b)). We thus consider the set of terms T ∞ (Σ) as the subset of canonical term trees of G ∞ C (Σ). With this correspondence in mind, we can define the unravelling of a term graph g as the unique term t such that there is a homomorphism φ : t → g. The unravelling of cyclic term graphs yields infinite terms, e.g. in Figure 8 on page 43, the term h ω is the unravelling of the term graph g 2 . We use the notation U(g) for the unravelling of g.

A Rigid Partial Order on Term Graphs
In this section, we shall establish a partial order suitable for formalising convergence of sequences of canonical term graphs similarly to p-convergence on terms.
Recall that p-convergence in term rewriting systems is based on a partial order ≤ ⊥ on the set T ∞ (Σ ⊥ ) of partial terms. The partial order ≤ ⊥ instantiates occurrences of ⊥ from left to right, i.e. s ≤ ⊥ t iff t is obtained by replacing occurrences of ⊥ in s by arbitrary terms in T ∞ (Σ ⊥ ).
Since we are considering term graph rewriting as a generalisation of term rewriting, our aim is to generalise the partial order ≤ ⊥ on terms to term graphs. That is, the partial order we are looking for should coincide with ≤ ⊥ if restricted to term trees. Moreover, we also want to maintain the characteristic properties of the partial order ≤ ⊥ when generalising to term graphs. The most important characteristic we are striving for is a complete semilattice structure in order to define p-convergence in terms of the limit inferior. Apart from that, we also want to maintain the intuition of the partial order ≤ ⊥ , viz. the intuition of information preservation, which ≤ ⊥ captures on terms as we illustrated in Section 2. We will make this last guiding principle clearer as we go along.
Analogously to partial terms, we consider the class of partial term graphs simply as term graphs over the signature Σ ⊥ = Σ ⊎ {⊥}. In order to generalise the partial order ≤ ⊥ to term graphs, we need to formalise the instantiation of occurrences of ⊥ in term graphs. ∆-homomorphisms, for ∆ = {⊥} -or ⊥-homomorphisms for short -provide the right starting point for that. A homomorphism φ : g → h maps each node in g to a node in h while preserving the local structure of each node, viz. its labelling and its successors. In the case of a ⊥-homomorphisms φ : g → ⊥ h, the preservation of the labelling is suspended for nodes labelled ⊥ thus allowing φ to instantiate each ⊥-node in g with an arbitrary node in h.
Therefore, we shall use ⊥-homomorphisms as the basis for generalising ≤ ⊥ to canonical partial term graphs. This approach is based on the observation that ⊥-homomorphisms characterise the partial order ≤ ⊥ on terms. Considering terms as canonical term trees, we obtain the following equivalence: Thus, ⊥-homomorphisms constitute the ideal tool for defining a partial order on canonical partial term graphs that generalises ≤ ⊥ . In the following subsection, we shall explore different partial orders on canonical partial term graphs based on ⊥-homomorphisms.

Partial Orders on Term Graphs.
Consider the simple partial order ≤ S ⊥ defined on term graphs as follows: This is a straightforward generalisation of the partial order ≤ ⊥ to term graphs. In fact, this partial order forms a complete semilattice on G ∞ C (Σ ⊥ ) [10]. As we have explained in Section 2, p-convergence on terms is based on the ability of the partial order ≤ ⊥ to capture information preservation between terms -s ≤ ⊥ t means that t contains at least the same information as s does. The limit inferior -and thus p-convergence -comprises the accumulated information that eventually remains stable. Following the approach on terms, a partial order suitable as a basis for convergence for term graph rewriting, has to capture an appropriate notion of information preservation as well.
One has to keep in mind, however, that term graphs encode an additional dimension of information through sharing of nodes, i.e. the fact that nodes may have multiple positions. Since ≤ S ⊥ specialises to ≤ ⊥ on terms, it does preserve the information on the tree structure in the same way as ≤ ⊥ does. The difficult part is to determine the right approach to the role of sharing.
Indeed, ⊥-homomorphisms instantiate occurrences of ⊥ and are thereby able to introduce new information. But while ⊥-homomorphisms preserve the local structure of each node, they may change the global structure of a term graph by introducing sharing: for the term graphs g 0 and g 1 in Figure 3, we have an obvious ⊥-homomorphism -in fact a homomorphism -φ : g 0 → ⊥ g 1 and thus g 0 ≤ S ⊥ g 1 . There are at least two different ways to interpret the differences in g 0 and g 1 . The first one dismisses ≤ S ⊥ as a partial order suitable for our purposes: the term graphs g 0 and g 1 contain contradicting information. While in g 0 the two children of the f -node are distinct, they are identical in g 1 . We will indeed follow this view in this paper and introduce a rigid partial order ≤ R ⊥ that addresses this concern. There is, however, also a second view that does not see g 0 and g 1 in contradiction: both term graphs show the f -node with two successors, both of which are labelled with c. The term graph g 1 merely contains the additional piece of information that the two successor nodes of the f -node are identical. The simple partial order ≤ S ⊥ , which follows this view, is studied further in [10]. One consequence of the above behaviour of ≤ S ⊥ is that total term graphs are not necessarily maximal w.r.t. ≤ S ⊥ , e.g. g 0 is total but not maximal. The second -more severeconsequence is that there can be no metric on total term graphs such that the limit w.r.t. that metric coincides with the limit inferior on total term graph. To see this consider the sequence (g i ) i<ω of term graphs illustrated in Figure 3. Its limit inferior w.r.t. ≤ S ⊥ is the total term graph g ω . On the other hand, there is no metric w.r.t. which (g i ) i<ω converges since the sequence alternates between two distinct term graphs. That is, the correspondence between metric and partial order convergence that we know from term rewriting, cf. Theorem 3.3, is impossible.
In order to avoid the introduction of sharing, we need to consider ⊥-homomorphisms that preserve the structure of term graphs more rigidly, i.e. not only locally. Recall that by Lemma 4.24, the structure of a term graph is essentially given by the positions of nodes and their labelling. Labellings are already taken into consideration by ⊥-homomorphisms. Thus, we can define a partial order ≤ P ⊥ that preserves the structure of term graphs as follows: While this would again yield a complete semilattice, it is unfortunately too restrictive. For example, consider the sequence of term graphs (g i ) i<ω depicted in Figure 4. Due to the cycle, we have for each term graph g i that ⊥ is the only term graph strictly smaller than g i w.r.t. ≤ P ⊥ . The reason for this is the fact that the only way to maintain the positions of the root node of the term graph g i is to keep all nodes of the cycle in g i . Hence, in order to obtain a term graph h with h ≤ P ⊥ g i , we have to either keep the whole term graph g i or collapse it completely, yielding ⊥. For example, we neither have g ′ 2 ≤ P ⊥ g 2 nor g ′ 2 ≤ P ⊥ g 3 for the term graph g ′ 2 illustrated in Figure 4. As a consequence, the limit inferior of the sequence (g i ) i<ω is ⊥ and not the expected term graph g ω .
The fact that the root nodes g 2 and g ′ 2 have different sets of positions is solely caused by the edge to the root node of g 2 that comes from below and thus closes a cycle. Even though the edge occurs below the root node, it affects its positions. Cutting off that edge, like in g ′ 2 , changes the sharing. As a consequence, in the complete semilattice (G ∞ C (Σ ⊥ ), ≤ P ⊥ ), we do not obtain the intuitively expected convergence behaviour depicted in Figure 8c on page 43.
This observation suggests that we should only consider the upward structure of each node, ignoring the sharing that is caused by edges occurring below a node. We will see that by restricting our attention to acyclic positions, we indeed obtain the desired properties for a partial order on term graphs.
Recall that a position π in a term graph g is called cyclic iff there are positions π 1 , π 2 with π 1 < π 2 ≤ π such that node g (π 1 ) = node g (π 2 ), i.e. π passes a node twice. Otherwise it is called acyclic. We will use the notation P a (g) for the set of all acyclic positions in g, and P a g (n) for the set of all acyclic positions of a node n in g. That is, P a (g) is the set of positions in g that pass each node in g at most once. Clearly, every node has at least one acyclic position, i.e. P a g (n) is a non-empty set. Proof. Straightforward.
Note that, for each node n in a term graph g, the positions in P a g (n) are minimal positions of n w.r.t. the prefix order. Rigid ⊥-homomorphisms thus preserve the upward structure of each non-⊥-node and, therefore, provide the desired structure for a partial order that captures information preservation on term graphs: Proof. Reflexivity and transitivity of ≤ R ⊥ follow immediately from Proposition 5.2. For antisymmetry, assume g ≤ R ⊥ h and h ≤ R ⊥ g. By Proposition 4.9, this implies g ∼ = ⊥ h. Corollary 4.14 then yields that g ∼ = h. Hence, according to Proposition 4.16, g = h.
Example 5.5. Figure 8c on page 43 shows a sequence (h ι ) ι<ω of term graphs and its limit inferior h ω in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ): a cyclic list structure is repeatedly rewritten by inserting an element b in front of the a. We can see that in each step the newly inserted b (including the additional :: -node) remains unchanged afterwards. In terms of positions, however, each of the nodes changes in each step since the length of the cycle in the term graph grows with each step. Since this affects only cyclic positions, we still get the following sequence ( β≤ι<ω h ι ) β<ω of canonical term trees: The least upper bound of this sequence ( β≤ι<ω h ι ) β<ω and thus the limit inferior of (h ι ) ι<ω is the infinite canonical term tree h ω = b :: b :: b :: . . . . Since the cycle changes in each step and is thus cut through in each element of ( β≤ι<ω h ι ) β<ω , the limit inferior has no cycles at all.
Note that we do not have this intuitively expected convergence behaviour for the partial order ≤ P ⊥ based on positions: since the length of the cycle grows along the sequence (h ι ) ι<ω , we have that the set of positions of the root nodes changes constantly. Hence, the limit inferior of (h ι ) ι<ω in (G ∞ C (Σ ⊥ ), ≤ P ⊥ ) is ⊥. The partial order ≤ R ⊥ based on rigid ⊥-homomorphisms is defined in a rather non-local fashion as the definition of rigidity uses the set of all acyclic positions. This poses the question whether there is a more natural definition of a suitable partial order. One such candidate is the partial order ≤ I ⊥ , which uses injectivity in order to restrict the introduction of sharing: g ≤ I ⊥ h iff there is a ⊥-homomorphism φ : g → ⊥ h that is injective on non-⊥nodes, i.e. φ(n) = φ(m) and lab g (n), lab g (m) = ⊥ implies n = m. While this yields indeed a cpo on G ∞ C (Σ ⊥ ), we do not get a complete semilattice. To see this, consider Figure 5. The two term graphs g 3 , g 4 are two distinct maximal lower bounds of the two term graphs g 1 , g 2 w.r.t. the partial order ≤ I ⊥ . Hence, the set {g 1 , g 2 } does not have a greatest lower bound in (G ∞ C (Σ ⊥ ), ≤ I ⊥ ), which is therefore not a complete semilattice. The same phenomenon occurs if we consider a partial order derived from ⊥-homomorphisms that are injective on all nodes.

PATRICK BAHR
The rigid partial order ≤ R ⊥ resolves the issue of ≤ I ⊥ illustrated in Figure 5: g 3 and g 4 are not lower bounds of g 1 and g 2 w.r.t. ≤ R ⊥ . The (unique) ⊥-homomorphism from g 3 to g 1 is not rigid as it maps the node n 2 to n 1 and P a g 3 (n 2 ) = { 0, 0 } whereas P a g 1 (n 1 ) = { 0, 0 , 1, 0 }. Hence, g 3 ≤ R ⊥ g 1 . Likewise, g 4 ≤ R ⊥ g 1 as the (unique) ⊥-homomorphism from g 4 to g 1 maps n 3 to n 1 , which again have different acyclic positions. We do find, however, a greatest lower bound of g 1 and g 2 w.r.t. ≤ R ⊥ , viz. g 5 .

The Rigid Partial Order.
In the remainder of this section, we will study the rigid partial order ≤ R ⊥ . In particular, we shall give a characterisation of rigidity in terms of labelled quotient trees analogous to Lemma 4.19, show that (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) forms a complete semilattice, illustrate the resulting mode of convergence, and give a characterisation of term graphs that are maximal w.r.t. ≤ R ⊥ . The partial order ≤ I ⊥ , derived from injective ⊥-homomorphisms, failed to form a complete semilattice, which is why we abandoned that approach. The following lemma shows that rigidity is, in fact, a stronger property than injectivity on non-∆-nodes. Hence, ≤ R ⊥ is a restriction of ≤ I ⊥ . Lemma 5.6 (rigid ∆-homomorphisms are injective for non-∆-nodes). Let g, h ∈ G ∞ (Σ) and φ : g → ∆ h rigid. Then φ is injective for all non-∆-nodes in g. That is, for two nodes n, m ∈ N g with lab g (n), lab g (m) ∈ ∆ we have that φ(n) = φ(m) implies n = m.
Proof. Let n, m ∈ N g with lab g (n), lab g (m) ∈ ∆ and φ(n) = φ(m). Since φ is rigid, it is rigid in n and m. That is, in particular we have P a h (φ(n)) ⊆ P g (n) and P a h (φ(m)) ⊆ P g (m). Moreover, because P a h (φ(n)) = P a h (φ(m)) = ∅, we can conclude that P g (n) ∩ P g (m) = ∅ and, therefore, m = n.

5.2.1.
Characterising Rigidity. The goal of this subsection is to give a characterisation of rigidity in terms of labelled quotient trees. We will then combine this characterisation with Lemma 4.19 to obtain a characterisation of the partial order ≤ R ⊥ . The following lemma provides a characterisation of rigid ∆-homomorphisms that reduces the proof obligations necessary to show that a ∆-homomorphism is rigid.
Proof. The "only if" direction is trivial. For the "if" direction, suppose that φ satisfies P a h (φ(n)) ⊆ P g (n) for all n ∈ N g with lab g (n) ∈ ∆. In order to prove that φ is rigid, we will show that P a h (φ(n)) = P a g (n) holds for each n ∈ N g with lab g (n) ∈ ∆. We first show the inclusion P a h (φ(n)) ⊆ P a g (n). For this purpose, let π ∈ P a h (φ(n)). Due to the hypothesis, this implies that π ∈ P g (n). Now suppose that π is cyclic in g, i.e. there are two positions π 1 , π 2 of a node m ∈ N g with π 1 < π 2 ≤ π. By Lemma 4.10, we can conclude that π 1 , π 2 ∈ P h (φ(m)). This is a contradiction to the assumption that π is acyclic in h. Hence, π ∈ P a g (n). For the other inclusion, assume some π ∈ P a g (n). Using Lemma 4.10 we obtain that π ∈ P h (φ(n)). It remains to be shown that π is acyclic in h. Suppose that this is not true, i.e. there are two positions π 1 , π 2 of a node m ∈ N h with π 1 < π 2 ≤ π. Note that since π ∈ P(g), also π 1 , π 2 ∈ P(g). Let m i = node g (π i ), i = 1, 2. According to Lemma 4.10, we have that φ(m 1 ) = m = φ(m 2 ). Moreover, observe that g(π 1 ), g(π 2 ) ∈ ∆: g(π 1 ) cannot be a nullary symbol because π 1 < π ∈ P(g). The same argument applies for the case that π 2 < π. If this is not the case, then π 2 = π and g(π) ∈ ∆ follows from the assumption that lab g (n) ∈ ∆. Thus, we can apply Lemma 5.6 to conclude that m 1 = m 2 . Consequently, π is cyclic in g, which contradicts the assumption. Hence, π ∈ P a h (φ(n)). From the above lemma we learn that ∆-isomorphisms are also rigid ∆-homomorphisms.
Proof. This follows from Lemma 4.13 and Lemma 5.7.
For the converse direction, let n ∈ N g with lab g (n) ∈ ∆. We need to show that φ is rigid in n. Due to Lemma 5.7, it suffices to show that P a h (φ(n)) ⊆ P g (n). Since P g (n) = ∅, we can choose some π * ∈ P g (n). Then, according to Lemma 4.10, also π * ∈ P h (φ(n)). Let π ∈ P a h (φ(n)). Then π * ∼ h π holds. Since π is acyclic in h and g(π * ) ∈ ∆, we can use the hypothesis to obtain that π * ∼ g π holds which shows that π ∈ P g (n).
Note that the above characterisation of rigidity is independent of the ∆-homomorphism at hand. This is expected since ∆-homomorphisms between a given pair of term graphs are unique.
Proof. This follows immediately from Lemma 4.19 and Lemma 5.9.
Note that for term trees (b) is always true and (a) follows from (c). Hence, on term trees, ≤ R ⊥ is characterised by (c) alone. This observation shows that ≤ R ⊥ is indeed a generalisation of ≤ ⊥ .

5.2.2.
Convergence. In the following, we shall show that ≤ R ⊥ indeed forms a complete semilattice on G ∞ C (Σ ⊥ ). We begin by showing that it constitutes a complete partial order.
forms a cpo. In particular, it has the least element ⊥, and the least upper bound of a directed set G is given by the following labelled quotient tree (P, l, ∼): Proof. The least element of ≤ R ⊥ is obviously ⊥. Hence, it remains to be shown that each directed subset G of G ∞ C (Σ ⊥ ) has a least upper bound w.r.t. ≤ R ⊥ . To this end, we show that the canonical term graph g given by the labelled quotient tree (P, l, ∼) described above is indeed the lub of G. We will make extensive use of Corollary 5.10 to do so. Therefore, we write (a), (b), (c) to refer to corresponding conditions of Corollary 5.10.
This shows that (P, l, ∼) is a labelled quotient tree which, by Lemma 4.24, uniquely defines a canonical term graph. Next we show that the thus obtained term graph g is an upper bound for G. To this end, let g ∈ G. We will show that g ≤ R ⊥ g by establishing (a),(b) and (c). (a) and (c) are an immediate consequence of the construction. For (b), assume that π 1 ∈ P(g), g(π 1 ) ∈ Σ, π 2 ∈ P a (g) and π 1 ∼ π 2 . We will show that then also π 1 ∼ g π 2 holds. Since π 1 ∼ π 2 , there is some g ′ ∈ G with π 1 ∼ g ′ π 2 . Because G is directed, there is some g * ∈ G with g, g ′ ≤ R ⊥ g * . Using (a), we then get that π 1 ∼ g * π 2 . Note that since π 2 is acyclic in g, it is also acyclic in g * : Suppose that this is not the case, i.e. there are positions π 3 , π 4 with π 3 < π 4 ≤ π 2 and π 3 ∼ g * π 4 . But then we also have π 3 ∼ π 4 , which contradicts the assumption that π 2 is acyclic in g. With this knowledge we are able to apply (b) to π 1 ∼ g * π 2 in order to obtain π 1 ∼ g π 2 .
In the final part of this proof, we will show that g is the least upper bound of G. For this purpose, let g be an upper bound of G, i.e. g ≤ R ⊥ g for all g ∈ G. We will show that g ≤ R ⊥ g by establishing (a), (b) and (c). For (a), assume that π 1 ∼ π 2 . Hence, there is some g ∈ G with π 1 ∼ g π 2 . Since, by assumption, g ≤ R ⊥ g, we can conclude π 1 ∼ g π 2 using (a). For (b), assume π 1 ∈ P , l(π 1 ) ∈ Σ, π 2 ∈ P a ( g) and π 1 ∼ g π 2 . That is, there is some g ∈ G with g(π 1 ) ∈ Σ. Together with g ≤ R ⊥ g this implies π 1 ∼ g π 2 by (b). π 1 ∼ π 2 follows immediately. For (c), assume π ∈ P and l(π) = f ∈ Σ. Then there is some g ∈ G with g(π) = f . Applying (c) then yields g(π) = f since g ≤ R ⊥ g. Remark 5.13. Following Remark 4.21, we define an order The extension of ≤ R ⊥ to equivalence classes is easily seen to be well-defined: assume some rigid ⊥-homomorphism φ : g → ⊥ h and two isomorphisms g ′ ∼ = g and h ′ ∼ = h. Since, by Corollary 5.8, isomorphisms are also rigid (⊥-)homomorphisms, we have two rigid ⊥- The isomorphism illustrated above allows us switch between the two partially ordered sets (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) and (G ∞ (Σ ⊥ )/∼ = , ≤ R ⊥ ) in order to use the structure that is more convenient for the given setting. In particular, the proof of Lemma 5.14 below is based on this isomorphism.
By Proposition 2.1, a cpo is a complete semilattice iff each two compatible elements have a least upper bound. Recall that compatible elements in a partially ordered set are elements that have a common upper bound. We make use of this proposition in order to show that (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) is a complete semilattice. However, showing that each two term graphs g, h ∈ G ∞ C (Σ ⊥ ) with a common upper bound also have a least upper bound is not easy. The issue that makes the construction of the lub of compatible term graphs a bit more complicated than in the case of directed sets is illustrated in Figure 6. Note that the lub g ⊔ h of the term graphs g and h has an additional cycle. The fact that in g ⊔ h the second successor of r has to be r itself is enforced by g saying that the first successor of r 1 is r 1 itself and by h saying that the first and the second successor of r 2 must be identical. Because of the additional cycle in g ⊔ h, we have that the set of positions in g ⊔ h is a proper superset of the union of the sets of positions in g and h. This makes the construction of g ⊔ h using a labelled quotient tree quite intricate.
Our strategy to construct the lub is to form the disjoint union of the two term graphs in question and then identify nodes that have a common position w.r.t. the term graph they originate from. In our example, we have four nodes r 1 , n 1 , r 2 and n 2 . At first r 1 and r 2 have to be identified as both have the position . Next, r 1 and n 2 are identified as they share the position 0 . And eventually, also n 2 and n 1 are identified since they share the position 1 . Hence, all four nodes have to be identified. The result is, therefore, a term graph with a single node r. The following lemma and its proof, given in Appendix A, show that, for any two compatible term graphs, this construction always yields their lub. Lemma 5.14 (compatible elements have lub). Each pair g 1 , g 2 of compatible term graphs in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) has a least upper bound.
28 Figure 6. Least upper bound g ⊔ h of compatible term graphs g and h.
In particular, this means that the limit inferior is defined for every sequence of term graphs.

Corollary 5.16 (limit inferior of
has a limit inferior. Recall that the intuition of the limit inferior on terms is that it contains the accumulated information that eventually remains stable in the sequence. This interpretation is, of course, based on the partial order ≤ ⊥ on terms, which embodies the underlying notion of "information encoded in a term". The same interpretation can be given for the limit inferior based on the rigid partial order ≤ R ⊥ on term graphs. Given a sequence (g ι ) ι<α of term graphs, its limit inferior lim inf ι→α g ι is the term graph that contains the accumulation of all pieces of information that from some point onwards remain unchanged in (g ι ) ι<α . Example 5.17. 9d and 9e on page 45 each show a sequence of term graphs and its limit inferior in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ). (i) Figure 9d shows a simple example of how acyclic sharing is preserved by the limit inferior. The corresponding sequence ( β≤ι<ω g ι ) β<ω of greatest lower bounds is given as follows: ⊥ The least upper bound of this sequence of term graphs and thus the limit inferior of (g ι ) ι<ω is the term graph g ω depicted in Figure 9d. (ii) The situation is slightly different in the sequence (g ι ) ι<ω from Figure 9e. Here we also have acyclic sharing, viz. in the c-node. However, unlike in the previous example from Figure 9d, the acyclic sharing changes in each step. Hence, a lower bound of two distinct term graphs in (g ι ) ι<ω cannot contain a c-node because a rigid ⊥homomorphism must map such a c-node to a c-node with the same acyclic sharing, i.e. the same acyclic positions. Consequently, the sequence of greatest lower bounds ( β≤ι<ω g ι ) β<ω looks as follows: We thus get the term graph g ω , depicted in Figure 9e, as the limit inferior of (g ι ) ι<ω . The ⊥ labelling is necessary because of the change in acyclic sharing throughout the sequence.
While we have confirmed in Corollary 5.11 that the partial order ≤ R ⊥ generalises the partial order ≤ ⊥ on terms, we still have to show that this also carries over to the limit inferior. We can derive this property from the following simple lemma: Since t is a term tree, ∼ t is an identity relation. According to Corollary 5.10, g ≤ R ⊥ t implies that ∼ g ⊆ ∼ t . Hence, also ∼ g is an identity relation, which means that g is a term tree as well.
Let t and g be the limit inferior of (t ι ) ι<α in (T ∞ (Σ ⊥ ), ≤ ⊥ ) and (G ∞ C (Σ ⊥ ), ≤ R ⊥ ), respectively. By the above argument, we know that t and g are the lub of the set Since g is the least such upper bound, we know that g ≤ R ⊥ t. According to Lemma 5.18, this implies that g is a term tree. Hence, by Corollary 5.11, g is an upper bound of S in (T ∞ (Σ ⊥ ), ≤ ⊥ ) and g ≤ ⊥ t. Since t is the least upper bound of S in (T ∞ (Σ ⊥ ), ≤ ⊥ ), we can conclude that t = g.

Maximal Term Graphs.
Intuitively, partial term graphs represent partial results of computations where ⊥-nodes act as placeholders denoting the uncertainty or ignorance of the actual "value" at that position. On the other hand, total term graphs do contain all the information of a result of a computation -they have the maximally possible information content. In other words, they are the maximal elements w.r.t. ≤ R ⊥ . The following proposition confirms this intuition. Proposition 5.20 (total term graphs are maximal). Let Σ be a non-empty signature. Then . Proof. At first we need to show that each element in G ∞ C (Σ) is maximal. For this purpose, let g ∈ G ∞ C (Σ) and h ∈ G ∞ C (Σ ⊥ ) such that g ≤ R ⊥ h. We have to show that then g = h. Since g ≤ R ⊥ h, there is a rigid ⊥-homomorphism φ : g → ⊥ h. As g does not contain any ⊥-node, φ is even a rigid homomorphism. By Lemma 5.6, φ is injective and, therefore, according to Lemma 4.12, an isomorphism. Hence, we obtain that g ∼ = h and, consequently, using Proposition 4.16, that g = h.
Secondly, we need to show that G ∞ C (Σ ⊥ ) does not contain any other maximal elements besides those in G ∞ C (Σ). Suppose there is a term graph Hence, there is a node n * ∈ N g with lab g (n * ) = ⊥. If Σ contains a nullary symbol c, construct a term graph h from g by relabelling the node n * from ⊥ to c. However, then g < R ⊥ h, which contradicts the assumption that g is maximal w.r.t. ≤ R ⊥ . Otherwise, if Σ (0) = ∅, let n be a fresh node (i.e. n ∈ N g ) and f some k-ary symbol in Σ. Define the term graph h by That is, h is obtained from g by relabelling n * with f and setting the ⊥-labelled node n as the target of all outgoing edges of n * . We assume that n was chosen such that h is canonical (i.e. n = P h (n)). Obviously, g and h are distinct. Define φ : N g → N h by n → n for all n ∈ N g . Clearly, φ defines a rigid ⊥-homomorphism from g to h. Hence, g ≤ R ⊥ h. This contradicts the assumption of g being maximal. Consequently, no element in G ∞ C (Σ ⊥ ) \ G ∞ C (Σ) is maximal. Note that this property does not hold for the simple partial order ≤ S ⊥ that we have considered briefly in the beginning of this section. Figure 3 shows the total term graph g 0 , which is strictly smaller than g 1 w.r.t. ≤ S ⊥ .

A Rigid Metric on Term Graphs
In this section, we pursue the metric approach to convergence in rewriting systems. To this end, we shall define a metric space on canonical term graphs. We base our approach to defining a metric distance on the definition of the metric distance d on terms. In particular, we shall define a truncation operation on term graphs, which cuts off certain nodes depending on their depth in the term graph. Subsequently, we study the interplay of the truncation with ∆-homomorphisms and the depth of nodes within a term graph. Finally, we use the truncation operation to derive a metric on term graphs.

Truncating Term Graphs.
Originally, Arnold and Nivat [4] used a truncation of terms to define the metric on terms. The truncation of a term t at depth d ≤ ω, denoted t|d, replaces all subterms at depth d by ⊥: Recall that the metric distance d on terms is defined by d(s, t) = 2 −sim(s,t) . The underlying notion of similarity sim(·, ·) can be characterised via truncations as follows: We adopt this approach for term graphs as well. To this end, we shall define a rigid truncation on term graphs. In Section 6.3 we will then show that this truncation indeed yields a complete metric space. Definition 6.1 (rigid truncation of term graphs). Let g ∈ G ∞ (Σ ⊥ ) and d < ω.
(i) Given n, m ∈ N g , m is an acyclic predecessor of n in g if there is an acyclic position π · i ∈ P a g (n) with π ∈ P g (m). The set of acyclic predecessors of n in g is denoted Pre a g (n). (ii) The set of retained nodes of g at d, denoted N g <d , is the least subset M of N g satisfying the following two conditions for all n ∈ N g : iii) For each n ∈ N g and i ∈ N, we use n i to denote a fresh node, i.e. n i n ∈ N g , i ∈ N is a set of pairwise distinct nodes not occurring in N g . The set of fringe nodes of g at d, denoted N g =d , is defined as the singleton set {r g } if d = 0, and otherwise as the set n i n ∈ N g <d , 0 ≤ i < ar g (n) with suc g i (n) ∈ N g <d or depth g (n) ≥ d − 1, n ∈ Pre a g (suc g i (n)) (iv) The rigid truncation of g at d, denoted g ‡d, is the term graph defined by Additionally, we define g ‡ω to be the term graph g itself.
Before discussing the intuition behind this definition of rigid truncation, let us have a look at the rôle of retained and fringe nodes: the set of retained nodes N g <d contains the nodes that are preserved by the rigid truncation. All other nodes in N g \ N g <d are cut off. The "holes" that are thus created are filled by the fringe nodes in N g =d . This is expressed in the condition suc g i (n) ∈ N g <d which, if satisfied, yields a fringe node n i . That is, a fresh fringe node is inserted for each successor of a retained node that is not a retained node itself. As fringe nodes function as a replacement for cut-off sub-term graphs, they are labelled with ⊥ and have no successors.
But there is another circumstance that can give rise to a fringe node: if depth g (n) ≥ d−1 and n ∈ Pre a g (suc g i (n)), we also get a fringe node n i . This condition is satisfied whenever an outgoing edge from a retained node closes a cycle. The lower bound for the depth is chosen such that a successor node of n is not necessarily a retained node. An example is depicted in Figure 7a. For depth d = 2, the node n in the term graph g is just above the fringe, i.e. satisfies depth g (n) ≥ d − 1. Moreover, it has an edge to the node r that closes a cycle.   Hence, the rigid truncation g ‡2 contains the fringe node n 0 which is now the 0-th successor of n. We chose this admittedly complicated notion of truncation in order to make it compatible with the partial order ≤ R ⊥ : first of all, the rigid truncation of a term graph is supposed to yield a smaller term graph w.r.t. the rigid partial order ≤ R ⊥ , i.e. g ‡d ≤ R ⊥ g. Hence, whenever a node is kept as a retained node, also its acyclic positions have to be kept in order to preserve its upward structure. To achieve this, with each node also its acyclic ancestors have to be retained. The closure condition (T2) is enforced exactly for this purpose.
To see what this means, consider Figure 7b. It shows a term graph g and its truncation at depth 2, once without the closure condition (T2), denoted g †2, and once including (T2), denoted g ‡2. The grey area highlights the nodes that are at depth smaller than 2, i.e. the nodes contained in N g <2 due to (T1). The nodes within the area surrounded by a dashed line are all the nodes in N g <2 . One can observe that with the simple truncation g †d without (T2), we do not have g †2 ≤ R ⊥ g. The reason in this particular example is the bottommost h-node whose acyclic sharing in g differs from that in the simple truncation g †2 as one of its predecessors was removed due to the truncation. This effect is avoided in our definition of rigid truncation, which always includes all acyclic predecessors of a node. Nevertheless, the simple truncation g †d has its benefits. It is much easier to work with and provides a natural counterpart for the simple partial order ≤ S ⊥ [10]. The following lemma confirms that we were indeed successful in making the truncation of term graphs compatible with the rigid partial order ≤ R ⊥ : Lemma 6.2 (rigid truncation is smaller). Given g ∈ G ∞ (Σ ⊥ ) and d ≤ ω, we have that g ‡d ≤ R ⊥ g. Proof. The cases d = ω and d = 0 are trivial. Assume 0 < d < ω and define the function φ as follows: We will show that φ is a rigid ⊥-homomorphism from g ‡d to g and, thereby, g ‡d ≤ R ⊥ g. Since r g ‡d = r g and r g ‡d ∈ N g <d , we have φ(r g ‡d ) = r g and, therefore, the root condition. Note that all nodes in N g =d are labelled with ⊥ in g ‡d, i.e. all non-⊥-nodes are in N g <d . Thus, the labelling condition is trivially satisfied as for all n ∈ N g <d we have lab g ‡d (n) = lab g (n) = lab g (φ(n)).
For the successor condition, let n ∈ N g <d . If n i ∈ N g =d , then suc g ‡d i (n) = n i . Hence, we have φ(suc g ‡d i (n)) = φ(n i ) = suc g i (n) = suc g i (φ(n)). If, on the other hand, n i ∈ N g =d , then suc g ‡d i (n) = suc g i (n) ∈ N g <d . Hence, we have φ(suc g ‡d i (n)) = φ(suc g i (n)) = suc g i (n) = suc g i (φ(n)). This shows that φ is a ⊥-homomorphism. In order to prove that φ is rigid, we will show that P a g (φ(n)) ⊆ P g ‡d (n) for all n ∈ N g <d , which is sufficient according to Lemma 5.7. Note that we can replace φ(n) by n since n ∈ N g <d . Therefore, we can show this statement by proving ∀π ∈ N * ∀n ∈ N g <d . (π ∈ P a g (n) =⇒ π ∈ P g ‡d (n)) by induction on the length of π. If π = , then n = r g and, therefore, π ∈ P g ‡d (n). If π = π ′ · i , let m = node g (π ′ ). Then we have m ∈ Pre a g (n) and, therefore, m ∈ N g <d by the closure property (T2). And since π ′ ∈ P a g (m), we can apply the induction hypothesis to obtain that π ′ ∈ P g ‡d (m). Moreover, because suc g i (m) = n, this implies that m i ∈ N g =d . Thus, suc g ‡d i (m) = n and, therefore, π ′ · i ∈ P g ‡d (n). Also note that the rigid truncation on term graphs generalises Arnold and Nivat's [4] truncation on terms. Proposition 6.3. For each t ∈ T ∞ (Σ ⊥ ) and d ≤ ω, we have that t ‡d ∼ = t|d.
Proof. For the case that d ∈ {0, ω}, the equation t ‡d = t|d holds trivially. For the other cases, we can easily see that t|d is obtained from t by replacing all subterms at depth d by ⊥. On the other hand, since in a term tree each node has at most one (acyclic) predecessor, which has a strictly smaller depth, we know that the set of retained nodes N t <d is the set of nodes of depth smaller than d and the set of fringe nodes N t =d is the set n i n ∈ N t , depth t (suc t i (n)) = d . Hence, t ‡d is obtained from t by replacing each node at depth d with a fresh node labelled ⊥. We can thus conclude that t ‡d ∼ = t|d.
Consequently, if we use the rigid truncation to define a metric on term graphs analogously to Arnold and Nivat, we obtain a metric on term graphs that generalises the metric d on terms.
6.2. The Effect of Truncation. In order to characterise the effect of a truncation to a term graph, we need to associate an appropriate notion of depth to a whole term graph: Definition 6.4 (symbol/graph depth). Let g ∈ G ∞ (Σ) and ∆ ⊆ Σ.
(i) The depth of g, denoted depth(g), is the least upper bound of the depth of nodes in g, i.e. depth(g) = depth g (n) n ∈ N g .
(ii) The ∆-depth of g, denoted ∆-depth(g), is the minimum depth of nodes in g labelled in ∆, i.e.
Notice the difference between depth and ∆-depth. The former is the least upper bound of the depth of nodes in a term graph whereas the latter is the minimum depth of nodes labelled by a symbol in ∆. Thus, we have that depth(g) = ω iff g is infinite; and ∆-depth(g) = ω iff g does not contain a ∆-node. In the following, we will prove a number of lemmas that show how ∆-homomorphisms preserve the depth of nodes in term graphs. Understanding how ∆-homomorphisms affect the depth of nodes will become important for relating the rigid truncation to the rigid partial order ≤ R ⊥ . Lemma 6.5 (reverse depth preservation of ∆-homomorphisms). Let g, h ∈ G ∞ (Σ) and Proof. We prove the statement by induction on depth h (n). If depth h (n) = 0, then n = r h . With m = r g , we have φ(m) = n and depth g (m) = 0. If depth h (n) > 0, then there is some n ′ ∈ N h with suc h i (n ′ ) = n and depth h (n ′ ) < depth h (n). Hence, we can employ the induction hypothesis to obtain some m ′ ∈ N g with depth g (m ′ ) ≤ depth h (n ′ ) and φ(m ′ ) = n ′ . Since depth g (m ′ ) ≤ depth h (n ′ ) < depth h (n) ≤ ∆-depth(g), we have lab g (m ′ ) ∈ ∆. Hence, φ is homomorphic in m ′ . For m = suc g i (m ′ ), we can then reason as follows: φ(m) = φ(suc g i (m ′ )) = suc h i (φ(m ′ )) = suc h i (n ′ ) = n, and depth g (m) ≤ depth g (m ′ ) + 1 ≤ depth h (n). Lemma 6.6 (∆-depth preservation of ∆-homomorphisms). Let g, h ∈ G ∞ (Σ) and φ : g → ∆ h, then ∆-depth(g) ≤ ∆-depth(h).
Proof. Let n ∈ N h with depth h (n) < ∆-depth(g). To prove the lemma, we have to show that lab h (n) ∈ ∆. According to Lemma 6.5, we find a node m ∈ N g with depth g (m) ≤ depth h (n) < ∆-depth(g) and φ(m) = n. Since then lab g (m) ∈ ∆, we also have lab h (n) ∈ ∆ by the labelling condition for φ.
Proof. If lab g (n) ∈ ∆, then P a g (n) = P a h (φ(n)). Hence, depth g (n) = depth h (φ(n)) follows since a shortest position of a node must be acyclic.
The gaps that are caused by a truncation due to the removal of nodes are filled by fresh ⊥-nodes. The following lemma provides a lower bound for the depth of the introduced ⊥-nodes. Proof. (i) From the proof of Lemma 6.2, we obtain a rigid ⊥-homomorphism φ : g ‡d → ⊥ g.
Note that the only ⊥-nodes in g ‡d are those in N g =d . Each of these nodes has only a single predecessor, a node n ∈ N g <d with depth g (n) ≥ d − 1. By Lemma 6.7, we also have depth g ‡d (n) ≥ d − 1 for these nodes since φ is rigid, n is not labelled with ⊥ and φ(n) = n. Hence, we have depth g ‡d (m) ≥ d for each node m ∈ N g =d . Consequently, it holds that ⊥-depth(g ‡d) ≥ d.
(ii) Note that if d > depth(g) + 1, then N g <d = N g and N g =d = ∅. Hence, g ‡d = g. Remark 6.9. Note that the precondition for the statement of clause (ii) in the lemma above reads d > depth(g) + 1 rather than d > depth(g) as one might expect. The reason for this is that a truncation might cut off an edge that emanates from a node at depth d − 1 and closes a cycle. For an example of this phenomenon, take a look at Figure 7a. It shows a term graph g of depth 1 and its rigid truncation at depth 2. Even though there is no node at depth 2 the truncation introduces a ⊥-node.
On the other hand, although a term graph has depth greater than d, the truncation at depth d might still preserve the whole term graph. An example for this behaviour is the family of term graphs (g n ) n<ω depicted in Figure 7a. Each of the term graphs g n has depth n + 1. Yet, the truncation at depth 2 preserves the whole term graph g n for each n > 0. Even though there might be h-nodes which are at depth ≥ 2 these nodes are directly or indirectly acyclic predecessors of the a-node and are, thus, included in N gn <2 . Intuitively, the following lemma states that a rigid ⊥-homomorphism has the properties of an isomorphism up to the depth of the shallowest ⊥-node: Lemma 6.10 (≤ R ⊥ and rigid truncation). Given g, h ∈ G ∞ (Σ ⊥ ) and d < ω with g ≤ R ⊥ h and ⊥-depth(g) ≥ d, we have that g ‡d ∼ = h ‡d.
The proof of the above lemma is based on a generalisation of Lemma 6.7, which states that rigid ⊥-homomorphisms map non-⊥-nodes to nodes of the same depth. However, since the rigid truncation of a term graph does not only depend on the depth of nodes but also the acyclic sharing in the term graph, we cannot rely on this statement on the depth of nodes alone. The two key components of the proof of Lemma 6.10 are (1) the property of rigid ⊥-homomorphisms to map retained nodes of the source term graph exactly to the retained nodes of the target term graph and (2) that in the same way fringe nodes are exactly mapped to fringe nodes. Showing the isomorphism between g ‡d and h ‡d can thus be reduced to the injectivity on retained nodes in g ‡d which is obtained from the rigid ⊥-homomorphism from g to h by applying Lemma 5.6. The full proof of Lemma 6.10 is given in Appendix B.
We can use the above findings in order to obtain the following properties of truncations that one would intuitively expect from a truncation operation: Lemma 6.11 (smaller truncations). For all g, h ∈ G ∞ (Σ) and e ≤ d ≤ ω, the following holds: (i) g ‡e ∼ = (g ‡d) ‡e , and (ii) g ‡d ∼ = h ‡d =⇒ g ‡e ∼ = h ‡e.
(ii) Since g ‡d ∼ = h ‡d, we also have (g ‡d) ‡e ∼ = (h ‡d) ‡e, as the construction of the truncation only depends on the structure of the term graphs. Hence, using (i) we can conclude g ‡e ∼ = (g ‡d) ‡e ∼ = (h ‡d) ‡e ∼ = h ‡e.

Deriving a Metric on Term Graphs.
We may now define a rigid distance measure on canonical term graphs in the style of Arnold and Nivat: Definition 6.12 (rigid distance). The rigid similarity of two term graphs g, h ∈ G ∞ C (Σ), written sim ‡ (g, h), is the maximum depth at which the rigid truncation of both term graphs coincide: , where we interpret 2 −ω as 0.
Indeed, the resulting distance forms an ultrametric on the set of canonical term graphs: Proposition 6.13 (rigid ultrametric). The pair (G ∞ C (Σ), d ‡ ) forms an ultrametric space. Proof. The identity condition is derived as follows: The symmetry condition is satisfied by the following equational reasoning: For the strong triangle condition, we have to show that which is equivalent to Let d = sim ‡ (g 1 , g 2 ) and e = sim ‡ (g 2 , g 3 ). By symmetry, we can assume w.l.o.g. that d ≤ e, i.e. d = min {sim ‡ (g 1 , g 2 ), sim ‡ (g 2 , g 3 )}. By definition of rigid similarity, we have both g 1 ‡d ∼ = g 2 ‡d and g 2 ‡e ∼ = g 3 ‡e. From the latter we obtain, by Lemma 6.11, that g 2 ‡d ∼ = g 3 ‡d.
That is, g 1 ‡d ∼ = g 2 ‡d ∼ = g 3 ‡d which means that sim ‡ (g 1 , g 3 ) ≥ d. Example 6.14. Figures 8c and 9d on pages 43 and 45, respectively, show two sequences of term graphs that are converging in the metric space (G ∞ C (Σ), d ‡ ). In the sequence (h i ) i<ω from Figure 8c, we have that the rigid truncation at 0 is trivially ⊥ for all term graphs in the sequence. From h 1 onwards, the rigid truncation at 1 is the term tree ⊥ :: ⊥; from h 2 onwards, the rigid truncation at 2 is the term tree b :: ⊥ :: ⊥; etc. Hence, for each n < ω, the metric distance d ‡ (h i , h j ) between two term graphs from h n onwards, i.e. with n ≤ i, j < ω, is at most 2 −n . That is, the sequence (h i ) i<ω is Cauchy. Even more, for the term tree h ω = b :: b :: b :: . . . depicted in Figure 8c we also have that h ω ‡0 = ⊥, h ω ‡1 = ⊥ :: ⊥, h ω ‡2 = b :: ⊥ :: ⊥, etc. Hence, for each n < ω, the metric distance d ‡ (h n , h ω ) is at most 2 −n . That is, the sequence (h i ) i<ω converges to h ω . In a similar fashion, the sequence depicted in Figure 9d converges as well. Figure 9e shows a sequence (g i ) i<ω of term graphs that does not converge. In fact, it is not even Cauchy. To see this, notice that the c-node is at depth 1 in g 0 and at depth 2 from g 1 onwards. As in each term graph g i the c-node is reachable from any node in g i without forming a cycle, we have that each node is an acyclic ancestor of the c-node. That is, whenever the c-node is retained by a rigid truncation, so is any other node. Consequently, we have that g i ‡d = g i for each i < ω and d > 2. Hence, the metric distance d ‡ (g i , g j ) between each pair of term graphs with i = j is at least 2 −2 . That is, (g i ) i<ω is not Cauchy.
Since we defined the metric on term graphs in the same style as Arnold and Nivat [4] defined the partial order d on terms, we can use the correspondence between the rigid truncation and the truncation on terms in order to derive that the metric d ‡ generalises the metric d on terms. Proof. Follows from Proposition 6.3.
From the above observation, we obtain that convergence in the metric space (G ∞ C (Σ), d ‡ ) is a conservative extension of convergence in the metric space (T ∞ (Σ), d): Proof. The "if" direction follows immediately from Corollary 6.15.
For the "iff" direction we assume a sequence S over T ∞ (Σ) that converges to t in (G ∞ C (Σ), d ‡ ). Consequently, S is also Cauchy in (G ∞ C (Σ), d ‡ ). Due to Corollary 6.15, S is then also Cauchy in (T ∞ (Σ), d). Since (T ∞ (Σ), d) is complete, S converges to some term t ′ in (T ∞ (Σ), d). Using the "if" direction of this proposition, we then obtain that S converges to t ′ in (G ∞ C (Σ), d ‡ ). Since limits are unique in metric spaces, we can conclude that t = t ′ .

Metric vs. Partial Order Convergence
In this section we study both the partially ordered set (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) and the metric space (G ∞ C (Σ), d ‡ ). In particular, we are interested in the notion of convergence that each of the two structures provides. We shall show that on total term graphs -i.e. in G ∞ C (Σ) -both structures yield the same notion of convergence. That is, we obtain the same correspondence that we already know from infinitary term rewriting as stated in Theorem 3.3. Moreover, as a side product, this finding will also show the completeness of the metric space (G ∞ C (Σ), d ‡ ). The cornerstone of this comparison of the rigid metric d ‡ and the rigid partial order ≤ R ⊥ is the following characterisation of the rigid similarity sim ‡ (·, ·) in terms of greatest lower bounds in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ): Proposition 7.1 (characterisation of rigid similarity). Let g, h ∈ G ∞ C (Σ) and g ⊓ h the greatest lower bound of g and h in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ). Then sim ‡ (g, h) = ⊥-depth(g ⊓ h). Proof. At first assume that g = h. Hence, g ⊓h = g and, consequently ⊥-depth(g ⊓h) = ω as g does not contain any node labelled ⊥. On the other hand, g = h implies g ‡ω ∼ = h ‡ω, and, therefore, sim ‡ (g, h) = ω. If g = h, then g ∼ = h by Proposition 4.16. Hence, sim ‡ (g, h) < ω. Moreover, since g ∼ = h, we know that g ⊓ h has to be strictly smaller than g or h w.r.t. ≤ R ⊥ . Hence, according to Proposition 5.20, g ⊓ h has to contain some node labelled ⊥, i.e. ⊥-depth(g ⊓ h) < ω as well. We prove that ⊥-depth(g ⊓ h) = sim ‡ (g, h) by showing that both ⊥-depth(g ⊓ h) ≤ sim ‡ (g, h) and ⊥-depth(g ⊓ h) ≥ sim ‡ (g, h) hold.
In order to show the former, let d = ⊥-depth(g ⊓ h). Since g ⊓ h ≤ R ⊥ g, h, we can apply Lemma 6.10 twice in order to obtain g ‡d ∼ = (g ⊓ h) ‡d ∼ = h ‡d. Hence, sim ‡ (g, h) ≥ d.
To show the converse direction, let d = sim ‡ (g, h), i.e. g ‡d ∼ = h ‡d. According to Lemma 6.2, we have both g ‡d ≤ R ⊥ g and h ‡d ≤ R ⊥ h. Note that, for the canonical representation, we then have C(g ‡d) = C(h ‡d), C(g ‡d) ≤ R ⊥ g and C(h ‡d) ≤ R ⊥ h (cf. Proposition 4.16 respectively Remark 5.13). That is, C(g ‡d) is a lower bound of g and h. Thus, C(g ‡d) ≤ R ⊥ g ⊓ h and we can reason as follows: d ≤ ⊥-depth(g ‡d) (Lem. 6.8) = ⊥-depth(C(g ‡d)) (Lem. 6.7, Cor. 5.8) Remark 7.2. From now on, we are not dealing with the concrete construction of rigid truncations g ‡d anymore. Therefore, we will rather use the canonical representation C(g ‡d) of g ‡d. In order to avoid the notational overhead, we write g ‡d instead of C(g ‡d).
This is a contradiction. Hence, g is indeed in G ∞ C (Σ). This result has two obvious but important consequences: firstly, the limit of a converging sequence in the rigid metric space is equal to the limit inferior in the rigid complete semilattice. Secondly, the rigid metric space (G ∞ C (Σ), d ‡ ) is complete: Theorem 7.4 (completeness of rigid metric). The metric space (G ∞ C (Σ), d ‡ ) is complete. Proof. Immediate consequence of Proposition 7.3.
In the following proposition, we show the converse direction of the relation between the limits of the rigid metric and the limit inferiors of the rigid partial order: Proposition 7.5 (total limit inferior = limit). Let (g ι ) ι<α be a non-empty sequence in Proof. If α is a successor ordinal, then both the limit and the limit inferior are equal to g α−1 . Let α be a limit ordinal. According to Proposition 7.3, in order to show that (g ι ) ι<α converges and that its limit coincides with its limit inferior, it suffices to prove that (g ι ) ι<α is Cauchy.
Note that Proposition 7.5 depends on the finiteness of the arity of the symbols in the signature. (This is used in the proof above when observing that a term graph has only finitely many positions of a bounded length.) This restriction also applies to terms as the following example shows: Example 7.6. Let Σ = {f /ω, a/0, b/0} and (t i ) i<ω a sequence with f (a, a, a, a, a . . . ), f (b, a, a, a, a . . . ), f (b, b, a, a, a . . . ), f (b, b, b, a, a . . . ), . . .
(t i ) i<ω has the limit inferior f (b, b, b, b, b, . . . ). On the other hand, the sequence is not even Cauchy since, for each i = j, we have sim ‡ (t i , t j ) = 1 and, therefore, d ‡ (t i , t j ) = 1 2 .

Infinitary Term Graph Rewriting
In the previous sections, we have constructed and investigated the necessary metric and partial order structures upon which the infinitary calculus of term graph rewriting that we shall introduce in this section is based. After describing the framework of term graph rewriting that we consider, we will explore two different modes of convergence on term graphs. In the same way that infinitary term rewriting is based on the abstract notions of m-and p-convergence [6], infinitary term graph rewriting is an instantiation of these abstract modes of convergence for term graphs. However, as in the overview of infinitary term rewriting in Section 2, we restrict ourselves to weak notions of convergence.
8.1. Term Graph Rewriting Systems. In this paper, we adopt the term graph rewriting framework of Barendregt et al. [11]. In order to represent placeholders in rewrite rules, this framework uses variables -in a manner much similar to term rewrite rules. To this end, we consider a signature Σ V = Σ ⊎ V that extends the signature Σ with a set V of nullary variable symbols.
(i) Given a signature Σ, a term graph rule ρ over Σ is a triple (g, l, r) where g is a graph over Σ V and l, r ∈ N g such that all nodes in g are reachable from l or r. We write ρ l respectively ρ r to denote the left-respectively right-hand side of ρ, i.e. the term graph g| l respectively g| r . Additionally, we require that, for each variable v ∈ V, there is at most one node n in g labelled v and that n is different but still reachable from l. (ii) A term graph rewriting system (GRS) R is a pair (Σ, R) with Σ a signature and R a set of term graph rules over Σ.
The requirement that the root l of the left-hand side is not labelled with a variable symbol is analogous to the requirement that the left-hand side of a term rule is not a variable. Similarly, the restriction that nodes labelled with variable symbols must be reachable from the root of the left-hand side corresponds to the restriction on term rules that every variable occurring on the right-hand side of a rule must also occur on the left-hand side. Term graphs can be used to compactly represent terms, which is formalised by the unravelling operator U(·). We extend this operator to term graph rules. Figure 8a illustrates two term graph rules that both represent the term rule a :: x → b :: a :: x from Example 3.1 to which they unravel. Definition 8.2 (unravelling of term graph rules). Let ρ be a term graph rule with ρ l and ρ r its left-respectively right-hand side term graph. The unravelling of ρ, denoted U(ρ) is the term rule U(ρ l ) → U(ρ r ).
The application of a rewrite rule ρ (with root nodes l and r) to a term graph g is performed in three steps: at first a suitable sub-term graph of g rooted in some node n of g is matched against the left-hand side of ρ. This amounts to finding a V-homomorphism φ : ρ l → V g| n from the term graph rooted in l to the sub-term graph rooted in n, the redex. The V-homomorphism φ allows us to instantiate variables in the rule with sub-term graphs of the redex. In the second step, nodes and edges in ρ that are not reachable from l are copied into g, such that each edge pointing to a node m in the term graph rooted in l is redirected to φ(m). In the last step, all edges pointing to n are redirected to (the copy of) r and all nodes not reachable from the root of (the now modified version of) g are removed.
The formal definition of this construction is given below: Definition 8.3 (application of term graph rewrite rules, [11]). Let ρ = (N ρ , lab ρ , suc ρ , l ρ , r ρ ) be a term graph rewrite rule in a GRS R = (Σ, R), g ∈ G ∞ (Σ) with N ρ ∩N g = ∅ and n ∈ N g . ρ is called applicable to g at n if there is a V-homomorphism φ : ρ l → V g| n . φ is called the matching V-homomorphism of the rule application, and g| n is called a ρ-redex. Next, we define the result of the application of the rule ρ to g at n using the V-homomorphism φ. This is done by constructing the intermediate graphs g 1 and g 2 , and the final result g 3 . (i) The graph g 1 is obtained from g by adding the part of ρ that is not contained in its left-hand side: if r ρ ∈ N ρ l and n ′ = r ρ otherwise. The graph g 2 is obtained from g 1 by redirecting edges ending in n to n ′ : The term graph g 3 is obtained by setting the root node r ′ , which is n ′ if n = r g , and otherwise r g . That is, g 3 = g 2 | r ′ . This also means that all nodes not reachable from r ′ are removed. This induces a pre-reduction step ψ = (g, n, ρ, n ′ , g 3 ) from g to g 3 , written ψ : g → n,ρ,n ′ g 3 . In order to indicate the underlying GRS R, we also write ψ : g → R g 3 .
Examples for term graph (pre-)reduction steps are shown in Figure 8. We revisit them in more detail in Example 8.9 below.
The definition of term graph rewriting in the form of pre-reduction steps is very operational in style. The result of applying a rewrite rule to a term graph is constructed in several steps by manipulating nodes and edges explicitly. While this is beneficial for implementing a rewriting system, it is problematic for reasoning on term graphs up to isomorphisms, which is necessary for introducing notions of convergence. In our case, however, this does not cause any harm since the construction in Definition 8.3 is invariant under isomorphism: Proposition 8.4 (pre-reduction steps). Let φ : g → n,ρ,m h be a pre-reduction step in some GRS R and ψ 1 : g ′ ∼ = g. Then there is a pre-reduction step φ ′ : g ′ → n ′ ,ρ,m ′ h ′ with ψ 2 : h ′ ∼ = h such that ψ 2 (n ′ ) = n and ψ 1 (m ′ ) = m.
Proof. Immediate from the construction in Definition 8.3.
The above finding justifies the following definition of reduction steps: Definition 8.5 (reduction steps). Let R = (Σ, R) be a GRS, ρ ∈ R and g, h ∈ G ∞ C (Σ) with n ∈ N g and m ∈ N h . A tuple φ = (g, n, ρ, m, h) is called a reduction step, written φ : g → n,ρ,m h, if there is a pre-reduction step φ ′ : g ′ → n ′ ,ρ,m ′ h ′ with C(g ′ ) = g, C(h ′ ) = h, n = P g ′ (n ′ ), and m = P h ′ (m ′ ). Similarly to pre-reduction steps, we also write φ : g → R h or simply φ : g → h for short.
In other words, a reduction step is a canonicalised pre-reduction step. Note that term graph rules do not provide a duplication mechanism. Each variable is allowed to occur at most once. Duplication must always be simulated by sharing. This means for example that variables that should occur on the right-hand side must share the occurrence of that variable on the left-hand side of the rule with its right-hand side. This can be seen in the term graph rules in Figure 8a. The sharing can be direct as in ρ 1 or indirect as in ρ 2 . For variables that are supposed to be duplicated on the right-hand side, for example in the term rewrite rule h(x) → f (h(x), h(x)), we have to use sharing in order to represent multiple occurrences of the same variable. This representation can be seen in the corresponding term graph rules in Figure 9a.

Convergence of Transfinite Reductions.
We now employ the partial order ≤ R ⊥ and the metric d ‡ for the purpose of defining convergence of transfinite term graph reductions.
The notion of (transfinite) reductions carries over to GRSs straightforwardly: Definition 8.6 (transfinite reductions). Let R = (Σ, R) be a GRS. A (transfinite) reduction in R is a sequence (g ι → R g ι+1 ) i<α of rewriting steps in R.
Analogously to reductions in TRSs, we need a notion of convergence in order to define well-behaved reductions. The two modes of convergence that we introduced for this very purpose in Section 5 and Section 6 are only defined on canonical term graphs. It is therefore crucial to work on reduction steps as opposed to pre-reduction steps. (i) Let S = (g ι → R g ι+1 ) ι<α be a reduction in R. S is m-continuous in R, written S : g 0 ֒→ m R . . . , if the underlying sequence of term graphs (g ι ) ι< α is continuous in R, i.e. lim ι→λ g ι = g λ for each limit ordinal λ < α. S m-converges to g ∈ G ∞ C (Σ) in R, written S : g 0 ֒→ m R g, if it is m-continuous and lim ι→ α g ι = g. (ii) Let R ⊥ be the GRS (Σ ⊥ , R) over the extended signature Σ ⊥ and S = (g ι → R ⊥ g ι+1 ) ι<α a reduction in R ⊥ . S is p-continuous in R, written S : g 0 ֒→ p R . . . , if lim inf ι→λ g i = g λ for each limit ordinal λ < α. S p-converges to g ∈ G ∞ C (Σ ⊥ ) in R, written S : g 0 ֒→ p R g, if it is p-continuous and lim inf ι→ α g i = g. :: :: (c) An m-convergent term graph reduction over ρ1. (iii) Let S = (g ι → R ⊥ g ι+1 ) ι<α be a reduction in R ⊥ . The reduction S is called pcontinuous in G ∞ C (Σ), if it is p-continuous and g ι ∈ G ∞ C (Σ) for all ι < α. The reduction S is said to p-converge in G ∞ C (Σ) to g, if it is p-continuous in G ∞ C (Σ) and p-converges to g ∈ G ∞ C (Σ). Note that, analogously to p-convergence on terms, we extended the signature of R to Σ ⊥ for the definition of p-convergence. Like for terms, this approach serves two purposes. First, by considering the extended signature Σ ⊥ , we allow any partial term graph to appear in a reduction as opposed to only total ones. Consequently, we have the whole complete semilattice (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) at our disposal, which means that p-continuity coincides with p-convergence: Proposition 8.8. In a GRS, every p-continuous reduction is p-convergent.
Proof. Follows immediately from Corollary 5.16.
The second reasons for the extension to R ⊥ is that by not presupposing that the system's signature Σ already contains a designated ⊥-symbol, we rule out the possibility that this ⊥ symbol occurs in one of the rules of the system. Consequently, any ⊥ symbol present in the final term graph of a reduction is either due to the initial term graph or the convergence behaviour. This is crucial for establishing a correspondence result between mand p-convergence in the vein of Theorem 3.3. Example 8.9. Consider the term graph rule ρ 1 in Figure 8a, which unravels to the term rule a :: x → b :: a :: x from Example 3.1. Starting with the term tree a :: c, depicted as g 1 in Figure 8b, we obtain the same transfinite reduction as in Example 3.1: S : a :: c → ρ 1 b :: a :: c → ρ 1 b :: b :: a :: c → ρ 1 . . .
Since the modes of convergence of both the partial order ≤ R ⊥ and the metric d ‡ coincide with the corresponding modes of convergence on terms (cf. Proposition 5.19 respectively Proposition 6.16), we know that, for reductions consisting only of term trees, both m-and p-convergence in GRSs coincide with the corresponding notions of convergence in TRSs. This observation applies to the reduction S above. Hence, also in this setting of term graph rewriting, S both m-and p-converges to the term tree h ω shown in Figure 8c. Similarly, we can reproduce the p-converging but not m-converging reduction T from Example 3.2.
Notice that h ω is a rational term tree as it can be obtained by unravelling the finite term graph g 2 depicted in Figure 8b. In fact, if we use the rule ρ 2 , which unravels to the term rule a :: x → b :: a :: x as well, we can immediately rewrite g 1 to g 2 . In ρ 2 , not only the variable x is shared but the whole left-hand side of the rule. This causes each redex of ρ 2 to be captured by the right-hand side [15]. Figure 8c indicates a transfinite reduction starting with a cyclic term graph h 0 that unravels to the rational term t = a :: a :: a :: . . . . This reduction both m-and p-converges to the rational term tree h ω as well. Again, by using ρ 2 instead of ρ 1 , we can rewrite h 0 to the cyclic term graph g 2 in one step.
For more detailed explanations of the underlying modes of partial order and metric convergence for the reductions above, revisit Example 5.17 and Example 6.14, respectively.
The following theorem shows that the total fragment of p-converging reductions is in fact equivalent to m-converging reductions: For every reduction S in a GRS the following equivalences hold: We only show (i) since (ii) follows similarly.
Let S = (g ι → R ⊥ g ι+1 ) ι<α . For the "only if" direction assume S : g ֒→ p R h is pconverging in G ∞ C (Σ). Since S p-converges in G ∞ C (Σ), it is a reduction in R. The pconvergence of S implies that lim inf ι→λ g i = g λ for each limit ordinal λ < α. Since each g ι is total, we have, according to Proposition 7.5, that lim ι<λ g i = lim inf ι→λ g i = g λ for each limit ordinal λ < α. Hence (g ι ) ι< α is continuous in the metric space. Likewise, we also have lim ι< α g i = lim inf ι→ α g i = h. That is, S m-converges to h. For the "if" direction assume S : g ֒→ m R h. Since (g ι ) ι< α is continuous, we have that lim ι<λ g i = g λ for each limit ordinal λ < α. According to Proposition 7.3, we then have that lim inf ι→λ g i = g λ for each limit ordinal λ < α. Likewise we also have lim inf ι→ α g i = lim ι< α g i = h. Hence, S is p-converging to h. Since S is m-converging it is by definition also in G ∞ C (Σ).
Example 8.11. In order to represent term rewrite rules that are not right-linear, i.e. which have multiple occurrences of the same variable on the right-hand side, we have to use sharing to represent the occurrences of a variable by a single node. Consider the term rewrite rule h(x) → f (h(x), h(x)) that duplicates the variable x on the right-hand side. Note that by repeatedly applying this term rewrite rule starting with term h(c), we obtain a reduction that m-converges to the full binary tree depicted in Figure 9c. Figure 9a shows three different ways of representing the term rewrite rule h(x) → f (h(x), h(x)) as a term graph rule. The rule ρ 3 has the lowest degree of sharing since it shares the variable node directly; ρ 1 has the highest degree of sharing as it shares its complete left-hand side with its right-hand side; ρ 2 lies in between the two.
(e) A p-convergent term graph reduction over ρ3. Figure 9. Term graph rules for duplicating term rewrite rules.

PATRICK BAHR
We have observed in Figure 8a before that, by sharing the complete left-hand side with the right-hand side, the redex gets captured by the right-hand side upon applying the rule to a term graph. This can be seen again in Figure 9b. By applying ρ 1 to the term tree h(c) once, we immediately obtain the cyclic term graph g 1 , which unravels to the full binary tree from Figure 9c.
With the rule ρ 2 , we have to go through an m-convergent reduction of length ω, depicted in Figure 9d, in order to obtain the desired term graph normal form that then unravels to the full binary tree as well.
The same can also be achieved via the rule ρ 3 : Starting from h(c) we can construct a reduction that m-converges directly to the full binary tree in Figure 9c. However, we may also form the reduction shown in Figure 9e in which we always contract the leftmost redex. As we can see in the picture, this means that the c-node remains constantly at depth 2 while still reachable from any other node. As we explained in Example 6.14, this means that the reduction does not m-converge. On the other hand, as described in Example 5.17 the reduction p-converges to the partial term graph g ω . In fact, from this term graph g ω we can then construct a reduction that p-converges to the full binary tree.

Term Graph Rewriting vs. Term Rewriting
In order to assess the value of the modes of convergence on term graphs that we introduced in this paper, we need to compare them to the well-established counterparts on terms. We have already observed that, if restricted to term trees, both the partial order ≤ R ⊥ and the metric d ‡ on term graphs coincide with corresponding structures ≤ ⊥ and d on terms, cf. Corollary 5.11 and Corollary 6.15, respectively. The same holds for the modes of convergence derived from these structures, cf. Proposition 5.19 and Proposition 6.16.

Soundness & Completeness of Infinitary Term Graph
Rewriting. Ideally, we would like to see a strong connection between converging reductions in a GRS R and converging reductions in the TRS U(R) that is its unravelling. For example, for m-convergence we want to see that g ֒→ m R h implies U(g) ֒→ m U(R) U(h) -i.e. soundness -and vice versa that U(g) ֒→ m U(R) U(h) implies g ֒→ m R h -i.e. completeness. Completeness is already an issue for finitary rewriting [11]: a single term graph redex may corresponds to several term redexes due to sharing. Hence, contracting a term graph redex may correspond to several term rewriting steps. For example, given a rewrite rule a → b, we can rewrite f (a, a) to f (a, b), whereas we can rewrite  f (a, a) → f (b, b). That is, in the term graph we cannot choose which of the two term redexes to contract as they are represented by the same term graph redex.
Note that there are techniques to circumvent this problem by incorporating reduction steps that copy nodes in order to reduce the sharing in a term graph [32]. In this paper, however, we are only concerned with pure term graph rewriting steps derived from rewrite rules.
In the context of weak convergence, also soundness becomes an issue. The underlying reason for this issue is similar to the phenomenon explained above: a single term graph rewrite step may represent several term rewriting steps, i.e. g → R h implies U(g) → + U(R) U(h). 2 When we have a converging term graph reduction (g ι → g ι+1 ) ι<α , we know that the underlying sequence of term graphs (g ι ) converges. However, the corresponding term reduction does not necessarily produce the sequence (U(g ι )) but may intersperse the sequence (U(g ι )) with additional intermediate terms, which might change the convergence behaviour.
A similar phenomenon is know in infinitary lambda calculus [25]: while one can simulate certain term rewriting systems with lambda terms, this simulation may fail for infinitary rewriting since a single term rewriting step may require several β-reduction steps. The problem that arises in this setting is that the intermediate terms that are introduced in the lambda reduction may cause the convergence to break.
The same can, in principle, also occur when simulating a term graph reduction by a term reduction. Since a single term graph rewrite step may require several term rewrite steps, we may introduce intermediate terms into the reduction that do not directly correspond to the term graphs in the graph reduction.

Preservation of Convergence under Unravelling.
Due to the abovementioned difficulties, we restrict ourselves in this paper to the soundness of the modes of convergence alone. By soundness in this setting we mean that whenever we have a sequence (g ι ) ι<α of term graphs converging to g, then the sequence (U(g ι )) ι<α converges to U(g). That is, convergence is preserved under unravelling. Since the metric d ‡ on term graphs generalises the metric d on terms, cf. Corollary 6.15, it does not matter whether we consider the convergence of (U(g ι )) ι<α in the metric space (G ∞ C (Σ), d ‡ ) or (T ∞ (Σ), d), according to Proposition 6.16. The same also holds for the limit inferior in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) and (T ∞ (Σ ⊥ ), ≤ ⊥ ), due to Corollary 5.11 and Proposition 5. 19.
The cornerstone of the investigation of the unravelling of term graphs is the following simple characterisation of unravelling in terms of labelled quotient trees: Proposition 9.1. The unravelling U(g) of a term graph g ∈ G ∞ (Σ) is given by the labelled quotient tree (P(g), g(·), I P(g) ).
Proof. Since I P(g) is a subrelation of ∼ g , we know that (P(g), g(·), I P(g) ) is a labelled quotient tree and thus uniquely determines a term tree t. By Lemma 4.19, there is a homomorphism from t to g. Hence, U(g) = t.
Employing the above characterisation, we can easily see that the relation ≤ R ⊥ is preserved under unravelling: Proof. Immediate consequence of Corollary 5.10 and Proposition 9.1.
Likewise, also least upper bounds of ≤ R ⊥ are preserved: Proposition 9.3 (preservation of lubs under unravelling). Given a directed set G in Proof. The fact that {U(g) | g ∈ G } is directed in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) follows from Proposition 9.2. The equality follows from the characterisation of the lub in Theorem 5.12 and from Proposition 9.1.
For greatest lower bounds of ≤ R ⊥ , the situation is more complicated as we have to consider arbitrary non-empty sets of term graphs instead of only directed sets.
We start with the characterisation of glbs in the partially ordered set (T ∞ (Σ ⊥ ), ≤ ⊥ ) of terms. Since this partially ordered set forms a complete semi-lattice, we know that it admits glbs of arbitrary non-empty sets. The following lemma characterises these glbs: Lemma 9.4 (glb on terms). The glb T of a non-empty set T in (T ∞ (Σ ⊥ ), ≤ ⊥ ) is given by the labelled quotient tree (P, l, I P ) where Proof. Special case of Proposition 5.9 in [10].
By combining the above characterisation with the characterisation of unravelled term graphs, we obtain the following: Corollary 9.5. Given a non-empty set G in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ), the glb g∈G U(g) is given by the labelled quotient tree (P, l, I P ) where Proof. Follows immediately from Lemma 9.4 and Proposition 9.1.
Before we deal with the preservation of glbs under unravelling, we need the following property that relates the unravelling of a glb to the original term graphs: Lemma 9.6 (unravelling of a glb). For each non-empty set G in (G ∞ C (Σ ⊥ ), ≤ R ⊥ ), the term t = U( G) satisfies the following for all g ∈ G and π ∈ P(t): Proof. Let g ∈ G, π ∈ P(t), and h = G. Then π ∈ P(h) and h(π) = t(π) according to Proposition 9.1. Since h ≤ R ⊥ g, we may apply Corollary 5.10 to obtain (i) that π ∈ P(g) and (ii) that t(π) = g(π) whenever t(π) ∈ Σ.
Proof. Let s = U g∈G g and t = g∈G U(g). Since both s and t are terms, we can use the characterisation of ≤ ⊥ instead of ≤ R ⊥ . That is, we will show that for each π ∈ P(s), we have that π ∈ P(t) and that t(π) = s(π) whenever s(π) ∈ Σ.
In general, glbs are not fully preserved under unravelling as the following example shows: Example 9.8. Consider the term graphs g and h in Figure 10. The only difference between the two term graphs is the sharing of the arguments of the root node. Due to this difference in sharing, the glb g ⊓ h of the two term graphs is a proper partial term graph as depicted in Figure 10. On the other hand, since the unravelling of the two term graphs coincides, viz. U(g) = U(h) = h, we have that U(g) ⊓ U(h) = h. In particular, we have the strict inequality U(g ⊓ h) < R ⊥ U(g) ⊓ U(h). Unfortunately, this also means that the limit inferior is only weakly preserved under unravelling as well: Proof. This follows from Proposition 9.7 and Proposition 9.3.
Again, we can construct a counterexample that shows that the converse inequality does not hold in general: Example 9.10. Let (g ι ) ι<ω be the sequence alternating between g and h from Figure 10, i.e. g 2ι = g and g 2ι+1 = h for all ι < ω. Then α≤ι<ω g ι = g ⊓ h for each α < ω and, consequently, lim inf ι→ω g ι = g⊓h. As we have seen in Example 9.8, g⊓h is the proper partial term graph depicted in Figure 10. On the other hand, since U(g) = U(h) = h, we have that lim inf ι→ω U(g ι ) = h. In particular, we have the strict inequality U(lim inf ι→ω g ι ) < R ⊥ lim inf ι→ω U(g ι ).
Moreover, we cannot expect that any other partial order with properties comparable to those of ≤ R ⊥ fully preserves the limit inferior under unravelling. The example above shows that any partial order ≤ on partial term graphs whose limit inferior is preserved under unravelling must also satisfy either g ≤ h or h ≤ g for the term graphs in Figure 10. That is, such a partial order has to give up the property that total term graphs are maximal, cf. Proposition 5.20. This observation is independent of whether this partial order specialises to ≤ ⊥ on terms.
The sacrifice for full preservation under unravelling goes even further. If a partial order ≤ on partial term graphs satisfies preservation of its limit inferior under unravelling, the limit inferior lim inf ι→ω g ι of the sequence (g ι ) ι<ω from Example 9.10 has to unravel to h, a total term. That is, lim inf ι→ω g ι has to be a total term graph. On the other hand, there is no metric -or any Hausdorff topology for that matter -for which (g ι ) ι<ω converges at all because (g ι ) ι<ω alternates between two distinct term graphs. In other words, the correspondence between m-and p-convergence, which we have for ≤ R ⊥ as stated in Theorem 8.10, cannot be satisfied for such a partial order ≤, regardless of the metric on term graphs.
The simple partial order ≤ S ⊥ , which we briefly discussed in comparison to the rigid partial order ≤ R ⊥ in Section 5, takes the other side of the trade-off illustrated above: it satisfies g ≤ S ⊥ h and the preservation of the limit inferior under unravelling but sacrifices the correspondence between total term graphs and maximality as well as the correspondence between m-and p-convergence [10].
9.3. Finite Representations of Transfinite Term Reductions. One of the motivations for considering modes of convergence on term graphs in the first place is the study of finite representation of transfinite term reductions as finite term graph reductions. Since both the metric d ‡ and the partial order ≤ R ⊥ specialise to the corresponding structures on terms, we can use both the metric space (G ∞ C (Σ), d ‡ ) and the partially ordered set (G ∞ C (Σ ⊥ ), ≤ R ⊥ ) to move seamlessly from terms to term graphs and vice versa.
For instance, Example 8.11 illustrates reductions that perform essentially the same computations, however, at different levels of sharing / parallelism. This includes the complete lack of sharing as well, i.e. term rewriting. For each of the cases we can use the partially that allows us to compress a transfinite term graph reduction that ends in a finite term graph to a term graph reduction of finite length.
Unfortunately, experience from infinitary term rewriting already shows us that a general compression property -allowing any reduction to be compressed to length at most ω -is not possible for weak convergence [24]. However, the more restrictive version of the compression property that we need, viz. that reductions ending in a finite term graph may be compressed to finite length, does hold for weakly m-converging term reductions [28] and there is hope that this carries over to the term graph rewriting setting.

Conclusions & Future Work
With the goal of generalising infinitary term rewriting to term graphs, we have presented two different modes of convergence for an infinitary calculus of term graph rewriting. The success of this generalisation effort was demonstrated by a number of results. Many of the properties of the modes of convergence on terms have been maintained in this transition to term graphs. First and foremost, this includes the intrinsic completeness properties of the underlying structures, i.e. the metric space is still complete and the partially ordered set still forms a complete semilattice. Moreover, we were also able to maintain the correspondence between p-and m-convergence as well as the intuition of the partial order to capture a notion of information preservation.
An important check for the appropriateness of the modes of convergence on term graphs is their relation to the corresponding modes of convergence on terms. For both the partial order and the metric approach, we have that convergence on term graphs is a conservative extension of the convergence on terms. Conversely, convergence on term graphs carries over to convergence on terms via the unravelling mapping. Unfortunately, this preservation of convergence under unravelling is only weak in the case of the partial order setting; cf. Theorem 9.9. However, as we have explained in Section 9.2, this phenomenon is an unavoidable side effect of the generalisation to term graphs unless other important properties are sacrificed. Fortunately, this phenomenon vanishes in the metric setting and we in fact obtain full preservation of limits under unravelling; cf. Theorem 9.11.
As a result, we have obtained two modes of convergence, which allow us to combine both infinitary term rewriting and term graph rewriting within one theoretical framework. Our motivation for this effort is derived from studying lazy evaluation and the correspondence between infinitary term rewriting and finitary term graph rewriting. For both applications, we still require more understanding of the matter, though: for the former, we still lack at least a treatment of higher-order rewriting whereas we are much closer to the latter. We have discussed issues concerning the correspondence between infinitary term rewriting and finitary term graph rewriting in detail in Section 9.3: while the unified modes of convergence are already helpful for studying infinitary rewriting with a varying degree of sharing, we identified two shortcomings that have to be addressed, viz. the lack of a unifying notion of rewriting for terms and term graphs and a compression property for transfinite term graph reductions.
Apart from the abovementioned issues, future work should also be concerned with establishing a stronger correspondence between infinitary term rewriting and infinitary term graph rewriting beyond the preservation of limits under unravellings, which we showed in this paper. Despite the difficulties that we encountered in Section 9.1, we think that obtaining such results is possible. However, a more promising way of approaching this issue is to restrict the notion of convergence to strong convergence as we know it from infinitary term rewriting [27]. Such a stricter notion of convergence takes the location of a reduction step into consideration and, thus, provides a closer correspondence between term graph reductions and their term rewriting counterparts. Indeed, this technique has been applied successfully to convergence on term graphs based on the simple partial order ≤ S ⊥ , which we briefly discussed in comparison to the rigid partial order ≤ R ⊥ in Section 5, and a corresponding metric [10].
Let g j = (N j , suc j , lab j , r j ), j = 1, 2. Since we are dealing with isomorphism classes, we can assume w.l.o.g. that the nodes in g j are of the form n j for j = 1, 2. Let M = N 1 ⊎ N 2 and define the relation ∼ on M as follows: n j ∼ m k iff P g j (n j ) ∩ P g k (m k ) = ∅ ∼ is clearly reflexive and symmetric. Hence, its transitive closure ∼ + is an equivalence relation on M . Now define the term graph g = (N , lab, suc, r) as follows: suc j i (n j ) ∈ N ′ Note that since ∈ P g 1 (r 1 ) ∩ P g 2 (r 2 ), we also have r = [r 2 ] ∼ + .
Before we argue about the well-definedness of g, we need to establish some auxiliary claims: We show (1) by proving that n j ∼ p m k implies φ j (n j ) = φ k (m k ) by induction on p > 0. If p = 1, then n j ∼ m k . Hence, P g j (n j ) ∩ P g k (m k ) = ∅. Additionally, from Lemma 4.10 we obtain both P g j (n j ) ⊆ P g (φ j (n j )) and P g k (m k ) ⊆ P g (φ k (m k )). Consequently, we also have that P g (φ j (n j )) ∩ P g (φ k (m k )) = ∅, i.e. φ j (n j ) = φ k (m k ). If p = q + 1 > 1, then there is some o l ∈ M with n j ∼ o l and o l ∼ q m k . Applying the induction hypothesis immediately yields φ j (n j ) = φ l (o l ) = φ k (m k ). For (1'), let n j , m k ∈ M with lab j (n j ), lab k (m k ) ∈ Σ and φ j (n j ) = φ k (m k ). Since φ j and φ k are rigid ⊥-homomorphisms, we have the following equations: P a g j (n j ) = P a g (φ j (n j )) = P a g (φ k (m k )) = P a g k (m k ). Hence, P g j (n j ) ∩ P g k (m k ) = ∅ and, therefore, n j ∼ m k .
Next we show that lab is well-defined. To this end, let N ∈ N and n j , m k ∈ N such that lab j (n j ) = f 1 ∈ Σ and lab k (m k ) = f 2 ∈ Σ. We need to show that f 1 = f 2 . By (1), we have that φ j (n j ) = φ k (m k ). Since f 1 , f 2 ∈ Σ, we can employ the labelling condition for φ j and φ k in order to obtain that f 1 = lab j (n j ) = lab(φ j (n j )) = lab(φ k (m k )) = lab k (m k ) = f 2 .
To argue that suc is well-defined, we first have to show for all N ∈ N that suc i (N ) is defined iff i < ar(lab(N )). Suppose that suc i (N ) is defined. Then there is some n j ∈ N such that suc j i (n j ) is defined. Hence, i < ar(lab j (n j )). Since then also lab j (n j ) ∈ Σ, we have lab(N ) = lab j (n j ). Therefore, i < ar(lab(N )). If, conversely, there is some i ∈ N with i < ar(lab(N )), then we know that lab(N ) = f ∈ Σ. Hence, there is some n j ∈ N with lab j (n j ) = f . Hence, i < ar(lab j (n j )) and, therefore, suc j i (n j ) is defined. Hence, suc i (N ) is defined.
To finish the argument showing that suc is well-defined, we have to show that, for all N, N 1 , N 2 ∈ N and n j , m k ∈ N such that suc j i (n j ) ∈ N 1 and suc k i (m k ) ∈ N 2 , we indeed have N 1 = N 2 . As n j , m k ∈ N , we have n j ∼ + m k and, therefore, φ j (n j ) = φ k (n k ) according