The Safe Lambda Calculus

Safety is a syntactic condition of higher-order grammars that constrains occurrences of variables in the production rules according to their type-theoretic order. In this paper, we introduce the safe lambda calculus, which is obtained by transposing (and generalizing) the safety condition to the setting of the simply-typed lambda calculus. In contrast to the original definition of safety, our calculus does not constrain types (to be homogeneous). We show that in the safe lambda calculus, there is no need to rename bound variables when performing substitution, as variable capture is guaranteed not to happen. We also propose an adequate notion of beta-reduction that preserves safety. In the same vein as Schwichtenberg's 1976 characterization of the simply-typed lambda calculus, we show that the numeric functions representable in the safe lambda calculus are exactly the multivariate polynomials; thus conditional is not definable. We also give a characterization of representable word functions. We then study the complexity of deciding beta-eta equality of two safe simply-typed terms and show that this problem is PSPACE-hard. Finally we give a game-semantic analysis of safety: We show that safe terms are denoted by `P-incrementally justified strategies'. Consequently pointers in the game semantics of safe lambda-terms are only necessary from order 4 onwards.


Introduction
Background. The safety condition was introduced by Knapik, Niwiński and Urzyczyn at FoSSaCS 2002 [19] in a seminal study of the algorithmics of infinite trees generated by higher-order grammars. The idea, however, goes back some twenty years to Damm [10] who introduced an essentially equivalent 1 syntactic restriction (for generators of word languages) in the form of derived types. A higher-order grammar (that is assumed to be homogeneously typed) is said to be safe if it obeys certain syntactic conditions that constrain the occurrences of variables in the production (or rewrite) rules according to their typetheoretic order. Though the formal definition of safety is somewhat intricate, the condition itself is manifestly important. As we survey in the following, higher-order safe grammars capture fundamental structures in computation and offer clear algorithmic advantages: • Word languages. Damm and Goerdt [11] have shown that the word languages generated by order-n safe grammars form an infinite hierarchy as n varies over the natural numbers. The hierarchy gives an attractive classification of the semi-decidable languages: Levels 0, 1 and 2 of the hierarchy are respectively the regular, context-free, and indexed languages (in the sense of Aho [5]), although little is known about higher orders.
Remarkably, for generating word languages, order-n safe grammars are equivalent to order-n pushdown automata [11], which are in turn equivalent to order-n indexed grammars [24,25].
• Trees. Knapik et al. have shown that the Monadic Second Order (MSO) theories of trees generated by safe (deterministic) grammars of every finite order are decidable 2 . They have also generalized the equi-expressivity result due to Damm and Goerdt [11] to an equivalence result with respect to generating trees: A ranked tree is generated by an order-n safe grammar if and only if it is generated by an order-n pushdown automaton.
• Graphs. Caucal [9] has shown that the MSO theories of graphs generated 3 by safe grammars of every finite order are decidable. Recently Hague et al. have shown that the MSO theories of graphs generated by order-n unsafe grammars are undecidable, but deciding their modal mu-calculus theories is n-EXPTIME complete [17].
Overview. In this paper, we examine the safety condition in the setting of the lambda calculus. Our first task is to transpose it to the lambda calculus and express it as an appropriate sub-system of the simply-typed theory. A first version of the safe lambda calculus has appeared in an unpublished technical report [4]. Here we propose a more general and cleaner version where terms are no longer required to be homogeneously typed (see Section 1 for a definition). The formation rules of the calculus are designed to maintain a simple invariant: Variables that occur free in a safe λ-term have orders no smaller than that of the term itself. We can now explain the sense in which the safe lambda calculus is safe by establishing its salient property: No variable capture can ever occur when substituting a safe term into another. In other words, in the safe lambda calculus, it is safe to use capture-permitting substitution when performing β-reduction.
There is no need for new names when computing β-reductions of safe λ-terms, because one can safely "reuse" variable names in the input term. Safe lambda calculus is thus cheaper to compute in this naïve sense. Intuitively one would expect the safety constraint to lower the expressivity of the simply-typed lambda calculus. Our next contribution is to give a precise measure of the expressivity deficit of the safe lambda calculus. An old result of Schwichtenberg [34] says that the numeric functions representable in the simply-typed lambda calculus are exactly the multivariate polynomials extended with the conditional function. In the same vein, we show that the numeric functions representable in the safe lambda calculus are exactly the multivariate polynomials. 2 It has recently been shown [30] that trees generated by unsafe deterministic grammars (of every finite order) also have decidable MSO theories. More precisely, the MSO theory of trees generated by order-n recursion schemes is n-EXPTIME complete. 3 These are precisely the configuration graphs of higher-order pushdown systems.
Our last contribution is to give a game-semantic account of the safe lambda calculus. Using a correspondence result relating the game semantics of a λ-term M to a set of traversals [30] over a certain abstract syntax tree of the η-long form of M (called computation tree), we show that safe terms are denoted by P-incrementally justified strategies. In such a strategy, pointers emanating from the P-moves of a play are uniquely reconstructible from the underlying sequence of moves and the pointers associated to the O-moves therein: Specifically, a P-question always points to the last pending O-question (in the P-view) of a greater order. Consequently pointers in the game semantics of safe λ-terms are only necessary from order 4 onwards. Finally we prove that a β-normal λ-term is safe if and only if its strategy denotation is (innocent and) P-incrementally justified.

The safe lambda calculus
Higher-order safe grammars. We first present the safety restriction as it was originally defined [19]. We consider simple types generated by the grammar A ::= o | A → A. By convention, → associates to the right. Thus every type can be written as A 1 → · · · → A n → o, which we shall abbreviate to (A 1 , · · · , A n , o) (in case n = 0, we identify (o) with o). We will also use the notation A n → B for every types A, B and positive natural number n > 0 defined by induction as: The order of a type is given by ord o = 0 and ord(A → B) = max(ord A + 1, ord B). We assume an infinite set of typed variables. The order of a typed term or symbol is defined to be the order of its type. The set of applicative terms over a set of typed symbols is defined as its closure under the application operation (i.e., if M : A → B and N : A are in the closure then so does M N : B).
A (higher-order) grammar is a tuple Σ, N , R, S , where Σ is a ranked alphabet (in the sense that each symbol f ∈ Σ is assumed to have type o r → o where r is the arity of f ) of terminals; N is a finite set of typed non-terminals; S is a distinguished ground-type symbol of N , called the start symbol; R is a finite set of production (or rewrite) rules, one for each non-terminal F : (A 1 , . . . , A n , o) ∈ N , of the form F z 1 . . . z m → e where each z i (called parameter ) is a variable of type A i and e is an applicative term of type o generated from the typed symbols in Σ ∪ N ∪ {z 1 , . . . , z m }. We say that the grammar is order-n just in case the order of the highest-order non-terminal is n.
We call higher-order recursion scheme a higher-order grammar that is deterministic (i.e., for each non-terminal F ∈ N there is exactly one production rule with F on the left hand side). Higher-order recursion schemes are used as generators of infinite trees. The tree generated by a recursion scheme G is a possibly infinite applicative term, but viewed as a Σ-labelled tree; it is constructed from the terminals in Σ, and is obtained by unfolding the rewrite rules of G ad infinitum, replacing formal by actual parameters each time, starting from the start symbol S. See e.g. [19] for a formal definition.  Let G be the following order-2 recursion scheme: where the arities of the terminals g, h, a are 2, 1, 0 respectively. The tree generated by G is defined by the infinite term g a (g a (h (h (h · · · )))).
A type (A 1 , · · · , A n , o) is said to be homogeneous if ord A 1 ≥ ord A 2 ≥ · · · ≥ ord A n , and each A 1 , . . . , A n is homogeneous [19]. We reproduce the following Knapik et al.'s definition [19]. Definition 1.2 (Safe grammar). (All types are assumed to be homogeneous.) A term of order k > 0 is unsafe if it contains an occurrence of a parameter of order strictly less than k, otherwise the term is safe. An occurrence of an unsafe term t as a subexpression of a term t ′ is safe if it is in the context · · · (ts) · · · , otherwise the occurrence is unsafe. A grammar is safe if no unsafe term has an unsafe occurrence at a right-hand side of any production.
The order-2 grammar defined in Example 1.1 is unsafe.
Safety adapted to the lambda calculus. We assume a set Ξ of higher-order constants. We use sequents of the form Γ ⊢ Ξ $ M : A to represent term-in-context where Γ is the context and A is the type of M . For convenience, we shall omit the superscript from ⊢ Ξ s whenever the set of constants Ξ is clear from the context. The subscript in ⊢ Ξ $ specifies which type system is used to form the judgement: We use the subscript 'st' to refer to the traditional system of rules of the Church-style simply-typed lambda calculus augmented with constants from Ξ. We will introduce a new subscripts for each type system that we define. For simplicity we write (A 1 , · · · , A n , B) to mean A 1 → · · · → A n → B, where B is not necessarily ground. Definition 1.4. (i) The safe lambda calculus is a sub-system of the simply-typed lambda calculus. It is defined as the set of judgements of the form Γ ⊢ s M : A that are derivable from the following Church-style system of rules: where ord Γ denotes the set {ord y : y ∈ Γ} and "c ≤ S" means that c is a lower-bound of the set S. The subscripts in ⊢ s and ⊢ asa stand for "safe" and "almost safe application".
(ii) The sub-system that is defined by the same rules in (i), such that all types that occur in them are homogeneous, is called the homogeneous safe lambda calculus.
(iii) We say that a term M is safe if the judgement Γ ⊢ s M : T is derivable in the safe lambda calculus for some context Γ and type T .
The safe lambda calculus deviates from the standard definition of the simply-typed lambda calculus in a number of ways. First the rule (abs) can abstract several variables at once. (Of course this feature alone does not alter expressivity.) Crucially, the side conditions in the application rule and abstraction rule require the variables in the typing context to have orders no smaller than that of the term being formed. We do not impose any constraint on types. In particular, type-homogeneity, which was an assumption of the original definition of safe grammars [19], is not required here. Another difference is that we allow Ξ-constants to have arbitrary higher-order types.
The term M 2 is not safe because in the subterm f (λy o .x), the free variable x has order 0 which is smaller than ord(λy o .x) = 1. On the other hand, M 1 is safe.
It is easy to see that valid typing judgements of the safe lambda calculus satisfy the following simple invariant: Lemma 1.6. If Γ ⊢ s M : A then every variable in Γ occurring free in M has order at least ord M . Definition 1.7. A term is an almost safe applications if it is safe or if it is of the form N 1 . . . N m for some m ≥ 1 where N 1 is not an application and for every 1 ≤ i ≤ m, N i is safe.
A term is almost safe if either it is an almost safe application, or if it is of the form λx A 1 1 . . . x An n .M for n ≥ 1 and some almost safe application M . An almost safe application is not necessarily safe but it can be used to form a safe term by applying sufficiently many safe terms to it. An almost safe term can be turned into a safe term by either applying sufficiently many safe terms (if it is an application), or by abstracting sufficiently many variables (if it is an abstraction).
We have the following immediate lemma: In particular, terms constructed with the rule (app as ) are almost safe applications.
When restricted to the homogeneously-typed sub-system, the safe lambda calculus captures the original notion of safety due to Knapik et al. in the context of higher-order grammars: Proposition 1.9. Let G = Σ, N , R, S be a grammar and let e be an applicative term generated from the symbols in Proof. We show by induction that (i) z 1 , . . . , z m ⊢ asa t : A is a valid judgement of the homogeneous safe lambda calculus containing no abstraction if and only if in the Knapik sense, all the occurrences of unsafe subterms of t are safe occurrences.
(ii) z 1 , . . . , z m ⊢ s t : A is a valid judgement of the homogeneous safe lambda calculus containing no abstraction if and only if in the Knapik sense, all the occurrences of unsafe subterms of t are safe occurrences, and all parameters occurring in t have order greater than ord t. The constant and variable rule are trivial. Application case: By definition, a term t 0 . . . t n is Knapik-safe iff for all 0 ≤ i ≤ n, all the occurrences of unsafe subterms of t i are safe occurrences (in the Knapik sense), and for all 1 ≤ j ≤ n, the operands occurring in t j have order greater than ord t j . The (app as ) rule and the induction hypothesis permit us to conclude.
Now since e is an applicative term of ground type, the previous result gives: z 1 , . . . , z m ⊢ s e : o is a valid judgement of the homogeneous safe lambda calculus iff all the occurrences of unsafe subterms of e are safe occurrences, which by definition of Knapik-safety is in turn equivalent to saying that the rule F z 1 . . . z m → e is safe.
In what sense is the safe lambda calculus safe? It is an elementary fact that when performing β-reduction in the lambda calculus, one must use capture-avoiding substitution, which is standardly implemented by renaming bound variables afresh upon each substitution. In the safe lambda calculus, however, variable capture can never happen (as the following lemma shows). Substitution can therefore be implemented simply by capturepermitting replacement, without any need for variable renaming. In the following, we write M {N/x} to denote the capture-permitting substitution 4 of N for x in M .  only happen if the following two conditions are met: (i) x occurs freely in M i , (ii) some variable y i for 1 ≤ i ≤ p occurs freely in N . By Lemma 1.6, (ii) implies ord y i ≥ ord N = ord x and since x ∈ y, condition (i) implies that x occurs freely in the safe term λy.R thus by Lemma 1.6 we have ord x ≥ ord λy.R ≥ 1 + ord y i > ord y i which gives a contradiction. Remark 1.11. A version of the No-variable-capture Lemma also holds in safe grammars, as is implicit in (for example Lemma 3.2 of) the original paper [19]. one should rename the bound variable x to a fresh name to prevent the capture of the free occurrence of x in the underlined term during substitution. Consequently, by the previous lemma, the term is not safe (because ord x = 0 < 1 = ord f x).
Note that λ-terms that 'satisfy' the No-variable-capture Lemma are not necessarily safe. For instance the β-redex in λy o z o .(λx o .y)z can be contracted using capture-permitting substitution, even though the term is not safe.
Related work: In her thesis [12], de Miranda proposed a different notion of safe lambda calculus. This notion corresponds to (a less general version of) our notion of homogeneous safe lambda calculus. It can be showed that for pure applicative terms (i.e., with no lambdaabstraction) the two systems coincide. In particular a version of Proposition 1.9 also holds in de Miranda's setting [12]. In the presence of lambda abstraction, however, our system is less restrictive. For instance the term λf (o,o,o) x o .f x : (o, o) is typable in the homogeneous safe lambda calculus but not in the safe lambda calculusà la de Miranda. One can show that de Miranda's system is in fact equivalent to the homogeneous long-safe lambda calculus (i.e., the restriction of the system of Def. 1.21 to homogeneous types).
Safe beta reduction. From now on we will use the standard notation M [N/x] to denote the substitution of N for x in M . It is understood that, provided that M and N are safe, this substitution is capture-permitting. This is proved by an easy induction on the structure of the safe term M . It is desirable to have an appropriate notion of reduction for our calculus. However the standard β-reduction rule is not adequate. Indeed, safety is not preserved by β-reduction as the following example shows. Suppose that w, z : o and f : (o, o, o) ∈ Σ then the safe term (λx o y o .f xy)zw β-reduces to (λy o .f zy)w, which is unsafe since the underlined first-order subterm contains a free occurrence of the ground-type variable z. However if we perform one more reduction we obtain the safe term f zw. This suggests simultaneous contraction of "consecutive" β-redexes. In order to define this notion of reduction we first introduce the corresponding notion of redex.
In the simply-typed lambda calculus a redex is a term of the form (λx.M )N . In the safe lambda calculus, a redex is a succession of several standard redexes: Definition 1.14. A safe redex is an almost safe application of the form . . x An n .M )N 1 . . . N l for l, n ≥ 1 such that M is an almost safe application. (Consequently each N i is safe as well as λx A 1 1 . . . x An n .M , and M is either safe or is an application of safe terms.) For instance, in the case n < l, a safe redex has a derivation tree of the following form: . . x An n .M )N 1 . . . N l : B A safe redex is by definition an almost term, but it is not necessarily a safe term. For instance the term (λx o y o .x)z is a safe redex but it is only an almost safe term. The reason why we call such redexes "safe" is because when they occur within a safe term, it is possible to contract them without braking the safety of the whole term. Before showing this result, we first need to define how to contract safe redexes: Definition 1.15 (Redex contraction). We use the abbreviations x = x 1 . . . x n , N = N 1 . . . N l . The relation β s (when viewed as a function) is defined on the set of safe redexes as follows:  . . x l ≡ M 2 is almost safe. (ii) Suppose that M 1 is safe. W.l.o.g. we can assume that the last rule used to form M 1 is (app) (and not the weakening rule (wk)), thus the variables of the typing context Γ are precisely the free variables of M 1 , and Lemma 1.6 gives us ord A ≤ ord Γ. This allows us to use the rule (abs) to form the safe term-in-context Γ ⊢ s λx (ii) Suppose that M 1 is safe. If n = l then M 2 ≡ M [N 1 . . . N n /x] is safe by the Substitution Lemma; If n < l then we obtain the judgement Γ ⊢ s M 2 : A by applying the rule (app as ) l − n − 1 times on Γ ⊢ s M [N 1 . . . N n /x] : C followed by one application of (app).
We can now define a notion of reduction for safe terms.   (i) is a subset of the transitive closure of → β (→ βs ⊂։ β ); (ii) is strongly normalizing; (iii) has the unique normal form property; (iv) has the Church-Rosser property.
Proof. (i) Immediate from the definition: Safe β-reduction is just a multi-step β-reduction.
(ii) This is because → βs ⊂։ β and, → β is strongly normalizing in the simply-typed λcalculus. (iii) It is easy to see that if a safe term has a beta-redex if and only if it has a safe beta-redex (because a beta-redex can always be "widen" into consecutive beta-redex of the shape of those in Def. 1.15). Therefore the set of β s -normal forms is equal to the set of β s -normal forms. The uniqueness of β-normal form then implies the uniqueness of β s -normal form. (iv) is a consequence of (i) and (ii).
Eta-long expansion. The η-long normal form (or simply η-long form) of a term is obtained by hereditarily η-expanding the body of every lambda abstraction as well as every subterm occurring in an operand position (i.e., occurring as the second argument of some occurrence of the binary application operator). Formally the η-long form, written ⌈M ⌉, of a (typeannotated) term M of type (A 1 , . . . , A n , o) with n ≥ 0 is defined by cases according to the syntactic shape of M : where m ≥ 0, p ≥ 1, x is either a variable or constant, ϕ = ϕ 1 . . . ϕ n and each ϕ i : A i is a fresh variable. The binder notation 'λϕ A ' stands for 'λϕ A 1 1 . . . ϕ An n ' if n ≥ 1, and for 'λ' (called the dummy lambda) in the case n = 0. The base case of this inductive definition lies in the second clause for m = n = 0: ⌈x⌉ ≡ λ.x. Remark 1.20. This transformation does not introduce new redexes therefore the η-long normal form of a β-normal term is also β-normal.
Let us introduce a new typing system: Definition 1.21. We define the set of long-safe terms by induction over the following system of rules: The subscript in ⊢ l stands for "long-safe". This terminology is deliberately suggestive of a forthcoming lemma. Note that long-safe terms are not necessarily in η-long normal form.
Observe that the system of rules from Def. 1.21 is a sub-system of the typing system of Def. 1.4 where the application rule is restricted the same way as the abstraction rule (i.e., it can perform multiple applications at once provided that all the variables in the context of the resulting term have order greater than the order of the term itself). Thus we clearly have:

Lemma 1.22. If a term is long-safe then it is safe.
In general, long-safety is not preserved by η-expansion. Proof. Suppose Γ ⊢ l λϕ τ .M ϕ : A. If M is an abstraction then by construction of M is necessarily safe. If M ≡ N 0 . . . N p with p ≥ 1 then again, since λϕ τ .N 0 . . . N p ϕ is safe, each of the N i is safe for 0 ≤ i ≤ p and for every variable z occurring free in λϕ.M ϕ, ord z ≥ ord(λϕ τ .M ϕ) = ord M . Since ϕ does not occur free in M , the terms M and λϕ τ .M ϕ have the same set of free variables, thus we can use the application rule to form Γ ′ ⊢ l N 0 . . . N p : A where Γ ′ consists of the typing-assignments for the free variables of M . The weakening rules permits us to conclude Γ ⊢ l M : A.
Proof. First we observe that for every variable or constant x : A we have x : A ⊢ l ⌈x⌉ : A. We show this by induction on ord x. It is verified for every ground type variable x since x = ⌈x⌉.
We now prove the lemma by induction on M . The base case is covered by the previous observation.
Step case: By the previous observation we have ϕ i : A i ⊢ l ⌈ϕ i ⌉ : A i , the weakening rule then gives us Γ, ϕ : A ⊢ l ⌈ϕ i ⌉ : A i . Since the judgement Γ ⊢ l xN 1 . . . N m : A is formed using the (app l ) rule, each N j must be long-safe for 1 ≤ j ≤ m, thus by the induction hypothesis we have Γ ⊢ l ⌈N j ⌉ : B j and by weakening we get Γ, ϕ : There are also closed terms in eta-normal form that are not long-safe but have an η-long normal form that is long-safe! Take for instance the closed βη-normal term After performing η-long expansion of a term, all the occurrences of the application rule are made long-safe. Thus if a term remains not long-safe after η-long expansion, this means that some variable occurrence is not bound by the first following application of the (abs) rule in the typing tree. . . ϕ m . By assumption this term is long-safe therefore we have ord A ≤ ord Γ and for 1 ≤ i ≤ m, ⌈N i ⌉ is also long-safe. By the induction hypothesis this implies that the N i s are all safe. We can then form the judgement Γ ⊢ s xN 1 . . . N m : A using the rules (var) and (δ) followed by m − 1 applications of the rule (app as ) and one application of (app) (this is allowed since we have ord A ≤ ord Γ).
By assumption, its η-long n.f. λx B ϕ C .⌈N ⌉⌈ϕ 1 ⌉ . . . ⌈ϕ m ⌉ : A (for some fresh variables ϕ = ϕ 1 . . . ϕ m and types C = C 1 . . . C m ) is long-safe. Thus we have ord A ≤ ord Γ. Furthermore the long-safe subterm ⌈N ⌉⌈ϕ 1 ⌉ . . . ⌈ϕ m ⌉ is precisely the eta-long normal form of N ϕ 1 . . . ϕ m : o therefore by the induction hypothesis we have that N ϕ 1 . . . ϕ m : o is safe. Since the ϕ i 's are all safe (by rule (var)), we can "peel-off" m applications (performed using the rules (app as ) or (app)) from the sequent Γ, x : B, ϕ : C ⊢ s N ϕ 1 . . . ϕ m : o which gives us the sequent Γ, x : B, ϕ : C ⊢ asa N : A. Since the variables ϕ are fresh for N , we can further peel-off applications of the weakening rule to obtain the judgement Γ, x : B ⊢ s N : A.
Finally since we have ord A ≤ ord Γ, we can use the rule (abs) to form the sequent Γ ⊢ s λx B .N : A.
The type inhabitation problem. It is well known that the simply-typed lambda calculus corresponds to intuitionistic implicative logic via the Curry-Howard isomorphism.
The theorems of the logic correspond to inhabited types, and every inhabitant of a type represents a proof of the corresponding formula. Similarly, we can consider the fragment of intuitionistic implicative logic that corresponds to the safe lambda calculus under the Curry-Howard isomorphism; we call it the safe fragment of intuitionistic implicative logic.
We would like to compare the reasoning power of these two logics, in other words, to determine which types are inhabited in the lambda calculus but not in the safe lambda calculus. 5 If types are generated from a single atom o, then there is a positive answer: Every type generated from one atom that is inhabited in the lambda calculus is also inhabited in the safe lambda calculus. Indeed, one can transform any unsafe inhabitant M into a safe one of the same type as follows: Compute the eta-long beta normal form of M . Let x be an occurrence of a ground-type variable in a subterm of the form λx.C[x] where λx is the binder of x and for some context C[−] different from the identity (defined as C[R] ≡ R for all R). We replace the subterm λx.C[x] by λx.x in M . This transformation is sound because both C[x] and x are of the same ground type. We repeat this procedure until the term stabilizes. This procedure clearly terminates since the size of the term decreases strictly after each step. The final term obtained is safe and of the same type as M .
This argument cannot be generalized to types generated from multiple atoms. In fact there are order-3 types with only 2 atoms that are inhabited in the simply-typed lambda calculus but not in the safe lambda calculus. Take for instance the order-3 type (((b, a), b), ((a, b), a), a) for some distinct atoms a and b. It is only inhabited by the following family of terms which are all unsafe: where i = 1, 2 . . . 5 This problem was raised to our attention by Ugo dal Lago.

THE SAFE LAMBDA CALCULUS 13
Another example is the type of function composition. For any atom a and natural number n ∈ N, we define the types n a as follows: 0 a = a and (n + 1) a = n a → a. Take three distinct atoms a, b and c. For any i, j, k ∈ N, we write σ(i, j, k) to denote the type For all i, j, k, this type is inhabited in the lambda calculus by the "function composition term": λxyz.y(x z) . This term is safe if and only if i ≥ j (for the subterm x z is safe iff i = ord(i a ) = ord z ≥ ord(x z) = ord(j b ) = j). In the case i < j, the type σ(i, j, k) may still be safely inhabited. For instance σ(1, 3, 4) is inhabited by the safe term The order-4 type σ(0, 2, 0), however, is only inhabited by the unsafe term λxyz.y(xz).
Statman showed [35] that the problem of deciding whether a type defined over an infinite number of ground atoms is inhabited (or equivalently of deciding validity of an intuitionistic implicative formula) is PSPACE-complete. The previous observations suggest that the validity problem for the safe fragment of implicative logic may not be PSPACEhard. Schwichtenberg [34] showed the following: (Schwichtenberg, 1976). The numeric functions representable by simplytyped lambda-terms of type I → . . . → I using the Church Numeral encoding are exactly the multivariate polynomials extended with the conditional function.

Expressivity
If we restrict ourselves to safe terms, the representable functions are exactly the multivariate polynomials: .n(mα). These terms are all safe, furthermore function composition can be safely encoded: take a function g : N n → N represented by safe term G of type I n → I and functions f 1 , . . . , f n : N p → N represented by safe terms F 1 , . . . F n respectively then the composed function (x 1 , · · · , x p ) → g(f 1 (x 1 , . . . , x p ), . . . , f n (x 1 , . . . , x p )) is represented by the safe term λc 1 . . . c p .G(F 1 c 1 . . . c p ) . . . (F n c 1 . . . c p ). Hence any multivariate polynomial P (n 1 , . . . , n k ) can be computed by composing the addition and multiplication terms as appropriate.
For the converse, let U be a safe lambda-term of type I → I → I. The generalization to terms of type I n → I for every n ∈ N is immediate (they correspond to polynomials with n variables). By Lemma 1.27, safety is preserved by η-long normal expansion therefore we can assume that U is in η-long normal form.
Let N τ Σ denote the set of safe η-long β-normal terms of type τ with free variables in Σ, and A τ Σ for the set of β-normal terms of type τ with free variables in Σ and of the form ϕs 1 . . . s m for some variable ϕ : Observe that the set A o Σ contains only safe terms but the sets A τ Σ in general may contain unsafe terms. Let Σ denote the alphabet {x, y : I, z : o, α : o → o}. By an easy reasoning (See the term grammar construction of Zaionc [37]), we can derive the following equations inducing a grammar over the set of terminals Σ ∪ {λxyαz., λz.} that generates precisely the terms of N (I,I,I) ∅ : The key rule is the fourth one: had we not imposed the safety constraint the right-hand side would instead be of the form Σ∪{w:o} . Here the safety constraint imposes to abstract all the ground type variables occurring freely, thus only one free variable of ground type can appear in the term and we can choose it to be named z up to α-conversion.
We extend the notion of representability to terms of type o, (o, o) and I with free variables in Σ as follows: A function f : We now show by induction on the grammar rules that any term generated by the grammar represents some polynomial: Base case: The term x and y represent the projection functions (m, n) → m and (m, n) → n respectively. The term α and z represent the constant functions (m, n) → 1 and (m, n) → 0 respectively.
Step case: The first and fourth rule are trivial: for F ∈ A o Σ , the terms λz.F and λxyαz.F represent the same function as F . We now consider the second and third rule. We observe that for m, p, p ′ ≥ 0 we have represent the functions f and g respectively then by (i), F G represents the function f × g.
Σ represent the functions f and g then by (ii), F G represents the function f + g.
Hence U represents some polynomial: for all m, n ∈ N we have U m n = β λαz.α p(m,n) z where p(m, n) = 0≤k≤d m i k n j k for some i k , j k ≥ 0, d ≥ 0.
Corollary 2.3. The conditional operator C : I → I → I → I satisfying: is not definable in the simply-typed safe lambda calculus.  [34] to define the conditional operator is unsafe since the underlined subterm, which is of order 1, occurs at an operand position and contains an occurrence of x of order 0.
(i) This corollary tells us that the conditional function is not definable when numbers are represented by the Church Numerals. It may still be possible, however, to represent the conditional function using a different encoding for natural numbers. One way to compensate for the loss of expressivity caused by the safety constraint is to introduce countably many domains of representation for natural numbers. Such a technique is used to represent the predecessor function in the simply-typed lambda calculus [14]. (iii) It is also possible to define a conditional operator behaving like the conditional operator C in the second-order lambda calculus [14]: natural numbers are represented by terms n ≡ Λt.λs t→t z t .s n (z) of type J ≡ ∆t.(t → t) → (t → t) and the conditional is encoded by the term λF J G J H J .F J (λu J .G) H. Whether this term is safe or not cannot be answered just yet as we do not have a notion of safety for second-order typed terms.

2.2.
Word functions definable in the safe lambda calculus. Schwichtenberg's result on numeric functions definable in the lambda calculus was extended to richer structures: Zaionc studied the problem for word functions, then functions over trees and eventually the general case of functions over free algebras [20,39,38,37,40]. In this section we consider the case of word functions expressible in the safe lambda calculus.
Word functions. We consider a binary alphabet Σ = {a, b}. The result of this section naturally extends to all finite alphabets. We consider the set Σ * of all words over Σ. The empty words is denoted ǫ. We write |w| to denote the length of the word w ∈ Σ * . For any k ∈ N we write k to denote the word a . . . a with k occurrences of a, so that |k| = k. For any n ≥ 1 and k ≥ 0, we write c(n, k) for the n-ary function (Σ * ) n → Σ * that maps all inputs to the word k. We consider various word functions. Let x, y, z be words over Σ: • Concatenation app : (Σ * ) 2 → Σ * . The word app(x, y) is the concatenation of x and y.
Additional operations can be obtained by combining the above functions [39]: sub(x, b, a)), b, a).
• Occurrence check occ l : Σ * → Σ * of the letter l ∈ Σ (returns 1 if the word contains an occurrence of l and 0 otherwise) is defined by occ l (x) = sq(sub(x, l, ǫ)).
Representability. We consider equality of terms modulo α, β and η conversion, and we write M = βη N to denote this equality. For every simple type τ , we write Cl(τ ) for the set of closed terms of type τ (modulo α, β and η conversion).
called the binary word type [37]. There is a 1-1 correspondence between words over Σ and closed terms of type B. Think of the first two parameters as concatenators for 'a' and 'b' respectively, and the third parameter as the constructor for the empty word. Thus the empty word ǫ is represented by . For any word w ∈ Σ * we write w to denote the term representation obtained that way. We say that the word function h : (Σ * ) n → Σ * is represented by a closed term H ∈ Cl(B n → B) just if for all x 1 , . . . , x n ∈ B * , Hx 1 . . . x n = βη hx 1 . . . x n .
Example 2.6. The word functions app, sub, cut a , cut b , sq, sq, occ a , occ b defined above are respectively represented by the following lambda-terms: Zaionc [37] showed that the λ-definable word functions are generated by a finite base in the following sense: Theorem 2.7 (Zaionc [37]). The set of λ-definable word functions is the minimal set containing: (i) the constant functions; (ii) the projections; (iii) concatenation app; (iv) substitution sub; (v) prefix-cut cut a ; and closed by composition.
The terms representing these basic operations are given in Example 2.6. We observe that among them, only APP and SUB are safe; the other terms are all unsafe because they contain terms of the form N (λy.x) where x and y are of the same order. It turns out that APP and SUB constitute a base of terms generating all the functions definable in the safe lambda calculus as the following theorem states: Theorem 2.8. Let λ safe def denote the minimal set containing the following word functions and closed by composition: (i) the projections; (ii) the constant functions; (iii) concatenation app; (iv) substitution sub.
The set of word functions definable in the safe lambda calculus is precisely λ safe def.
The proof follows the same steps as Zaionc's proof. The first direction is immediate: Projections are represented by safe terms of the form λx 1 . . . x n .x i for some i ∈ {1..n}, and constant functions by λx 1 . . . x n .w for some w ∈ Σ * . The terms APP and SUB are safe and represent concatenation and substitution. For closure by composition: take a function g : (Σ * ) n → Σ * represented by safe term G ∈ Cl(B n → B) and functions f 1 , . . . , f n : (Σ * ) p → Σ * represented by safe terms F 1 , . . . F n respectively then the function is represented by the term λc 1 . . . c p .G(F 1 c 1 . . . c p ) . . . (F n c 1 . . . c p ) which is also safe.
To show the other direction we need to introduce some more definitions. We will write Op(n, k) to denote the set of open terms M typable as follows: Thus we have the following equality (modulo α, β and η conversions) for n, k ≥ 1: We generalize the notion of representability to terms of type τ (n, k) as follows: By extension we will say that an open term M from Op(n, k) represents the pair (f, p) just if M [w 1 . . . w n /c 1 . . . c n ] = βη f (w 1 , . . . , w n )uvx |p(w 1 ,...,wn)| .
We will call safe pair any pair of functions of the form (w, c(n, i)) where 0 ≤ i ≤ k − 1 and w is an n-ary function from λ safe def. Proof. (Soundness). Take a pair (w, c(n, i)) where 0 ≤ i ≤ k − 1 and w is an n-ary function from λ safe def. As observed earlier, all the functions from λ safe def are representable in the safe lambda calculus: Let w be the representative of w. The pair (w, c(n, i)) is then represented by the term λc 1 . . . c n uvx k−1 . . . x 0 .wc 1 . . . c n uvx i .
(Completeness) It suffices to consider safe β-η-long normal terms from Op(n, k) only. The result then follows immediately for every safe term in Cl(τ (n, k)). The subset of Op(n, k) consisting of β-η-long normal terms is generated by the following grammar [37]: The name of each rule is indicated in parenthesis. We identify a rule name with the right-hand side of the rule, thus α k i belongs to Op(n, k), β k and γ k are functions from Op(n, k) to Op(n, k), and δ k j is a function from Op(n, k + 1) × Op(n, k + 1) × Op(n, k) to Op(n, k).
We now want to characterize the subset consisting of all safe terms generated by this grammar. The term α k i is always safe; We therefore need to identify the subclass of terms generated by the non-terminal R k which are safe and which do not have any free occurrence of variables in {x 1 . . . x k−1 }. By imposing this requirement to the rules of the previous grammar we obtain the following specialized grammar characterizing the desired subclass: For every term M , Q k (M ) is safe if and only if M can be generated from the non-terminal R k . Thus the subset of Cl(τ (n, k)) consisting of safe beta-normal terms is given by the grammar: To conclude the proof it thus suffices to show that every term generated by this grammar (starting with the non-terminal S) represents a safe pair.
We proceed by induction and show that the non-terminal R k generates terms representing pairs of the form (w, c(n, 0)) while non-terminals S and R k generate terms representing pairs of the form (w, c(n, i)) for 0 ≤ i < k and w ∈λ safe def. Base case: The term α k 0 represents the safe pair (c(n, 0), c(n, 0)) while α k i represents the safe pair (c(n, 0), c(n, i)).
Hence δ k j (E, F, G) represents the pair (w, c(n, i)). The same argument shows that if E, F and G all represent safe pairs then so does δ k j (E, F, G).
Theorem 2.8 is obtained by instantiating Theorem 2.10 with terms of types τ (n, 1) = I n → I: every closed safe term of this type represents some n-ary function from λ safe def.

Representability of functions over other structures.
There is an isomorphism between binary trees and closed terms of type τ = (o → o → o) → o → o. Thus a closed term of type τ → τ → . . . → τ represents an n-ary function over trees. Zaionc gave a characterization of the set of tree functions representable in the simply-typed lambda calculus [38]: It is precisely the minimal set containing constant functions, projections and closed under composition and limited primitive recursion. Zaionc showed that the same characterization holds for the general case of functions expressed over (different) free algebras [39,40] (they are again given by the minimal set containing constant functions, projections and closed under composition and limited primitive recursion). This result subsumes Schwichtenberg's result on definable numeric functions as well as Zaionc's own results on definable word and tree functions.
We have seen that constant functions, projections and composition can be encoded by safe terms. Limited primitive recursion, however, cannot be encoded in the safe lambda calculus (It can be used to define the conditional operator and the cut a word function). We expect an appropriate restriction to limited recursion to characterize the functions over free algebras representable in the safe lambda calculus.

Complexity of the safe lambda calculus
This section is concerned with the complexity of the beta-eta equivalence problem for the safe lambda calculus: Given two safe lambda-terms, are they equivalent up to βηconversion?

Statman's result.
Let exp h (m) denote the tower-of-exponential function defined by induction as exp 0 (m) = m and exp h+1 (m) = 2 exp h (m) . A program is elementary recursive if its run-time can be bounded by exp K (n) for some constant K where n is the length of the input.
We recall the definition of finite type theory. We define D 0 = {true, false} and D k+1 = P(D k ) (i.e., the powerset of D k ). For k ≥ 0, we write x k , y k and z k to denote variables ranging over D k . Prime formulae are x 0 , true ∈ y 1 , false ∈ y 1 , and x k ∈ y k+1 . Formulae are built up from prime formulae using the logical connectives ∧,∨,→,¬ and the quantifiers ∀ and ∃. Meyer showed that deciding the validity of such formulae requires nonelementary time [26].
A famous result by Statman states that deciding the βη-equality of two first-order typable lambda-terms is not elementary recursive [36]. The proof proceeds by encoding the Henkin quantifier elimination of type theory in the simply-typed lambda calculus and by appealing to Meyer's result [26]. Simpler proofs have subsequently been given: one by Mairson [23] and another by Loader [22]. Both proceed by encoding the Henkin quantifier elimination procedure in the lambda calculus, as in the original proof, but their use of list iteration to implement quantifier elimination makes them much easier to understand.
It turns out that all these encodings rely on unsafe terms: Statman's encoding uses the conditional function sg which is not definable in the safe lambda calculus [8]; Mairson's encoding uses unsafe terms to encode both quantifier elimination and set membership, and Loader's encoding uses unsafe terms to build list iterators. We are thus led to conjecture that finite type theory (see definition in Sec. 3.2) is intrinsically unsafe in the sense that every encoding of it in the lambda calculus is necessarily unsafe. Of course this conjecture does not rule out the possibility that another non-elementary problem is encodable in the safe lambda calculus.

3.2.
Mairson's encoding. We refer the reader to Mairson's original paper [23] for a detailed account of his encoding. We show here why Mairson's encoding does not work in the safe lambda calculus. We then introduce a variation that eliminates some of the unsafety. Although the resulting encoding does not suffice to interpret type theory in the safe lambda calculus, it enables another interesting encoding: that of the True Quantifier Boolean Formula (TQBF) problem. This implies that deciding beta-eta equality of safe terms is PSPACE-hard.
3.2.1. Sources of unsafety. In Mairson's encoding, boolean values are encoded by terms of type B = σ → σ → σ for some type σ, and variables of order k ≥ 0 are encoded by terms of type ∆ k defined as ∆ 0 ≡ B and ∆ k+1 ≡ (∆ k → τ → τ ) → τ → τ for any type τ . Using this encoding, unsafety manifests itself in three different places: (i) Set membership: The prime formula "x k ∈ y k+1 " is encoded by a term-in-context of the form for some term F and term M (x, z) containing free occurrences of x and z. This is unsafe because the free occurrence of x in M (x, z) is not abstracted together with z. (ii) Quantifier elimination is implemented using a list iterator D k+1 of type ∆ k+2 which acts like the foldr function (from functional programming) over the list of all elements of D k . Thus nested quantifiers in the formula are encoded by nested list iterations. This can be source of unsafety, for instance the formula "∀x 0 .∃y 0 .x 0 ∨ y 0 " is encoded as ⊢ st D 0 (λx ∆ 0 .AN D(D 0 (λy ∆ 0 .OR(x ∨ y))F )) T : B for some terms AN D, OR, F and T and where the type τ is instantiated as B. This term is unsafe due to the underlined occurrence which is unsafely bound.
More generally, nested binding will be encoded safely if and only if every variable x in the formula is bound by the first quantifier ∃z or ∀z satisfying ord z ≥ ord x in the path to the root of the formula AST. So for example if set-membership were safely encodable then the interpretation of "∀x k .∃y k+1 .x k ∈ y k+1 " would be unsafe whereas that of "∀y k+1 .∃x k .x k ∈ y k+1 " would be safe. (iii) Elements of the type hierarchy. The base set D 0 of booleans is represented by a safe term D 0 of type ∆ 0 . Higher-order sets D k for k ≥ 1 are represented by unsafe terms D k : they are constructed from D 0 using a powerset construction that is unsafe.
The second source of unsafety can be easily overcome, the idea is as follows. We introduce multiple domains of representation for a given formula. An element of D k is thereby represented by countably many terms of type ∆ n k where n ∈ N indicates the level of the domain of representation. The type ∆ n k is defined in such a way that its order strictly increases as n grows. Furthermore, there exists a term that can lower the domain of representation of a given term. Thus each formula variable can have a different domain of representation, and since there are infinitely many such domains, it is always possible to find an assignment of representation domains to variables such that the resulting encoding term is safe.
There is no obvious way to eliminate unsafety in the two other cases however. For instance in the case of set-membership, Mairson's encoding (3.1) could be made safe by appealing to a term that changes the domain of representation of an encoded higher-order value of the type-hierarchy. Unfortunately, such transformation is intrinsically unsafe! In the following paragraphs we present in detail a variation over Mairson's encoding in which quantifier elimination is safely encoded.

Encoding basic boolean operations.
Let o be a base type and define the family of types σ 0 ≡ o, σ n+1 ≡ σ n → σ n satisfying ord σ n = n. Booleans are encoded over domains B n ≡ σ n → o → o → o for n ≥ 0, each type B n being of order n+1. We write i n+1 to denote the term λx σn .x of type σ n+1 for n ≥ 0. The truth values true and false are represented by the following terms parameterized by n ∈ N: Clearly these terms are safe. Moreover the following relations hold for all n, n ′ ≥ 0: It is then possible to change the domain of representation of a Boolean value from a higherlevel to another arbitrary level using the conversion term: so that if a term M of type B n , for n ≥ 1, is beta-eta convertible to T n (resp. F n ) then C n →n ′ 0 M of type B n ′ is beta-eta convertible to T n ′ (resp. F n ′ ).
Observe that although C n+1 →n ′ 0 is safe for all n, n ′ ≥ 0, if we apply a variable to it then the resulting term-in-context is safe if and only if ord B n+1 ≥ ord B n ′ , that is to say if and only if the transformation decreases the domain of representation of x.
Boolean functions are encoded by the following closed safe terms parameterized by n:

3.2.3.
Coding elements of the type hierarchy. For every n ∈ N we define the hierarchy of type ∆ n k as follows: ∆ n 0 ≡ B n and ∆ n k+1 ≡ ∆ n k * where for a given type α, α * = (α → τ → τ ) → τ → τ for any type τ . We encode an occurrence x k of a formula variable by a term variable x k of type ∆ n k for some level of domain representation n ∈ N. Following Mairson's encoding, each set D k is represented by a list D n k consisting of all its elements: where powerset α ≡ λA * (α→α * * →α * * )→α * * →α * * .
(In the definition of D n k+1 , to see why it is possible to apply powerset ∆ n k and D n k one needs to understand that the term D n k is of type ∆ n k+1 polymorphic in τ . The application can thus be typed by taking τ ≡ ∆ n k+2 in the term D n k .) Observe that the term double is unsafe because the underlined variable occurrence x is not bound together with c ′ . Consequently for all n ≥ 0, D n 0 is safe and D n k is unsafe for all k > 0.

Quantifier elimination.
Terms of type ∆ n k+1 are now used as iterators over lists of elements of type ∆ n k and we set τ ≡ B n in the type ∆ n k+1 in order to iterate a level-n Boolean function. Since ord ∆ n k ≥ ord B n for all n, all the instantiations of the terms D n k will be safe (although the terms D n k themselves are not safe for k > 1). Following [23], quantifier elimination interprets the formula ∀x k .Φ(x k ) as the iterated conjunction ) T n whereΦ is the interpretation of Φ and n is the representation level chosen for the variable x k . Similarly we interpret ∃x k .Φ(x k ) by the iterated disjunction C n →0 0 D n k (λx ∆ n k .AN D n (Φ x)) T n .

3.2.5.
Encoding the formula. Given a formula of type theory, it is possible to encode it in the lambda calculus by inductively applying the above encodings of boolean operations and quantifiers on the formula; each variable occurrence in the formula being assigned some domain of representation. We now show that there exists an assignment of representation domains for each variable occurrence such that the resulting term is safe. Let x kp p . . . x k 1 1 for p ≥ 1 be the list of variables appearing in the formula, given in order of appearance of their binder in the formula (i.e., x kp p is bound by the leftmost binder). We fix the domain of representation of each variable as follows. The right-most variable x k 1 1 is encoded in the domain ∆ 0 k 1 ; and if for 1 ≤ i < p the domain of representation of where l ′ is the smallest natural number such that ord ∆ l ′ k i+1 is strictly greater than ord ∆ l k i . This way, since variables that are bound first have higher order, variables that are bound in nested list-iterations-corresponding to nested quantifiers in the formula-are guaranteed to be safely bound.
Example 3.1. The formula ∀x 0 .∃y 0 .x 0 ∨y 0 , which is encoded by an unsafe term in Mairson's encoding, is represented in our encoding by the safe term

Set-membership.
To complete the interpretation of prime formulae, we need to show how to encode set membership. Unfortunately, the introduction of multiple domains of representation does not permit us to completely eliminate the unsafety of Mairson's encoding of set membership. Indeed, adapting Mairson's encoding of set membership requires the ability to perform conversion of domains of representation for higher-order sets (not only for Boolean values). The conversion term C n+1 →n ′ 0 can be generalized to higher-order sets as follows: where k ≥ 0. Unfortunately this term is safe if and only if n = n ′ (The largest underlined subterm is safe just when n ≥ n ′ and the other underline subterm is safe just when n ′ ≥ n).
Hence at higher-orders, all the non-trivial conversion terms are unsafe.
If the terms C n →n ′ k+1 , k ≥ 0, n = n ′ were safely representable then the encoding would go as follows: We set τ ≡ B 0 in the types ∆ n k+1 for all n, k ≥ 0 in order to iterate a level-0 Boolean function. Firstly, the formulae "true ∈ y 1 " and "false ∈ y 1 " can be encoded by the safe terms y 1 (λx 0 .OR 0 x 0 )F 0 and y 1 (λx 0 .OR 0 (N OT 0 x 0 ))F 0 respectively. For the general case "x k ∈ y k+1 " we proceed as in Mairson's proof [23]: we introduce lambda-terms encoding set equality, set membership and subset tests, and we further parameterize these encodings by a natural number n.
x (λx ∆ n k .AN D 0 (member n k+1 x y)) T 0 : ∆ n k+1 → ∆ n k+1 → B 0 eq n 0 ≡ λx Bn .λy Bn .C n →0 0 (OR n (AN D n x y)(AN D n (N OT n x)(N OT n y))) : B n → B n → B 0 eq n k+1 ≡ λx ∆ n k+1 y ∆ n k+1 .(λop ∆ n k+1 →∆ n k+1 →B 0 .AN D 0 (op x y)(op y x)) subset n k+1 : ∆ n k+1 → ∆ n k+1 → B 0 . The variables in the definition of eq n k+1 and subset n k+1 are safely bounds. Moreover, the occurrence of x in member n+1 k+1 is now safely bound-which was not the case in Mairson's original encoding-thanks to the fact that the representation domain of z is lower than that of x. The formula x k ∈ y k+1 can then be encoded as x) (C n ′ →u k+1 y) : B 0 for some n, n ′ ≥ 2 and u = min(n, n ′ ) + 1.
Unfortunately this encoding is not completely safe because, as mentioned before, the conversion term C n →u k is unsafe for k ≥ 1, n = u. We conjecture that the set-membership function is intrinsically unsafe.
3.3. PSPACE-hardness. We observe that instances of the True Quantified Boolean Formulae satisfaction problem (TQBF) are special instances of the decision problem for finite type theory. These instances correspond to formulae in which set membership is not allowed and variables are all taken from the base domain D 0 . As we have shown in the previous section, such restricted formulae can be safely encoded in the safe lambda calculus. Therefore since TQBF is PSPACE-complete we have: Deciding βη-equality of two safe lambda-terms is PSPACE-hard. Example 3.3. Using the encoding where τ is set to B 0 in the types ∆ n k for all k, n ≥ 0, the formula ∀x∃y∃z(x ∨ y ∨ z) ∧ (¬x ∨ ¬y ∨ ¬z) is represented by the safe term: Remark 3.4. The Boolean satisfaction problem (SAT) is just a particular instance of TQBF where formulae are restricted to use only existential quantifiers, thus the safe lambda calculus is also NP-hard. Asperti gave an interpretation of SAT in the simply-typed lambda calculus but his encoding relies on unsafe terms [6].
Remark 3.5. (i) Because the safety condition restricts expressivity in a non-trivial way, one can reasonably expect the beta-eta equivalence problem to have a lower complexity in the safe case than in the normal case; this intuition is strengthened by our failed attempt to encode type theory in the safe lambda calculus. No upper bounds is known at present. On the other hand our PSPACE-hardness result is probably a coarse lower bound; it would be interesting to know whether we also have EXPTIME-hardness. (ii) Statman showed [36] that when restricted to some finite set of types, the beta-eta equivalence problem is PSPACE-hard. Such result is unlikely to hold in the safe lambda calculus. This is suggested by the fact that we had to use the entire type hierarchy to encode TQBF in the safe lambda calculus. In fact we expect the beta-eta equivalence problem for safe terms to have a complexity lower than PSPACE when restricted to any finite set of types. (iii) The normalization problem ("Given a (safe) term M , what is its β-normal form?") is non-elementary. Indeed, let τ −2 ≡ o and for n ≥ −1, τ n ≡ τ n−1 → τ n−1 . For k, n ∈ N, let k n denote the k th Church Numeral λs τ n−1 z τ n−2 .s(· · · (s(s z) · · · ) (with k applications of s) of type τ n . Then for n ≥ 1, the safe term 2 n−1 2 n−2 · · · 2 0 of type τ 0 has size O(n) and its normal form exp n (1) 0 has size O(exp n (1)).
Thus in the simply-typed lambda calculus, beta-eta equivalence is essentially as hard as normalization. We do not know if this is the case in the safe lambda calculus. (iv) A related problem is that of beta-reduction: "Given a β-normal term M 1 and a term M 2 , does M 2 β-reduce to M 1 ?". It is known to be PSPACE-complete when restricted to order-3 terms [33], but no complexity result is known for higher orders. The safe case can potentially give rise to interesting complexity characterizations at higher-orders.

A game-semantic account of safety
Our aim is to characterize safety by game semantics. We shall assume that the reader is familiar with the basics of game semantics; for an introduction, we recommend Abramsky and McCusker's tutorial [3]. Recall that a justified sequence over an arena is an alternating sequence of O-moves and P-moves such that every move m, except the opening move, has a pointer to some earlier occurrence of the move m 0 such that m 0 enables m in the arena. A play is just a justified sequence that satisfies Visibility and Well-Bracketing. A basic result in game semantics is that λ-terms are denoted by innocent strategies, which are strategies that depend only on the P-view of a play. The main result (Theorem 4.11) of this section is that if a λ-term is safe, then its game semantics (is an innocent strategy that) is, what we call, P-incrementally justified. In such a strategy, pointers emanating from the P-moves of a play are uniquely reconstructible from the underlying sequence of moves and pointers from the O-moves therein: Specifically a P-question always points to the last pending O-question (in the P-view) of a greater order.
The proof of Theorem 4.11 depends on a Correspondence Theorem (see the Appendix) that relates the strategy denotation of a λ-term M to the set of traversals over a souped-up abstract syntax tree of the η-long form of M . In the language of game semantics, traversals are just (concrete representations of) the uncovering (in the sense of Hyland and Ong [18]) of plays in the strategy denotation.
The useful transference technique between plays and traversals was originally introduced by the second author [30] for studying the decidability of monadic second-order theories of infinite structures generated by higher-order grammars (in which the Σ-constants or terminal symbols are at most order 1, and uninterpreted). In the Appendix, we present an extension of this framework to the general case of the simply-typed lambda calculus with free variables of any order. A new traversal rule is introduced to handle nodes labelled with free variables. Also new nodes are added to the computation tree to account for the answer moves of the game semantics, thus enabling the framework to model languages with interpreted constants such as PCF (by adding traversal rules to handle constant nodes).
Incrementally-bound computation tree. In the context of higher-order grammars, the computation tree is defined as the unravelling of the finite graph representing the long transform of the grammar [30]. Similarly we define the computation tree of a λ-term as an abstract syntax tree of its η-long normal form. We write l t 1 , . . . , t n with n ≥ 0 to denote the ordered tree with a root labelled l with n child-subtrees t 1 , . . . , t n . In the following we consider arbitrary simply-typed terms.  Its η-long normal form is: Its computation tree is: Its η-long normal form is: Its computation tree is: Even-level nodes are λ-nodes (the root is on level 0). A single λ-node can represent several consecutive variable abstractions or it can just be a dummy lambda if the corresponding subterm is of ground type. Odd-level nodes are variable or application nodes.
The order of a node n, written ord n, is defined as follows: @-nodes have order 0. The order of a variable-node is the type-order of the variable labelling it. The order of the root node is the type-order of (A 1 , . . . , A p , T ) where A 1 , . . . , A p are the types of the variables in the context Γ. Finally, the order of a lambda node different from the root is the type-order of the term represented by the sub-tree rooted at that node.
We say that a variable node n labelled x is bound by a node m, and m is called the binder of n, if m is the closest node in the path from n to the root such that m is labelled λξ with x ∈ ξ.
We introduce a class of computation trees in which the binder node is uniquely determined by the nodes' orders: Definition 4.4. A computation tree is incrementally-bound if for all variable node x, either x is bound by the first λ-node in the path to the root with order > ord x, or x is a free variable and all the λ-nodes in the path to the root except the root have order ≤ ord x. In the safe lambda calculus, the variables in the context with the lowest order must be all abstracted at once when using the abstraction rule. Since the computation tree merges consecutive abstractions into a single node, any variable x occurring free in the subtree rooted at a node λξ different from the root must have order greater or equal to ord λξ. Conversely, if a lambda node λξ binds a variable node x then ord λξ = 1 + max z∈ξ ord z > ord x.
Let x be a bound variable node. Its binder occurs in the path from x to the root, therefore, according to the previous observation, x must be bound by the first λ-node occurring in this path with order > ord x. Let x be a free variable node then x is not bound by any of the λ-nodes occurring in the path to the root. Once again, by the previous observation, all these λ-nodes except the root have order smaller than ord x. Hence τ is incrementally-bound.
(ii) Let M be a closed term such that τ (M ) is incrementally-bound. W.l.o.g. we can assume that M is in η-long form. We prove that M is safe by induction on its structure. The base case M ≡ λξ.x for some variable x is trivial.
Step case: If M ≡ λξ.N 1 . . . N p . Let i range over 1..p. We have N i ≡ λη i .N ′ i for some non-abstraction term N ′ i . By the induction hypothesis, λξ.N i = λξη i .N ′ i is a safe closed term, and consequently N ′ i is necessarily safe. Let z be a free variable of N ′ i not bound by λη i in N i . Since τ (M ) is incrementally-bound we have ord z ≥ ord λη 1 = ord N i , thus we can abstract the variables η 1 using (abs) which shows that N i is safe. Finally we conclude ⊢ s M = λξ.N 1 . . . N p : T using the rules (app) and (abs).
The assumption that M is closed is necessary. For instance for x, y : o, the computation trees τ (λxy.x) and τ (λy.x) are both incrementally-bound but λxy.x is safe and λy.x is not. Definition 4.6. A strategy σ is said to be P-incrementally justified if for every play s q ∈ σ where q is a P-question, q points to the last unanswered O-question in s with order strictly greater than ord q.
Note that although the pointer is determined by the P-view, the choice of the move itself can be based on the whole history of the play. Thus P-incremental justification does not imply innocence.
The definition suggests an algorithm that, given a play of a P-incrementally justified denotation, uniquely recovers the pointers from the underlying sequence of moves and from the pointers associated to the O-moves therein. Hence: Lemma 4.7. In P-incrementally justified strategies, pointers emanating from P-moves are superfluous. The Correspondence Theorem 6.10 gives us the following equivalence:   Example 4.13. If justification pointers are omitted then the denotations of the two Kierstead terms from Example 1.5 are not distinguishable. In the safe lambda calculus this ambiguity disappears since M 1 is safe whereas M 2 is not.
In fact, as the last example highlights, pointers are superfluous at order 3 for safe terms whether from P-moves or O-moves. This is because for question moves in the first two levels of an arena (initial moves being at level 0), the associated pointers are uniquely recoverable thanks to the visibility condition. At the third level, the question moves are all P-moves therefore their associated pointers are uniquely recoverable by Pincremental justification. This is not true anymore at order 4: Take the safe term-in-context ψ : (((o 4 , o 3 ), o 2 ), o 1 ) ⊢ s ψ(λϕ (o,o) .ϕa) : o 0 for some constant a : o. Its strategy denotation contains plays whose underlying sequence of moves is q 0 q 1 q 2 q 3 q 2 q 3 q 4 . Since q 4 is an Omove, it is not constrained by P-incremental justification and thus it can point to any of the two occurrences of q 3 . 7 7 More generally, a P-incrementally justified strategy can contain plays that are not "O-incrementally justified" since it must take into account any possible strategy incarnating its context, including those that are not P-incrementally justified. For instance in the given example, there is one version of the play that is Towards a fully abstract game model. The standard game models which have been shown to be fully abstract for PCF [2,18] are of course also fully abstract for the restricted language safe PCF. One may ask, however, whether there exists a fully abstract model with respect to safe context only. Such model may be obtained by considering P-incrementally justified strategies-which have been shown to compose [7]. Its is reasonable to think that O-moves also needs to be constrained by the symmetrical O-incremental justification, which corresponds to the requirement that contexts are safe. This line of work is still in progress.
Safe PCF and safe Idealised Algol. PCF is the simply-typed lambda calculus augmented with basic arithmetic operators, if-then-else branching and a family of recursion combinator Y A : ((A, A), A) for every type A. We define safe PCF to be PCF where the application and abstraction rules are constrained in the same way as the safe lambda calculus. This language inherits the good properties of the safe lambda calculus: No variable capture occurs when performing substitution and safety is preserved by the reduction rules of the small-step semantics of PCF.
Correspondence. The computation tree of a PCF term is defined as the least upper-bound of the chain of computation trees of its syntactic approximants [3]. It is obtained by infinitely expanding the Y combinator, for instance τ (Y (λf x.f x)) is the tree representation of the η-long form of the infinite term (λf It is straightforward to define the traversal rules modeling the arithmetic constants of PCF. Just as in the safe lambda calculus we had to remove @-nodes in order to reveal the game-semantic correspondence, in safe PCF it is necessary to filter out the constant nodes from the traversals. The Correspondence Theorem for PCF says that the revealed game semantics is isomorphic to the set of traversals disposed of these superfluous nodes. This can easily be shown for term approximants. It is then lifted to full PCF using the continuity of the function T rv( ) ↾⊛ from the set of computation trees (ordered by the approximation ordering) to the set of sets of justified sequences of nodes (ordered by subset inclusion). Finally computation trees of safe PCF terms are incrementally-bound thus we have Theorem 4.14. Safe PCF terms have P-incrementally justified denotations.
Similarly, we can define safe IA to be safe PCF augmented with the imperative features of Idealized Algol (IA for short) [32]. Adapting the game-semantic correspondence and safety characterization to IA seems feasible although the presence of the base type var, whose game arena com N × exp has infinitely many initial moves, causes a mismatch between the simple tree representation of the term and its game arena. It may be possible to overcome this problem by replacing the notion of computation tree by a "computation directed acyclic graph".
The possibility of representing plays without some or all of their pointers under the safety assumption suggests potential applications in algorithmic game semantics. Ghica and McCusker [15] were the first to observe that pointers are unnecessary for representing plays in the game semantics of the second-order finitary fragment of Idealized Algol (IA 2 for short). Consequently observational equivalence for this fragment can be reduced to the problem of equivalence of regular expressions. At order 3, although pointers are necessary, deciding observational equivalence of IA 3 is EXPTIME-complete [29,28]. Restricting the problem to the safe fragment of IA 3 may lead to a lower complexity.

Further work and open problems
The safe lambda calculus is still not well understood. Many basic questions remain. What is a (categorical) model of the safe lambda calculus? Does the calculus have interesting models? What kind of reasoning principles does the safe lambda calculus support, via the Curry-Howard Isomorphism? Does the safe lambda calculus characterize a complexity class, in the same way that the simply-typed lambda calculus characterizes the polytime-computable numeric functions [21]? Is the addition of unsafe contexts to safe ones conservative with respect to observational (or contextual) equivalence?
With a view to algorithmic game semantics and its applications, it would be interesting to identify sublanguages of Idealised Algol whose game semantics enjoy the property that pointers in a play are uniquely recoverable from the underlying sequence of moves. We name this class PUR. IA 2 is the paradigmatic example of a PUR-language. Another example is Serially Re-entrant Idealized Algol [1], a version of IA where multiple uses of arguments are allowed only if they do not "overlap in time". We believe that a PUR language can be obtained by imposing the safety condition on IA 3 . Murawski [27] has shown that observational equivalence for IA 4 is undecidable; is observational equivalence for safe IA 4 decidable?

Appendix -Computation tree, traversals and correspondence
The second author introduced the notion of computation tree and traversals over a computation tree for the purpose of studying trees generated by higher-order recursion scheme [30]. Here we extend these concepts to the simply-typed lambda calculus. Our setting allows the presence of free variables of any order and the term studied is not required to be of ground type. (This contrasts with [30]'s setting where the term is of ground type and contains only uninterpreted constant.) Note that we automatically account for the presence of uninterpreted constants since they can just be regarded as free variables. We will then state the Correspondence Theorem (Theorem 6.10) that was used in Sec. 4.
In the following we fix a simply-typed term-in-context Γ ⊢ st M : T (not necessarily safe) and we consider its computation tree τ (M ) as defined in Def. 4.1.
6.1. Notations. We first fix some notations. We write ⊛ to denote the root of the computation tree τ (M ). The set of nodes of this computation tree is denoted by IN . The sets IN @ , IN λ and IN var are respectively the subset of @-nodes, λ-nodes and variable nodes. The type of a variable-labelled node is the type of the variable that labels it; the type of the root is (A 1 , . . . , A p , T ) where x 1 : A 1 , . . . , x p : A p are the variables in the context Γ; and the type of a node n ∈ (IN λ ∪ IN @ ) \ {⊛} is the type of the subterm of ⌈M ⌉ corresponding to the subtree of τ (M ) rooted at n.

6.2.
Pointers and justified sequences of nodes. We define the enabling relation on the set of nodes of the computation tree as follows: m enables n, written m ⊢ n, if and only if n is bound by m (and we sometimes write m ⊢ i n to indicate that n is the i th variable bound by m); or m is the root ⊛ and n is a free variable; or n is a λ-node and m is its parent node.
We say that a node n 0 of the computation tree is hereditarily enabled by n p ∈ IN if there are nodes n 1 , . . . , n p−1 ∈ IN such that n i+1 enables n i for all i ∈ 0..p − 1.
For any set of nodes S, H ⊆ N we write S H⊢ for {n ∈ S |∃m ∈ H s.t. m ⊢ * n} -the subset of S consisting of nodes hereditarily enabled by some node in H. We will abbreviate S {m}⊢ into S m⊢ .
We call input-variables nodes the elements of IN ⊛⊢ var (i.e., variables that are hereditarily enabled by the root of τ (M )). Thus we have . A justified sequence of nodes is a sequence of nodes with pointers such that each occurrence of a variable or λ-node n different from the root has a pointer to some preceding occurrence m satisfying m ⊢ n. In particular, occurrences of @-nodes do not have pointer. We represent the pointer in the sequence as follows m . . . n i . where the label indicates that either n is labelled with the i th variable abstracted by the λ-node m or that n is the i th child of m. Children nodes are numbered from 1 onward except for @-nodes where it starts from 0. Abstracted variables are numbered from 1 onward. The i th child of n is denoted by n.i.
We say that a node n 0 of a justified sequence is hereditarily justified by n p if there are occurrences n 1 , . . . , n p−1 in the sequence such that n i points to n i+1 for all i ∈ 0..p − 1. For any occurrence n in a justified sequence s, we write s ↾ n to denote the subsequence of s consisting of occurrences that are hereditarily justified by n.
The notion of P-view t of a justified sequence of nodes t is defined the same way as the P-view of a justified sequences of moves in Game Semantics: 8 ǫ = ǫ s · m · . . . · λξ = s · m · λξ for n / ∈ IN λ , s · n = s · n s · ⊛ = ⊛ The O-view of s, written s , is defined dually. We will borrow the game-semantic terminology: A justified sequences of nodes satisfies alternation if for any two consecutive nodes one is a λ-node and the other is not, and P-visibility if every variable node points to a node occurring in the P-view a that point.
6.3. Computation tree with value-leaves. We now add another ingredient to the computation tree that was not originally used in the context of higher-order grammars [30]. We write D to denote the set of values of the base type o. We add value-leaves to τ (M ) as follows: For each value v ∈ D and for each node of the computation tree we attach a new child leaf v n to n. We write N for the set of nodes (i.e., inner nodes and leaf nodes) of the resulting tree. The set of leaf nodes is denoted L, we thus have N = IN ∪ L. For $ ranging in {@, λ, var}, we write N $ to denote the set consisting of nodes from IN $ together with leaf nodes with parent node in IN $ ; formally The basic notions can be adapted to this new version of computation tree: A value-leaf has order 0. The enabling relation ⊢ is extended so that every leaf is enabled by its parent node. A link going from a value-leaf v n to a node n is labelled by v (e.g., n . . . v n v ). For the definition of P-view and visibility, value-leaves are treated as λ-nodes if they are at an odd level in the computation tree, and as variable nodes if they are at an even level.
We say that an occurrence of an inner node n ∈ IN is answered by an occurrence v n if v n in the sequence that points to n, otherwise we say that n is unanswered. The last unanswered node is called the pending node. A justified sequence of nodes is wellbracketed if each value-leaf occurring in it is justified by the pending node at that point. If t is a traversal then we write ?(t) to denote the subsequence of t consisting only of unanswered nodes. 6.4. Traversals of the computation tree. A traversal is a justified sequence of nodes of the computation tree where each node indicates a step that is taken during the evaluation of the term.  Table 1. A traversal that cannot be extended by any rule is said to be maximal.
• ϕ maps λ-nodes to O-questions, variable nodes to P-questions, value-leaves of λ-nodes to P-answers and value-leaves of variable nodes to O-answers. • ϕ maps nodes of a given order to moves of the same order.
If t = t 0 t 1 . . . is a justified sequence of nodes in N λ ∪ N var then ϕ(t) is defined to be the sequence of moves ϕ(t 0 ) ϕ(t 1 ) . . . equipped with the pointers of t.  q λy q y ψ ψ λg,q λg λg.gx ψ λy,q λy λy.y 6.6. The Correspondence Theorem. In game semantics, strategy composition is performed using a CSP-like "composition + hiding". If some of the internal moves are not hidden then we obtain alternative denotations called revealed semantics [16] or interaction semantics [13]. We obtain different notions of revealed semantics depending on the choice of internal moves that we hide. For instance the fully revealed denotation of Γ ⊢ st M : T , written Γ ⊢ st M : T , is obtained by uncovering all the internal moves from [[Γ ⊢ st M : T ]] that are generated during composition. 11 The inverse operation consists in filtering out the internal moves.
The syntactically-revealed denotation, written Γ ⊢ st M : T s , differs from the fully-revealed one in that only certain internal moves are preserved during composition: When computing the denotation of an application joint by an @-node in the computation tree, all the internal moves are preserved. When computing the denotation of y i N 1 . . . N p for some variable y i , however, we only preserve the internal moves of N 1 , . . . , N p while omitting the internal moves produced by the copy-cat projection strategy denoting y i .
The Correspondence Theorem states that in the simply-typed lambda calculus, the set T rv(M ) of traversals of the computation tree is isomorphic to the syntactically-revealed denotation, and the set of traversal reductions is isomorphic to the standard strategy denotation: Theorem 6.10 (The Correspondence Theorem). We have the following two isomorphisms: