A Logical Foundation for Environment Classifiers

Taha and Nielsen have developed a multi-stage calculus {\lambda}{\alpha} with a sound type system using the notion of environment classifiers. They are special identifiers, with which code fragments and variable declarations are annotated, and their scoping mechanism is used to ensure statically that certain code fragments are closed and safely runnable. In this paper, we investigate the Curry-Howard isomorphism for environment classifiers by developing a typed {\lambda}-calculus {\lambda}|>. It corresponds to multi-modal logic that allows quantification by transition variables---a counterpart of classifiers---which range over (possibly empty) sequences of labeled transitions between possible worlds. This interpretation will reduce the"run"construct---which has a special typing rule in {\lambda}{\alpha}---and embedding of closed code into other code fragments of different stages---which would be only realized by the cross-stage persistence operator in {\lambda}{\alpha}---to merely a special case of classifier application. {\lambda}|>enjoys not only basic properties including subject reduction, confluence, and strong normalization but also an important property as a multi-stage calculus: time-ordered normalization of full reduction. Then, we develop a big-step evaluation semantics for an ML-like language based on {\lambda}|>with its type system and prove that the evaluation of a well-typed {\lambda}|>program is properly staged. We also identify a fragment of the language, where erasure evaluation is possible. Finally, we show that the proof system augmented with a classical axiom is sound and complete with respect to a Kripke semantics of the logic.


Introduction
A number of programming languages and systems that support manipulation of programs as data [1,2,3,4,5] have been developed in the last two decades. A popular language abstraction in these languages consists of the Lisp-like quasiquotation mechanism to create and compose code fragments and a function to run them like eval in Lisp. For those languages and systems, a number of type systems for so-called "multi-stage" calculi have been studied [5,6,7,8,9,10,11] to guarantee safety of generated programs even before the generating program runs.

T. TSUKADA AND A. IGARASHI
Among them, some seminal work on the principled design of type systems for multistage calculi is due to Davies [7] and Davies and Pfenning [12,8]. They discovered the Curry-Howard isomorphism between modal/temporal logics and multi-stage calculi by identifying (1) modal operators in modal logic with type constructors for code fragments treated as data and, in the case of temporal logic, (2) the notion of time with computation stages. For example, the calculus λ [7], which can be thought as a reformulation of Glück and Jørgensen's calculus for multi-level generating extensions [6] by using explicit quasiquote and unquote in the language, corresponds to a fragment of linear-time temporal logic (LTL) with the temporal operator "next" (written ) [13]. Here, linearly ordered time corresponds to the level of nesting of quasiquotations, and a modal formula A to the type of code of type A. It, however, does not treat eval; in fact, the code type in λ represents code values, whose bodies are open, that is, may have free variables, so simply adding eval to the calculus does not work-execution may fail by referencing free variables in the code. The calculus developed by Davies and Pfenning [12,8], on the other hand, corresponds to (intuitionistic) modal logic S4 (only with the necessity operator ), in which a formula A is considered the type of closed code values of type A. It supports safe eval since every code is closed, but inability to deal with open code hampers generation of efficient code. The following work by Taha and others [5,14,15,9,16] sought various forms of combinations of the two systems above to develop expressive type systems for multi-stage calculi.
Finally, Taha and Nielsen [9] developed a multi-stage calculus λ α , which was later modified to make type inference possible [16] and implemented as a basis of MetaOCaml. The calculus λ α has a strong type system while supporting open code, run, (which corresponds to eval), and a mechanism called cross-stage persistence (CSP), which allows a value to be embedded in a code fragment evaluated later. They introduced the notion of environment classifiers (or, simply, classifiers), which are special identifiers with which code fragments and variable declarations are annotated, to the type system. A key idea is to reduce the closedness checking of a code fragment (which is useful to guarantee the safety of run) to the freshness checking of a classifier. Unfortunately, however, correspondence to a logic is not clear for λ α any longer, resulting in somewhat ad-hoc typing rules and complicated operational semantics, which would be difficult to adapt to different settings.
In this paper, we investigate a Curry-Howard isomorphism for environment classifiers by developing a typed λ-calculus λ ⊲ . As a computational calculus, λ ⊲ is equipped with quasiquotation (annotated with environment classifiers) and abstraction over environment classifiers just like λ α , with application of a classifier abstraction to a possibly empty sequence of environment classifiers, which makes λ ⊲ different from λ α . Intuitively, (the type system of) λ ⊲ can be considered a proof system of a multi-modal logic to reason about deterministic labeled transition systems. Here, modal operators are indexed with transition labels, and so the logic is multi-modal. One notable feature of the logic is that it has quantification that allows one to express "for any state transitions," where a state transition is a possibly empty sequence of labels. This quantifier corresponds to types for classifier abstractions, used to ensure freshness of classifiers, which correspond to transition labels (and variables ranging over their sequences).
A pleasant effect of this logical interpretation-in particular, interpreting environment classifiers as variables ranging over sequences of transition labels-is that it will reduce the run construct-which has a peculiar typing rule in λ α -and embedding of closed code into other code fragments of different stages-which would be only realized by the CSP operator in λ α -to merely a special case of classifier application.

A LOGICAL FOUNDATION FOR ENVIRONMENT CLASSIFIERS 3
Our technical contributions can be summarized as follows: • Identification of a modal logic that corresponds to (a computational calculus with) environment classifiers; • Development of a new typed λ-calculus λ ⊲ , naturally emerged from the correspondence, with its syntax, operational (small-step reduction and big-step evaluation) semantics, and type system; • Proofs of basic properties, which a multi-stage calculus is expected to enjoy; and • Proofs of soundness and completeness of the proof system (augmented with a classical axiom) with respect to a Kripke semantics of the logic. Our calculus λ ⊲ not only enjoys the basic properties such as subject reduction, confluence, and strong normalization but also time-ordered normalization [7,10], which says (full) reduction to a normal form can always be performed according to the order of stages. We extend λ ⊲ with base types and recursion, define a big-step evaluation semantics as a basis of a multi-stage programming language such as MetaOCaml, and prove the evaluation of a well-typed program is safe and staged, i.e., if a program of a code type evaluates to a result, it is a code value whose body is a well-typed program, again. We also develop erasure semantics, where information on classifiers is (mostly) discarded, and identify a subset of the language, where the original and erasure semantics agree, by an alternative type system. It turns out that the subset is rather similar to λ i [16], whose type system is used in the current implementation of MetaOCaml.
One missing feature in λ ⊲ is CSP for all types of values but we do not think it is a big problem. First, CSP for primitive types such as integers is easy to add as a primitive; CSP for function types is also possible as long as they do not deal with open code, which, we believe, is usually the case. Second, as mentioned above, embedding closed code into code fragments of later stages is supported by a different means. It does not seem very easy to add CSP for open code to λ ⊲ , but we think it is rarely needed. For more detail, see Section 6.3.
We can obtain a natural deduction proof system of a new logic that corresponds to the calculus λ ⊲ just by removing terms from typing rules, as usual. It is also easy to see that terms and reduction in the calculus correspond to proofs and proof normalization in the logic, respectively.
Of course, we should answer an important question: "What does this logic really mean?" We will elaborate the intuitive meaning of formulae in Section 2 and proof rules can be understood according to this informal interpretation but, to answer this question more precisely, one has to give a semantics and prove the proof system is sound and complete with respect to the semantics. However, the logic is intuitionistic and it is not straightforward to give (Kripke) semantics [17]. So, instead of Kripke semantics of the logic directly corresponding to λ ⊲ , we give that of a classical version of the proof system, which has a proof rule for double negation elimination and prove that the proof system is sound and complete with it in Section 5. Even though the semantics does not really correspond to λ ⊲ , it justifies our informal interpretation.
Organization of the Paper. In Section 2, we review λ α and informally describe how the features of its type system correspond to those of a logic. In Section 3, we define the multi-stage calculus λ ⊲ and prove basic properties including subject reduction, strong normalization, confluence, and time-ordered normalization. In Section 4, we define MiniML ⊲ , an extension of λ ⊲ with base types and recursion, with its big-step semantics and prove that the big-step semantics implements staged execution. We also investigate erasure semantics of a subset of MiniML ⊲ here. In Section 5, we formally define (a classical version of) the logic and prove soundness and completeness of the proof system (augmented with a classical rule) with respect to a Kripke semantics. Lastly, we discuss related work and conclude.

Interpreting Environment Classifiers in a Modal Logic
In this section, we informally describe how environment classifiers can be interpreted in a modal logic. We start with reviewing Davies' λ [7] to get an intuition of how notions in a modal logic correspond to those in a multi-stage calculus. Then, along with reviewing main ideas of environment classifiers, we describe our logic informally and how our calculus λ ⊲ is different from λ α by Taha and Nielsen [9].
2.1. λ : Multi-Stage Calculus Based on LTL. Davies has developed the typed multistage calculus λ , which corresponds to a fragment of intuitionistic LTL by the Curry-Howard isomorphism. It can be considered the λ-calculus with a Lisp-like quasiquotation mechanism. We first review linear-time temporal logic and the correspondence between the logic and the calculus.
In LTL, the truth of propositions may depend on discrete and linearly ordered time, i.e., a given time has a unique time that follows it. Some of the standard temporal operators are (to mean "next"), (to mean "always"), and U (to mean "until"). The Kripke semantics of (classical) LTL can be given by taking the set of natural numbers as possible worlds; 1 then, for example, the semantics of is given by: n τ if and only if n + 1 τ , where n τ is the satisfaction relation, which means "τ is true in world-or, at time-n." In addition to the usual Curry-Howard correspondence between propositions and types and between proofs and terms, Davies has pointed out additional correspondences between time and computation stages (i.e., levels of nested quotations) and between the temporal operator and the type constructor meaning "the type of code of". So, for example, τ 1 → τ 2 , which means "if τ 1 holds at next time, then τ 2 holds at next time," is considered the type of functions that take a piece of code of type τ 1 and return another piece of code of type τ 2 . According to this intuition, he has developed λ , corresponding to the fragment of LTL only with .
λ has two new term constructors next M and prev M , which correspond to the introduction and elimination rules of , respectively. The type judgment of λ is of the form Γ ⊢ n M : τ , where Γ is a context, M is a term, τ is a type (a proposition of LTL, only with ) and n is a natural number indicating a stage. A context, which corresponds to assumptions, is a mapping from variables to pairs of a type and a natural number, since the truth of a proposition depends on time. The key typing rules are those for next and prev: 1 Note that this is equivalent to another, perhaps more standard presentation as a sublogic of CTL * [13].
The former means that, if M is of type τ at stage n + 1, then, at stage n, next M is code of type τ ; the latter is its converse. Computationally, next and prev can be considered quasiquote and unquote, respectively. So, in addition to the standard β-reduction, λ has the reduction rule prev (next M ) −→ M , which cancels next by prev.
The code types in λ are often called open code types, since the quoted code may contain free variables, so naively adding the construct to "run" quoted code does not work, since it may cause unbound variable errors.
Although the logic is considered intuitionistic, Davies has only shown that the proof system augmented with double negation elimination is equivalent to a standard axiomatic formulation [13], which is known to be sound and complete with the Kripke semantics described above. Kojima and Igarashi [18,19] have studied the semantics of intuitionistic LTL and shown that the proof system obtained from λ is sound and complete with the given semantics. Even though the Kripke semantics discussed here does not really correspond to the logic obtained from the calculus, it certainly helps understand intuition behind the logic and we will continue to use Kripke semantics in what follows for an explanatory purpose.

2.2.
Multi-Modal Logic for Environment Classifiers. Taha and Nielsen [9] have introduced environment classifiers to develop λ α , which has quasiquotation, run, and CSP with a strong type system. We explain how λ α can be derived from λ . 2 Environment classifiers are a special kind of identifiers with which code types and quoting are annotated: for each classifier α, there are a type constructor τ α for code and a term constructor M α to quote M . Then, a stage is naturally expressed by a sequence of classifiers, and a type judgment is of the form Γ ⊢ A M : τ , where natural numbers in a λ type judgment are replaced with sequences A of classifiers. So, the typing rules of quoting and unquoting (written˜M ) in λ α are given as follows: Obviously, this is a generalization of λ : if only one classifier is allowed, then the calculus is essentially λ . The corresponding logic would also be a generalization of LTL, in which there are several "dimensions" of linearly ordered time. A Kripke frame for the logic is given by a transition system in which each transition relation is a map. More formally, a frame is a triple (S, L, { α −→| α ∈ L}) where S is the (non-empty) set of states, L is the set of labels, and α −→ ∈ S → S for each α ∈ L. Then, the semantics of τ α is given by: s τ α if and only if s ′ τ for any s ′ such that s α −→ s ′ , where s and s ′ are states. The calculus λ α has also a scoping mechanism for classifiers and it plays a central role to guarantee safety of run. The term (α)M , which binds α in M , declares that α is used locally in M and such a local classifier can be instantiated with another classifier by term M [β]. We show typing rules for them with one for run below: 2 Unlike the original presentation, classifiers do not appear explicitly in contexts here. The typing rules shown are accordingly adapted.

T. TSUKADA AND A. IGARASHI
The rule for (α)M requires that α does not occur in the context-the term M has no free variable 3 labeled α-and gives a type of the form (α)τ , which Taha and Nielsen called α-closed type, which characterizes a relaxed notion of closedness. For example, the term λx : b.x α is a closed term, so this term is α-closed and the judgment ∅ ⊢ ε (α) λx : b.x α : The term x α , however, is not α-closed because this term has free variable x in the stage α, but β-closed (if β = α) because there is no free variable in the stage containing the classifier β. The rule for run M says that an α-closed code fragment annotated with α can be run. Note that · α (but not (α)·) is removed in the type of run M . Taha and Nielsen have shown that α-closedness is sufficient to guarantee safety of run.
When this system is to be interpreted as logic, it is fairly clear that (α)τ is a kind of universal quantifier, as Taha and Nielsen have also pointed out [9]. Then, the question is "What does a classifier range over?", which has not really been answered so far. Another interesting question is "How can the typing rule for run be read logically?" One plausible answer to the first question is that "classifiers range over the set of transition labels". This interpretation matches the rule for M [β] and it seems that the typing rules without run (with a classical axiom) are sound and complete with the Kripke semantics that defines s (α)τ by s τ [α := β] for all β ∈ L. However, it is then difficult to explain the rule for run.
The key idea to solve this problem is to have classifiers range over the set of finite (and possibly empty) sequences of transition labels and to allow a classifier abstraction (α)M to be applied to also sequences of classifiers. Then, run will be unified to a special case of application of a classifier abstraction to the empty sequence. More concretely, we change the term M [β] to M [B], where B is a possibly empty sequence of classifiers (the left rule below). When B is empty and τ is τ 0 α (assuming τ 0 do not include α), the rule (as shown as the right rule below) can be thought as the typing rule of (another version of) run, since α-closed code of τ 0 becomes simply τ 0 (without (α)· as in the original λ α ).
Another benefit of this change is that CSP for closed code (or embedding of persistent code [10]) can be easily expressed. For example, if x is of the type (α) int α , then it can be used as code computing an integer at different stages as in, say, · · · (˜x[α]) + 3 · · · · · · 4 + (˜˜x[αβ]) · · · β · · · α . So, once a programmer obtains closed code, she can use it at any later stage. While our calculus λ ⊲ does not have a primitive of CSP for all types, we can express CSP in many cases. In Section 6.3, we discuss this subject in more detail.
Correspondingly, the semantics is now given by v, ρ; s τ where v is a valuation for propositional variables and ρ is a mapping from classifiers to sequences of transition labels. Then, v, ρ; s τ α is defined by v, ρ; s ′ τ where s ′ is reachable from s through the sequence ρ(α) of transitions and v, ρ; s (α)τ by: v, ρ[A/α]; s τ for any sequence A of labels (ρ[A/α] updates the value of α to be A). In Section 5, we give a formal definition of the Kripke semantics and show that the proof system, based in the ideas above, with double negation elimination is sound and complete with respect to it.

The Calculus λ ⊲
In this section, we define the calculus λ ⊲ , based on the ideas described in the previous section: we first define its syntax, type system, and small-step full reduction semantics and states some basic properties; then we prove the time-ordered normalization property. Finally, we give an example of programming in λ ⊲ . We intentionally make notations for type and term constructors different from λ α because their precise meanings are different; it is also to avoid confusion when we compare the two calculi.
3.1. Syntax. Let Σ be a countably infinite set of transition variables, ranged over by α and β. A transition, denoted by A and B, is a finite sequence of transition variables; we write ε for the empty sequence and AB for the concatenation of the two transitions. We write Σ * for the set of transitions. A transition is often called a stage. We write FTV(A) for the set of transition variables in A, defined by FTV( Let PV be the set of base types (corresponding to propositional variables), ranged over by b. The set Φ of types, ranged over by τ and σ, is defined by the following grammar: A type is a base type, a function type, a code type, which corresponds to · α of λ α , or an α-closed type, which corresponds to (α)τ . The transition variable α of ∀α.τ is bound in τ . In what follows, we assume tacit renaming of bound variables in types. The type constructor ⊲ α connects tighter than → and → tighter than ∀: for example, ⊲ α τ → σ means (⊲ α τ ) → σ and ∀α.τ → σ means ∀α.(τ → σ). We write FTV(τ ) for the set of free transition variables, which is defined in a straightforward manner. Let Υ be a countably infinite set of variables, ranged over by x and y. The set of terms, ranged over by M and N , is defined by the following grammar: In addition to the standard λ-terms, there are four more terms, which correspond to M α , M , (α)M , and M [β] of λ α (respectively, in the order presented). Note that, unlike˜M in λ α , the term ◭ α M for unquote is also annotated. This annotation is needed because a single transition variable can be instantiated with a sequence, in which case a single unquote has to be duplicated accordingly. The variable 3.2. Type System. As mentioned above, a type judgment and variable declarations in a context are annotated with stages. A context Γ is a finite set {x 1 : τ 1 @A 1 , . . . , x n : τ n @A n }, where x i are distinct variables. We often omit braces {}. We write FTV(Γ) for the set of free transition variables in Γ, defined by: A type judgment is of the form Γ ⊢ A M : τ , read "term M is given type τ under context Γ at stage A." Figure 1 presents the typing rules to derive type judgments. The notation τ [α := B], used in the rule (Ins), is capture-avoiding substitution of transition B for α in τ . When α in ⊲ α is replaced by a transition, we identify ⊲ ε τ with τ and ⊲ AB τ with ⊲ A ⊲ B τ .
The first three rules on the left are mostly standard except for stage annotations. The conditions on stage annotations are similar to those in most multi-stage calculi: The rule (Var) means that variables can appear only at the stage in which those variables are declared. and the rule (Abs) requires the stage of the parameter to be the same as that of the body and, correpondingly, the rule (App) requires M and N are typed at the same stage. The next two rules (◮) and (◭) are for quoting and unquoting and already explained in the previous section. The last two rules (Gen) and (Ins) are for generalization and instantiation of a transition variable, respectively. They resemble the introduction and elimination rules of ∀x.A(x) in first-order predicate logic: the side condition of the rule (Gen) ensures that the choice of α is independent of the context. Computationally, this side condition expresses α-closedness of M , that means M has no free variable which has annotation α in its type or its stage. This is a weaker form of closedness, which means M has no free variable at all.

3.3.
Reduction. We will introduce full reduction M −→ N , read "M reduces to N in one step," and prove basic properties including subject reduction, confluence and strong normalization.
Before giving the definition of reduction, we define substitution. Since the calculus has binders for term variables and transition variables, we need two kinds of substitutions for both kinds of variables. Substitution M [x := N ] for a term variable is the standard capture-avoiding one, and its definition is omitted here. Substitution M [α := A] of A for α is defined similarly to τ [α := A]. We show representative cases below: Note that, when a transition variable in ◭ is replaced, the order of transition variables is reversed, because this is the inverse operation of ◮. This is similar to the inversion operation in group theory: (a 1 a 2 . . . a n ) −1 = a −1 n a −1 n−1 . . . a −1 1 .

A LOGICAL FOUNDATION FOR ENVIRONMENT CLASSIFIERS 9
The reduction relation M −→ N is the least relation closed under the following three computation rules and congruence rules, which are omitted here. In addition to the standard β-reduction, there are two rules: the second one, which is already explained previously, cancels quote by unquote and the last one, instantiation of a transition variable, is similar to polymorphic function application in System F. Note that the reduction is full-reduction occurs under any context in any stage. This reduction relation can be thought as (non-deterministic) proof normalization, which should preserve types, be confluent and strongly normalizing. Then, we will define another reduction relation as a triple M T −→ N , with T standing for the stage of reduction in Section 3.5, as done in λ [7] and λ [10], to prove time-ordered normalization.
3.4. Basic Properties. We will prove three basic properties, namely, subject reduction, strong normalization and confluence.
The key lemma is, as usual, Substitution Lemma, which says substitution preserves typing. We will prove such a property for each kind of substitution. We define substitution Γ[α := A] for contexts as follows: Proof. Easy induction on the typing rules. We only show main cases.
The proof of (1): • Case M = x: It is the case that τ = σ and A = B. So, what we have to show is Γ ⊢ B N : τ , which is already assumed.
• Case M = M 1 M 2 : By the typing rules, we know that Γ, x : σ@B ⊢ A M 1 : τ 0 → τ and Γ, x : σ@B ⊢ A M 2 : τ 0 for some τ 0 . By the induction hypothesis, The proof of (2): Proof. By straightforward induction on the derivation of M −→ M ′ , using Substitution Lemma (Lemma 3.1). We only show three base cases and omit induction steps. • Proof. We construct a term ♮(M ) of the simply typed λ-calculus (λ → ) as follows: We can easily prove the following propositions by induction of the structure of M : . Now, assume there exists an infinite reduction sequence from a typable term M . It is clear that there are infinitely many β-reduction. By (2), there exists an infinite reduction sequence from ♮(M ), which is typable by (1). This contradicts the strong normalization property of λ → .
The last property we will show is confluence. We prove this by using parallel reduction and complete development [20]. We define the parallel reduction relation M =⇒ N as in Figure 2. Notice that, the rule (P-◭ ◮) allows more than one nested pairs of quoting and unquoting to be cancelled in one step: For example, x. It is not very standard in the sense that parallel reduction usually does not allow "hidden" redices (that is, redices that appear only after some other reduction steps) to be contracted in one step. We require this definition because a transition variable α can be replaced with a sequence A of transition variables during reduction. If A in (P-◭ ◮) were α, Lemma 3.5 (2) below would not hold any longer. The following lemma relates the reduction relation and the parallel reduction relation. Thanks to this lemma, we know that confluence of −→ is equivalent to confluence of =⇒. We prove confluence of =⇒ by showing that =⇒ enjoys the diamond property. The following properties of parallel reduction are useful.
Proof. Easy induction on the structure of the derivation M 1 =⇒ N 1 and M =⇒ N , respectively. Now, we define the notion of complete development and show its key property. The complete development M ⋆ of M is defined by induction as in Figure 3.
By the definition of parallel reduction, we have that N 0 = Λα.N 1 for some N 1 and So, by applying (P-TIns) rule, we obtain (Λα.
It is easy to show diamond property of =⇒ by using Lemma 3.6. Proof. Choose M ⋆ as N and use the previous lemma.
Proof. By Lemma 3.4, we have . Therefore what we should show is confluence of =⇒, which is an easy consequence of Lemma 3.7.

Annotated Reduction and Time-Ordered Normalization.
We introduce the notion of stages into reduction and prove the property called time-ordered normalization [7,10]. Intuitively, it says that normalization can be done in the increasing order of stages and does not need to 'go back' to earlier stages. In other words, once all redices at some stage are contracted, subsequent reductions never yield a new redex at that stage. To state time-ordered normalization formally, we first introduce the notion of paths from one stage to another and a new reduction relation, annotated with paths to represent the stage at which reduction occurs.
A path represents how the stage of a subterm is reached from the stage of a given term. For example, if Γ ⊢ α M and Γ ′ ⊢ αβ N for a subterm N of M , then we say the path from (the stage of) M to (that of) N is β. The stage of a subterm may not be able to be expressed by a transition (a sequence of transition variables), however: For example, consider the path from ◭ α M to M . We introduce formal inverses α −1 to deal with such cases: the path from the stage of ◭ α M to that of M is represented by α −1 . Similarly, the path from ◮ α ◭ β M to M will be αβ −1 .
Formally, the set of paths, ranged over by T and U , is the free group generated by the set of transition variables Σ. In other words, a path is a finite sequence ξ 1 ξ 2 . . . ξ n , where ξ i = α or α −1 , such that it includes no subsequence of the form αα −1 or α −1 α for any α.
). The empty sequence ε is the unit element for the operation T · U . We simply write T U for T · U . We define (ξ 1 ξ 2 . . . ξ n ) −1 = ξ −1 n ξ −1 n−1 . . . ξ −1 1 . We say a path T is positive if T does not contain formal inverses, in other words, the canonical form of T is in Σ * . We can naturally identify the positive paths with transitions and use metavariables A and B for positive paths. We write T ≤ U when there exists a positive path A which satisfies T A = U . Clearly, ε ≤ T if and only if T is positive.
The annotated reduction relation is a triple of the form M T −→ N , where M and N are terms and T is a path from the stage of M to that of its redex-more precisely, that of the constructor destructed by the reduction, since the stage of a redex and that of its constructor may be different as in ◮ α in redex ◭ α ◮ α M . The definition of the annotated reduction, presented in Figure 4, is mostly straightforward. For example, α −1 is given to As for the rule (AR-◮), the path from M to the constructor destructed by the reduction is T and the path from ◮ α M to M is α, hence the path from ◮ α M to the constructor is given by their concatenation αT . The rule (AR-◭) is similar. The rule (AR-Gen) is the most interesting. First of all, α is bound here, so, we cannot propagate T in the premise to the conclusion to prevent α from escaping its scope. We have found that replacing α with ε, which is the earliest possible stage, is a reasonable choice, especially for time-ordered normalization.
Annotated reduction is closely related to reduction defined in the previous section. It is easy to see that M −→ N if and only if there exists T such that M T −→ N . Furthermore, such T is unique.
The next theorem shows that any reduction occurs indeed at a positive stage. Proof. See Appendix A.
As its corollary, we know that for any reduction to a normal form from a typable term M is "rearranged" according to an increasing order between stages. Moreover, this increasing order can be any total order that respects ≤, i.e., includes ≤ as a subset.
Corollary 3.11. Let M be a typable term and be a total order that respects ≤. Then, there is a reduction sequence M Tn −→ N n , which satisfies T 1 T 2 · · · T n and N n is a normal form.
3.6. Programming in λ ⊲ . We give an example of programming in λ ⊲ . The example is the power function, which is a classical example in multi-stage calculi and partial evaluation. We augment λ ⊲ with integers, Booleans, arithmetic and comparison operators, if-then-else, a fixed point operator fix, and let. In the next section, we will formalize such a language (without let) as MiniML ⊲ and study its evaluation in more detail. For readability, we often omit type annotations and put terms under quotation in shaded boxes.
We start with the ordinary power function without staging.
Our purpose is to get a code generator power ∀ that takes the exponent n and returns (closed, hence runnable) code of λx.
Here, we follow the construction of code generators in the previous work [15,14].
First, we construct a code manipulator power 1 : int → ⊲ α int → ⊲ α int, which takes an integer n and a piece of integer code and then outputs a piece of code which concatenates the input code by " * " n times. It can be obtained by changing type annotation and introducing quasiquotation.
Then, from power 1 , we can construct a code generator power α of type int → ⊲ α (int → int), which means it takes an integer and returns code of a function.
It indeed behaves as a code generator: for example, power α 3 would evaluate to the code value ◮ α λx : int .x * (x * (x * 1)). This construction is independent of the choice of the stage α. So, by abstracting α at appropriate places in power 1 and power α , we can obtain the desired code generator, whose return type is a closed code type ∀γ. ⊲ γ (int → int).
The output from power ∀ is usable in any stage. For example, if we want code of a cube function at the later stage, say A, then we write power ∀ 3 A. In particular, when A is the empty sequence ε, power ∀ 3 ε : int → int evaluates to a function closure which computes x * x * x * 1 from the input x. The former corresponds to CSP (of closed code) and the latter to run.

MiniML ⊲
We extend λ ⊲ and define an ML-like functional language MiniML ⊲ , which has, in addition to the features of λ ⊲ , integers, arithmetic and comparison operations, Booleans, conditional expressions, and the (call-by-value) fixed-point combinator fix. We define the type system and big-step evaluation semantics for MiniML ⊲ and prove type soundness. In this semantics, bindings of transition variables have to be maintained at run time. So, we investigate a fragment of MiniML ⊲ , in which programs can be executed by mostly forgetting information on transition variables. We give another type system, which identifies such a fragment, and erasure translation, which removes transitions from terms, and alternative evaluation semantics for erased terms. Then, we prove the erasure property, which says program executions before and after erasure agree.

Syntax and Type
System. The syntax of types and terms of MiniML ⊲ is defined as follows, where n and bv are metavariables ranging over integers and Boolean constants true and false. Types The type system is given as a straightforward extension of that of λ ⊲ . We show typing rules for the additional constructs.  Proof. The proof is essentially the same as that of Substitution Lemma for λ ⊲ (Lemma 3.1).

4.2.
Evaluation and Type Soundness. Now, we give a big-step semantics and prove that the execution of a well-typed program is properly divided into stages. The judgment has the form ⊢ A M ⇓ R, read "evaluating term M at stage A yields result R," where R is either err, which stands for a run-time error, or a value v, defined below. Values are given via a family of sets V A indexed by transitions, that is, stages. The family V A is defined by the following grammar: The index A represents the current stage in which a value is typed. So, the index changes under quoting and unquoting. Note that a value at a higher stage (that is, under quotation) include free variables, applications and instantiation since computation is suspended. For example, x y ∈ V α and so ◮ α x y ∈ V ε . Figure 5 shows the evaluation rules. Notice that metavariables M or N for terms (not values) are used on the right side of ⇓, since it is not immediately clear that a result is really a value of a proper form (or err)-we will prove such a property as a theorem. The evaluation is left-to-right and call-by-value. The rules in Figure 5(1) are for ordinary evaluation. The rule for ◭ α M means that quote is canceled by unquote; since the resulting term M ′ belongs to the stage α (inside quotation), α is attached to the conclusion. As seen in the rule for Λα.M , Λ does not delay the evaluation of the body. The rule about instantiation of a transition abstraction is straightforward. The rules for stages later than ε, which are in Figure 5(2), are all similar: since the term to be evaluated is inside quotation, each term constructor is left as it is and only subterms of stage ε will be evaluated. We also need rules for handling erroneous terms, such as: They are shown in Appendix B.
We show a few properties of the big-step semantics. The first theorem says that evaluation is deterministic.
Proof. By straightforward induction on the derivation of ⊢ A M ⇓ R.
The second theorem below says that, unless the result is err, the result must be a value even though the rules do not say it is the case. The last property is type soundness and its corollary that if a well-typed program of a code type yields a result, then the result is a quoted term, whose body is also typable at stage ε. Unlike a usual setting where only closed terms are considered programs, free variables at non-ε stages are considered symbols and do not cause unbound variable errors in MiniML ⊲ , so we relax the notion of programs to include terms that contain such symbolic variables. We say that Γ is ε-free if it satisfies A = ε for any x : τ @A ∈ Γ; then, a program is a term which is typed under an ε-free environment. In the statement of Type Soundness Theorem, we also use the notation Γ −A , defined by: Theorem 4.4 (Type Soundness). If Γ is ε-free and Γ ⊢ ε M : τ and ⊢ ε M ⇓ R, then R = v and v ∈ V ε for some v and Γ ⊢ ε v : τ . In particular, if τ = ⊲ α τ 0 , then v = ◮ α N and Γ −α ⊢ ε N : τ 0 .

T. TSUKADA AND A. IGARASHI
(1) Rules for ordinary evaluation.
The only difference is the annotation on ◮((fix f.f )2), but ⊢ ε M 1 ⇓ 1 whereas there is no term N such that ⊢ ε M 2 ⇓ N . In other words, the evaluation of M 1 terminates but that of M 2 diverges. Therefore, we must record how transition variables are bound to transitions during evaluation.
From the implementation point of view, it is desirable that evaluation is insensitive to the annotation as much as possible to avoid overhead. In λ α [9], environment classifiers can be regarded as completely static citizens so that the evaluation does not require them, although the authors do not explicitly state it. The property that the evaluation goes well even if we erase the annotations is called erasure property. The previous example shows that the erasure property does not hold for MiniML ⊲ . Since the argument B, especially, its length, in an instantiation M B is significant at run time, we cannot erase transitions completely. So, we consider a slightly weaker notion of erasure, which removes transition variables only from ◮, ◭ and Λ and replaces the transition B in M B with its length. The goal of this section is to find a practically meaningful subset of MiniML ⊲ , which enjoys the erasure property under the translation sketched above.
The reasons why the erasure property is broken are (1) Λ-bound transition variables are used "too far" from the binder, as is the case in M 2 and (2) the "depth" of quoting ◮ α can be changed by using instantiation with a transition, whose length is not 1. In the case of M 2 , there is an occurrence of transition variable α far from the binder and α is instantiated by ε, whose length is 0. So, to ensure the erasure property, it is enough to prevent both (1) and (2) from holding at once, in other words, to guarantee that Λ-bound transition variables occur near the binder or to restrict instantiations to only transitions of length 1.
Based on this observation, we will introduce two instantiation rules. The first rule is for instantiation of transition variables used only near the binder. We can change the depth of quoting by using this rule, but this rule can be applied only in limited situations. The second rule is for instantiation of transition variables by transitions, whose lengths are 1. This rule can be applied to any ∀-types, but we cannot change the depth of quoting. We introduce a new term constructor M [α] to distinguish from the former.
The first instantiation rule requires some control on the occurrences of transition variables. We enforce one additional restriction, which requires that transition variables be also staged like term variables. This restriction rejects a type with nested occurrences of ⊲ α , such as ∀α. ⊲ α ⊲ α τ , whose term would have a distant use of ◮ α . This restriction is closely related to the distinction between open and closed code types in λ i [16].
We define a new type system with staged transition variables. We need two changes to deal with the stages for transition variables. First, we introduce environments for transition variables. A transition environment is a set of the form {α 1 @A 1 , . . . , α n @A n }, where α i are distinct transition variables. An intuitive meaning of α@A is that the valid occurrence of α is always of the form Aα. The second change is the annotation for the universal quantifier. The new syntax for universal quantification is ∀α@A.τ , where A is the (positive) path from the current stage to the stage in which α is usable. Next, we define well-formed transitions, transition environments, types, and type environments to ensure every use of a transition variable is valid. We say a transition A = α 1 . . . α n is well formed under a transition environment ∆ if, for any i ≤ n, α i @α 1 . . . α i−1 ∈ ∆. We say ∆ is well formed if, for any α@A ∈ ∆, A is well formed under ∆, i.e., all stages where transition variables are declared are well formed. This definition avoids the circular definition of transition variables, e.g., α@β, β@α. We write ∆ ⊢ s A if A is well formed under ∆, and ⊢ s ∆ if ∆ is a well-formed transition environment.
The judgment of the form ∆ ⊢ A s τ means "type τ is well formed at stage A under ∆", and defined by the rules in Figure 6. The base types int and bool are always well formed at any well-formed stage. The rules for τ → σ and ⊲ α τ resemble the typing rules (Abs) and (◮), respectively. The type ∀α@B.τ , which binds a new transition variable α, is well formed at A under ∆ if τ is well formed under the transition variables environment extended by the new transition variable declaration α@AB. Finally, we define well-formedness of type environment Γ under ∆, written ∆ ⊢ s Γ, by: Γ is well formed under ∆ if and only if ∆ is well formed and, for any x : τ @A ∈ Γ, τ is well formed at A under ∆ (i.e., ∆ ⊢ A s τ ). Figure 7 shows the typing rules that differ from the previous type system (except the addition of ∆). They have additional premises about well-formedness. The rule (S-Var) requires the well-formedness of environment Γ, x : τ @A, which will require well-formedness of the type τ at A and the transition environment ∆. The rules (S-Num) and (S-Bool) require the well-formedness of the environment Γ and the stage A, which ensures the wellformedness of the base types. The typing rule (S-Gen) records the path from the current stage to the stage in which α is usable. This information is used by the rules (S-Ins1) and (S-Ins2). As mentioned above, there are two kinds of transition instantiation rules and corresponding term constructors. The first one (S-Ins1) is computationally meaningful, in other words it may change the depth of quoting, but can be used only in limited situations. The second one (S-Ins2) does not change the depth of quoting, so this is computationally meaningless and we can use anytime. Here, substitution for a transition variable α in ∀α@A.τ (among other types) is defined as follows: It is easy to see that Γ; ∆ ⊢ A s M : τ implies Γ ⊢ A M : τ . Now, we define the syntax for erased terms, terms after erasure and the erasure translation from λ ⊲ terms to erased terms, and the big-step semantics for erased terms. The syntax of erased terms, ranged over by M ♭ , is as follows:

Erased Terms
(S-Ins2) Figure 7: The typing rules which differ from the previous type system.
The erasing function ♭(·) from terms to erased erased terms is defined as follows: The erasure semantics is essentially the same as the ordinary evaluation semantics in Section 4.2, except for the two differences: one is that the stage A of ⊢ A M ⇓ N is replaced by the natural number n, which is the length of A; and the other is that the rule for M n at the stage 0. In this case, M must be evaluated to Λ ◮ M ′ and ◮ at its head is duplicated by n times. We show only main rules below.
e M ♭ n ⇓ N ♭ Finally, we state the erasure property: the erasure semantics is equivalent to the semantics with transition variables for terms typed under the new type system.
We believe that the calculus with this new type system does not lose much expressiveness for practical use; in fact, the example of power functions in Section 3.6 can be typed with the new type system.

Kripke Semantics for λ ⊲ and Logical Completeness
In this section, we formalize the Kripke semantics discussed in Section 2 and prove completeness of a classical version of the proof system obtained from λ ⊲ to justify the informal interpretation of types in λ ⊲ as formulae of a logic. We augment the set of propositions (namely types) with falsity and the proof rules with double negation elimination. It is left for future work to study the semantics of the intuitionistic version, of which recent work on Kripke semantics for intuitionistic LTL [18,19] can be a basis.
First, we (re)define the set of propositions and the natural deduction proof system. Then, we proceed to the formal definition of the Kripke semantics and prove soundness and completeness of the proof system. Finally, we examine another rule for the double negation elimination.
5.1. Natural Deduction. The set Φ ⊥ , ranged over by φ and ψ, of propositions is given by the grammar for Φ extended with a new constant ⊥.
The natural deduction system can be obtained by forgetting variables and terms in the typing rules. We add the following new rule, which is the ordinary double negation elimination rule, adapted for this setting:

Kripke Semantics and Completeness.
As mentioned in Section 2, the Kripke semantics for this logic is based on a functional transition system T = (S, L, { a −→ | a ∈ L}) where S is the (non-empty) countable set of states, L is the countable set of labels, and a −→ ∈ S → S for each label a ∈ L. We write s Actually, given s, a 1 , . . . , a n , s ′ always exists in this setting because a i −→ is a total function for all 1 ≤ i ≤ n.
To interpret a proposition, we need two valuations, one for propositional variables and the other for transition variables. The former is a total function v ∈ S × PV → {0, 1}; the latter is a total function ρ ∈ Σ → L * , where L * is the set of all finite sequences of labels. Then, we define the satisfaction relation T , v, ρ; s φ, where s ∈ S is a state, as follows: The satisfaction relation is extended pointwise to contexts Γ (possibly infinite sets of pairs of a proposition and a transition 4 ) by: The local consequence relation Γ φ is defined by: Γ φ iff T , v, ρ; s Γ implies T , v, ρ; s φ for any T , v, ρ, s .
Then, the natural deduction proof system is sound and complete with respect to the local consequence relation. The proof is similar to the one for first-order predicate logic: we use the standard techniques of Skolemization and Herbrand structure.
Proof. By induction on the derivation of Γ ⊢ ε φ.
Proof. We give a proof sketch; more detailed proofs are found in Appendix D.
We assume Γ ε φ. We construct a transition system T , two valuations v and ρ and a state s such that T , v, ρ; s Γ and T , v, ρ; s φ. The construction is similar to the construction of counter models in first-order predicate logic.
First, we prove the following proposition.

Alternative Semantics.
We can give an alternative deduction rule for double negation elimination.
The difference is in the stage of the premise. This rule requires that the stage of ⊥ is equal to the stage of φ → ⊥, but in the rule in Section 5.1 the stage of ⊥ is arbitrary. This restriction makes the proof system weaker: for example, in this setting ⊲ α is not self-dual, that means ¬ ⊲ α ¬φ ↔ ⊲ α φ is not provable (here ¬φ is an abbreviation of φ → ⊥), while under the previous rules ⊲ α is self-dual. The difference is equivalent to the axiom ⊲ A ⊥ ↔ ⊲ B ⊥ (or a weaker form ∀α.(⊲ α ⊥ → ⊥)) in the sense that this system with this axiom is equivalent to the previous proof system. This axiom corresponds to the axiom ¬A ↔ ¬ A in linear-time temporal logic, due to Stirling [13].
exists a finite context Γ ′ ⊆ Γ such that Γ ′ ⊢ A φ. The following argument holds if we restrict Γ to be finite because the logic is compact, i.e., Γ is unsatisfiable if and only if there exists a finite subset Γ ′ ⊆ Γ such that Γ ′ is also unsatisfiable. For more detail, see Appendix D. There is a corresponding semantics, with respect to which the new proof system is sound and complete. In fact, this is achieved by a minor change: we allow transition functions a −→ to be partial. As a result, there can be no s ′ such that s ρ(α) −→ s ′ . In this setting, the semantics for ⊲ α φ has to be modified. We define s

Comparing with other multi-stage calculi
In this section, we will compare λ ⊲ with closely related calculi λ [7], the Kripke-style modal λ-calculus [8], λ α [9] and λ i [16]. The first two calculi are based on Curry-Howard correspondence between multi-stage calculi and modal logics and our work can be considered a generalization of them. In fact, there are embeddings from these two calculi to λ ⊲ . The calculi λ α and λ i are multi-stage calculi with environment classifiers. We discuss several differences among these two calculi and λ ⊲ . Although it does not seem possible to give (straightforward) embeddings from them, due to the presence of CSP, we will show that an embedding from the CSP-free fragment of λ i to λ ⊲ is possible.
6.1. Comparing with λ . λ [7] is a multi-stage calculus corresponding to linear-time temporal logic (LTL). As already mentioned in Section 2, λ is obtained by using only one transition variable, say α. Then the modal operator to mean "next" in LTL corresponds to the modal operator ⊲ α and the stage n in LTL corresponds to α n and so on. We define an embedding [[·]] from λ into λ ⊲ in Figure 8. This embedding is essentially the same as that from λ calculus into λ α , given by Taha and Nielsen [9].
The following two theorems show the correctness of the embedding.  Proof. By induction on the type derivation. Proof. By induction on the structure of M .
Moreover, by giving a suitable definition of reduction annotated with time for λ , we can easily show that the embedding preserves the stage of reduction.
We can construct a reverse mapping, a type-and reduction-preserving embedding from the quantifier-free fragment of λ ⊲ to λ , by simply forgetting annotations of transition variables. Moreover, the quantifier-free fragment of λ ⊲ with only one transition variable is isomorphic to λ in the sense that there is a bijection that preserves typability and reduction.

6.2.
Comparing with Calculus based on S4. Davies and Pfenning [8,12] develop calculi that correspond to intuitionistic modal logic S4 (only with ). The type τ represents closed code values, which thus can be run or embedded in code of any later stages, as is possible in λ ⊲ . We compare λ ⊲ with one of their calculi (what they call the Kripke-style modal λ-calculus in Section 4 of [8]), in which, there are box and unbox n for quoting and unquoting, respectively (see Pfenning and Davies [8] for details). An embedding [[·]] from this calculus into λ ⊲ is given in Figure 9.
The following theorems state the correctness of the embedding. Proof. An easy induction on the type derivation Γ 0 ; . . . ; Γ n ⊢ M : τ . Proof. By induction on the structure of M .
Taha and Nielsen have shown a similar embedding from the Kripke-style calculus into λ α . In their embedding, the translation of unbox, which corresponds to the elimination rule for , is slightly more involved than ours, since they use run and the CSP operator. In our embedding, on the other hand, unbox is expressed uniformly by M B, which corresponds to the elimination rule for ∀.

6.3.
Comparing with λ α and λ i . Comparing λ ⊲ with λ α [9], we can point out two differences: the meaning of run and the absence of CSP primitive.
In λ α , run is a primitive, while, in λ ⊲ , run M is defined as a syntax sugar for M ε. The following rules are typing rules for run in λ α and λ ⊲ , respectively.
Aside from the presence of a binder (α), which is not essential, an important difference is how code type constructors are removed in the conclusion. In λ α , run removes only the outermost bracket annotated with α, while, in λ ⊲ , M ε removes all code-type constructor ⊲ α in τ . This difference in typing rules is also reflected in reductions.
(Here we assume that v does not contain the CSP constructor to simplify the argument-See [9] for the complete definition.) So, it does not seem very easy to give an embedding of either direction. From a practical point of view, we do not think this difference is very significant. It is not clear when one wants to share one environment classifier or transition variable among different stages. If the use of classifiers or transition variables is staged, as we discussed in Section 4.3, then the difference is very little. In fact, the current implementation of MetaOCaml is based on λ i , which can be considered a subset of both λ α and λ ⊲ (if CSP is ignored), and this fact shows that the difference is practically insignificant.
As another difference, λ α allows the CSP constructor % to be applied to any terms to embed the value of the term inside a quotation. It is easy to see that φ@ε ⊢ α φ is not provable in general and so such a universal CSP operator would not be expressible in λ ⊲ . However, we can support CSP for many types. CSP for values of base types such as integers and Booleans is easy. CSP for function closures is also possible if they do not contain open code in their bodies or environment. We can deal with CSP for closed code, i.e., the terms which have types of the form ∀α. ⊲ α τ with α / ∈ FTV(τ ), as syntactic sugar in λ ⊲ . The following rules are for CSP in λ α and for CSP as syntactics sugar in λ ⊲ .  Figure 10: Embedding from λ i without CSP primitive to λ ⊲ . Precisely speaking, in order to recover τ and α, which appear only on the right hand side, this embedding takes type derivations of λ i rather than terms as an input.
The only problematic case is CSP for open code and, as mentioned above, functions containing open code, but we think it is rarely needed. λ i [16] is developed as a subset of λ α where type inference is possible (although there are slight differences in syntax and typing). The difference between λ ⊲ and λ i , is smaller than that between λ ⊲ and λ α . In fact, we can construct an embedding that preserves typing from λ i (without the CSP operator) into λ ⊲ , by observing that the executable code type τ in λ i corresponds to∀α. ⊲ α τ (where α / ∈ FTV(τ )) in λ ⊲ . Figure 10 shows the complete description of the embedding. Precisely speaking, it takes a type derivation rather than terms: For example, the rule for open means Proof. Easy induction on the type derivation.

T. TSUKADA AND A. IGARASHI
Preservation of the semantics is hard to discuss precisely. First of all, the semantics of λ i is not given in [16] in spite of the subtle syntactical differences between λ α and λ i . However, as far as we guess, the semantics of λ i seems very close to the erasure semantics in Section 4.3, and then we expect to have preservation of the semantics.

Related Work
Multi-Stage Calculi Based on Modal Logics and Their Extensions. Our work can be considered a generalization of the previous work on the Curry-Howard isomorphism between multi-stage calculi and modal logics [7,8,10]. As we have seen in Section 6, there are embeddings from λ and λ to λ ⊲ .
The restriction of λ that all code be closed precludes the definition of a code generator like power ∀ , which generates both efficient and runnable code. Nanevski and Pfenning [21] have extended λ with the notion of names, similar to the symbols in Lisp, and remedied the defect of λ by allowing newly generated names (not variables) to appear in closed code.
Taha and Sheard [5] added run and CSP to λ and developed MetaML, but its type system was not strong enough-run may fail at run-time. Then, Moggi, Taha, Benaissa, and Sheard [14] developed the calculus AIM ("An Idealized MetaML"), in which there are types for both open and closed code; it was then simplified to λ BN , which replaced closed code types with closedness types for closed terms that are not necessarily code. Both calculi are based on categorical models and have sound type systems. The notion of α-closedness in λ α can be considered a generalization of λ BN 's closed types. In fact, the typing rule for run in λ BN is similar to the one in λ α . Although some of these calculi have sound type systems, it is hard to regard them as logic, mainly due to the presence of CSP, which delays the stage of the type judgment to any later stage, and the typing rule for run (as discussed in Section 6.3).
More recently, Yuse and Igarashi have proposed the calculus λ [10] by combining λ and λ , while maintaining the Curry-Howard isomorphism. The main idea was to consider LTL with modalities "always" ( ) and "next" ( ), which represent closed and open code types, respectively. It is similar to AIM in this respect. Although λ is based on logic, it cannot be embedded into λ ⊲ simply by combining the two embeddings above. In fact, in λ , both directions of τ ↔ τ are provable, whereas neither direction of (∀α. ⊲ α ⊲ β τ ) ↔ ⊲ β ∀α. ⊲ α τ is provable in λ ⊲ . However, in λ it seems impossible to program a code generator like power ∀ , which generates specialized code used at any stage; the best possible one presented can generate specialized code used only at any later stage, so running the specialized code is not possible.
It is considered not easy to develop a sound type system for staging constructs with side effects. Calcagno, Moggi, and Sheard developed a sound type system for a multi-stage calculus with references using closed types [22]. It is interesting to study whether their closedness condition can be relaxed by using α-closedness.
Other Multi-Stage Calculi. Kim [24]. Viera and Pardo [25] have proposed a multi-stage language with intensional code analysis, that is, pattern matching on code. The language requires typechecking at run-time.
More recently, Kameyama, Kiselyov, and Shan [26] have developed an extension of (a two-level version of) λ with control operators shift/reset [27], which enable an interesting pattern of code generation such as let-insertion [28] in the direct style. It will be interesting to investigate how this calculus extends to dynamic code execution (i.e., run).
Modal Logics. As we discussed above, the -fragment of modal logic, the -fragment of LTL can be embedded into our logic, and the -fragment of LTL and our logic is incomparable.
Our logic has three characteristic features: (1) it is multi-modal, (2) it has universal quantification over modalities and (3) modal operators are "relative", meaning their semantics depends on the possible world at which they are interpreted. Most other logics do not have all of these features.
Dynamic logic [29] is a multi-modal logic for reasoning about programs. Its modal operators are [α] for each program α, and [α]φ means "when α halts, φ must stand after execution of α from the current state". Dynamic logic is multi-modal and its modal operators are "relative", but does not have quantification over programs. Therefore, there is no formula in Dynamic logic which would correspond to ∀α. ⊲ α ⊲ α φ. There is, however, a formula which is expressive in Dynamic logic but not in our logic: e.g., a Dynamic logic formula [α * ]φ, which means intuitively φ ∧ [α]φ ∧ [α][α]φ ∧ . . . , cannot be expressed in our logic.
Hybrid logic [30] is a modal logic with a new kind of atomic formula called nominals, each of which must be true exactly one state in any model (therefore, a nominal names a state). For each nominal i, @ i is a modal operator and @ i φ means "φ stands at the state denoted by i". Hybrid logic has a universal quantifier over nominals (and another binder ↓: ↓ x.φ means "let x stand for the nominal for the current world, then φ stands"). Hybrid logic differs from our logic, in that modal operators @ i indicate worlds directly, hence are not "relative". In Hybrid logic, @ i @ j φ ↔ @ j φ, but ⊲ α ⊲ β φ and ⊲ β φ are not equivalent in our logic.

Conclusion and Future Work
We have studied a logical aspect of environment classifiers by developing a simply typed multi-stage calculus λ ⊲ with environment classifiers. This calculus corresponds to a multimodal logic with quantifier over transitions by the Curry-Howard isomorphism and satisfies time-ordered normalization as well as basic properties such as subject reduction, confluence, and strong normalization. The classical proof system is sound and complete with respect to the Kripke semantics. Our calculus simplifies the previous calculus λ α of environment classifiers by reducing run and some uses of CSP to an extension of another construct. We believe our work helps clarify the semantics of environment classifiers.
We have also studied evaluation of (a slight extension of) λ ⊲ and shown staged execution of a program is possible. Also, it is shown that erasure execution is possible for a certain subset, which is close to λ i , the basis of MetaOCaml.
From a theoretical perspective, it is interesting to study the semantics of the intuitionistic version of the logic, as mentioned earlier, and also the calculus corresponding to the classical version of the logic. It is known that the naive combination of staging constructs and control operators is problematic since bound variables in quotation may escape from its scope by a control operator. We expect that a logical analysis, like the one presented here and Reed and Pfenning [31], will help analyze the problem.
From a practical perspective, one feature missing from λ ⊲ is CSP for all types. As argued in the introduction, we think typical use of CSP is rather limited and so easy to support. Type inference for full λ ⊲ would not be possible for the same reason as λ α [16]. However, it would be easy to applying type inference for λ i let [16] to a similar subset of λ ⊲ .  Figure 11.
We first prove that this proof system characterize T -normal forms as in Lemma A.2, which is obtained as a special case of the lemma below. In what follows, FTV(T ) denotes the set of transition variables in the path T . • Case M = x: ∆ ⊢⇓ T M trivially holds.
for any N ′ and U . Assume  Other cases are similar. The proof of (2) ⇒ (1) is by easy induction on the structure of the derivation ∆ ⊢⇓ T M . We show the case M = ◭ α M ′ as a representative case.
There are two cases. Proof. Immediate from Lemma A.1. Now, we prove that reduction preserves T -normality and T -neutrality, as in Lemma A.5, from which Theorem 3.10 immediately follows. Before that, we prove that (term and transition) substitution preserves T -normality and T -neutrality under a certain condition. Proof. We first show (2) by case analysis on the last rule used to derive ∆, α ⊢ ▽ T M . • The other case is easy.
Then, (1) is proved by induction on the derivation of ∆, α ⊢⇓ T M . We show only the main cases.
The proof of the remaining part is similar to other cases.
Proof. We prove this lemma by contraposition.
Proof. Let ∀ n 0ᾱ 0 ∃β 1 . . . ∀ nmᾱ m ψ@A ∈ Γ. We can get the following proposition by easy induction on the number of existential quantifiers.