On Nested Sequents for Constructive Modal Logics

. We present deductive systems for various modal logics that can be obtained from the constructive variant of the normal modal logic CK by adding combinations of the axioms d , t , b , 4


Introduction
The modal logic K is obtained from classical propositional logic by incorporating two unary operators, or modalities, and ♦, and adding the k-axiom, (A ⊃ B) ⊃ ( A ⊃ B), to dictate the interaction between the modalities and propositional connectives.The behavior of the ♦ modality is then determined by enforcing that it is the De Morgan dual of .Along with this axiom there is the necessitation rule, saying that if A is a theorem of K then so is A. Informally, is often interpreted as "necessarily" and ♦ as "possibly".Notice that interaction with other propositional connectives is determined by the adequacy of {⊃, ⊥} in classical logic.
In the intuitionistic setting, however, one must define the behavior of and ♦ independently, in the absence of De Morgan duality.Consequently, it is not enough to just add the standard k-axiom, which makes no mention of the ♦-modality, and so some classical consequences of k must be added to formulate an intuitionistic version of K. To this end there seems to be no canonical choice, and many different intuitionistic versions of K have been proposed, e.g., [Fit48, Pra65, Ser84, PS86, Sim94, BdP00, PD01] (for a survey see [Sim94]).However, in the current literature, two variants prevail; the first, known as intuitionistic K, adds the following five axioms, along with the necessitation rule, to intuitionistic propositional logic: It was originally proposed in [Ser84,PS86] and studied in detail in [Sim94]; more recent work can be found in [GS10,Str13,MS14a].
The second variant, known as constructive K, includes only k 1 and k 2 , not k 3 , k 4 , k 5 .This choice of axioms dates back to [Pra65] 1 , and its proof theory was investigated, for example, in [BdP00, HP07, MS11], while the semantics of it and some extensions was studied in [FM97] and [Koj12].
To gain intuition about the difference between the two variants, let us have a look at their standard Kripke semantics.A model of intuitionistic modal logic is described by a 4-tuple (W, ≤, R, I) with • a non-empty set of possible worlds W , preordered by ≤.
• an accessibility relation R ⊆ W × W satisfying: (i) For any w, v, v ′ ∈ W , if wRv and v ≤ v ′ , there exists a w ′ ∈ W such that w ≤ w ′ and w ′ Rv ′ .(ii) For any w, w ′ , v ∈ W , if w ≤ w ′ and wRv, there exists a v ′ ∈ W such that w ′ Rv ′ and v ≤ v ′ .• a function I : W → 2 A , where A = {a, b, c, . ..} denotes the set of propositional letters, such that for any w, w ′ ∈ W , if w ≤ w ′ then I(w) ⊆ I(w ′ ).Note that (i) and (ii) ensure a form of monotonicity of R over ≤.In contrast, a model of constructive modal logic decouples the accessibility relation R from ≤.It assumes a set of 'fallible' worlds ⊥ as a subset of W ; such that ⊥ is closed under ≤ and R, i.e. whenever w ∈ ⊥ and wRw 1 or w ≤ w 1 we also have w 1 ∈ ⊥.This is much weaker a condition on R than (i) and (ii).Also the definition of the forcing relation |= shows subtle differences.For the atoms, the binary connectives and the -modality, the intuitionistic and constructive semantics definition coincide: • w |= a iff a ∈ I(w).• w |= A iff ∀w ′ , v ′ ∈ W. w ≤ w ′ and w ′ Rv ′ imply v ′ |= A. In the intuitionistic case, forcing for ♦ and ⊥ is defined as follows: • w |= A iff ∃v ∈ W. wRv and v |= A.
• CS4 q q q q q q • CS5 q q q q q q Notice that the countermodels for k 3 and k 4 could not exist in the presence of (i) and (ii) above, and the countermodel for k 5 relies on the availability of the set ⊥ of fallible worlds.
We refer the reader to [MS11] for a more thorough semantic analysis of the differences between intuitionistic K and constructive K.
This work is concerned with the proof theory of constructive K, denoted CK and its various extensions with other common modal axioms.As for the classical and intuitionistic variants, we consider the five axioms below: A priori, this gives us 32 different logics, but as in classical (or intuitionistic) modal logic some of them coincide, so that we obtain only 15 distinct logics.2These are depicted in Figure 1, where we use the same names as those standard in the classical setting [Gar08], prefixed by 'C'.While the proof theory of the intuitionistic version of this cube has been well-studied in labeled systems [Sim94] and non-labeled systems [GS10,Str13,MS14a], there is surprisingly little work on the constructive "modal cube".In fact, to our knowledge, only the logics CK, CT, CK4, and CS4 have received proof theoretic treatment so far, e.g. in [BdP00, HP07,MS11].
In this work we attempt to give a unified cut-elimination procedure for all logics in Figure 1, using the framework of nested sequents [Kas94,GPT09,Brü09,Str13,Fit14], a generalization of Gentzen's sequent calculus which allows sequents to occur within sequents.This approach has previously been successful for the classical modal cube in [Brü09] and the intuitionistic modal cube in [Str13] but, perhaps surprisingly, the step from intuitionistic to constructive appears more involved than the one from classical to intuitionistic.This is also the reason why, in this paper, we consider only the logics in the 'cube'.We would like to compare the intuitionistic and constructive cases from the point of view of cut-elimination.Whenever possible, we aim to point out the differences to the arguments for intuitionistic systems presented in [Str13].
While the cut-elimination proofs in [Brü09] and [Str13] are markedly similar, we seem to require a different method in the constructive setting.The reasons for this are that certain formulations of some logical rules are no longer sound, and that we need an explicit contraction rule, along with other structural rules that further complicate the process of cut-elimination.
Nonetheless we manage to obtain cut-elimination for the logics CK, CK4, CK45, CD, CD4, CD45, CT, CS4, and CS5, and we conjecture that our systems admit cut for all logics in the cube.
We are not aware of a similar uniform treatment of constructive modal logics within other formalisms.However, in hindsight it is straightforward to translate our results into prefixed tableaux, using [Fit12], or into a tree-labeled sequent calculus.
We point out an interesting observation that the b-axiom entails k 3 and k 5 .While this is likely already known to many in the community we could not find this result stated in the literature, and so it is pertinent to raise it here.This arguably questions the "constructiveness" of logics including b, and so the inclusion of such logics in the cube itself, but such considerations are beyond the scope of this work. 3 Several attempts to deal with the proof theory of constructive modal logic have appeared previously.However, the fundamental data structures of such calculi all seem to be special cases of nested sequents.For example, the 2-sequents of [Mas92,Mas93] are a form of nested sequent where no tree-branching is allowed.It is not clear how the 2-sequent approach, while successful for deontic logic, could be adapted for the various constructive logics, or even CK, as pointed out by Wansing in [Wan94].Also the sequents of [MS11,MS14b] can be seen as a special case of nested sequents, also where no tree-branching is allowed, but constituting a richer data structure than 2-sequents because of the inclusion of a 'focus'.
Regarding applications of the family of the constructive modal logics, the extended Curry-Howard correspondence (which, for modal logics, is a relatively recent investigation) has been studied for CS4 [AMdPR01,MS14b].The constructive operator here captures staged computation [DP96,aBMTS99], and such logics are also used for the study of contexts [MdP05,MS14b].We also point out that there are many logics of interest that are proper extensions of CK but not of intuitionistic K, e.g.CS4 and PLL; a more detailed discussion of such logics can be found in [FM97].

Preliminaries on Nested Sequents
In order to present a nested sequent system for CK, we first need to define the notion of a nested sequent structure.For this, we recall the basic notions from [Str13], with slight modifications in notation, tailored to the current setting.Let a, b, c, . . .denote propositional variables and define formulas A, B, C, . . . of constructive modal logic by the following grammar: A As shorthand we write ⊤ for ⊥ ⊃ ⊥ and we omit parentheses whenever it is not ambiguous.
A (nested) sequent is a tree whose nodes are multisets of formulas tagged with a polarity.There are two polarities, input (intuitively as if on the left of the turnstile in the conventional sequent calculus), denoted by a • superscript, and output (intuitively as if on 3 One might argue that this observation is the reason behind Prawitz' statement [Pra65] on S5 being inherently non-constructive.However, Prawitz does not explicitly mention k3 and k5. the right of the turnstile in the conventional sequent calculus), denoted by a • superscript.Formally we define LHS sequents, denoted Φ, and RHS sequents, denoted Ψ, as follows, and a full sequent is a structure of the form Φ, Ψ.We assume that associativity and commutativity of the comma ',' is implicit in our systems, and that ∅ acts as its unit.This definition entails that exactly one formula in a full sequent has output polarity, and all others have input polarity.We use capital Greek letters Γ, ∆, Σ, . . . to denote arbitrary sequents, LHS, RHS or full, and may decorate them with a • or • superscript to indicate that they are LHS or RHS, respectively.
The corresponding formula of a sequent is defined inductively as follows, A context, denoted by Γ{ }, is a sequent with a hole { } taking the place of a subsequent (or, equivalently, a formula); Γ{∆} is the sequent obtained from Γ{ } by replacing the occurrence of { } by ∆.Note that, for this to form a full sequent, Γ{ } and ∆ must have the correct format.We distinguish two kinds of contexts: an output context is one that results in a full sequent when its hole is filled with a RHS sequent, and an input context analogously for a LHS sequent.This is clarified by the following example, taken from [Str13].
Then Γ 1 {∆ 2 } and Γ 2 {∆ 1 } are not well-formed full sequents, because the former would contain no output formula, and the latter would contain two.However, we can form the full sequents, whose corresponding formulas, respectively, are: for some n ≥ 0. Filling the hole of an output context with a RHS or full sequent yields a full sequent, and filling it with a LHS sequent yields a LHS sequent.Every input context Γ{ } is of the shape, 3) where Γ ′ { } and Λ{ } are output contexts (i.e., are of the shape (2.2) above).Note that Γ ′ { } and Λ{ } and Π are uniquely determined by the position of the hole { } in Γ{ }.
We can choose to fill the hole of a context Γ{ } with nothing, denoted by Γ{∅}, which means we simply remove the occurrence of { }.

Nested Sequent Systems for CK and its Variants
We use the standard notions of inference rule and derivation (or proof ) from usual sequent calculi; all that changes is the notion of sequent, as introduced in the previous section.We insist that every sequent in a derivation is a full sequent. 4A proof of a formula A is then a derivation whose conclusion is the (full) sequent A • .We also use the standard notions of admissibility and derivability of inference rules (see, e.g., [Bus98] or [TS00]).
Let us now consider the set of inference rules shown in Figure 2, which we call the system NCK for CK.These rules are similar to the corresponding rules for intuitionistic modal logic in [Str13] and classical modal logic in [Brü09], although there are some subtle yet crucial differences: • In [Str13] and [Brü09] additive versions of ⊃ • and • were given rather than incorporating an explicit contraction rule in the system.While these were essentially design choices in the previous works, here it is necessary to make contraction explicit since our treatment of the b-axiom does not allow us to show the admissibility of contraction; this is explained further below.Consequently, our cut-elimination proof differs significantly from the ones in [Str13] and [Brü09].• The ⊥ • -rule and the ∨ • -rule have a restriction on where the output formula occurs in the context: it must be in the same subtree of the sequent as the principal formula of the rule.The reason for this is the lack of k 3 (for the ∨ • -rule) and k 5 (for the ⊥ • -rule).• In the ⊃ • -rule (and also in the cut-rule described below), the 'output pruning' is defined differently from [Str13].There only the unique output formula is removed, whereas here the whole subtree containing the output formula is removed.The reason for this is the lack of the k 4 -axiom.
• In [Str13] the structural rule is heavily used.However, in the constructive setting, this rule is not available as it is no longer sound: it corresponds to the k 4 -axiom when the output formula occurs in ∆ 1 or ∆ 2 .Note that the id-rule applies only to atomic formulas but, as usual with sequent-style systems, the general form is derivable and this can be shown by a straightforward induction: In the course of this paper we make use of the following structural rules: called necessitation, weakening, and cut, respectively.These rules are not part of the system, but we will later see that they are all admissible.Note that in the weakening rule ∆ must be a LHS sequent, as is the case for the contraction rule c, as one might expect in an intuitionistic setting.The cut rule makes use of the output pruning in the same way as the ⊃ • -rule.We now turn to the rules for the axioms in (1.2).For d, t and 4, the corresponding rules are shown in Figure 3, and they coincide with those in [Str13].
For the b and 5 axioms, the rules given in [Str13] (themselves adapted from the classical setting [Brü09]) are not sound in the constructive setting, again due to the lack of k 4 .For b, one could restrict the rules of [Str13] in the following way, in order to regain soundness.However such a system is not yet complete as, for example, the formula ♦( A ∨ ⊥) ⊃ A is no longer provable in the cut-free system.
To address this problem, we introduce the structural rules in Figure 4 which were used during the cut-elimination proofs of [Brü09] and [Str13].These rules are identical to the ones in [Brü09] and [Str13] for d, t, and b.For 4, our rule is slightly weaker than the one in [Str13], again due to the lack of k 4 .Finally, for 5, the situation is more subtle: again, the general versions of the logical rules 5 • and 5 • from [Str13] are no longer sound due to the lack of k 4 .These 5 • and 5 • rules can each be decomposed into three rules performing 'simpler' inference steps, but unfortunately all three of these are unsound.The first can be made sound by incorporating weakening, as shown for b • and b • in (3.2) and (3.3) above, but, as expected, the resulting system is again incomplete.Perhaps surprisingly, the structural rule 5 [ ] used in [Brü09] and [Str13] is also no longer sound in the constructive setting due to the lack of k 4 .However, that rule (shown on the left below5 ) can also be decomposed into three rules (shown on the right below), of which the first (shown in Figure 4) is sound in the constructive setting, i.e. with respect to HCK + 5 in the next section.This 'decomposition' is similar to the cases of the rules 5 • and 5 • discussed in [Str13] and [MS14a].
In the remainder of this paper we show soundness and completeness of our systems.For this let us introduce the following notation.We use X and Y for sets of axioms, i.e., X, Y ⊆ {d, t, b, 4, 5}, and we write X [ ] (or Y [ ] ) to denote the set of corresponding structural rules shown in Figure 4.If X ⊆ {d, t, 4}, we write X for the set of corresponding • -and ♦ • -rules shown in Figure 3.Then, we may write NCK + X + Y [ ] to denote NCK augmented with the rules X and Y [ ] ; in such cases no assumptions on X or Y further to those stated are assumed.In particular, their intersection does not need to be empty, nor does one need to be a subset of the other.

Soundness
To our knowledge there are no standard Kripke semantics for all the various constructive modal logics and consideration of this issue is beyond the scope of this work.Therefore we show soundness of our rules with respect to the Hilbert system.
For this we define HCK to be some complete set of axioms for intuitionistic propositional logic extended by the axioms k 1 and k 2 , shown in (1.1), together with the rules mp for modus ponens and nec for necessitation: For a set X ⊆ {d, t, b, 4, 5} we then write HCK + X for the system obtained from HCK by adding the axioms in X.If X is a singleton {x}, we just write HCK + x.Soundness can now be stated in the following theorem: Theorem 4.1 (Soundness).Let X ⊆ {d, t, 4}, let Y ⊆ {d, t, b, 4, 5}, and let Clearly, (ii) follows immediately from (i) using an induction on the size of the derivation.To prove (i), we start with the axioms: Lemma 4.2.Let X ⊆ {d, t, b, 4, 5}, let Γ{ } be an output context, and Π • be an RHSsequent.Then fm(Γ{a • , a • }) and fm(Γ{⊥ • , Π • }) are provable in HCK + X.
Proof.By induction on the structure of Γ{ }.
For showing soundness of the inference rules with one premise, we first have to verify that the deep inference reasoning remains valid in the constructive setting.This is shown in the following three lemmas.
Proof We can now prove the soundness of rules with one premiss.
Lemma 4.6.Let X ⊆ {d, t, b, 4, 5}, and let Proof.For the rules ∨ • , • , ♦ • , ⊃ • this follows immediately from Lemma 4.4, where for ♦ • we need the k 2 -axiom.For the other rules we apply Lemma 4.5.Note that for the • -rule we need a case distinction: If the output formula occurs inside ∆, then we use k 1 and Lemma 4.4.If the output formula occurs inside the context Γ{ }, then we use k 2 and Lemma 4.5.
Let us now turn to showing the soundness of the branching rules ∧ • , ∨ • , ⊃ • , and cut.For this, we develop appropriate versions of Lemmas 4.3 and 4.4 that deal with branching behavior.Note that, contrary to the intuitionistic case in [Str13], we do not have such a version of Lemma 4.5 in the constructive setting.This is due to the lack of axiom k 3 .Lemma 4.7.Let X ⊆ {d, t, b, 4, 5}, and let A, B, C, and D be formulas. (i) Proof.(i) and (ii) follow by completeness of HCK over intuitionistic logic.(iii) and (iv) follow by necessitation, distributivity of over ∧, and k 1 or k 2 respectively.
Lemma 4.9.Let X ⊆ {d, t, b, 4, 5}, and let Proof.For the ∧ • -and ∨ • -rules, this follows immediately from Lemma 4.8 and provable formulas ( , respectively.For ⊃ • , note that by Observation 2.2 and Definition 2.3, the rule is of shape where Γ ′ { }, Λ{ }, and Π{ } are output contexts.In particular, let Now let P = fm(Π • ) and L i = fm(Λ i ) for i = 0 . . .n, and let To be able to apply Lemma 4.8, we need to show that ( which can be shown provable in HCK + X using an induction on n together with Lemma 4.7.(ii) and (iv).For the cut-rule we additionally observe that A ⊃ A is always provable.
Remark 4.10.From the lemmas presented so far, we now have that NCK+w+cut is sound with respect to HCK, i.e. we have proved already Theorem 4.1 in the case of X = Y = ∅.This means that if we have a proof of a formula A in NCK + w + cut in which we allow X ⊆ {d, t, b, 4, 5} to occur as proper axioms, then we have that X ⊃ A is provable in HCK, by purely propositional logic, and therefore A is provable in HCK + X.
We use the observation in the above remark to prove the following lemma.
Lemma 4.11.Let S and D be arbitrary formulas.Then we have the following: Proof.In the following we show that the formulas in (i)-(vi) can be proved in NCK + cut extended by b or 5, as appropriate, as a proper axiom.Our lemma then follows from Remark 4.10.
where D 1 is a subderivation of (i), D 2 is the same as D 1 , except that we use 5 instead of b, and D 3 is a variant of a subderivation of (ii), using 5 instead of b.
Now we can show soundness of the rules in Figures 3 and 4, which we need to complete the proof of Theorem 4.1.
Lemma 4.12.Let X ⊆ {d, t, 4}, let Y ⊆ {d, t, b, 4, 5}, let x ∈ X, let y ∈ Y, and let and 4 [ ] this follows immediately from Lemmas 4.4 and 4.5 and the corresponding axioms, shown in (1.2).For 4 • and 4 • , observe that these two rules can be derived using the rules ♦ • and ♦ • , respectively, and respectively.The soundness of the two rules in (4.2) follows immediately from Lemmas 4.4 and 4.5 and the 4-axiom.Having established the soundness of our system, we can use it to make some interesting observations.Surprisingly, the b-axiom entails the axioms k 3 and k 5 (shown in (1.1) in the d axiom: t axiom: b axiom: While the proof of k 5 can be easily shown directly in the Hilbert system, the proof of k 3 in HCK + b is not so simple.From our cut-elimination result in Section 6 it will follow that the 5 axiom alone is not enough to derive k 3 or k 5 .But since b is derivable in CS5, both k 3 or k 5 are derivable in CS5.

Completeness
Completeness is also shown with respect to the Hilbert system.This is in fact very similar to the completeness proof for intuitionistic modal logic given in [Str13].To simplify our cut-elimination argument in Section 6 we will put a restriction on the ♦ • -rule: we define the system NCK ′ to be NCK with the ♦ • -rule replaced by Theorem 5.1 (Completeness).Let X ⊆ {d, t, 4} and Y ⊆ {d, t, b, 4, 5}.Then every formula that is provable in Proof.Clearly, all axioms of propositional intuitionistic logic are provable in NCK ′ .The axioms k 1 and k 2 are provable in NCK ′ , by the same derivations as in [Str13], so we do not repeat them here.Note that the derivations for k 3 , k 4 , and k 5 of [Str13] are not valid in our setting because of the restrictions to the ∨ • -, ⊃ • -, and ⊥ • -rules, respectively.Figure 5 shows that each axiom Finally the rules mp and nec, shown in (4.1), can be simulated by the rules cut and nec [ ] , shown in (3.1), as follows: From here we appeal to the admissibility of the nec [ ] -rule, which follows by a straightforward induction on the size of a derivation.
Theorems 4.1 and 5.1 are enough to give sound and complete nested sequent systems with cut for any logic in the cube shown in Figure 1, by simply adding the corresponding structural rules from Figure 4.If one of the axioms is d, t, or 4, then we can use the logical rules from Figure 3 instead of the structural rule.For example, for CS4, we can use or any union of these sets.
In the next section we show cut-elimination for NCK ′ + X + Y [ ] , yielding completeness for the cut-free system.However, this is not achieved for every subset of X ∪ Y [ ] with X ⊆ {d, t, 4} and Y ⊆ {d, t, b, 4, 5}.In fact, it can be shown that, for example, NCK ′ + 4 [ ] is not complete for CK4.On the other hand, we have: Thus, if we want a cut-free system for CS4, we have to add the rules {t • , t • , 4 • , 4 • } to NCK ′ .The proof of Theorem 5.2 relies on the cut-elimination argument presented in the next section, and can thus be presented only at the end of Section 6.
Looking back at the cube in Figure 1, we can see that Theorem 5.2 gives us cut-free systems for the logics CK, CK4, CK45, CD, CD4, CD45, CT, CS4, and CS5.The logics for which our cut-elimination proof does not apply are CKB, CK5, CKB5, CD5, CDB, and CTB.

Cut-Elimination
By inspection of the statement of Theorem 5.2, we have that t and 4 must be present as logical rules, and b and 5 as structural rules, whereas d can be present in either variation.This is due to the following result, whose proof is straightforward.We can now state our cut-elimination result in a concise way: Theorem 6.3 (Cut-Elimination).Let X, Y be a safe pair of axioms, and let D be a proof in The rest of this section is dedicated to the proof of Theorem 6.3.Since our cut-elimination strategy might seem unorthodox, we first explain some of the problems we encountered.Consider the following derivation: We cannot permute the instance of ∨ • under the cut because in general it is not applicable in Γ{Θ{C ∨ B • }, Π • }.On the other hand, we cannot reduce the rank of the cut along the main connective of the cut formula ♦A, since there is no invertible rule for ♦A • , and different things might happen in the left two branches.Furthermore, we cannot just impose the same restriction that we impose on the ∨ • rule also on the cut rule, because then we would not be able to reduce the cut rank in the ordinary ♦ • -♦ • cases.The situation in (6.1) is the reason we work with the rule ♦• instead of ♦ • .Note that imposing the same restriction on all logical rules would make other permutation cases difficult.
In what follows we will use the shorthand Γ n to denote Γ with n pairs of brackets around it, i.e.
. Also, we define the depth of a context Γ{ } to be the number of bracket pairs in whose scope the hole of Γ{ } appears, i.e., the depth of ∆ We consider super rule variants of the rules 4 and 5 [ ] , shown in Figure 6, obtained from unboundedly many applications of the corresponding normal rules in a certain way.For a safe pair X, Y of axioms, we define We need these variants in order to obtain height-preserving admissibility of certain rules.We have the following proposition: Proposition 6.4.A sequent is provable in Proof.One direction follows immediately from the observation that 4 • and 4 • are special cases of s4 • and s4 • , respectively, and that b [ ] is a special case of sb [ ] and of s5b [ ] , and that 5 [ ] is a special case of s5 [ ] and of sb5 [ ] .Conversely, s4 • and s4 • are just sequences of 4 • and 4 • , respectively, and s4 • and s4 s .Proof.We show here how s4 • permutes over sb [ ] .There are two nontrivial interactions: The other cases are similar.
When the rules 4 • and 4 • are present our cut rule, shown in (3.1), is not strong enough for our induction to work.Therefore we additionally use the following two rules which are just combinations of cut with towers of 4 • and 4 • , respectively.More precisely: Fact 6.6.The rule ♦cut is derivable in {cut, s4 • } and in {cut, 4 • }, and the rule cut is derivable in {cut, s4 • } and in {cut, 4 • }.
By Cut, we refer to the set {cut, ♦cut, cut} or {cut}, depending on whether 4 • and 4 • are present or not, and we write * cut for any variant in Cut.Throughout this section we fix the convention that, for any * cut step, the output cut formula occurs in the left premise, while the input cut formula occurs in the right premise.Definition 6.7.For a formula A we define depth(A) inductively as follows: Given a cut step, as shown in (3.1), its cut formula is A, and its rank is depth(A).
Since the rules ♦cut and cut can be seen as derivations consisting of one instance of cut and some instances of 4 • and 4 • , respectively, the definition of rank also applies to ♦cut and cut given in (6.2).We use this convention throughout this section: whenever we define a notion for an instance of cut, this definition also applies for ♦cut and cut because there is a unique instance of cut contained in them.

and s4
• are called black destructing.The principal formula of a black destructing rule instance is the input formula singled out in its conclusion in Figures 2, 3 and 6.
In other words a rule instance is black destructing if, considered bottom-up, it decomposes an input formula along its main connective, and that formula is its principal formula.In particular, note that 4 • and s4 • are not black destructing.Definition 6.9.An instance of cut is anchored if the rule immediately above it on the right is a black-destructing rule whose principal formula is the cut formula.We define the value of a cut-instance to be the pair r, s , where r is its rank, and s = 0 if it is anchored and s = 1 if it is not anchored.The value of an instance of ♦cut or cut is the value of the underlying cut-instance (if we read the ♦cut/ cut as composition of cut and s4 • /s4 • ).Finally, the cut-value of a derivation D, denoted by v(D) is the multiset of the values of its cut-instances.
We order cut values lexicographically, i.e., Then, multisets of cut values are ordered via a common multiset ordering: Given an ordered set V, < , let M(V ) be the set of multisets of elements of V , and let M 1 , M 2 ∈ M(V ) be two such multisets.We define M 1 ≪ M 2 iff there is a multiset surjection f : Fact 6.10.If < is a strict total order, then so is ≪.Furthermore, if < is well-founded, then so is ≪ [DM79].
This gives us a well-order ≪ on the cut-values of a derivation, and our cut-reduction proceeds by an induction on this well-order.For simplicity, we always consider a topmost cut.There are two main lemmas, one for reducing anchored cuts (Lemma 6.20), and one for reducing cuts that are not anchored (Lemma 6.19).For both of these lemmas we need, as is often the case, height preserving admissibility and invertibility of certain inference rules.Definition 6.12.The height of a derivation D, denoted by h(D), is defined to be the length of a maximal branch in the derivation tree.We say that a rule r with one premise is height preserving admissible in a system S, if for each derivation D in S \ {r} of r's premise there is a derivation D ′ of r's conclusion in S \ {r}.Similarly, a rule r is height preserving invertible in a system S, if for every derivation of the conclusion of r there are derivations for each of r's premises with at most the same height.Proposition 6.13.Let X, Y be a safe pair of axioms.Then all rules in X [ ] , as well as the rules w and nec are height preserving admissible for s .Proof.For w and nec this is a straightforward induction on the height of the derivation.For t [ ] and 4 [ ] we permute steps upwards through the proof to show admissibility, preserving height of the other rules in each reduction.Notice that, for either step, any nontrivial overlap with a rule above must have a bracket in its conclusion.For t [ ] we have the following nontrivial cases: } with depth(∆ 2 ) = m and depth(∆ 1 ) = n, and the permutation is as follows: and we can apply the induction hypothesis twice.(2) s4 where r is t • if the hole of ∆{ } has depth 0 and s4 One overlap case is similar to case 1, and the other is given below.
One overlap case is similar to case 1, and the other is similar to 7 above.And for 4 [ ] we have the following nontrivial cases: The only overlap possible is in the ∆ part of a sb [ ] -step, so let ∆{ } = ∆ 1 {[∆ 2 { }]} with depth(∆ 2 ) = m and depth(∆ 1 ) = n, and the permutation is as follows: and we can apply the induction hypothesis twice.(10) s4 One overlap case is similar to 9 and the other is given below.
One overlap case is similar to 9, and the other is similar to 15 above.Note also that permutations over contraction preserve height, since we can apply the induction hypothesis twice.
Note that the variants X s of X and Y [ ] s of Y [ ] are needed to make Proposition 6.13 work.Without the "super-rules" we would not be able to preserve the height, and consequently would not be able to proceed by the induction hypothesis when eliminating t [ ] and 4 [ ] in the cases 1 and 9 above.Proposition 6.14.The rules ∧ Before we can state our main lemmas, we need to define a restricted version of Buss' logical flow-graphs [Bus91].Definition 6.15.We define the (formula) flow-graph of a derivation D, denoted by G(D) to be the directed graph whose vertices are all input formula occurrences in D, and whose edges are just between two formula occurrences which are the same unaltered occurrence in the premise and conclusion of an instance of an inference rule.This concerns all formula occurrences in Γ{ }, ∆, Π, and Σ in the rules in Figures 2, 3, 4, 6 and * cut, as well as the occurrences of A • in the 4 • and s4 • rules.The edges are always directed from premise to conclusion.The length of a path in G(D) is its number of edges.A path p in G(D) is maximal if for every path p ′ in G(D) with p ⊆ p ′ we have p = p ′ .Let us emphasize that there are no edges between a formula occurrence and any of its subformulae that may occur in G(D).For example, the principal A ∨ B • in the conclusion of an ∨ • -rule is connected to neither the A • nor the B • in the premises.But every formula occurrence in Γ{{ }, Π • } in the conclusion is connected via an edge to the same occurrence in each of the two premises.Thus, the flow-graph is essentially a set of trees, where branching occurs in the branching rules ∨ • , ⊃ • , ∧ • , * cut, and in a contraction because every formula occurrence in ∆ • in the conclusion is connected to each of its copies in the premise.
Recall from Definition 6.8 the notion of a black-destructing rule, from Definition 6.9 the notion of an anchored cut, and our convention that an output cut formula is written on the left-hand side of a * cut step and an input cut formula on the right.Definition 6.16.A cut-path in G(D) is a maximal path that ends at the cut formula A • in the right-hand side premise of a * cut-instance.A cut path is relevant if it starts at the principal formula of a black destructing rule.Otherwise it is called irrelevant.A cut path is left-free if it never passes through a left-hand side premise of an instance of * cut.A derivation D is left-free if all relevant cut paths in G(D) are left-free.An origin of G(D) is the topmost vertex of a relevant cut path in G(D).An origin is anchored if its cut path has length 0. A derivation is anchored if all its cuts are anchored.An instance of * cut in D is called relevant if it has at least one relevant cut path.Otherwise it is called irrelevant.The relevant cut-value of a derivation D, denoted by v r (D), is the multiset of the values of its relevant cuts.
To be clear, irrelevant cut-paths are exactly those that begin in the context of an axiom, i.e. in the Γ{ } part of a ⊥ • or id step.
Notice that we are using the term 'anchored' to describe both cuts, as in Definition 6.9, and origins as in the definition above (as well as derivations).In particular we point out that, if a cut is anchored, then it can have only one origin which is also anchored.Conversely, if all origins are anchored (which are only defined for relevant cut-paths), there may be some cuts that are not anchored in the derivation, namely those with only irrelevant cut-paths.An anchored derivation, thus, is one all of whose cuts and origins are anchored, which is not the same as simply having all origins anchored.This subtlety is important in the proof of Lemma 6.19 below.But first, let us make an example.Example 6.17.Consider the derivation: Here the cut-paths for A • and E • are irrelevant, while the cut-path for d ∧ b • is relevant.
There are three cut-paths for A • and, except for the rightmost one, they do not satisfy left-freeness, since they pass through the left premise of a cut instance.The only cut-path for d ∧ b • has two vertices and length 1.Therefore, this cut is not anchored.But if we permute that cut over the ∨ • -rule instance, we obtain the derivation in which the cut-path for d ∧ b • has length 0, and so this cut is anchored.
Lemma 6.18.Let X, Y be a safe pair of axioms.Given a derivation D in s + Cut of the same conclusion, such that D ′ has no irrelevant cuts, and such that v r (D ′ ) ≤ v r (D).
Proof.We proceed by induction on the number of irrelevant cuts in D. Consider the topmost one.We can replace where D ′ 2 is obtained from D 2 by removing the A • occurrence everywhere; this results in a correct derivation since, by irrelevance, A • must occur in the context of an axiom.For ♦cut and cut we proceed similarly.
In the following, we use the notation n * r where n is a natural number and r a name of an inference rule.Then n * r simply stands for n consecutive applications of r.Lemma 6.19.Let X, Y be a safe pair of axioms, and let D be a left-free derivation in Then there is an anchored derivation D ′ in NCK ′ + X s + Y [ ] s + Cut of the same conclusion, such that for each * cut in D ′ , there is a * cut in D of the same rank.
Proof.We proceed by induction on the number of origins in G(D) that are not anchored.If all origins are anchored, then we remove all irrelevant cuts using Lemma 6.18 and we are done.Otherwise, we pick a topmost origin that is not anchored and proceed by an inner induction on the length of its cut-path to show that there is a derivation in which the number of non-anchored origins has decreased.Note that, if the length of this cut-path is 0, then the cut is already anchored and there is nothing to do.Now consider the * cut-instance connected to our origin and make a case analysis on the rule instance r on the right above it.
(1) If r is one of ∧ • , ♦ • , ⊃ • , • , we can reduce as follows: where the Inv r is eliminated by Proposition 6.14.
(2) If r is one of ∨ • , ∧ • , we can reduce as follows: where the Inv r steps are eliminated by Proposition 6.14.Note that it can happen in these two cases that the Inv r -step is vacuous in the above because depending on the position of the output formula in Γ{ } it is possible that Γ From here, w steps are removed by Proposition 6.13.Left-freeness is preserved since the derivations initially on the right of the cut, D 2 and D 3 , remain on the right of all cuts after the transformation.Finally, both of the new cuts have the same rank as the initial cut, satisfying the requirement in the statement of the lemma.(To see that the application of the c-rule is correct we refer to Observation 2.2 and Definition 2.3.)Note that this case shows that we need an explicit contraction rule.Making contraction implicit in the ⊃ • -rule (as done in [Str13]) would not be enough, since we also need to duplicate the context Θ{ }.
In the case shown above, the output formula in the conclusion can be in Γ{ } or Θ{ }.There is another such case for ⊃ • on the right branch, where the output formula in the conclusion is in ∆{ }.That case is simpler, and no extra contraction is needed: Here we use invertibility of ⊃ • on the right (Proposition 6.14).(4) If r is one of • , ♦ • , t • , t • , c, or one of the s4-rules, working entirely in the context of the cut formula A • , then there are contexts Γ ′ , Γ ′ 1 such that we can reduce as follows: where, read top-down, the two w-steps weaken as much as is necessary to unify the contexts in order to perform a cut-step.The r-step then acts on the appropriate redex (as determined by the r-step on the left) before contraction is performed to eliminate any formulae duplicated as a result of the permutation.The w-steps are then removed by Proposition 6.13.(5) If r is a s4 • step moving the cut formula (which is of shape A • ), then we can inductively apply, by decomposing the instance of s4 • into several 4 • steps.
(6) If r is a c step duplicating the cut formula, we can reduce as follows: Note that the number of origins is not increased because the derivation is left-free.Furthermore, we can choose the order of the two new cuts such that the origin we are working on belongs to the topmost cut.Thus, we can proceed by the induction hypothesis.(7) If r is a sb [ ] or s5b [ ] step, such that the cut-formula A • is inside Σ.Then there are two subcases.
(9) If r is a s5 [ ] or sb5 [ ] , such that the cut-formula A • is inside Σ, then the situation is similar to case 7b above: where we use Proposition 6.13 to remove the w-and 4 [ ] -steps.This case is the reason why we need the presence of 4 when we have 5 in our logic.(10) If r is a s5 [ ] or sb5 [ ] , such that the cut-formula A • is inside ∆{∅}, then the situation is similar to case 8 above.(11 such that the cut-formula A • is inside Γ{ }, then we proceed as in case 4 above.(12) Finally, if r is another cut, we can reduce as follows: where D ′ 2 exists because our original derivation is left-free, and the w-step is removed by Proposition 6.13.Note that it can happen that (Σ{∅}) ⇓ = (Σ{A • }) ⇓ and/or (∆{∅}) ⇓ = (∆{B • }) ⇓ , depending on where the output formula occurs in Γ{Σ{∅}, ∆{∅}}.Above, we have only shown the cases for cut.The ones for ♦cut and cut are similar, except the ones when r is one of When such a rule is on the right above a cut, we decompose that cut into a s4 • and a cut (using Fact 6.6) and then apply Lemma 6.5 to permute the s4 • over r, so that we can proceed as described above.When the cut is permuted over r, we can compose it again with the s4 • -instance, so that we can proceed by induction hypothesis.Observe that we make crucial use of the left-free property.Without it, the number of origins in G(D) would not be stable.Furthermore, note that the cut-cut permutation does not affect cuts that are above our current origin.Thus, all origins above remain anchored.This is the reason for starting with the topmost one.Lemma 6.20.Let X, Y be a safe pair of axioms.If there is a proof where D 1 and D 2 are both in NCK ′ + X s + Y [ ] s and where * cut is anchored, then there is a proof s + Cut in which all cuts have a smaller rank.Proof.We make a case analysis on the cut-formula A.
(1) If A = B ∧ C, we reduce the cut rank as follows: where D ′ 1 and D ′′ 1 exist since the ∧ • -rule is height-preserving invertible (Proposition 6.14), and D ′ 2 exists since our cut is anchored.Finally, we remove the w-step using Proposition 6.13.
(2) If A = B ⊃ C, we reduce the cut rank as follows: where again, D ′ 1 exists by invertibility of the ⊃ • -rule (Proposition 6.14), and D ′ 2 and D ′′ 2 exist since our cut is anchored.(3) If A = B, and the rule on the right above the cut is a • , then we reduce the cut rank as follows: Where D ′ 1 and D ′ 2 exist for the same reason as above, and we finally apply Proposition 6.13 to remove the w-and 4 [ ] -steps, where n is the depth of Θ ⇓ { }. (4) If A = B and the rule on the right above cut is a s4 • , then we can reduce to the previous case as follows, (5) If A = B, and the rule on the right above the cut is a t • , then we reduce the cut rank as follows: where n is the depth of Θ ⇓ { }, and at the top we either have one t [ ] step (if n = 0) or n − 1 steps of 4 [ ] (if n ≥ 1), which can be removed via Proposition 6.13.Note that 4 ∈ X if n ≥ 1. (6) If A = a, then the cut is removed as follows: we proceed by induction on the height of D 1 and make a case analysis on its bottommost rule instance r.
(a) If r is a ∨ • , then it must decompose the cut formula, bottom-up, and we can reduce the cut rank as follows: where the Inv-step is removed by Proposition 6.14, and we can proceed by induction hypothesis.(c) All other cases are standard commutative cases and are shown below.They are in fact symmetric to their corresponding cases in Lemma 6.19.This, in particular, concerns the case where r is ∨ • .Since our cut is anchored, the output branch of the sequent is next to the A • occurrence in the right premise of the cut.Therefore the ∨ • above the left premise of the cut can now be permuted under the cut: The Inv-steps are removed by Proposition 6.14.The other invertible rules are handled similarly: Finally, if the rule on the left above the cut is an axiom ⊥ • (note that it cannot be id because A • is not an atom), then we reduce as follows:   where n is the depth of the context Θ • { }, and D ′ 2 exists because the instance of ♦cut is anchored.We use Proposition 6.13 to remove the w-and 4 [ ] -steps.We proceed similarly for s4 (c) If r is t • the situation is similar and we can reduce the cut rank as follows: We can now put things together to complete the proof of cut-elimination.
Proof of Theorem 6.3.A proof in NCK ′ +X +Y [ ] +cut is trivially also a proof in NCK ′ +X s + Y [ ] s +Cut.We proceed by induction on the cut-value v(D) using the well-ordering ≪ (defined after Definition 6.9).In the base case v(D) is empty, and we are done.Otherwise, we pick a topmost cut in D. If this * cut-instance is anchored, then we can by Lemma 6.20 replace this cut by cuts of smaller rank, and thus reduce the overall cut-value of the derivation.If our * cut-instance is not anchored, we observe that the subderivation rooted at that * cut-instance is left-free (because we chose a topmost cut), and therefore we can apply Lemma 6.19 to replace that subderivation with one in which all cuts are anchored and have the same rank.Thus, the overall cut-value of the derivation has reduced as well, and we can proceed by the induction hypothesis.Finally, we apply Proposition 6.4 to eliminate super steps and obtain a proof of the same conclusion in From here it is simple to see why Theorem 5.2 holds.

Conclusions
To the best of our knowledge, our paper is the first attempt to provide some unified prooftheoretic framework for the constructive modal cube.Although this work does not show cut-elimination for every logic in the cube, we conjecture that the systems presented do, in fact, admit cut.More precisely: This would give us a cut-free system for every logic in the cube.The reason why we think Conjecture 7.1 is true is the observation that the only place where 4 steps appear in the presence of b [ ] or 5 [ ] is the permutation of sb [ ] steps or s5 [ ] steps under a cut, as in (6.6).Instances of 4 [ ] are then introduced, and in the admissibility proof for 4 [ ] instances of 4 • or 4 • are only introduced when 4 [ ] steps are permuted over instances of • or ♦ • .However, looking back at (6.6) and (6.8), one can see that it seems possible to permute these instances of • and ♦ • under the whole derivation block, including the cut.We have not yet managed to incorporate this observation into the formal cut-elimination argument, and leave this issue for further research.
An alternative approach would be to make use of the observation that b implies k 3 and k 5 , by generalizing the ⊥ • -and ∨ • -rules to their intuitionistic versions, as used in [Str13,MS14a].This would simplify the cut-elimination argument for logics containing b since we could reuse a lot of the material already appearing in [Str13].
Another path of further research is to give modal logics a similar uniform treatment as the substructural logics in [CGT08,CST09].For this, it is necessary to first look at concrete examples of structural rules corresponding to axioms, as we have shown in Figure 4. Since these rules almost coincide in the classical, the intuitionistic, and the constructive setting, we hope to eventually discover a general pattern, yielding uniform cut-elimination arguments for a variety of modal logics.
Finally, we have observed an apparent dichotomy between the b axiom and the 'constructiveness' of constructive modal logic, since the former implies k 3 and k 5 , for which we do not know of any approach providing some sort of Curry-Howard correspondence.We therefore believe it would be pertinent to develop further outlooks on such logics.Perhaps it would be possible to find weaker formulations of b which are equivalent classically, but not constructively, and which do not entail k 3 and k 5 .Such an endeavor might yield new insights for extending the scope of the Curry-Howard correspondence to modal logics.

Figure 5 .
Figure 5. Proofs of the axioms d, t, b, 4, and 5 in our system Proposition 6.1.(i) The rules d • and d • are derivable in { • , d [ ] } and {♦ • , d [ ] }, respectively.(ii) The rule d [ ] is admissible for any subsystem of NCK ′ + X + Y [ ] , provided d ∈ X.However, our cut-elimination argument becomes slightly simpler if we work with d [ ] instead of d • and d • .To summarize, the following definition fixes the axiom sets our cutelimination proof deals with.Definition 6.2.Let X, Y ⊆ {d, t, b, 4, 5}.We call the pair X, Y safe if X ⊆ {t, 4} and Y ⊆ {d, b, 5}, such that if t ∈ X and 5 ∈ Y then b ∈ Y, and if b ∈ Y or 5 ∈ Y then 4 ∈ X.

Figure 6 .
Figure 6.Super rules for 4, b, and 5, where n is the depth of ∆{ } and 1 ≤ k ≤ n.

•♦
are obtained by composing with • and ♦ • , respectively.Then sb [ ] and s5 [ ] are just sequences of b [ ] and 5 [ ] , respectively, whereas s5b [ ] and sb5 [ ] use both b [ ] and 5 [ ] .Lemma 6.5.Let X, Y be a safe pair of axioms.If 4 ∈ X, then the rules s4 • and s4 • permute over any r ∈ Y [ ] and ⊃ • on the right premise, are height preserving invertible for NCK ′ ∪ X s ∪ Y [ ] s .Proof.Straightforward induction on the height of the derivation.