Herbrand-Confluence

We consider cut-elimination in the sequent calculus for classical first-order logic. It is well known that this system, in its most general form, is neither confluent nor strongly normalizing. In this work we take a coarser (and mathematically more realistic) look at cut-free proofs. We analyze which witnesses they choose for which quantifiers, or in other words: we only consider the Herbrand-disjunction of a cut-free proof. Our main theorem is a confluence result for a natural class of proofs: all (possibly infinitely many) normal forms of the non-erasing reduction lead to the same Herbrand-disjunction.


Introduction
The constructive content of proofs has always been a central topic of proof theory and it is also one of the most important influences that logic has on computer science.Classical logic is widely used and presents interesting challenges when it comes to understanding the constructive content of its proofs.These challenges have therefore attracted considerable attention, see, for example, [Par92, DJS97, CH00], [BB96], [Urb00, UB00], [BBS02], [Koh08], or [BL00] for different investigations in this direction.
A well-known, but not yet well-understood, phenomenon is that a single classical proof usually allows several different constructive readings.From the point of view of applications this means that we have a choice among different programs that can be extracted.In [RT12] the authors show that two different extraction methods applied to the same proof produce two programs, one of polynomial and one of exponential average-case complexity.This phenomenon is further exemplified by case studies in [Urb00, BHL `05, BHL `08] as well as the asymptotic results [BH11,Het12b].The reason for this behavior is that classical "proofs often leave algorithmic detail underspecified" [Avi10].
On the level of cut-elimination in the sequent calculus this phenomenon is reflected by the fact that the standard proof reduction without imposing any strategy is not confluent.
The explicitly mentioned formula in a conclusion of an inference rule, like A _ B for _ is called main formula.Analogously, the explicitly mentioned formulas in the premises of an inference rule, like A and B for _, are called auxiliary formulas.In the context of a concrete derivation we speak about main and auxiliary formula occurrences of inferences.
Definition 2.2.A proof is called regular if different @-inferences have different eigenvariables.
We use the following convention: We use lowercase Greek letters α, β, γ, δ, . . .for eigenvariables in proofs, and π, ψ, . . .for proofs.For a proof π, we write |π| for the number of occurrences of inferences in π.Furthermore, we write EVpπq for the set of eigenvariables of @-inferences of π.
In a sequent calculus proof, each formula occurrence can be traced downwards via its descendants to either a cut formula or the end-sequent.We write EV c pπq for the set of those eigenvariables in π that are introduced by a @-inference whose main formula occurrence can be traced downwards to a cut formula, i.e., is not part of the end-sequent of π.The elements of EV c pπq will also be called cut-eigenvariables.
Definition 2.3.A weak sequent is a sequent that does not contain any @-quantifier.
Fact 2.4.If the end-sequent of a proof π is a weak sequent then EVpπq " EV c pπq.
Remark 2.5.Our results do not depend on technical differences in the definition of the calculus (which in classical logic are inessential) such as the choice between multiplicative and additive rules and the differences in the cut-reduction induced by these choices.However, for the sake of precision, we will formally define the cut-reduction used in this paper.
Definition 2.6.Cut-reduction is defined on regular proofs and consists of the proof rewrite steps shown in Figure 1 (as well as all corresponding symmetric variants), where in the contraction reduction step ρ 1 " rαzα 1 s αPEVpψ 2 q and ρ 2 " rαzα 2 s αPEVpψ 2 q are substitutions replacing each eigenvariable α in ψ 2 by fresh copies, i.e., α 1 and α 2 are fresh for the whole proof.We write ù for the compatible (w.r.t. the inference rules), reflexive and transitive closure of ❀.
The above system for cut-reduction consists of purely local, minimal steps and therefore allows the simulation of many other reduction relations.We chose to work in this system in order to obtain invariance results of maximal strength.Among the systems that can Axiom reduction: Quantifier reduction: Propositional reduction: A, ∆ ´´´´´´´´´´´´´´´´´cut Γ, ∆, ∆ " " " " " " " " " c Γ,

∆
Weakening reduction: Binary inference permutation: Figure 1: Cut-reduction steps be simulated literally are for example all color annotations of [DJS97] in the multiplicative version of LK defined there.The real strength of the results in this paper lies however in the general applicability of the used proof techniques: the extraction of a grammar from a proof (that is described in the next sections) is possible in all versions of sequent calculus for classical logic and in principle also in other systems like natural deduction.In particular, our results also apply to inversion-based cut-elimination procedures such as for example that in [Sch77].

Regular and Rigid Tree Grammars
Formal language theory constitutes one of the main areas of theoretical computer science.
Traditionally, a formal language is defined to be a set of strings but this notion can be generalized in a straightforward way to considering a language to be a set of first-order terms.Such tree languages possess a rich theory and many applications, see e.g.[GS97], [CDG `07].
In this section we introduce notions and results from the theory of tree languages that we will use for our proof-theoretic purposes.
A ranked alphabet Σ is a finite set of symbols which have an associated arity (their rank ).For f P Σ, we sometimes use the notation f {n for saying that n is the arity of f .We write T Σ to denote the set of all finite trees (or terms) over Σ, and we write T Σ pXq to denote the set of all trees over Σ and a set X of variables (seen as symbols of arity 0).We also use the notion of position in a tree, which is a list of natural numbers.We write ε for the empty list (the root position), and we write p.q for the concatenation of lists p and q.We write p ď q if p is a prefix of q and p ă q if p is a proper prefix of q.Clearly, ď is a partial order and ă is its strict part.We write Posptq to denote the set of all positions in a tree t P T Σ pXq.Furthermore, for a given tree or term t and position p, we write t| p to denote the subterm of t that occurs at position p.Definition 3.1.A regular tree grammar is a tuple G " xN, Σ, θ, P y, where N is a finite set of non-terminal symbols, where Σ is a ranked alphabet, such that N X Σ " H, where θ is the start symbol with θ P N , and where P is a finite set of production rules of the form β Ñ t with β P N and t P T Σ pN q.
The derivation relation Ñ G of a regular tree grammar G " xN, Σ, θ, P y is defined as follows.We have s Ñ G r if there is a production rule β Ñ t in P and a position p P Pospsq, such that s| p " β and r is obtained from s by replacing β at p by t.The language of G is then defined as LpGq " tt P T Σ | θ Ñ G tu, where Ñ G is the reflexive and transitive closure of Ñ G .A derivation D of a term t P LpGq is a sequence t 0 Ñ G t 1 Ñ G . . .Ñ G t n with t 0 " θ and t n " t.Note that a term t might have different derivations in G.
In [JKV09] the class of rigid tree languages has been introduced with applications in verification (e.g. of cryptographic protocols as in [JKV11]) as primary motivation.It will turn out that this class is appropriate for describing cut-elimination in classical first-order logic.In contrast to [JKV09] we do not use automata but grammars-their equivalence is shown in [Het12a].Definition 3.2.A rigid tree grammar is a tuple xN, N R , Σ, θ, P y, where xN, Σ, θ, P y, is a regular tree grammar and N R Ď N is the set of rigid non-terminals.We speak of a totally rigid tree grammar if N R " N .In this case we will just write xN R , Σ, θ, P y.
A derivation θ " t 0 Ñ G t 1 Ñ G . ..Ñ G t n " t of a rigid tree grammar G " xN, N R , Σ, θ, P y is a derivation in the underlying regular tree grammar satisfying the additional rigidity condition: If there are i, j ă n, a non-terminal β P N R , and positions p and q such that t i | p " β and t j | q " β then t| p " t| q .The language LpGq of the rigid tree grammar G is the set of all terms t P T Σ which can be derived under the rigidity condition.For a given derivation D : θ " t 0 Ñ G t 1 Ñ G . . .Ñ G t n " t and a non-terminal β we say that p P Posptq is a β-position in D if there is an i ď n with t i | p " β, i.e., either a production rule β Ñ s has been applied at p in D, or β occurs at position p in t.In the context of a given grammar G, we sometimes write D : α Ñ G t to specify that D is a derivation starting with α and ending with the term t.
Example 3.3.Let Σ " t0{0, s{1u.A simple pumping argument shows that the language L " tf pt, tq | t P T Σ u is not regular.On the other hand, L is generated by the rigid tree grammar G " xtθ, α, βu, tαu, t0{0, s{1, f {2u, θ, P y where , θ, P y be a rigid tree grammar and let t P LpGq.Then there is a derivation θ Ñ G . . .Ñ G t which uses at most one β-production for each β P N R .
Proof.Given any derivation of t, suppose both β Ñ s 1 and β Ñ s 2 are used at positions p 1 and p 2 respectively.Then by the rigidity condition t| p 1 " t| p 2 and we can replace the derivation at p 2 by that at p 1 (or the other way round).This transformation does not violate the rigidity condition because it only copies existing parts of the derivation.
Proof.If a G-derivation of a term s uses β, it must replace β by t hence s is derivable using the productions of G 1 as well.The rigidity condition is preserved as the equality constraints of the G 1 -derivation are a subset of those of the G-derivation.Conversely, given a G 1 -derivation of a term s we obtain a derivation of s from the productions of G by replacing applications of δ Ñ rrβzts by δ Ñ r followed by a copy of β Ñ t for each occurrence of β in r.Let γ 1 , . . ., γ n be the non-terminals that appear in t.By the rigidity condition for i P t1, . . ., nu there is a unique term at all γ i -positions in the derivation.Hence β fulfills the rigidity condition as well, and we have obtained a G-derivation of s.
Notation 3.6.For a given non-terminal β and a term t, we will write β P t or t Q β for denoting that β occurs in t.
Definition 3.7.Let G be a tree grammar.A path of G is a list P of productions α 1 Ñ t 1 , . . ., α n Ñ t n with n ě 1 and α i`1 P t i for all i P t1, . . ., n ´1u.The length of a path is |P| " n.We will also write P : For a given path P : α 1 Ñ t 1 Q α 2 Ñ . . .Q α n Ñ t n we say that α 1 , . . ., α n are on the path P and write α i P P for that.We also write P : α 1 t n and P : α 1 α n , if we do not want to explicitly mention the intermediate steps.For a fixed grammar G, we write α β to denote that there is a path P in G with P : α β.
For a set P of production rules, we write α ă P β (or simply α ă β, when P is clear from context) if there is a production α Ñ t in P with β P t.We write ă `for the transitive closure of ă, and ă ˚for its reflexive, transitive closure.Note that α β implies α ă `β, but not the other way around, since β could be a non-terminal with no production β Ñ s in P .Definition 3.8.A tree grammar xN, Σ, θ, P y is called cyclic if α ă P α for some α P N , and acyclic otherwise.Lemma 3.9.If G is totally rigid and acyclic, then we have that up to renaming of the nonterminals G " xtα 1 , . . ., α n u, Σ, α 1 , P y with LpGq " tα 1 rα 1 zt 1 s ¨¨¨rα n zt n s | α i Ñ t i P P u.
Proof.Acyclicity permits a renaming of non-terminals, such that α i ă P α j implies i ă j.Then LpGq Ě tα 1 rα 1 zt 1 s ¨¨¨rα n zt n s | α i Ñ t i P P u is obvious.For the left-to-right inclusion, let D : α 1 " s 1 Ñ G . . .Ñ G s n " s P T Σ be a derivation in G.By Lemma 3.4 we can assume that for each j at most one production whose left-hand side is α j is applied, say α j Ñ t j .By acyclicity we can rearrange the derivation so that α j Ñ t j is only applied after α i Ñ t i for all i ă j.For those α j which do not appear in the derivation we can insert any substitution without changing the final term so we obtain s " α 1 rα 1 zt 1 s ¨¨¨rα n zt n s.
This lemma entails that |LpGq| ď ś n i"1 |tt | α i Ñ t P P u|, in particular we are dealing with a finite language.The central questions in this context are (in contrast to the standard setting in formal language theory) not concerned with representability but with the size of a representation.

Proofs and Grammars
In this section we will relate sequent calculus proofs to rigid tree grammars.A central tool for establishing this relation is Herbrand's theorem [Her30,Bus95].In its simplest form it states that Dx A, for A quantifier-free, is valid iff there are terms t 1 , . . ., t n such that Ž n i"1 Arxzt i s is a tautology.Such tautological disjunctions of instances are hence called Herbrand-disjunctions.Such a disjunction, or equivalently: the set of terms, can be considered a compact representation of a cut-free proof.The relation to tree grammars is based on the observation that a (finite) set of terms is just a (finite) tree language.While the Herbrand-disjunction of a cut-free proof will be considered a tree language, a proof with cut will give a rise to a grammar and its cut-elimination will be described by the computation of the language of its grammar.
There are different options for extending Herbrand's theorem to non-prenex formulas, e.g. the Herbrand proofs of [Bus95] or the expansion trees of [Mil87].For our purposes it will be most useful to follow the approach of [BL94].
Definition 4.1.Let π be a proof and let O be a formula occurrence in π.Then we define the Herbrand-set HpOq of O inductively as follows: ‚ If O is the occurrence of a formula A in an axiom, then HpOq " tAu.‚ If O is in the conclusion sequent of an inference rule without being its main occurrence, then O has exactly one ancestor O 1 in one of the premises, and we let HpOq " HpO 1 q.‚ If O is the main occurrence in the conclusion of a ˝-rule with ˝P t^, _u and with auxiliary occurrences O 1 and O 2 , then HpOq " tA ˝B | A P HpO 1 q, B P HpO 2 qu.‚ If O is the main occurrence in the conclusion of a @or D-rule with auxiliary occurrence O 1 in the premise, then HpOq " HpO 1 q.
‚ If O is the main occurrence in the conclusion of a w-rule, then HpOq " tKu.‚ If O is the main occurrence in the conclusion of a c-rule with auxiliary occurrences O 1 and O 2 in the premise, then HpOq " HpO 1 q Y HpO 2 q.Finally, we define Hpπq " ď where Γ is the end-sequent of π and P ranges over all formula occurrences in Γ.
Besides to the Herbrand-set of a formula occurrence, we also need the set of terms associated with an occurrence of an D-formula.Definition 4.2.Let Q be an occurrence of a formula Dx A in a proof.We define the set tmpQq of terms associated with Q as follows: if Q is introduced as the main formula of a weakening, then tmpQq " H.If Q is introduced by an D-rule Γ, Arxzts ´´´´´´´´D Γ, Dx A then tmpQq " ttu.If Q is the main formula in the conclusion of a contraction, and Q 1 and Q 2 are the two auxiliary occurrences of the same formula in the premise, then tmpQq " tmpQ 1 q Y tmpQ 2 q.In all other cases, an inference with the occurrence Q in the conclusion has a corresponding occurrence Q 1 of the same formula in one of its premises, and we let tmpQq " tmpQ 1 q.
In the following, we will restrict our attention to a certain class of proofs, that we call simple proofs below.Definition 4.3.A proof π is called simple if ‚ it is regular (i.e., different @-inferences have different eigenvariables), ‚ every cut in π is of one of the following forms: where B is quantifier-free.
Let us make some remarks on this definition.First, we require regularity which is a necessary assumption in the context of cut-elimination.But since every proof can be trivially transformed into a regular one, this is no real restriction.Second, the requirement of the @-rule being applied directly above the cut is natural as the rule is invertible.Moreover, any proof which does not fulfill this requirement can be pruned to obtain one that does, by simply permuting @-inferences down and identifying their eigenvariables when needed.Thus, the only significant restriction is that of disallowing quantifier alternations in the cut formulas.This corresponds to allowing only Σ 1 (or Π 1 ) formulas in cuts.
We conjecture that our central result can be extended to Σ n -cuts.However, this will require the development of an adequate class of grammars first (see also Section 8).
Observation 4.4.Simple proofs have the technically convenient property of exhibiting a 1-1 relationship between cut-eigenvariables and cuts.For an eigenvariable α P EV c pπq we will therefore write cut α for the corresponding cut and @ α for the inference introducing α (when read from bottom to top).Definition 4.5.Let π be a simple proof, let α P EV c pπq, and let Q be the occurrence of the existentially quantified cut-formula in the premise of cut α .Then we write Bpαq for the set of substitutions t rαzts | t P tmpQq u and we define Bpπq " ď αPEVcpπq Bpαq .
Structures similar to the above Bpπq have been investigated also in [Hei10] and [McK13] where they form the basis of proof net like formalisms using local reductions for quantifiers in classical first-order logic.Our aim in this work is however quite different: we use these structures for a global analysis of the sequent calculus.
Definition 4.6.Let π be a simple proof.Then the grammar of π is the totally rigid grammar Gpπq " xN R , Σ, θ, P y with where Σpπq is the signature of π, 1 the rank of ^and _ is 2, the rank of J and K is 0, and θ does not occur in π.
Proof.By induction on the number of cuts in π.The grammar of a cut-free proof is trivially acyclic.For the induction step, let r be the lowest binary inference with subproofs π 1 and π 2 such that either (i) r is a cut or (ii) r is not a cut but both π 1 and π 2 contain at least one cut.Let P , P 1 , and P 2 be the set of productions induced by the cuts in π, π 1 , π 2 , respectively.In case (ii), ă P " ă P 1 Y ă P 2 , which is acyclic by induction hypothesis (since EV c pπ 1 q X EV c pπ 2 q " H).In case (i), let P r be the productions induced by the cut r, then ă P " ă P 1 Y ă P 2 Y ă Pr .By induction hypothesis, ă P 1 and ă P 2 are acyclic and as the cut-formula in r contains at most one quantifier, also ă Pr is acyclic.Therefore, a cycle in ă P must be of the form α 1 ă P1 β 1 ă Pr α 2 ă P2 β 2 ă Pr α 1 where α 1 , β 1 P EV c pπ 1 q and α 2 , β 2 P EV c pπ 2 q.However, r contains only one quantifier and depending on its polarity all productions in P r lead from π 1 to π 2 or from π 2 to π 1 but not both, so ă P is acyclic.

Grammars and Cut-Elimination
In this section we will show that the language of the grammar of a proof defined in the previous section is an invariant under cut elimination.Before formulating this invariance result precisely we have to consider the following three aspects of the situation: First, note that all the reductions shown in Figure 1 preserve simplicity, except the following: We consider the eigenvariables in EVpπqzEVcpπq to be part of Σpπq.
where cut α is permuted down under cut β (using the bottommost reduction in Fig. 1) and the cut formula of cut β has its ancestor on the right side of cut α .So in the following, when we speak about a reduction sequence of simple proofs we require that the above reduction is immediately followed by permuting @ α down as well, in order to arrive at Secondly, observe that there is no mechanism for deletion in the grammar, but there is one in cut-elimination: the reduction of weakening which erases a sub-proof (see Fig. 1).It is hence natural and will turn out to be useful to also consider the reduction relation without this step.
Definition 5.1.We define the non-erasing cut-reduction ne ù as ù without the reduction rule for weakening.
Note that a ne ù-normal form π is an analytic proof too as Hpπq is also a Herbranddisjunction, i.e. a tautological collection of instances.In contrast to a ù-normal form (which might contain implicit redundancy) a ne ù-normal form might also contain explicit redundancy in the form of cuts whose cut-formulas are introduced by weakening on one or on both sides.Non-erasing reduction is also of interest in the context of the λ-calculus where it is often considered in the form of the λI-calculus and gives rise to the conservation theorem (see Theorem 13.4.12 in [Bar84]).Our situation here is however quite different: neither ù nor ne ù is confluent and neither of them is strongly normalizing.Thirdly, in contrast to the case treated in [HS12] in our more general setting it may happen that the reduction of a weakening deletes sub-formulas of formula instances from the proof.In order to treat this situation adequately, we need to define a generalization of the Ď-relation between sets of formulas.For this reason, we use the symbol K for representing subformulas introduced by weakening, a technique also employed in [BHW12,Wel11] for the purpose of a tighter complexity-analysis.
Definition 5.2.The relation ď is defined inductively on quantifier-free formulas as follows: ‚ for all formulas A we have K ď A and A ď A, and ‚ whenever A 1 ď A and B 1 ď B then also A 1 ^B1 ď A ^B and A 1 _ B 1 ď A _ B Let A and B be sets of quantifier-free formulas.Then we define A ď B iff for all A P A there is a B P B with A ď B .
Fact 5.3.The relation ď is transitive on formula sets.
We are now in a position to precisely state our main invariance lemma which connects grammars with cut-elimination for weak sequents.
Lemma 5.4.If π ù π 1 is a reduction sequence of simple proofs of a weak sequent, then LpGpπqq ě LpGpπ 1 qq.If π ne ù π 1 is a reduction sequence of simple proofs of a weak sequent, then LpGpπqq " LpGpπ 1 qq.
The rest of this section is devoted to proving this result.The proof strategy is to carry out an induction on the length of the reduction sequence π ù π 1 (or π ne ù π 1 respectively) and to make a case distinction on the type of reduction step.The most difficult step will turn out to be the reduction of contraction which duplicates a sub-proof.
Lemma 5.5.Let π be a simple proof, and let π 1 be obtained from π by the single application of an axiom reduction, or a propositional reduction, or a unary or binary inference permutation (see Figure 1).Then LpGpπ 1 qq " LpGpπqq.
Proof.None of these reductions is changing the grammar of the proof, i.e., Gpπ 1 q " Gpπq and therefore also LpGpπ 1 qq " LpGpπqq.
Lemma 5.6.Let π be a simple proof, and let π 1 be obtained from π by the single application of a quantifier reduction (see Figure 1).Then LpGpπ 1 qq " LpGpπqq.
Proof.Let α be the eigenvariable of the @-inference and t be the term of the D-rule directly above the cut that is reduced.Then Gpπ 1 q can be obtained from Gpπq by removing the production rule α Ñ t and by applying the substitution rαzts to the right-hand side of all remaining production rules.Thus, LpGpπ 1 qq " LpGpπqq follows immediately from Lemma 3.5.
Lemma 5.7.Let π be a simple proof, and let π 1 be obtained from π by the single application of a weakening reduction (see Figure 1).Then LpGpπ 1 qq ď LpGpπqq.
Proof.The grammar Gpπ 1 q is obtained from Gpπq via two modifications.First, all productions coming from cuts or D-inferences in ψ 2 are deleted, and second, the formulas in ∆ which are ancestors of the end-sequent are replaced by K in Hpπ 1 q.Now let A P LpGpπ 1 qq.Then the derivation of A in Gpπ 1 q is also a derivation in Gpπq, with the difference that some K-subformulas are replaced by other formulas, yielding a formula B P LpGpπqq with B ě A. Hence LpGpπ 1 qq ď LpGpπqq.
It remains to analyze the case of contraction.Surprisingly, also in this case the language of the grammar of a proof remains unchanged.However, the proof of this result is quite technical and requires additional auxiliary results about the relationship between proofs and grammars.Furthermore, this is the case which needs the additional condition that the end-sequent of our proof is weak, i.e., does not contain @-quantifiers.
For simplifying the presentation, we assume in the following (without loss of generality) that the @-side is on the right of a cut and the D-side on the left.Then, a production β Ñ t in Gpπq corresponds to three inferences in π: a cut, an instance of the @-rule, and an instance of the D-rule, that we denote by cut β , @ β , and D t , respectively, and that are, in general, arranged in π as shown below.
The additional condition that @ β is directly above cut β , as indicated in (4.1) is needed because in the following we make extensive use of Observation 4.4: there is a one-to-one correspondence between the cuts and the eigenvariables in EV c pπq, and thus, the notation cut β makes sense.
Definition 5.8.We say that the instances cut β , @ β , and D t are on a path P in Gpπq if the production β Ñ t is in P.
where r 1 , r 2 , and r 3 are arbitrary rule instances, and r 3 is a branching rule, and r 1 and r 2 might or might not be branching.Then we say that r 1 is on the left above r 3 , denoted by r 1 è r 3 , and r 2 is on the right above r 3 , denoted by r 3 é r 2 , and r 1 and r 2 are in parallel, denoted by r 1 èé r 2 .
Lemma 5.10.Let π be a simple proof and P : Then there is a k P t1, . . ., nu such that cut α k is lowermost among all inferences on P. Furthermore, @ α 1 is on the right above cut α k and D tn is on the left above cut α k .
Proof.We proceed by induction on n.If n " 1, then n " k " 1.For the induction step consider a path . . .
In the first case we let k " n `1 and in the second we let k " l.In both cases cut α k has the desired properties.
Lemma 5.12.Let Gpπq " xN R , Σ, ϕ, P y be the grammar of a simple proof π, such that there are two paths such that γ 0 and δ 0 occur at two different positions in t.Then we have one of the following two cases: (1) we have γ i " δ j for some 0 ď i ă n and 0 ď j ă m, or (2) for all 0 ď i ă n and 0 ď j ă m we have cut α é cut γ i and cut α é cut δ j .
Proof.Note that because of acyclicity of Gpπq, we have that β ‰ γ i for all i ď n and β ‰ δ j for all j ď m, in particular β ‰ α.Assume, for the moment, that m, n ą 0; the case of one of them being zero will be treated at the very end of the proof.Then γ 0 ‰ α and δ 0 ‰ α.If γ 0 " δ 0 , we have case 1.So, assume also γ 0 ‰ δ 0 .As β Ñ t is a production in Gpπq, the proof π contains a formula which contains both γ 0 and δ 0 hence @ γ 0 and @ δ 0 are not parallel.Since we have cut γ 0 é @ γ 0 and cut δ 0 é @ δ 0 , we also have that cut γ 0 and cut δ 0 are not parallel.Without loss of generality, assume that cut δ 0 is below cut γ 0 .Then cut δ 0 é cut γ 0 (since cut γ 0 è cut δ 0 would entail @ γ 0 èé @ δ 0 ).Since we have δ 0 α, we can apply Lemma 5.11, giving us three possibilities: ‚ If cut α è cut δ 0 then we have the situation . . .s n we have that cut δ 0 must coincide with cut γ i for some 0 ď i ă n (since π is a tree), so δ 0 " γ i (by Observation 4.4), and we are in case 1. ‚ If cut α é cut δ 0 then we are in both of the following two situations: . . .  . . .Thus, by Lemma 5.10 applied to the paths γ 0 s n and δ 0 r m we know that cut α " cut γ k " cut δ l for some 0 ď k ď n and 0 ď l ď m hence γ k " α " δ l .Furthermore k " n and l " m by acyclicity of Gpπq and assumption γ n " α " δ m .Now consider any γ i with 0 ď i ă n.Since γ i α, we can apply Lemma 5.11 and get either cut α è cut γ i or cut α é cut γ i or cut α èé cut γ i .Since by Lemma 5.10 cut γ i must be above cut α , we conclude cut α é cut γ i .With the same reasoning we can conclude that cut α é cut δ j for all 0 ď j ă m.We are therefore in case 2. ‚ If cut α èé cut δ 0 then we are in both of the following two situations: . . .By Lemma 5.10 applied to the paths γ 0 Ñ . . .Ñ s n and δ 0 Ñ . . .Ñ r m , the rule r coincides with cut γ i and cut δ j for some 0 ă i ă n and 0 ă j ă m, therefore γ i " δ j (by Observation 4.4), and we are in case 1.It remains to treat the case n " 0 or m " 0. If m " n " 0 then we are trivially in case 2 (there is no 0 ď i ă n or 0 ď j ă m).If n " 0 and m ą 0, we can apply Lemma 5.10 to the path δ 0 Ñ . . .Ñ r m and obtain an l P t0, . . ., mu such that we are in the situation . . .But by the same argument as at the beginning of the proof, we also have that @ α and @ δ 0 cannot be in parallel (α and δ 0 both appear in t), and therefore either cut δ 0 é cut α or cut α é cut δ 0 .Since δ 0 α, the only possibility is cut α é cut δ 0 , by Lemma 5.11.Thus cut α " cut δ l , and therefore l " m and we are in case 2. The case m " 0 and n ą 0 is similar.
We have now finally collected together all necessary tools for describing the reduction step for contraction.
Lemma 5.13.Let π be a simple proof of a weak sequent such that π contains a subproof ψ, shown on the left below, and let π 1 be the proof obtained from π from replacing ψ by ψ 1 shown on the right above, where ρ 1 " rαzα 1 s αPEVpψ 2 q and ρ 2 " rαzα 2 s αPEVpψ 2 q are substitutions that replace all eigenvariables in ψ 2 by fresh copies.Then LpGpπ 1 qq " LpGpπqq.
Proof.Let us first show LpGpπqq Ď LpGpπ 1 qq.Write P for the productions of Gpπq and P 1 for those of Gpπ 1 q.Let F P LpGpπqq and D be its derivation.If the duplicated cut is quantifier-free, then P 1 " P ρ 1 Y P ρ 2 , since the substitutions ρ and ρ 1 do not affect the eigenvariables outside ψ 2 .Hence Dρ 1 (as well as Dρ 2 ) is a derivation of F in Gpπ 1 q.If the duplicated cut contains a quantifier, let α be its eigenvariable, let t 1 , . . ., t k be its terms coming from the left copy of A and t k`1 , . . ., t n those from the right copy of A and let Q " tα Ñ t 1 , . . ., α Ñ t n u Ď P .We then have If D does not contain α, then Dρ 1 (as well as Dρ 2 ) is a derivation of F in Gpπ 1 q.If D does contain α, then by Lemma 3.4 we can assume that it uses only one α-production, say α Ñ t i .If 1 ď i ď k, then Dρ 1 is a derivation of F in Gpπ 1 q and if k ă i ď n, then Dρ 2 is a derivation of F in Gpπ 1 q.
Let us now show LpGpπ 1 qq Ď LpGpπqq.Let F be a formula in LpGpπ 1 qq, and let D 1 be a derivation of F in Gpπ 1 q.We construct D " D 1 pρ 1 q ´1pρ 2 q ´1 by "undoing" the renaming of the variables in ψ 2 .Then D is a derivation for F , using the production rules of Gpπq, but possibly violating the rigidity condition.
First, recall that EV c pπq " EVpπq and observe that only non-terminals α P EVpψ 2 q can violate the rigidity condition in D: if β R EVpψ 2 q violates the rigidity condition then there are β-positions p 1 , p 2 in D with F | p 1 ‰ F | p 2 and as βρ 1 ρ 2 " β the positions p 1 , p 2 are also β-positions in D 1 and they violate the rigidity condition in D 1 which is a contradiction to D 1 being a Gpπ 1 q-derivation.Now define for each α P EVpψ 2 q the value npD, αq to be the number of pairs pp 1 , p 2 q P PospF q ˆPospF q where p 1 and p 2 are α-positions in D with p 1 ‰ p 2 and F | p 1 ‰ F | p 2 , and define npDq " ř αPEVpψ 2 q npD, αq.We proceed by induction on npDq to show that D can be transformed into a derivation which does no longer violate rigidity.If npDq " 0 then D obeys the rigidity condition, and we are done.Otherwise there is at least one α P EVpψ 2 q with npD, αq ą 0. We now pick one such α which is minimal with respect to ă ˚(which exists since Gpπq is acyclic).Let p 1 and p 2 be α-positions in D with p 1 ‰ p 2 and F | p 1 ‰ F | p 2 , let p be the maximal common prefix of p 1 and p 2 and let q be the maximal prefix of p where a production rule has been applied in D. Due to the tree structure of F , the position q is uniquely defined, and q is a β-position for some non-terminal β, and some production rule β Ñ t has been applied at position q in D, and we have two paths: where γ 0 and δ 0 occur at two different positions in t.Thus, we can apply Lemma 5.12, giving us the following two cases: ‚ We have γ i " δ j for some 0 ď i ă n and 0 ď j ă m.Say η " γ i " δ j , and let p γ and p δ be the positions of γ i and δ j (respectively) in D. Since ηă `α we know that η does not violate the rigidity condition (we chose α to be minimal), and therefore F | pγ " F | p δ " F 1 .Let D γ : γ i Ñ Gpπq F 1 and D δ : δ j Ñ Gpπq F 1 be the two subderivations of D starting in positions p γ and p δ , respectively.Without loss of generality, we can assume that npD γ q ď npD δ q.
Then let D be the derivation obtained from D by replacing D δ by D γ .Then D is still a derivation for F , but np Dq ă npDq.‚ For all 0 ď i ă n and 0 ď j ă m we have cut α é cut γ i and cut α é cut δ j .So all inferences of the path γ 0 Ñ . . .Ñ s n´1 as well as inferences of δ 0 Ñ . . .Ñ r m´1 are in ψ 2 .Therefore all variables of of these paths are in EVpψ 2 q.As α violates the rigidity in D one of p 1 , p 2 must be a α 1 -position and the other a α 2 -position in D 1 because D 1 does satisfy the rigidity condition.Without loss of generality we can assume that p 1 is the α 1 -position and p 2 the α 2 -position.As the paths are contained completely in ψ 2 we have γ 0 P EVpψ 2 qρ 1 and δ 0 P EVpψ 2 qρ 2 which is a contradiction as no term can contain both a variable from EVpψ 2 qρ 1 and one from EVpψ 2 qρ 2 .
Proof of Lemma 5.4.By induction on the length of the reduction π ù π 1 or π ne ù π 1 respectively using one of Lemmas 5.5, 5.6, 5.7 or 5.13 depending on the current reduction step.

Skolemization and Deskolemization
In this section we will describe some results that allow one to extend the above invariance lemma to proofs of arbitrary end-sequents (including @-quantifiers).Carrying out the above argument directly for arbitrary end-sequents would require dealing with variable-names on the level of the grammar in order to describe the changes of eigenvariables of the @quantifiers in the end-sequent.This can be avoided completely by skolemizing proofs to reduce the general case to that of weak sequents and then translating back the results by deskolemization.Skolemization and deskolemization are simple operations on the level of Herbrand-disjunctions or expansion trees [Mil87] and their use in this context suffices for our purposes.In contrast, they have surprising complexity-effects on the level of proofs, see e.g.[BHW12].The reason why this transfer is possible is that the form of the end-sequent, and in particular the question whether it contains universal quantifiers, does not have an effect on the dynamics of cut-elimination.This observation has been well known for a long time and is apparent already in Gentzen's consistency proof for Peano Arithmetic [Gen38] which is carried out on a (hypothetical) proof of the empty sequent as well as in the proof of the second ε-Theorem from the first ε-Theorem by deskolemization [HB39].
Let us now first define the notion of Herbrand-disjunction precisely.We assume w.l.o.g. that in a formula every variable is bound by at most one quantifier.Definition 6.1.For a given formula F , we write F for the formula obtained from F by removing all quantifiers.Now let x 1 , . . ., x n be the existentially bound variables in F , and let y 1 , . . ., y m be the universally bound variables in F .Then any formula of the shape F rx 1 zt 1 , . . ., x n zt n , y 1 zα 1 , . . ., y m zα m s where F is an arbitrary formula with F ď F , where t 1 , . . ., t n are arbitrary terms, and where α 1 , . . ., α m are fresh variables, is called an instance of F .If Γ is a sequent we say that a set I of formulas is a set of instances of Γ if for every I P I there is a F P Γ, s.t.I is instance of F .
Often we will work in the context of a proof π of a sequent Γ and consider the instances of the formulas in Γ that are induced by π.Then the above fresh variables α 1 , . . ., α m will be eigenvariables of the proof and their occurrences in terms will be restricted by an acylicity-condition, see below.
Let Γ " F 1 , . . ., F n be a sequent, let I be a set of instances of Γ, let m i be the number of quantifiers in F i , and let l i be the number of instances of F i in I .If we impose an arbitrary linear ordering on the instances of F i in I , then a tuple xi, j, ky for 1 ď i ď n and 1 ď j ď m i and 1 ď k ď l i uniquely identifies the term which is substituted for the quantifier Qx j in the k-th instance of the formula F i .We will write t i,j,k for this term (which could just be an eigenvariable if Qx j happens to be an @-quantifier).The k-th instance of F i can hence be written as F i,k rx 1 zt i,1,k , . . ., x m i zt i,m i ,k s, where x 1 , . . ., x m i are the bound variables in F i , and F i,k is some formula with F i,k ď Fi .Such a tuple xi, j, ky is called existential position if x j is bound existentially in F i , and universal position if x j is bound universally in F i .
A position xi 1 , j 1 , k 1 y is said to dominate another position xi 2 , j 2 , k 2 y, if i 1 " i 2 , and k 1 " k 2 , and the quantifier Qx j 2 is in the scope of the quantifier Qx j 1 in F i .A set I of instances induces a relation ă on its existential positions as: xi 1 , j 1 , k 1 y ă xi 2 , j 2 , k 2 y if there is a universal position xi 3 , j 3 , k 3 y, such that the term t i 2 ,j 2 ,k 2 contains a variable α with α " t i 3 ,j 3 ,k 3 and xi 1 , j 1 , k 1 y dominates xi 3 , j 3 , k 3 y.Furthermore we define the dependency relation Î on the existential positions of I as transitive closure of ă.Remark 6.2.A proof π with the property that Hpπq " I is sometimes called a sequentialization of I .If I has positions xi 1 , j 1 , k 1 y and xi 2 , j 2 , k 2 y with xi 1 , j 1 , k 1 y ă xi 2 , j 2 , k 2 y, then in each sequentialization of I the inference corresponding to xi 1 , j 1 , k 1 y is below that of xi 2 , j 2 , k 2 y.In the literature on proof nets, relations like ă are known as jumps.
Note that for a weak sequent Γ, the induced dependency ordering Î is empty and hence trivially acyclic.The Herbrand-disjunctions of weak sequents are therefore exactly the tautologies of instances.Definition 6.6.Let F r@y Gs be a formula containing a universal quantifier and let Dx 1 , . . ., Dx n be the existential quantifiers in whose scope @y is.Then define the Skolemization of this universal quantifier as sk 1 pF r@y Gsq " F rGryzgpx 1 , . . ., x n qs where g is a fresh n-ary function symbol, called a Skolem function symbol.The term gpx 1 , . . ., x n q is called Skolem-term.For a formula F define its Skolemization skpF q to be the iteration of sk 1 until no universal quantifier is left, such that no Skolem function symbol is used for two different universal quantifiers in F .For a sequent Γ " F 1 , . . ., F n define its Skolemization skpΓq " skpF 1 q, . . ., skpF n q, where no Skolem function symbol is used for two different universal quantifiers in Γ. Remark 6.7.Sometimes the above operation on formulas is also called Herbrandization.We prefer to use the name Skolemization due to the simple duality between the satisfiabilitypreserving replacement of existential quantifiers and the validity-preserving replacement of universal quantifiers by new function symbols.There is no danger of confusion as, in the proof-theoretic context of this work, we are clearly dealing with validity only.This use of terminology is due to [HB39], see in particular Section 3.5.a.
The above side condition on the choice of Skolem function symbols results in a 1-1 mapping between universal quantifiers in the sequent we skolemize and the Skolem function symbols.It could be made formally more precise by equipping the sk-operation with such a bijection as second argument.However, for the sake of notational simplicity we refrain from doing so here.
The Skolemization of formulas and sequents can be extended to a Skolemization of proofs.When skolemizing a proof, all universal quantifiers in the end-sequent are removed and their variables are replaced by Skolem-terms.In contrast, the cut-formulas remain unchanged, more precisely: Definition 6.8.Let π be a proof of a sequent Γ, and let y 1 , . . ., y n be the variables that are bound by a @-quantifier in Γ.Furthermore, for each y i let α i,1 , . . ., α i,h i be the eigenvariables introduced in π by an @-rule whose main formula is of the shape @y i A. Then the Skolemization of the proof π, denoted by skpπq, is the proof with end-sequent skpΓq that is obtained from π by (1) removing all @-quantifiers binding one of y 1 , . . ., y n everywhere, and (2) replacing each occurrence of y i (for i P t1, . . ., nu) and α i,j (for i P t1, . . ., nu and j P t1, . . ., h i u) by the corresponding Skolem-term.This term is in each case uniquely determined if we proceed from the end-sequent of π upwards to the axioms and demand that each rule application remains valid, or, in the case of the @-rule, becomes void (i.e., premise and conclusion coincide), and (3) removing the void rule instances.Note that skpπq still can contain @-quantifiers, namely those coming from a cut.
The Skolemization of a proof π also affects the quantifier-free formulas in π through the replacement of eigenvariables by Skolem terms.In the context of proof Skolemization we hence extend the notation skp¨q to formulas F from which some (or all) @-quantifiers have been removed; then skpF q denotes the formula obtained from skolemizing the remaining @-quantifiers and carrying out the replacement of eigenvariables by Skolem-terms.Skolemization of proofs has the following useful commutation properties.Lemma 6.9.If π ù π 1 then skpπq ù skpπ 1 q.If π ne ù π 1 then skpπq ne ù skpπ 1 q.
Proof.By induction on the number of reductions in π ù π 1 or π ne ù π 1 , respectively, making a case distinction on the reduction step.The most interesting case is that of the permutation of a @-inference over a cut where the main formula of the @-inference is an ancestor of the end-sequent.This reduction step is translated to an identity-step as Skolemization maps both of the above proofs to skpAq, skp∆q ´´´´´´´´´´´´´´´´´´´´´´cut skpΓq, skp@x Bq, skp∆q Each of the other reduction steps translates directly into exactly one reduction step in the skolemized sequence.
Proof.First note that EV c pπq " EV c pskpπqq hence Gpπq and Gpskpπqq have the same nonterminals.Furthermore, to each α P EVpπqzEV c pπq corresponds a unique Skolem-term in skpπq, hence to each F P Hpπq and σ P Bpπq corresponds a unique F 1 P Hpskpπqq and σ 1 P Bpπq and therefore to each production α Ñ t in Gpπq corresponds a unique production α Ñ t 1 in Gpskpπqq that is obtained from replacing eigenvariables by their respective Skolem-terms.If I P skpLpGpπqqq then by Lemma 3.9 we have I " skpF rα 1 zs 1 s ¨¨¨rα n zs n sq.Now for θ Ñ F, α 1 Ñ s 1 , . . ., α n Ñ s n being the productions in Gpπq, letting θ Ñ F 1 , α 1 Ñ s 1 1 , . . ., α n Ñ s 1 n be the corresponding productions in Gpskpπqq we obtain F 1 rα 1 zs 1 1 s ¨¨¨rα n zs 1 n s " skpF rα 1 zs 1 s ¨¨¨rα n zs n sq.Thus, skpLpGpπqqq Ď LpGpskpπqqq.For the other direction, note that every Skolem-term has at least one corresponding α P EVpπqzEV c pπq, and as before, this relation translates to productions.So, if J P LpGpskpπqqq then by Lemma 3.9 we have J " Grα 1 zt 1 s ¨¨¨rα n zt n s for θ Ñ G, α 1 Ñ t 1 , . . ., α n Ñ t n being the productions in Gpskpπqq.By choosing one corresponding set of productions θ Ñ G 1 , α 1 Ñ t 1 1 , . . ., α n Ñ t 1 n where Skolem-terms are replaced by the eigenvariables from which they originate we obtain skpG 1 rα 1 zt 1 1 s ¨¨¨rα n zt 1 n sq " Grα 1 zt 1 s ¨¨¨rα n zt n s.
As we have seen in the above proof, Skolemization can identify instances that differ only in their variable names.The reason for this ability lies in the use of variable names which can be chosen in a redundant way.These superfluous instances can also be removed by an appropriate variable renaming as shown in the following example.
Skolemizing would produce the following set of instances skpI q " tP pc, f pcqq _ Qpc, f pcqq, P pc, f pcqq ^Qpc, f pcqqu by implicitly identifying the two formulas that become equal.A similar effect (but without using Skolemization) can be achieved by directly identifying α and β as in I rβzαs " tP pc, αq _ Qpc, αq, P pc, αq ^Qpc, αqu .
We now generalize the observations made in the above example.For every Herbranddisjunction I there is a substitution ρ, such that I ρ is a Herbrand-disjunction having the following property: If two universal positions xi, j, k 1 y and xi, j, k 2 y have different variables then there is a j 1 , such that the quantifier Dx j 1 dominates @x j in F i and t i,j 1 ,k 1 ‰ t i,j 1 ,k 2 .This follows for example from the formulation of expansion trees in [CHM12a,CHM12b] which use sets of terms for the D-quantifier and a single variable for the @-quantifier.A Herbranddisjunction with this property is α-equivalent to one with canonical variable names in the following sense.Definition 6.12.Let I be a set of instances.The canonical name of the eigenvariable of the universal position xi, j, ky is α i,j,t 1 ,...,tm where t 1 , . . ., t m are the terms of the existential positions that dominate xi, j, ky.The canonical variable renaming ρ c of I is the substitution which replaces all variable names by their canonical names.Remark 6.13.Note that this relationship is significantly more complex than α-equivalence, as differently named variables are identified according to certain criteria external to variable names.In particular, for some fixed I , there are I n of unbounded size such that I n ρ c " I .This can be seen, for example, by continuing Example 6.5: take I n " tP pcq_P pα i q, P pα i q_ P pβ i q | 1 ď i ď nu.
We now turn to deskolemization, the inverse operation of Skolemization.In our setting, we only consider deskolemization of sequents and their instances, but not of proofs.Furthermore we always assume that the original sequent with @-quantifiers is known.Hence the deskolemization of a sequent trivially replaces it by the original sequent.More interesting is the deskolemization of instances which will consist of replacing Skolem-terms by (canonically named) variables.Definition 6.14.Let Γ " F 1 , . . ., F n be a sequent with Skolem function symbol f i,j for the universal quantifier @x j in F i .Let I be a set of instances of Γ and define its deskolemization sk ´1pI q by repeating the replacement f i,j pt 1 , . . ., t m q Þ Ñ α i,j,t 1 ,...,tm on maximal Skolem-terms (w.r.t. the subterm ordering).
In the deskolemization of a Herbrand-disjunction, the acyclicity of the dependency relation is obtained from the acyclicity of the subterm ordering on the Skolem-terms.Conversely, during Skolemization, the Skolem-terms are well-defined due to the acyclicity of the dependency relation (see e.g.[Mil87,Wel11,BHW12] for more details).We hence obtain the following properties: Lemma 6.15.Let Γ be a sequent and Γ 1 be a weak sequent with Γ 1 " skpΓq.
(1) If I is a Herbrand-disjunction of Γ, then skpI q is a Herbrand-disjunction of skpΓq.
Note that for a cut-free proof π we have rrπss " Hpπqρ c , i.e. the Herbrand-content is nothing other than the Herbrand-disjunction of the proof after variable normalization.Also note that for a proof π of a weak sequent we have rrπss " LpGpπqq, and hence, for a cut-free proof of a weak sequent we have rrπss " Hpπq.We can now lift the main invariance lemma, Lemma 5.4, to proofs of arbitrary end-sequents and formulate this result in terms of the Herbrand-content.
Proof.This is a direct consequence of Theorem 7.2.This corollary shows that rrπss is an upper bound on the Herbrand-disjunctions obtainable by cut-elimination from π.Let us now compare this result with another upper bound that has previously been obtained in [Het10].To that aim let G 0 pπq denote the regular tree grammar underlying Gpπq which can be obtained by setting all non-terminals to non-rigid.In this notation, a central result of [Het10], adapted to this paper's setting is Theorem 7.4.Let π be a proof of a formula of the shape Dx 1 . . .Dx n A with A quantifier-free, and let π ù π 1 with π 1 cut-free.Then Hpπ 1 q Ď LpG 0 pπqq.
While the Theorem 7.4 applies also to non-simple proofs, Corollary 7.3 is stronger in several respects: First, the size of the Herbrand-content is by an exponential smaller than the size of the bound given by Theorem 7.4.Indeed, it is a straightforward consequence of Lemma 3.9 that the language of a totally rigid acyclic tree grammar with n production rules is bound by n n but on the other hand: Proposition 7.5.There is an acyclic regular tree grammar G with 2n productions and |LpGq| " n n n .
Secondly, the class of totally rigid acyclic tree grammars can be shown to be in exact correspondence with the class of simple proofs in the following sense.Not only can we use a totally rigid acyclic tree grammar to simulate the process of cut-elimination, we can also-in the other direction-use cut-elimination to simulate the process of calculating the language of a grammar.It is shown in [Het12a] how to transform an arbitrary acyclic totally rigid tree grammar G into a simple proof that has a ù normal form whose Herbrand-disjunction is essentially the language of G.
The third and-for the purposes of this paper-most important difference is that the bound of Corollary 7.3 is tight in the sense that it can actually be reached by a cutelimination strategy, namely ne ù.In fact, an even stronger statement is true: not only is there a normal form of ne ù that reaches the bound but all of them do.This property leads naturally to the following confluence result for classical logic.Proof.This is a direct consequence of Theorem 7.2.
How does this result fit together with ne ù being neither confluent nor strongly normalizing?In fact, note that it is possible to construct a simple proof which permits an infinite ne ù reduction sequence from which one can obtain normal forms of arbitrary size by bailing out from time to time.This can be done by building on the propositional doublecontraction example found e.g. in [DJS97, Gal93, Urb00] and in a similar form in [Zuc74].While these infinitely many normal forms do have pairwise different Herbrand-disjunctions when regarded as multisets, Corollary 7.7 shows that as sets they are all the same.This setcharacter of Herbrand-disjunctions is assured by using canonical variable names (or equivalently: Skolemization) and thus identifying repeated instances.This observation shows that the lack of strong normalization is taken care of by using sets instead of multisets as data structure.But what about the lack of confluence?Results like [BH11] and [Het12b] show that the number of ù normal forms with different Herbrand-disjunctions can be enormous.On the other hand we have just seen that ne ù induces only a single Herbrand-disjunction: rrπss.The relation between rrπss and the many Herbrand-disjunctions induced by ù is explained by Corollary 7.3: rrπss contains them all.

Conclusion
We have shown that non-erasing cut-elimination for the class of simple proofs is Herbrandconfluent.While there are different and possibly infinitely many normal forms, they all induce the same Herbrand-disjunction.This result motivates the definition of this unique Herbrand-disjunction as Herbrand-content of the proof with cut.
As future work, the authors plan to extend this result to arbitrary first-order proofs.The treatment of blocks of quantifiers is straightforward: the rigidity condition must be changed to apply to vectors of non-terminals.Treating quantifier alternations is more difficult: the current results suggest to use a stack of totally rigid tree grammars, each layer of which corresponds to one layer of quantifiers (and is hence acyclic).Concerning further generalizations, note that the method of describing a cut-free proof by a tree language is applicable to any proof system with quantifiers that has a Herbrand-like theorem, e.g., even full higher-order logic as in [Mil87].The difficulty consists in finding an appropriate type of grammars.
Given the wealth of different methods for the extraction of constructive content from classical proofs, what we learn from our work about the class of simple proofs is this: the first-order structure possesses (in contrast to the propositional structure) a unique and canonical unfolding.The various extraction methods hence do not differ in the choice of how to unfold the first-order structure but only in choosing which part of it to unfold.We therefore see that the effect of the underspecification of algorithmic detail in classical logic is redundancy.