ON THE RELATIVE PROOF COMPLEXITY OF DEEP INFERENCE VIA ATOMIC FLOWS

. We consider the proof complexity of the minimal complete fragment, KS , of standard deep inference systems for propositional logic. To examine the size of proofs we employ atomic ﬂows , diagrams that trace structural changes through a proof but ignore logical information. As results we obtain a polynomial simulation of versions of Resolution, along with some extensions. We also show that these systems, as well as bounded-depth Frege systems, cannot polynomially simulate KS , by giving polynomial-size proofs of certain variants of the propositional pigeonhole principle in KS .


Introduction
Deep inference is a relatively recent proof methodology whose systems differ from other formalisms by allowing derivations themselves to be composed by logical connectives.One of its main features is locality, i.e. inference steps can be checked in constant time, a property that is impossible to achieve in Gentzen systems [Brü03].In recent years there has been an increasing interest in the proof complexity of deep inference [BG09] [Das14], in particular the weaker systems initially introduced by Brünnler and Tiu [BT01].Perhaps the most notable result is that a certain system, denoted KS ∪ {c↑}, quasipolynomially 1 simulates Frege systems [Jeř09] [BGGP10].It is conjectured that this can be improved to a polynomial simulation, and so proving nontrivial lower bounds for KS ∪ {c↑} is likely equivalent to proving them for Frege systems, a task which has escaped proof complexity theorists for years.
However this quasipolynomial simulation relies crucially on the presence of "dag-like behaviour", manifested in deep inference by a particular rule, cocontraction: . Without it we have a minimal complete system closed under deep inference, KS.This system is free of compression mechanisms, in the sense that a proof of a conjunction can be 'partitioned' into proofs of each conjunct, unlike proofs in systems that are dag-like or contain cut.This is explained further in [Das12a].
It is conjectured that KS is unable to polynomially simulate KS∪{c↑} [BG09] [BGGP10] [Das11] [Str12], raising the question of where exactly it fits in the hierarchy of proof systems.
Atomic flows are diagrams that track structural changes in a proof (duplication, creation and destruction of atoms) but ignore logical information.In this work they serve as a useful abstraction because of certain rewriting procedures on them which can be used to manipulate derivations soundly without any mention of logical syntax.Atomic flows are introduced formally in [GG08] and a comprehensive account can be found in [Gun09].
In this paper we focus on upper bounds and simulations to demonstrate the relative strength of KS.The starting points in our arguments are proofs in a system obtained by extending KS ∪ {c↑} by the coweakening rule: A w↑ −− ⊤ ; the resulting system KS ∪ {c↑, w↑} is denoted KS + in this paper.We then appeal to sound rewriting rules on the atomic flows of these proofs to show that cocontraction and coweakening steps can, in certain cases, be eliminated from a proof in polynomial time.
It is worth mentioning here that the addition of coweakening makes little difference to proof complexity, indeed it is not difficult to see from the rewriting rules in Fig. 1 that coweakening steps can be eliminated in time linear in the size of the proof.Rather the real generators of complexity in our proofs are the interactions between contraction and cocontraction nodes in the atomic flows, as we show in Prop.3.10 and Lemma 3.13.
In Sect. 4 we give a simple example of how atomic flows can be used to normalise a naïve encoding of truth table proofs in KS + to produce a polynomial simulation in KS.As a corollary we obtain a superpolynomial separation of KS from tree-like cut-free Gentzen systems, since they are unable to simulate truth tables [D'A92], a new proof of a result appearing in [BG09].
In Sect. 5 we consider stronger systems; we improve a result of Jeřábek's that KS + has polynomial-size proofs of the functional and onto versions of the propositional pigeonhole principle by showing that they can be polynomially transformed into KS proofs of the same conclusions.This immediately entails that cut-free Gentzen systems, Resolution and even bounded-depth Frege systems are exponentially separated from KS by the results of [PBI93] and [KPW95].
In Sect.6 we consider simulations in KS of other proof systems.There is already a naïve simulation of tree-like cut-free Gentzen sequent calculi, appearing in [BG09], and what amounts to a polynomial simulation of 'Resolution with multisets' is outlined in [Gug03].
Here we formalise the latter result and also give a polynomial simulation of tree-like Resolution systems, even when sets are the basic data structure.We show that both simulations extend to so-called Resolution(f ) systems, introduced by Krajíček in [Kra01] and known to be strictly stronger than usual versions of Resolution [SBI02] [EGM04].
This paper is a full version of [Das12b], and differs from that work as follows: (1) Full proofs are given where they were brief or omitted previously.
(2) We expand on some of the preliminary work on the complexity of normalisation induced by flow rewriting in Sect. 2. In particular we provide full proofs of termination and confluence for the rewriting system norm and give explicit reduction strategies that achieve the complexity bounds given in the previous work.(3) In Sect. 2 we use the definitions and notations found in [GG08] and [Gun09] for atomic flows, to maintain consistency with the existing literature, whereas in [Das12b] selfcontained definitions were preferred for brevity.
(4) There were some errors in the statements of results in the previous work, which have been corrected here.In particular the previously stated simulations of dag-like cut-free Gentzen systems and dag-like Resolution systems are incorrect as presented and we have not been able to amend them.Here we only obtain such simulations for certain cases, namely the versions of Resolution in Sect.6.

Deep inference, atomic flows and normalisation
In this section we introduce deep inference systems for propositional logic and atomic flows, diagrams that trace the structural changes in a proof.We consider a graph rewriting system norm on flows, corresponding to sound manipulations on proofs, and analyse the complexity of termination in this system.
2.1.Deep inference.We consider propositional logic with formulae constructed from literals (propositional variables and their duals), also called atoms, over the basis {⊤, ⊥, ∧, ∨}, and use the infix symbol ≡ to denote equivalence of expressions.The variables a, b, c, d range over literals, with ā, b, . . .denoting their duals, and A, B, C, D range over formulae; both sets of variables may include subscripts or superscripts as necessary.
For clarity we use square brackets [, ] for disjunctions and round ones (, ) for conjunctions.We generally omit external brackets of an expression, and also internal ones under associativity.This does not cause any confusion when it comes to proofs, since any valid bracketing can be reduced to any other by the = rule in Dfn.2.1.
Note that we do not have a symbol for negation in our language, formulae are always in negation normal form.We may however write Ā to denote the De Morgan dual of a formula A, obtained by the following rules: Definition 2.1 (Rules and systems).An inference rule is a sound binary relation on formulae decidable in polynomial time, and a system is a set of rules.We define the rules we use below, and the systems KS = {ai↓, aw↓, ac↓, s, m}, KS + = KS ∪ {aw↑, ac↑} and SKS = KS + ∪ {ai↑}.

Atomic structural rules
Linear logical rules coweakening cocontraction medial Note in particular our distinction between variables for literals and formulae in the above rules, and between 'structural' and 'logical' rules.
We also have the logical rule = which is obtained by closing the equations below under reflexivity, symmetry, transitivity and by applying context closure.We implicitly assume that it is contained in every system.

Commutativity
A proof that = is decidable in polynomial time can be found in [BG09].Essentially both formulae are just reduced to some canonical form and then compared.Consequently we often omit occurrences of the = rule in proofs and derivations.Definition 2.2 (Proofs and derivations).We define derivations and premiss and conclusion functions (pr and cn respectively) inductively.
(1) Each formula A is a derivation with premiss and conclusion A.
(3) If Φ and Ψ are derivations and cn(Φ) is an instance of some inference rule ρ then Φ ρ −− Ψ is a derivation with premiss pr(Φ) and conclusion cn(Ψ).If pr(Φ) ≡ ⊤ then we call Φ a proof.If Φ is a derivation where all inference steps are instances of rules in a system S with premiss A, conclusion B, we write

B
. Furthermore, if A ≡ ⊤, i.e.Φ is a proof in a system S, we write While our structural rules only have atoms in their premisses and conclusions, the notion of derivation above allows us to extend these to arbitrary formulae, as stated in the proposition below.We often use these 'generic rules' rather than their full derivations for convenience.
Proposition 2.3 (Generic rules).Each rule below is derivable from s, m, and its respective atomic structural rule in polynomial time.
Proof.See [BT01] for full proofs.Here we just give an example of the case for contraction, since that is the only structural rule of the sequent calculus that cannot be reduced to atomic form [Brü03].The proof is by induction on the depth of the conclusion of a c↓ step. [

B
Note that the case for cocontraction is dual to this: one can just flip the derivations upside down and replace every formula with its De Morgan dual.
Definition 2.4 (Complexity).We define the size |Φ| of a derivation Φ to be the number of atom occurrences in Φ.A system S polynomially simulates a system T if each proof in T can be polynomially transformed into a proof in S of the same conclusion.
2.2.Atomic flows.We give only an informal definition of atomic flows here, but refer the reader to [GG08], [Gun09] for a formal account of atomic flows.
Definition 2.5 (Atomic flows).For an SKS derivation Φ we define its atomic flow, f l(Φ), to be the diagram obtained by tracing the path of each atom through the derivation, designating the creation, duplication and destruction of atoms by the following corresponding nodes: We do not have nodes for s, m or = since they do not create, destroy or duplicate any atom occurrences.
More generally an atomic flow, not necessarily of a derivation, is a (vertically) directed graph embedded in the plane generated from the six types of node above.
Atomic flows are considered equivalent up to continuous deformation preserving the (vertical) ordering of connected edges.Note that edges may be pending at either end.
We define the size of a flow φ, denoted |φ|, to be its number of edges.
In previous works atomic flows have been equipped with a labeling of the edges, or a polarity assignment, for example to avoid the following impossible situation: Since we are only concerned with the complexity of flows and their transformations we do not include this extra structure; this does not affect the soundness or termination of our rewriting systems, and in fact is crucial in order to obtain confluence, for which labellings of edges can cause problems.
We do, however, insist that edges are vertically directed and so we are often able to talk about one node being 'above' another node.Notice that this order is not generally preserved under deformation, e.g. if two nodes are in disconnected components.Whenever we use this notion in arguments it should be clear that it is being used correctly.
Definition 2.6.A flow rewriting rule is an ordered pair of flows, written φ → ψ.A flow rewriting system (FRS) is a set of flow rewriting rules.A one-step reduction of a flow φ in an FRS r is a flow ψ that is obtainable from φ by replacing some induced subgraph that is the left hand side of some rule in r with its right hand side.
Definition 2.7.We define a graph rewriting system norm on flows in Fig. 1.
The system norm is essentially the system c ∪ w in [GG08], without the rules for i↑.The proof of termination that follows is similar to that for cycle-free flows in [GG08], and the proof of confluence is similar to that in [GGS10].Notation 2.8.Let r be a flow rewriting system.We use the following notation: • We write φ → r ψ if there is a one-step reduction from a flow φ to a flow ψ using a rule in r.
• We denote by * → r the reflexive transitive closure of → r .
• If a flow φ has a unique normal form under r we denote it by φ↓ r . 2 In all cases we might omit the subscript r if it is clear from context.
Example 2.9.We give an example of a flow associated with a derivation in Fig. 2, as well as a reduction under norm, as defined in Dfn.2.7, applying w↑ rules first.
The first equality follows by the definition of a flow, the second by deformation and the final by definition again.The intermediate steps are as follows: (1) Apply c↓-w↑ on the left and c↓-c↑ on the right.
(2) Apply w↓-w↑ on the left, i↓-w↑ in the middle and i↓-c↑ on the right.
The use of colours in the initial and final flow identifies which edges corresponds to which atoms; in the intermediate flows the colours should aid the reader in reconstructing the corresponding transformations on the derivation.
We now proceed to prove that reducing under norm is terminating and confluent.For this we use the usual critical pair lemma, that one only needs to check that the one-step contracta of every overlapping pair of redexes are joinable in order to conclude local confluence.It is simple to see that this is still valid in the flow rewriting setting, since the only other case to check for flows is when the redexes are disjoint, which is trivial.We point out that Newman's lemma, that any locally confluent terminating system is confluent, is true more generally for any binary relation on any set, and so indeed holds in this setting.
2 Note that we are using downward arrows in both the names of deep inference rules and to denote normal forms under rewriting systems.Unfortunately both notations are standard in their respective literature, however there should be no ambiguity in their usage so hopefully this will cause little confusion.
Figure 2: A proof, its flow and a reduction under norm.
Proof.For a node ν in a flow φ, let d(ν, φ) be the distance of ν from the top of φ, i.e. the minimum length of a (vertically) directed path from ν to an ai↓ node, an aw↓ node or an edge with upper end pending.For an atomic structural rule ρ, let D(ρ, φ) be the sequence of natural numbers that counts how many ρ nodes in φ have each d-value, i.e. the sequence (n i ) such that, for each i, n i is the number of ρ nodes ν in φ with d(ν, φ) = i, and consider the lexicographical ordering < on such sequences, with Clearly the rules c↓-c↑, i↓-c↑ and w↓-c↑ strictly reduce the D(ac↑, •) value of a flow, while the other rules of norm (as well as w↓-c↑) strictly reduce a flow's size, while not increasing the D(ac↑, •) value.Therefore each application of a norm rule strictly reduces the lexicographical product D(ac↑, Since norm is terminating, it suffices to check local confluence for the critical pairs, which are the following by inspection: (1) , (2) (6) , ( 7) Note that every other overlapping pair can be deformed so that each rule application trivially commutes.We consider each case below. ( The cases where we appeal to duality follow by simply flipping the indicated reductions upside down and relabeling nodes and reduction steps appropriately.
The significance of the rewriting system norm is evident from the following results, where we show that reduction steps correspond to sound manipulations of SKS derivations.
Definition 2.11.If R is a relation on flows we say that R lifts polynomially to SKS if, whenever (f l(Φ), ψ) ∈ R, we can construct a derivation Ψ in time polynomial in |Φ| + |ψ| with the same premiss and conclusion as Φ and flow ψ.If f is a function on flows then we say that f lifts polynomially to SKS just if the relation f (•) = • lifts polynomially to SKS.
Sometimes we simply say that an individual flow rewrite rule is sound rather than saying that it lifts polynomially to SKS.
Remark 2.12.In fact a derivation can always be manipulated (preserving premiss, conclusion and flow) so that it has size at most polynomial in the size of its flow, as (essentially) shown in [Das13], so the dependence on size of derivation in Dfn.2.11 is somewhat redundant.However this is beyond the scope of this work, and it does no harm for us to include this dependence.
Proof.See [GG08].Essentially the proof shows that each local rewrite step on a flow of a derivation induces a sound manipulation of polynomial size on that derivation.We give as an example the case for c↓-c↑, since that rule is of particular interest in this work.
Let ξ{ } denote an arbitrary context, i.e. a formula with a single hole occurring in place of a subformula, and let ξ{A} denote the result of substituting a formula A for the hole in ξ{ }.The manipulation is as follows: where Φ{a/(a ∧ a)} is obtained by replacing a by (a ∧ a) everywhere in Φ.
Corollary 2.14.Given an SKS derivation Φ and a reduction f l Proof.By induction on the length n of a norm derivation.
2.3.Reduction strategies.We analyse the complexity of normalising a flow under norm, presenting a class of reduction strategies that optimise the size of a −→ norm derivation from a flow to its normal form, up to a polynomial.Our aim is to prove the following theorem: The result follows from Cor. 2.14 if we can find appropriate reductions with size only polynomially dependent on the initial derivation and normal form of its flow.
Example 2.16.Consider the following flow, . . .where there are n ac↑ nodes.Notice that this flow has normal form , but the complexity of a derivation witnessing this can vary significantly.If we apply c↓-c↑ steps first then the length of a derivation to normal form is Θ(2 n ), whereas applying {w↓-c↓, w↓-c↑} steps first results in length Θ(n).These bounds follow from later results in this section, but are not difficult to prove directly.
In fact, this sort of unnecessary exponential blowup can always be avoided by applying 'weakening' rules first.
Proof.By inspecting the rules of cont, we observe that if there are no wk-redexes in a flow ψ and ψ → cont θ then there are no wk-redexes in θ.By induction on the length of a → cont derivation we then have that there are no wk-redexes in (φ ↓ wk ) ↓ cont .But since there are also no cont-redexes, by definition of normal form, and wk ∪ cont = norm we can conclude that (φ↓ wk )↓ cont is already in normal form for norm.
We can now give a proof of the main theorem of this section.
Proof of Thm.2.15.Let Φ be an SKS derivation with flow φ whose normal form under norm is ψ, and fix a derivation

Complexity of normal forms
In this section we specialise previous results to KS + flows, reducing the complexity of norm reduction to counting the number of certain paths in the initial flow.We then give two simple ways of estimating this number, which we use in later sections to obtain complexity results.
Proposition 3.1.If φ is the flow of a KS + proof, with normal form ψ under norm, then ψ is free of aw↑ and ac↑ nodes, i.e. contains just KS nodes.
Proof.We argue by contradiction.Notice that by c↓-c↑ we can assume that all ac↑ nodes are above all ac↓ nodes in ψ, by deformation, and so must have upper end directly connected to an aw↓ or ai↓ node, since φ is associated with a proof.However then ψ can be reduced by either i↓-c↑ or w↓-c↑, contradicting normality.An aw↑ node, similarly, must be above all ac↓ and ac↑ nodes by c↓-w↑ and c↑-w↑, and so must have upper end directly connected to a aw↓ or ai↓ node, and again ψ can be reduced by w↓-w↑ or i↓-w↑, contradicting normality.
Notice that the above proposition, along with previous results in this section, allows us to transform KS + proofs to KS proofs of the same conclusion.We state this formally in Thm.3.8 once we have determined more about the complexity of this transformation, which we address in the proceeding results.Definition 3.2.An ai-path is a (simple) path that changes (vertical) direction only at ai↓ and ai↑ nodes.We say that an ai-path is maximal if it cannot be extended, and that it is open if it begins and ends at a pending end of an edge.
The inversion of a path is just the same path in the reverse direction.
Example 3.3.The paths on the right, and their inversions, are exactly all the maximal ai-paths of the flow below.All of these are open except 0, since one of its ends is a aw↓ node.More generally all open paths are maximal but not vice-versa.Proof.Since the flow of a proof can have no edge with upper end pending, every edge must be path-connected to a aw↓ or ai↓ node.Since no open path goes through an aw↓ node, and there are no ai↑ nodes, every open ai-path goes through a unique ai↓ node, and every ai↓ node accommodates at least one such path since there are no aw↑ nodes.Now, the only other node a path can go through is an ac↓ node.Consequently every node in an open ai-path has in-degree 1 before passing its unique ai↓ node, and out-degree 1 after.Hence each ai↓ node accommodates exactly one open ai-path.we must be able to decompose χ into two disjoint components: χ 1 , consisting of just ai↓, ac↓ and ac↑ nodes, and χ 2 consisting of just aw↓ and aw↑ nodes.In fact, since φ was the flow of a proof, χ 2 consists of just aw↓ nodes.Now notice that any → cont derivation from χ to ψ acts only on χ 1 , and so ψ can be decomposed into two disjoint components: ψ 1 , which is the normal form of χ 1 under norm, consisting of just ai↓ and ac↓ nodes by Prop.3.1, and ψ 2 = χ 2 , consisting of just aw↓ nodes.
We  The following result provides a simple estimate of the number of paths in a flow, and also the complexity of flow normalisation under norm, and is just an adaptation of well-known techniques to estimate the number of paths in any directed acyclic graph.Definition 3.9 (Dimensions of a flow).The length of a flow is the maximum number of times the type of node changes in an ai-path.The width of a flow is the maximum size of a connected subflow containing just one type of node.The breadth of a flow is the number of connected components it has.
The above definition is perhaps most easily understood by allowing ac↓ and ac↑ nodes to have unbounded in-degree and out-degree respectively.For example, by just collapsing any configuration of n − 1 connected ac↓ nodes to a single node • • • n , and similarly for configurations of ac↑ nodes.Then we can consider the length of the flow to be essentially just the maximum length, in the usual sense, of an ai-path, and the width is the maximum out-degree or in-degree.
It is worth mentioning here that replacing ac↓ and ac↑ nodes with these 'super' nodes can be lifted polynomially to proofs, in the sense that ac↓ and ac↑ steps in a proof can be soundly replaced by similar 'super' steps with only linear change in size.Proof.For simplicity we write • • • n for some configuration of n − 1 connected c↓ nodes, and similarly for a c↑ configuration.Notice that it suffices to consider the case when b = 1, since paths in different connected components are disjoint.
In the worst case scenario we just have a sequence of

Contraction loops in atomic flows. Sometimes the estimate of number of open
ai-paths given via length of a flow in the previous section is not sufficiently accurate.An example of this is given in Sect.6.2 where the length of flows in the translation R is polynomial in their size, and so the estimate is exponential, yet the actual number of open ai-paths is bounded above by a polynomial.
Unsurprisingly, it is only certain interactions between ac↓ and ac↑ nodes that generate significant complexity in a KS + flow; in this section we focus on one type of interaction we call a 'contraction loop'.In the absence of these the number of open ai-paths in a flow is still polynomial in its size, regardless of its length.Definition 3.12.A contraction loop in a flow is a pair of (ac↑, ac↓) nodes (ν 1 , ν 2 ) such that there are at least two disjoint (directed) paths between ν 1 and ν 2 For example we give the following flow and all its contraction loops, (u, y), (v, z).
whereas every other pair has only one path between them.If the edge ⋆ were broken and there was no path from w to x then there would be no contraction loops at all.Lemma 3.13.If there are no contraction loops in a KS + flow φ then φ ≤ |φ| 3 .
Proof.For an edge ǫ in φ consider the following two notions: • The weight of ǫ, denoted w(ǫ), is the number of directed paths from ǫ to the bottom of φ, i.e. to a aw↑ node or an edge with lower end pending.• For an atomic structural rule ρ let N (ρ, ǫ) denote the number of ρ nodes below ǫ that are connected to ǫ by a directed path.We show that w(ǫ) ≤ N (aw↑, ǫ) + N (ac↑, ǫ) + 1 by induction on the distance of ǫ from the bottom of φ.The inequality is clear for the base cases when ǫ is directly connected to a aw↑ node, ǫ , and when ǫ has lower end pending, ǫ .We have two inductive steps: (1) ǫ is an upper edge of a ac↓ node, ǫ δ .In this case we clearly have that w(ǫ) = w(δ) and so the inequality follows by the inductive hypothesis.
Clearly the number of open ai-paths going through an edge with upper end pending is bounded above by its weight, and so by |φ| by the bound above, while the number of open ai-paths going through any ai↓ node is bounded above by the product of the weights of each of its edges, and so by |φ| 2 .In particular we have that the number of open ai-paths going through any edge at the top of a flow is bounded above by |φ| 2 , and there are at most |φ| many such edges, whence the bound follows.
Example 3.14.In fact the bound given above is optimal, up to multiplication by a constant.Consider the flow φ below, where there are n ai↓ nodes: . . .Clearly the flow has size linear in n and notice that the weights (as defined in the above proof) of the topmost edges, starting from the left, are 1, 2, . . ., n, n, . . ., 1 respectively.Consequently the number of open ai-paths going through the ai↓ nodes, starting from the outside, is 1 2 , 2 2 , . . ., n 2 respectively, and so taking the sum we obtain φ = Ω(n 3 ).Notice also that φ has length linear in n and width 2, and so the upper bound on φ given by Prop.3.10 is exponential, considerably worse than the bound given in Lemma 3.13.

Truth tables and tree-like Gentzen systems
KS polynomially simulates tree-like cut-free Gentzen systems since its rules are generalisations of Gentzen rules [BT01] [BG09].In the other direction Bruscoli and Guglielmi have proved in [BG09] that the converse does not hold, by way of the so-called 'Statman tautologies'.We offer a new proof here, via truth tables, as an exercise prior to the main results.
Let LK − denote the cut-free sequent calculus, and tree-LK − denote the system of tree-like proofs in this calculus. 3  Proposition 4.1 (D'Agostino).Tree-LK − cannot polynomially simulate truth tables.

Proof. See [D'A92].
To expand slightly on the above proposition, truth tables are efficient when there are exponentially many occurrences of each atom, and some such tautologies are hard for tree-LK − .One such example, used by D'Agostino, is simply the disjunction of every assignment on k propositional variables, what we call A γ A below.Such a formula has size k • 2 k , although any tree-like cut-free sequent proof must contain at least k! branches, while a truth table contains 2 k rows and k • 2 k columns.Lemma 4.2.KS + polynomially simulates truth tables. 3It does not matter too much which version of the sequent calculus we choose as complexity properties are typically preserved across common variations.
Proof.Let τ be a tautology.For each partial assignment A, defined on just those variables appearing in τ , and each formula A satisfied by A construct a derivation Φ A (A) by structural induction on A as follows: where, in the last case, when A is a disjunction, the disjunct B was chosen such that B is satisfied by A. It is clear that each Φ A (τ ) has conclusion τ and premiss a conjunction of literals; moreover this conjunction of literals is satisfied by A. Let γ A be the conjunction of all literals satisfied by A, so that each variable appears exactly once.Then we can easily construct derivations . Now construct a proof Ψ of A γ A in {ai↓, ac↑, s, m} by induction on the number of distinct variables, as shown below: Finally we put these together and apply contractions to obtain a KS + proof of τ : It is clear that the derivations inside the large parentheses have size polynomial in |τ |, which is the number of columns in a truth table, and the number of these derivations appearing in disjunction is just the number of assignments, which is the number of rows in a truth table.
Proof.Notice that, in the above simulation, all ac↑ steps are above all ac↓ steps, and so the associated flow will have bounded length.The result follows by Cor.2.14, Lemma 3.7 and Prop.3.10.Remark 4.5.It should be noted that D'Agostino's separation is only quasipolynomial,4 and so our separation is also only quasipolynomial, while the proof using the Statman tautologies in [BG09] yields an exponential separation.Nonetheless, an exponential separation follows from the results in the next section.

Separations via variants of the pigeonhole principle
Jeřábek has shown that KS + has polynomial-size proofs of the functional and onto variants of the pigeonhole principle [Jeř09].We show that these proofs reduce under norm to KS proofs with only polynomial increase in size, and so cut-free sequent calculi, Resolution and bounded-depth Frege systems are unable to polynomially simulate KS.
The pigeonhole principle states that, if n pigeons sit in n − 1 holes, some hole must contain more than one pigeon.In its unrestricted formulation, the mapping from pigeons to holes can be many-many, while in the functional variant it must be many-one, i.e. a function, and the onto variant insists that each hole must be occupied.It is clear that the two variants are weaker than the unrestricted version, and the variant containing both criteria, the onto functional pigeonhole principle, is weaker still.We will see this more clearly in the following definition.
The propositional encodings of pigeonhole principle variants below are most easily understood by interpreting the atoms a ij as "pigeon i sits in hole j".Recall that it does not matter how large disjunctions and conjunctions are bracketed, since any valid bracketing can be easily reduced to any other by the = rule.Definition 5.1 (Pigeonhole principles).We define the following formulae, and denote by FPHP n , OPHP n and OFPHP n the formulae obtained by putting in disjunction the associated formulae, i.e.
We can see in the above definition that any variant can be obtained from a stronger variant, i.e. one with a subset of disjuncts, by a simple application of generic weakening w↓.Consequently upper bounds on the size of proofs of one variant yield upper bounds for all weaker variants, and lower bounds vice-versa.
The following result was proved by Beame, Impagliazzo and Pitassi, and independently by Krajíček, Pudlák and Woods.Proof.All the systems are just special cases of bounded-depth Frege, and a proof of any variant can be extended to one of OFPHP n by an application of generic weakening w↓.
On the other hand we have the following: Theorem 5.4 (Buss).There are polynomial-size Frege proofs of PHP n , and so all variants of the pigeonhole principle.
From here, polynomial-size SKS proofs of PHP n follow from the following observation: Proposition 5.5 (Bruscoli and Guglielmi).SKS is polynomially equivalent to Frege systems.
Notice that one direction of the above proposition, indeed the direction that we require, that SKS polynomially simulates Frege systems, can be obtained by recognising that the rules of SKS are just generalisations of the rules of Gentzen systems with cut, which are well-known to be polynomially equivalent to Frege systems.
The following trick, now standard in the deep inference literature, is very useful for proving certain tautologies in KS, as we will see.The idea is that if we know there is an SKS proof of a tautology, then we can transform it into a KS proof of that tautology in disjunction with trivial contradictions.If we can find derivations from each of these contradictions to the tautology we want to prove, then the composition mechanisms for derivations in deep inference guarantee that we can build a proof of the tautology.
Lemma 5.6.Let A be a formula over the atoms a 1 , . . ., a n .Every SKS proof Φ of A can be polynomially transformed to a KS proof of A ∨ i (a i ∧ āi ).
Lemma 5.7 (Jeřábek).There are polynomial-size proofs of FPHP n and OPHP n in KS + .Proof.By Thm.5.4, Prop.5.5 and Lemma 5.6 there are polynomial-size KS proofs with conclusion PHP n ∨ i,j (a ij ∧ āij ).For each atom a st we construct a derivations Φ ast n in KS + \ {ac↓} with premiss a st ∧ āst and conclusion FPHP n as follows: We then put these together and apply contractions to obtain proofs of FPHP n : We can construct similar derivations from a st ∧ āst to OPHP n , given below, and put them together in a similar way to obtain proofs of OPHP n in KS + .
Theorem 5.8.There are polynomial-size KS proofs of FPHP n , OPHP n and so also OFPHP n .
Proof.In the proofs of FPHP n constructed above, Lemma 5.7, notice that the only ac↑ steps occur in Φ ast n where there are also no ac↓ steps, and similarly for OPHP n .It follows that there are only two alternations between ac↓ and ac↑ steps in the path of any atom from an ai↓ step, and so the atomic flows of these proofs will have bounded length.The result follows by Thm.3.8 and Prop.3.10.Corollary 5.9.Cut-free sequent calculi, Resolution and bounded-depth Frege systems are exponentially separated from KS.

Polynomial simulations of versions of Resolution
In this section we present a polynomial simulation in KS of tree-like and multiset Resolution systems.We point out that, in previous work [Das12b], a more general result was claimed, namely a simulation of unrestricted Resolution operating with sets, however that proof contained errors that we could not amend for this version of the article.The status of those results is unresolved.
We define the Resolution system below in both its set and multiset formulations.
Definition 6.1 (Resolution).We use symbols Γ, ∆ etc. to range over (multi)sets of literals and write '∪' to denote (multi)set union.We define the system (mulitset-)Resolution by the following rules: A derivation from (multi)sets Γ 1 , . . ., Γ s is a list5 π = (∆ 1 , . . ., ∆ n ) where each ∆ i is some Γ j or follows by one of the rules above whose premisses have occurred previously in the list.
If we further have that ∆ n = ∅ then we call π a refutation of Γ 1 , . . ., Γ s .We call a derivation (∆ 1 , . . ., ∆ n ) tree-like if each ∆ i is used at most once as the premiss of any inference step concluding some ∆ j .
To simplify the treatment of (multiset-)Resolution derivations we address certain rather pathological situations below.These are also the reason why we opt to include weakening in our formulation. 6emark 6.2 (Assumptions on derivation format).In a (multiset-)Resolution derivation, we can assume that neither premiss of a res step contains both a resolved atom and its dual.Any such step would have the following format, which can be simulated by an application of wk to the left premiss, if necessary at all.
We consequently have that no res step has identical premisses, and so there is no need to consider this case in the translations that follow.
Finally, in the case of sets, we further assume that neither the resolved atom nor its dual appear in the conclusion of a res step.I.e., in the definition of the res above, we assume that Γ and ∆ contain neither a nor ā.Aside from the above situation this phenomenon might also occur if, say, a ∈ Γ and so Γ = Γ ∪ {a}, whence the step may be simulated by an application of wk in a similar way.Before presenting our simulations in KS of versions of Resolution, we set some notational conventions below in order to easily switch between the two settings.Notation 6.3 (Set symbols in deep inference).To reduce the amount of syntax in our deep inference derivations, we will simply write Γ instead of Γ for the disjunction of the members of Γ, as an abuse of notation.
We similarly use other (multi)set-theoretic notation.In particular, in light of the above remark, we will always have that Γ ∪ {a} = Γ ∨ a in both the set and multiset settings whenever it occurs.In the multiset setting we further have that Γ ∪ ∆ = Γ ∨ ∆ for all Γ, ∆.Remark 6.4.Throughout this section, when we say that a proof system polynomially simulates a refutation system, we mean that every refutation of A in the latter can be polynomially transformed to a proof of Ā in the former.

Definition 6.5 (Dual systems). The dual of a deep inference rule
a rule x ↓ is dual to x ↑, while s and m are self-dual.The set of duals of a system S is denoted S. Notice that a rule is sound if and only if its dual is, by the law of contraposition.
The dual of a derivation Ā obtained by flipping Φ upside down and replacing each inference step with its dual.
Similarly we define the dual of a flow-rewriting rule to be the rule flipped upside-down, replacing each node with its dual, and the dual of a rewriting system is just the set of its duals.A rule/system is sound, terminating and/or confluent if and only if its dual is.
We are now ready to define our basic translation from Resolution derivations to deep inference, on which our simulation results in later sections will be based.Definition 6.6 (Translation of Resolution derivations).We give the following translation R of a Resolution step to a KS + derivation, where the parenthesised c↓ steps are present only when Γ and ∆ are sets (not multisets) that have nonempty intersection.We extend the definition of R to any (multiset-)Resolution derivation π = (∆ 1 , . . ., ∆ n ) from Γ 1 , . . ., Γ s , such that R has the following format: R : The definition is by induction on the length n of the Resolution derivation π.
If n = 0, i.e. π is an empty list, then m ∆ m = ⊤ and we define: If π is extended by a (multi)set Γ i then we define, R : where the derivation marked ID is already defined by induction.If π is extended by a wk step then we define, where ∆ n+1 = ∆ i ∪ Σ and the derivation marked ID is already defined by induction.If π is extended by a res step then we define, R : and the derivation marked ID is already defined by induction.6.1.Multiset Resolution.We consider the multiset setting, for arbitrary derivations (i.e.not necessarily tree-like).Here we achieve a polynomial simulation in KS rather easily by using the identities pointed out in Not.6.3.Observation 6.7.The image of a multiset-Resolution derivation under R has no ac↓ steps.
Proof.The only ac↓ steps in the definition of R occur in the translation of individual res steps, where the format of the premiss is Γ ∨ ∆ and the conclusion Γ ∪ ∆.However, as already pointed out in Not.6.3, these are equivalent in the multiset setting, and so no ac↓ steps are necessary at all.Theorem 6.8.KS polynomially simulates multiset resolution.
Proof.For a multiset resolution refutation π of Γ 1 , . . ., Γ s we have that Rπ is a KS + derivation from 6.2.Tree-like Resolution.We now consider Resolution derivations over sets and show a polynomial simulation of tree-like refutations in KS.Here the argument is only slightly more involved: there are both ac↓ and ac↑ steps occurring, but they do not interact in any complex way.We point out that the proof of this result could perhaps be carried out more simply by writing tree-like derivations as trees and simulating steps locally, discarding premisses once they are used.However the current presentation allows us to use the same translation R for this and the multiset setting, as well as the extensions introduced in the next section.This uniform treatment is possible due to Lemma 3.13 on the number of open ai-paths in the absence of contraction loops.Proposition 6.9.If π is a tree-like Resolution derivation then every ac↑ node in f l(Rπ) has one edge pending.
Proof.By induction on the length of π.On inspection of the atomic flows, notice that each translation step introduces only c↑ nodes whose lower left edge is pending (with respect to the horizontal order of atoms given in the definition of R in Dfn.6.6), so it suffices to show that new nodes are not attached to the lower edge of existing c↑ nodes.
If any such situation existed then, by construction, each c↑ node must be associated with the same set ∆ i that is the premiss of distinct steps in π, contradicting the fact that π is tree-like.
Recall the notion of contraction loop from Dfn. 3.12.Corollary 6.10.If π is a tree-like Resolution derivation then f l(Rπ) has no contraction loops.
Proof.For a tree-like Resolution refutation π of Γ 1 , . . ., Γ s , as in the proof of Thm.6.8, we have that Rπ has the format:

Γr
We also have that f l(Rπ) contains no contraction loops, since otherwise there would also be a contraction loop in f l(Rπ), violating Cor.6.10.The result now follows from Lemma 3.13 and Thm.3.8.
6.3.Extensions of Resolution.Finally, we notice that this simulation extends to some extensions of Resolution, operating on (multi)sets of conjunctions of literals, introduced by Krajíček in [Kra01].These systems are known to be strictly stronger than their counterparts we have so far dealt with [SBI02] [EGM04].
Definition 6.12.Let s, t, etc. denote terms, i.e. conjunctions of literals, and Γ, ∆, etc. now denote sets of terms.We define the following rules, For a function f : N → N, a (tree-like) (multiset-)Resolution(f ) derivation/refutation is defined analogously to Dfn. 6.1, using the rules above, with the additional proviso that no term has size bigger than f (N ), where N is the number of inference steps in the refutation.Theorem 6.13.For any f : N → N we have the following: (1) KS polynomially simulates multiset-Resolution(f ).
Proof sketch.The proofs are analogous to those appearing earlier for the tree-like and multiset variants of Resolution respectively, interpreting Γ, ∆ etc. as (multi)sets of terms and replacing variables a, ā etc. by term variables and their duals t, t etc.8 We extend the definition of R on individual steps to ∧ steps as follows: Γ ∪ ∆ ∪ {s ∧ t} R is extended to Resolution(f )-derivations similarly to Dfn. 6.6, dealing with ∧ steps in the same way res steps.
For 1 the argument is identical to that for Thm.6.8 since, again, c↓ steps are not introduced for the same reason, cf.Not.6.3.
For 2 the argument is identical to that for Thm.6.11 since the argument of Prop.6.9 remains valid and so there are still no contraction loops in the flow of a derivation in the image of R.

Conclusions
We have presented a series of upper bound results for the deep inference system KS.Polynomial simulations were given for truth tables and versions of Resolution, while polynomial-size proofs of functional and onto variants of the propositional pigeonhole principle were presented, yielding exponential separations from all these systems, as well as bounded-depth Frege systems.
We have seen that atomic flows can act as a powerful tool to analyse and manipulate derivations, and that often we can avoid the possibly exponential blowup arising from the c↓-c↑ rule.A relevant pursuit would be to investigate whether we can always avoid this blowup via, perhaps, a polynomial-time local or global flow reduction; this would imply that KS polynomially simulates KS + , and thus quasipolynomially simulates SKS and Frege systems.Despite much work in this direction a result, positive or negative, seems difficult to obtain; this remains arguably the most important question in the proof complexity of deep inference.
Question 7.1.Does KS (quasi)polynomially simulate KS + ?This question has already been asked in previous works, e.g.[BG09], [Jeř09], [Str12] and [Das11], with both positive and negative answers conjectured.We point out that our contribution might be helpful to any work towards a positive answer.
We also point out that it could be that atomic flows do not themselves include sufficient information to carry out an efficient normalisation procedure from KS + to KS, if one exists at all.To this end there is work ongoing by various researchers to augment atomic flows with certain logical information, in the hope of accessing further normalisation procedures and the ability to 'rebuild' deep inference proofs efficiently from flow objects. 9ne might argue that our proofs in Sects.4 and 5 eventually reduced to flows of bounded length, and so the complexity of normalisation was trivially polynomial.A more sophisticated situation might involve flows of bounded width and logarithmic length, again resulting in a polynomial blowup, or quasipolynomial width and polylogarithmic length, giving a quasipolynomial blowup.We refer the reader to the recent article [Das14] for an example of this, where quasipolynomial-size proofs of the unrestricted pigeonhole principle are given in KS, utilising the techniques of this paper.
All the results that appear in this work, and indeed all other works on proof complexity of deep inference, e.g.[BG09] [Jeř09] [Das14], present only simulations and upper bounds.We currently know of no technique for proving lower bounds for deep inference systems and, in light of the question above and results in [Jeř09] and [Das14], it may be that such an endeavour, even for KS, could be as difficult as that of proving lower bounds for Frege systems.

Figure 1 :
Figure 1: Local rewriting rules for the system norm.
we can construct an SKS derivation with the same premiss and conclusion as Φ and with flow f l(Φ)↓ norm in time polynomial in |Φ| + n i=1 |φ i |.
Theorem 2.15.The function •↓ norm mapping a flow to its unique normal form under −→ norm lifts polynomially to SKS.
allow us to estimate the size of the normal form of a flow, under norm, without actually constructing it.Observation 3.4.−→ norm preserves the number of open ai-paths in a flow.Notation 3.5.We write φ to denote the number of open ai-paths in a flow φ, modulo inversions, and #(ρ, φ) to denote the number of ρ nodes in φ for an atomic structural rule ρ.Lemma 3.6.If φ is the flow of a KS proof then φ = #(ai↓, φ).
Proposition 3.10.If φ is a flow consisting of just KS + nodes with width w, length l and breadth b, then φ = b • w l 2 +O(1) .

w
configurations in series vertically, and each configuration multiplies the number of paths by w.It does not make much difference if there is a w configuration at the top of the flow because of the way we have defined an ai-path; the exponent differs by just the addition of a constant.Remark 3.11.We generally use the trivial upper bound of size of flow for width and breadth, yielding the estimate φ = |φ| l 2 +O(1) for a KS + flow φ.

7 R7
: π → s r=1 Γ r w↑ − − − − − − −⊤ Throughout this translation we shall omit certain = rules required to handle units, e.g.we associate the empty disjunction with ⊥ and the empty conjunction with ⊤.By now this is a routine consideration.

Γ
r to ⊥, by definition, not containing ac↓ steps, by the observation above.Consequently we have that Rπ has the following format, bounded length (due to the absence of ac↑ nodes) and so can be transformed to a KS proof of the same conclusion in polynomial time by Prop.3.10 and Thm.3.8.
which must exist by Lemma 2.19.By Cor.2.14 we can construct an SKS derivation Ψ with same premiss and conclusion as Φ and flow ψ in time polynomial in |Φ| + m i=1 |φ i | + n j=1 |ψ i |.However, by Prop.2.18 we have that |φ wk step decreases the size of a flow, so m is bounded above by |φ| (and so also |Φ|), and each cont step increases the size of a flow, so n is bounded above by |ψ|, whence the result follows.
Length of atomic flows.It is not difficult to see that the main contributor to an increase of flow size reducing under norm is the rule c↓-c↑.It can sometimes cause an exponential blowup, as evident in Ex. 2.16 if there were no aw↓ node at the top.
Theorem 3.8.If Φ is a KS + proof with flow φ then we can transform it to a KS proof of the same conclusion in time polynomial in |Φ| + φ .Proof.Let ψ be the normal form of φ under norm.By Prop.3.1 we have that ψ contains just KS nodes, and so by Thm.2.15 we have that a KS proof with same conclusion as Φ can be constructed in time polynomial in |Φ| + |ψ|.The statement follows by the bound |ψ| = O(|φ| + φ ) = O(|Φ| + φ ) given by Lemma 3.7.3.1.