CONSISTENCY AND COMPLETENESS OF REWRITING IN THE CALCULUS OF CONSTRUCTIONS

. Adding rewriting to a proof assistant based on the Curry-Howard isomorphism, such as Coq, may greatly improve usability of the tool. Unfortunately adding an arbitrary set of rewrite rules may render the underlying formal system undecidable and inconsistent. While ways to ensure termination and conﬂuence, and hence decidability of type-checking, have already been studied to some extent, logical consistency has got little attention so far. In this paper we show that consistency is a consequence of canonicity, which in turn follows from the assumption that all functions deﬁned by rewrite rules are complete. We provide a sound and terminating, but necessarily incomplete algorithm to verify this property. The algorithm accepts all deﬁnitions that follow dependent pattern matching schemes presented by Coquand and studied by McBride in his PhD thesis. It also accepts many deﬁnitions by rewriting including rules which depart from standard pattern matching.


Introduction
Equality is ubiquitous in mathematics.Yet it turns out that proof assistants based on the Curry-Howard isomorphism, such as Coq [11], are not very good at handling equality.While proving an equality is not a problem in itself, using already established equalities is quite problematic.Apart from equalities resulting from internal reductions (namely, beta and iota reductions), which can be used via the conversion rule of the calculus of inductive constructions without being recorded in the proof term, any other use of an equality requires giving all details about the context explicitly in the proof.As a result, proof terms may become extremely large, taking up memory and making type-checking time consuming: working with equations in Coq is not very convenient.
A straightforward idea for reducing the size of proof terms is to allow other equalities in the conversion, making their use transparent.This can be done by using user-defined rewrite rules.However, adding arbitrary rules may easily lead to logical inconsistency, making the 2 D. WALUKIEWICZ-CHRZĄSZCZ AND J. CHRZĄSZCZ proof environment useless.It is of course possible to put the responsibility on the user, but it is contrary to the current Coq policy to guarantee consistency of developments without axioms.Therefore it is desirable to retain this guarantee when rewriting is added to Coq.Since consistency is undecidable in the presence of rewriting in general, one has to find some decidable criteria satisfied only by rewriting systems which do not violate consistency.
The syntactical proof of consistency of the calculus of constructions, which is the basis of the formalism implemented in Coq, requires every term to have a normal form [2]. The same proof is also valid for the calculus of inductive constructions [24], which is even closer to the formalism implemented in Coq.
There exist several techniques to prove (strong) normalization of the calculus of constructions with rewriting [1,7,6,21,22], following numerous works about rewriting in the simply-typed lambda calculus.Practical criteria for ensuring other fundamental properties, like confluence, subject reduction and decidability of type-checking are addressed e.g. in [6].
Logical consistency is also studied in [6].It is shown under the assumption that for every symbol f defined by rewriting, f (t 1 , . . ., t n ) is reducible if t 1 . . .t n are terms in normal form in the environment consisting of one type variable.Apart from a proof sketch that this is the case for the two rules defining the induction predicate for natural numbers and a remark that this property resembles the completeness of definitions, practical ways to satisfy the assumption of the consistency lemma are not discussed.
Techniques for checking completeness of definitions are known for almost 30 years for the first-order algebraic setting [14,20,15].More recently, their adaptations to type theory appeared in [12,16] and [18].In this paper we show how the latter algorithm can be tailored to the calculus of constructions extended with rewriting.We study a system where the set of available function symbols and rewrite rules are not known from the beginning but may grow as the proof development advances, as it is the case with concrete implementations of modern proof assistants.
We show that logical consistency is an easy consequence of canonicity, which in turn can be proved from completeness of definitions by rewriting, provided that termination and confluence are proved first.Our completeness checking algorithm closes the list of necessary procedures needed to guarantee logical consistency of developments in a proof assistant based on the calculus of constructions with rewriting.
In fact, in this paper we work in a framework which is slightly more general than the calculus of constructions, namely that of pure type systems, of which the calculus of constructions is an instance.However, since termination and confluence are used both in our algorithm and in the proof of its correctness, our results are useful only if a termination and confluence criteria exist for a given pure type system extended with rewriting.Some work in this direction has been done, e.g., in [4].

Rewriting in the Calculus of Constructions
Let us briefly discuss how we imagine introducing rewriting in Coq and what problems we encounter on the way to a usable system.
From the user's perspective definitions by rewriting could be entered just as all other definitions:1  The above fragment can be interpreted as an environment consisting of the inductive definition of natural numbers, symmetric definition by rewriting of addition and the declaration of a variable n of type nat.In this environment all rules for + contribute to conversion.For instance both ∀x : nat.x + 0 = x and ∀x : nat.0 + x = x can be proved by λx : nat.refl nat x, where refl is the only constructor of the Leibniz equality inductive predicate.Note that the definition of + is terminating and confluent.The latter can be checked by an (automatic) examination of its critical pairs.
Rewrite rules can also be used to define higher-order and polymorphic functions, like the map function on polymorphic lists.In this example, the first two rules correspond to the usual definition of map by pattern matching and structural recursion and the third rule can be used to quickly get rid of the map function in case one knows that f is the identity function.

Symbol map : forall (
Even though we consider higher-order rewriting, we choose the simple matching modulo α-conversion.Higher-order matching is useful for example to encode logical languages by higher-order abstract syntax, but it is seldom used in Coq where modeling relies rather on inductive types.Instead of higher-order matching, one needs a possibility not to specify certain arguments in left-hand sides, and hence to work with rewrite rules built from terms that may be not typable.Consider, for example the type tree of trees with size, holding some Boolean values in the nodes, and the function rotr performing a right rotation in the root of the tree.
Inductive tree : nat → Set := Leaf : tree O | Node : forall n1:nat, tree n1 → bool → forall n2:nat, tree n2 → tree (S(n1+n2)).The first argument of rotr is the size of the tree and the second is the tree itself.The first two rules cover the trees which cannot be rotated and the third one performs the rotation.The ? marks above should be treated as different variables.The information they hide is redundant for typable terms: if we take the third rule for example, the values of ?3, ?4 and ?5 must correspond to the sizes of the trees A, C and E respectively, ?2 must be equal to S(?3+?4) and ?1 to S(?2+?5).Note that by not writing these subterms we make the rule left-linear (and therefore easier to match) and avoid critical pairs with +, hereby helping the confluence proof.
This way of writing left-hand sides of rules was already used by Werner in [24] to define elimination rules for inductive types, making them orthogonal (the left-hand sides are of the form I elim P f w (c x), where P , f , w, x are distinct variables and c is a constructor of I).In [6], Blanqui gives a precise account of these omissions using them to make more rewriting rules left-linear.Later, the authors of [8] show that these redundant subterms can be completely removed from terms (in a calculus without rewriting however).In [3], a new optimized convertibility test algorithm is presented for Coq, which ignores testing equality of these redundant arguments.
In our paper we do not explicitly specify which arguments should/could be replaced by ? and do not restrict left-hand sides to be left-linear.Instead, we rely on an acceptance condition to suitably restrict the form of acceptable definitions by rewriting to guarantee the needed metatheoretical properties listed in the next section.
It is also interesting to note that when the first argument of rotr is ?1 then we may understand it as S(?2+?5) matched to terms modulo the convertibility relation and not just syntactically (i.e., modulo α-conversion).

Pure Type Systems with Generative Definitions
Even though most papers motivated by the development of Coq concentrate on the calculus of constructions, we present here a slightly more general formalization of a pure type system with inductive definitions and definitions by rewriting.The presentation, taken from [9,10], is quite close to the way these elements could be implemented in Coq.The formalism is built upon a set of PTS sorts S, a binary relation A and a ternary relation R over S governing the typing rules (Term/Ax) and (Term/Prod) respectively (Figure 1).The syntactic class of pseudoterms is defined as follows: can be a variable v ∈ Var , a sort s ∈ S, an application, an abstraction or a product.We write |t| to denote the size of the pseudoterm t, with |v| = |s| = 1.We use Greek letters γ, δ to denote substitutions which are finite partial maps from variables to pseudoterms.The postfix notation is used for the application of substitutions to pseudoterms.
Inductive definitions and definitions by rewriting are generative, i.e. they are stored in the environment and are used in terms only through names they "generate".An environment is a sequence of declarations, each of them is a variable declaration v : t, an inductive definition Ind(Γ I := Γ C ), where Γ I and Γ C are environments providing names and types of (possibly mutually defined) inductive types and their constructors, or a definition by rewriting Rew(Γ, R), where Γ is an environment providing names and types of (possibly mutually defined) function symbols and R is a set of rewrite rules defining them.Types of

CONSISTENCY AND COMPLETENESS OF REWRITING IN THE CALCULUS OF CONSTRUCTIONS 5
inductive types, constructors and function symbols determine their arity: given v : t in an inductive definition or a definition by rewriting, if t is of the form (x 1 : t 1 ) . . .(x n : t n ) t where t is not a product, then n is the arity of v.
A rewrite rule is a triple denoted by ∆ ⊢ l −→ r, where l and r are pseudoterms and ∆ is an environment, providing names and types of variables occurring in the left-and right-hand sides l and r.
Given an environment E, inductive types, constructors and function symbols declared in E are called constants (even though syntactically they are variables).We often write h(e 1 , . . ., e n ) to denote the application of a constant h to pseudoterms e 1 , . . ., e n , when n is the arity of h.General environments are denoted by E and environments containing only variable declarations are denoted by Γ, ∆, G, D. We assume that names of all declarations in environments are pairwise disjoint.A pair consisting of an environment E and a term e is called a sequent and denoted by E ⊢ e.A sequent is well-typed if E ⊢ e : t for some t.Definition 3.1.A pure type system with generative definitions is defined by the typing rules in Figure 1, where: • POS is a positivity condition for inductive definitions (see assumptions below).
• ACC is an acceptance condition for definitions by rewriting (idem).
• The relation ≈ used in the rule (Term/Conv) is the smallest congruence on well typed terms, generated by −→ which is the sum of beta and rewrite reductions, denoted by −→ β and −→ R respectively (for exact definition see [10], Section 2.8).• The notation δ : Γ → E means that δ is a well-typed substitution, i.e.E ⊢ vδ : tδ for all v : t ∈ Γ.
As in [22,6], recursors and their reduction rules have no special status and they are supposed to be expressed by rewriting.
Assumptions.We assume that we are given a positivity condition POS for inductive definitions and an acceptance condition ACC for definitions by rewriting.Together with the right choice of the PTS they must imply the following properties: E ⊢ e ′′ −→ * ê for some ê.These properties are usually true in all well-behaved type theories.They are for example all proved for the calculus of algebraic constructions [6], an extension of the calculus of constructions with inductive types and rewriting, where POS is the strict positivity condition as defined in [17], and ACC is the General Schema.
From now on, we use the notation t↓ for the unique normal form of t.

Consistency and Completeness
Consistency of the calculus of constructions (resp.calculus of inductive constructions) can be shown by rejecting all cases of a hypothetical normalized proof e of (x : * )x in a closed environment, i.e. empty environment (resp.an environment containing only inductive definitions and no axioms).Our goal is to extend the definition of closed environments to the Let us try to identify that class.If we reanalyze e in the new setting, the only new possible normal form of e is an application f ( e) of a function symbol f , coming from a rewrite definition Rew(Γ, R), to some arguments in normal form.There is no obvious argument why such terms cannot be proofs of (x : * )x.On the other hand if we knew that such terms were always reducible, we could complete the consistency proof.Let us call COMP(Γ, R) the condition on rewrite definitions we are looking for (i.e.f ( e) is always reducible), which can also be read as: the function symbols from Γ are completely defined by the set of rules R.
Note that the completeness of f has to be checked much earlier than it is used: we use it in a given closed environment E = E 1 ; Rew(Γ, R); E 2 but it has to be checked when f is added to the environment, i.e. in the environment E 1 .It implies that completeness checking has to account for environment extension and can be performed only with respect to arguments of such types, of which the set of normal forms could not change in the future.This is the case for arguments of inductive types.
The requirement that functions defined by rewriting are completely defined could very well be included in the condition ACC.On the other hand, the separation between ACC and COMP is motivated by the idea of working with abstract function symbols, equipped with some rewrite rules not defining them completely.For example if + from Section 2 were declared using only the third rewrite rule, one could develop a theory of an associative function over natural numbers.
The intuition behind the definitions given below is the following.A rewrite definition Rew(Γ, R) is complete (satisfies COMP(Γ, R)) if for all f in Γ, the goal f (x 1 , . . ., x n ) is covered by R. A goal is covered if all its instances are immediately covered, i.e. headreducible by R. Following the discussion above we limit ourselves to normalized canonical instances, i.e. built from constructors wherever possible.Definition 4.1 (Canonical form and canonical substitution).Given a judgment E ⊢ e : t we say that the term e is in canonical form if and only if: • if t↓ is an inductive type then e = c(e 1 , . . ., e n ) for some constructor c and terms e 1 , . . ., e n in canonical form • otherwise e is arbitrary Let ∆ be a variable environment and E a correct environment.We call δ : ∆ → E canonical if for every variable x ∈ ∆, the term xδ is canonical.
From now on, let E be a global environment and let Rew(Γ, R) be a rewrite definition such that E ⊢ Rew(Γ, R) : correct.Let f : (x 1 : t 1 ) . . .(x n : t n ) t ∈ Γ be a function symbol of arity n.
A normalized canonical instance of the goal E; Γ; ∆ ⊢ f (e 1 , . . ., e n ) is a well-typed sequent E; Rew(Γ, R); E ′ ⊢ f (e 1 δ↓, . . ., e n δ↓) for any canonical substitution δ : ∆ → A term e is immediately covered by R if there is a rule G ⊢ l −→ r in R and a substitution γ such that lγ = e.By obvious extension we can also write that a goal or a normalized canonical instance is immediately covered by R.
A goal is covered by R if all its normalized canonical instances are immediately covered by R.
Note that, formally, a normalized canonical instance is not a goal.The difference is that the conversion corresponding to the environment of an instance contains reductions defined by R, while the one of a goal does not.

Definition 4.3 (Complete definition).
A rewrite definition Rew(Γ; R) is complete in the environment E, which is denoted by COMP E (Γ; R), if and only if for all function symbols f : (x 1 : t 1 ) . . .
Both (with their respective environments) are goals for rotr, and t 2 (with a slightly different environment) is also a normalized canonical instance of t 1 .The goal t 1 is not immediately covered, but its instance t 2 is, as it is head-reducible by the second rule defining rotr.Since other instances of t 1 are also immediately covered, the goal is covered (see Example 5.20 Proof.By induction on the size of e.If t↓ is not an inductive type then any e is canonical.Otherwise, let us analyze the structure of e.It cannot be a product, an abstraction or a sort because t↓ is an inductive type.Since E is closed, it is not a variable either.Hence e is of the form e ′ e 1 . . .e m (with m possibly equal 0), where e ′ is not an application.The term e ′ can be neither a product, nor a sort (they cannot be applied), nor a variable (E is closed).It is not an abstraction, since e is in normal form.The only possibility left is that e ′ is a constant h of arity n ≤ m, and we get e = h(e 1 , . . ., e n ) e n+1 . . .e m .
Since t↓ is an inductive type, h cannot be an inductive type.If it is a constructor then n = m and by induction hypothesis e 1 , . . ., e n are in canonical form and so is h(e 1 , . . ., e n ).If h is a function symbol then E = E 1 ; Rew(Γ, R); E 2 for some E 1 , E 2 and h : (x 1 : t 1 ) . . .
Hence, we have E ′ ⊢ e False : False.By Lemma 4.6, the normal form of e False is canonical.Since False has no constructors, this is impossible.

Checking Completeness
The objective of this section is to provide an algorithm for checking completeness of definitions by rewriting.The algorithm presented in Subsection 5.2 checks that a goal is covered using successive splitting (Definition 5.3), i.e., replacement of variables of inductive types by constructor patterns.In order to know which constructor terms can replace a given variable, one has to compare types and hence an algorithm for unification modulo conversion is needed (Definition 5.2).Consider for example the first rule of the definition of rotr.It is clear that only Leaf can replace t in rotr O t because other trees have types that do not unify with tree O.
Correctness of the completeness checking algorithm is proved in Lemma 5.19.It is done using an additional assumption on rewrite systems called preservation of reducibility which is discussed in Subsection 5.1.

Definition 5.1 (Unification problem).
A quadruple E, ∆ ⊢ t .= s, where E is an environment, ∆ a variable environment and s, t are terms, is a unification equation in E. A unification problem in E is a finite set of unification equations.Without loss of generality we may assume that the variable environments ∆ in all equations are the same.
A unifier or a solution of the unification problem U is a substitution γ : ∆ → E; E ′ such that E; E ′ ⊢ tγ ≈ sγ for every E, ∆ ⊢ t .= s in U .We say that E ′ is the co-domain of γ, which is denoted by Ran(γ).

Definition 5.2 (Correct unification algorithm).
A unification algorithm is a procedure which for every unification problem U = {E, ∆ ⊢ t i .= s i } returns a substitution γ, a bottom ⊥, or a question mark ?.The algorithm is correct if and only if: if it answers γ, it is the most general unifier γ : ∆ → E; ∆ ′ such that ∆ ′ ⊆ ∆ and for all x ∈ ∆ ′ , γ(x) = x; if it answers ⊥, U has no unifier.
Since unification modulo conversion is undecidable, every correct unification algorithm must return ? in some cases, which may be seen as too difficult for the algorithm.An example of such a partial unification algorithm is constructor unification, that is first-order unification with constructors and type constructors as rigid symbols, answering ?whenever one compares a non-trivial pair of terms involving non-rigid symbols.
From now on we assume the existence of a correct (partial) unification algorithm Alg.If Alg(U c ) = ?for some c, the splitting fails.
Example 5.4.If one splits the goal rotr n t along n, one gets two goals: rotr O t and rotr (S m) t.The first one is immediately covered by the first rule for rotr and if we split the second one along t, the Leaf case is impossible, because tree O does not unify with tree (S m) and the Node case gives rotr (S (nA + nC)) (Node nA A b nC C).

Preservation of Reducibility.
Although one would expect that an immediately covered goal is also covered, it is not always true, even for confluent systems.It turns out that we need a property of critical pairs that is stronger than just joinability.Let us suppose that or : bool → bool → bool is defined by four rules by cases over true and false and that if : bool → bool → bool → bool is defined by two rules by cases on the first argument.In the example presented above all expressions used in types and rules are in normal form, all critical pairs are joinable, the system is terminating, and splitting of f b i along i results in the only reducible goal f (or b b) (C b).In spite of that f is not completely defined, as f true (C true) is a normalized canonical instance of f (or b b) (C b) and it is not reducible.In order to know that an immediately covered goal is always covered we need one more condition on rewrite rules, called preservation of reducibility.
Definition 5.6.Definition by rewriting Rew(Γ, R) preserves reducibility in an environment E if for every critical pair f ( u), rδ of a rule coming from R or from some other rewrite definition in E, the term f ( u↓) is head-reducible by R.
Note that by using ?variables in rewrite rules one can get rid of (some) critical pairs and hence make a definition by rewriting satisfy this property.In the example above one could write f ?(C b) as the left-hand side.This would also make the system non-terminating, and show that f is not really well-defined.
Of course all orthogonal rewrite systems, in particular inductive elimination schemes, as defined in [24], preserve reducibility.Otherwise, let G 1 ⊢ f ( l) −→ r be a rule from R and γ a substitution such that f ( e) = f ( l)γ and let us make one reduction step e i −→ e ′ i , using the rule G 2 ⊢ g −→ d.There are two possibilities: the reduction in e i happens either in substitution γ, i.e. in the term γ(x), where x is a free variable of f ( l), or it happens on a position p that belongs to f ( l).In the former case, let us do identical reduction in all other instances of x.Obviously, we get a term f (e ′ 1 , . . .e ′ n ) that is smaller than e in −→ and is still an instance of f ( l).Hence by induction hypothesis we get the desired conclusion.
Otherwise, f ( l) and g superpose at some nonvariable position and we have f ( l)| p γ = gξ for some position p and substitution ξ.Since we may suppose that free variables of f ( l) and g are different, we get f ( l)| p (γ ∪ ξ) = g(γ ∪ ξ).Let δ be the most general unifier of f ( l)| p and g and let f ( u), rδ be the corresponding critical pair.Since δ is the most general unifier, there exists σ such that . .e n ).By preservation of reducibility f ( u↓) is headreducible by R. Hence f ( u↓)σ is also head-reducible by R. Like above we can apply induction hypothesis and deduce that f ( e↓) is head-reducible by R. Lemma 5.8.Let Rew(Γ, R) preserve reducibility in an environment E, let f ∈ Γ and let E; Γ; ∆ ⊢ f ( e) be a goal.If it is immediately covered then it is covered.Proof.Let E; Γ; ∆ ⊢ f ( e) be a goal immediately covered by R and δ : ∆ → E; Rew(Γ, R); E ′ be a canonical substitution.Obviously, E; Rew(Γ, R); E ′ ⊢ f ( eδ) is immediately covered by R. Hence, by Lemma 5.7 E; Rew(Γ, R); E ′ ⊢ f ( eδ↓) is also immediately covered by R, i.e.E; Γ; ∆ ⊢ f ( e) is covered.

Coverage Checking Algorithm.
In this section we present an algorithm checking whether a set of goals is covered by the given set of rewrite rules.The algorithm is correct only for definitions that preserve reducibility.The algorithm, in a loop, picks a goal, checks whether it is immediately covered, and if not, splits the goal replacing it by the subgoals resulting from splitting.In order to ensure termination, splitting is limited to safe splitting variables.Intuitively, a splitting variable is safe if it lies within the contour of the left-hand side of some rule when we superpose the tree representation of the left-hand side with the tree representation of the goal.The number of nodes that have to be added to the goal in order to fill the tree of the left-hand side is called a distance, and a sum of distances over all rules is called a measure.Since the measures of goals resulting from splitting are smaller than the measure of the original goal, the coverage checking algorithm terminates.This subsection is organized as follows.We start by defining the splitting matching algorithm which is used to define safe splitting variables.Next, we provide definitions and lemmas needed to prove termination of the coverage checking algorithm and then we give the algorithm itself and the proof of its correctness.We conclude this subsection with some positive and negative examples leading to an extension of the algorithm allowing us to accept definitions by case analysis even if the unification algorithm is not strong enough.
Let us start with the splitting matching algorithm which finds variables in t 1 that lie within the contour of t 2 .Definition 5.9 (Splitting matching).The splitting matching algorithm is defined in Figure 2. Given two sequents ∆ 1 ⊢ t 1 and ∆ 2 ⊢ t 2 , it returns the unique set S, such that Definition 5.10 (Safe splitting variable).Let ∆ 1 ⊢ t 1 and ∆ 2 ⊢ t 2 be sequents such that t 2 is a left-hand side of a rule from R and let S be a set such that t 1 < Λ t 2 ⇒ S and ⊥ ∈ S. A variable x ∈ ∆ 1 is a safe splitting variable for ∆ 1 ⊢ t 1 along ∆ 2 ⊢ t 2 if it is a splitting variable and there exists p ∈ S such that t 1 | p = x and either t 2 | p is a variable declared in ∆ 2 or t 2 | p = c( e) for some constructor c and some terms e.
The set of safe splitting variables for the sequent or SV (t 1 , t 2 ) for short.SV (t, R) is the set of safe splitting variables for t along left-hand sides of rules from R. Definition 5.12 (Distance).Let ∆ 1 ⊢ t 1 and ∆ 2 ⊢ t 2 be sequents and S be a set such that t The following two lemmas state that the distance of a term decreases when we apply a substitution, and it decreases strictly if it is a substitution resulting from splitting.Lemma 5.13 (Distance of a substituted sequent).Let ∆ 1 ⊢ t 1 and ∆ 2 ⊢ t 2 be sequents and let S be a set such that t 1 < Λ t 2 ⇒ S. Then for every substitution γ : Otherwise, let us take p ∈ S and the set Q p = {q ∈ S γ | p q}, where is the prefix ordering.Since all positions from Q p are independent (as t 1 γ| q ∈ Var for every q ∈ S γ ) we have q∈Qp |t 2 | q | ≤ |t 2 | p | and the equality holds only if Q p = {p}.Let us show that ∀q ∈ S γ ∃p ∈ S p q. Indeed, assuming that ⊥ ∈ S γ , q ∈ S γ either because q ∈ S and (t 1 | q )γ ∈ ∆ ′ or because there is a position p ∈ S such that q = p • q ′ for some q ′ and (t 1 | p γ)| q ′ ∈ ∆ ′ .Of course, since positions in S are independent, the sets Q p are disjoint for different p.
Hence S γ = p∈S Q p and Lemma 5.14 (Distance after splitting strictly decreases).Let E; Γ; ∆ ⊢ f (e 1 , . . ., e n ) be a goal, t = f (e 1 , . . ., e n ), let G ⊢ l −→ r be one of the rewrite rules for f in R and let S be a set such that t < Λ l ⇒ S and ⊥ ∈ S. If x : I u ∈ SV (t, l) is a safe splitting variable and splitting t along Proof.Let σ c ∈ Sp(x) and let S c be a set such that tσ c < Λ l ⇒ S c .By Lemma 5.13 we have dist(tσ c , l) ≤ dist(t, l).Let us analyze the proof of that lemma and show that in case of a substitution resulting from splitting there is a strict inequality between dist(tσ c , l) and dist(t, l).In the proof it was noticed that for every p ∈ S, Proof.Let us consider a successful run of the algorithm, performing a finite number of times the body of the Repeat loop and resulting in CE = ∅.By induction on n, the number of Repeat steps until the end of the algorithm, we prove that the goals appearing in W are covered.
The base case, for n = 0, is trivial since W 0 is empty.Now suppose that n steps before the end of the algorithm all goals in W n are covered and let us check that this was true n + 1 steps before the end, i.e. one step of the algorithm earlier.
In case 2, W n+1 contains all goals from W n and one goal φ which is immediately covered by a rule in R. By preservation of reducibility (Lemma 5.8) every normalized canonical instance of φ is also immediately covered and consequently all goals of W n+1 are covered.
Case 3(a) is impossible since it makes the set CE non-empty.
In case 3(b)i, W n+1 contains some of the goals from W n and one goal φ whose subgoals resulting from successful splitting are already in W n .By Lemma 5.5 the set of normalized canonical instances of these subgoals contains the set of normalized canonical instances of φ.Hence W n+1 is covered.
In case 3(b)ii the set of goals in W n+1 and W n are equal.Hence the initial goal in W is also covered.immediately covered by the second and the third rule respectively.Since we started with the initial goal rotr n t and since the definition of rotr preserves reducibility, it is complete.
When the coverage checking algorithm stops with CE = ∅, we cannot deduce that R is complete.The set CE contains potential counterexamples.They can be true counterexamples, false counterexamples, or goals for which splitting failed along all safe variables, due to incompleteness of the unification algorithm.In some cases further splitting of a false counterexample may result in reducible goals or in the elimination of the goal as uninhabited, but it may also loop.Some solutions preventing looping (finitary splitting) can be found in [18].
Unfortunately splitting failure due to incompleteness of the unification may happen while checking coverage of a definition by case analysis over complex dependent inductive types (for example trees of size 2), even if rules for all constructors are given.Therefore, it is advisable to add a second phase to our algorithm, which would treat undefined output of unification as success.Using this second phase of the algorithm, one can accept all definitions by case analysis that can be written in Coq.
One splitting of JMelim A a P h b c over c results in JMelim A a P h a (JMrefl A a) which is equal to the left-hand side of the rule.Hence this rule completely defines JMelim.6.2.Uniqueness of Identity Proofs and Streicher's axiom K. Consider the type eq and the definition of function UIP, proving that identity proofs are unique: Inductive eq (A:Set)(a:A): A → Set := refl: eq A a a.
Symbol UIP : forall (A:Set)(a b:A)(p q: eq A a b), (eq (eq A a b) p q) Rules UIP A a a (refl A a) (refl A a) −→ refl (eq A a a) (refl A a).
The function UIP is completely defined since two subsequent splittings of UIP A a b p q, along p and along q, result in UIP A a a (refl A a) (refl A a) which is exactly the left-hand side of the only rule for UIP.
The rule for Streicher's axiom K can also easily be proved complete: Symbol K : forall (A:Set) (a:A) (P:eq A a a → Set), P (refl A a) → forall p: eq A a a, P p Rules K A a P h (refl A a) −→ h Note that both rules for UIP and K can also be written in a left-linear form: UIP A a ?1 (refl ?2 ?3) (refl ?4 ?5) −→ refl (eq A a a) (refl A a) K A a P h (refl ?1 ?2) −→ h 6.3.Non pattern matching rules.These are two examples of complete definitions which do not follow the pattern matching schemes as defined in [12] and [16].In this paper we study consistency of the calculus of constructions with rewriting.More precisely, we propose a formal system extending an arbitrary PTS with inductive definitions and definitions by rewriting.Assuming that suitable positivity and acceptance conditions guarantee termination and confluence, we formalize the notion of a complete definition by rewriting.We show that in every environment consisting only of inductive definitions and complete definitions by rewriting there is no proof of (x : * )x.Moreover, we present a sound and terminating algorithm for checking completeness of definitions.It is necessarily incomplete, since in presence of dependent types emptiness of types trivially reduces to completeness and the former is undecidable.
Our coverage checking algorithm resembles the one proposed by Coquand in [12] for Martin-Löf type theory and used by McBride for his OLEG calculus [16].In these works the procedure consisting in successive case-splittings is used to interactively built pattern matching equations, or to check that a given set of equations can be built this way.Unlike in our paper, Coquand and McBride do not have to worry whether all instances of a reducible subgoal are reducible.Indeed, in [12] pattern matching equations are meant to be applied to terms modulo conversion, and in [16] equations (or rather the order of splittings in the successful run of the coverage checking procedure) serve as a guideline to construct an OLEG term verifying the equations.Equations themselves are never used for reduction and the constructed term reduces according to existing rules.
In our paper rewrite rules are matched against terms modulo α-conversion.Rewriting has to be confluent, strongly normalizing and has to preserve reducibility.Under these assumptions we can prove completeness for all examples from [12] and for the class of pattern matching equations considered in [16].In particular we can deal with elimination rules for inductive types and with Streicher's axiom K.Moreover, we can accept definitions which depart from standard pattern matching, like rotr and +.
The formal presentation of our algorithm is directly inspired by the work of Pfenning and Schürmann [18].A motivation for that paper was to verify that a logic program in the Twelf prover covers all possible cases.In LF, the base calculus of Twelf, there is no polymorphism, no rewriting and conversion is modulo βη-conversion.The authors use higher-order matching modulo βη-conversion, which is decidable for patterns a la Miller and strict patterns.Moreover, since all types and function symbols are known in advance, the coverage is checked with respect to all available function symbols.In our paper, conversion contains rewriting and it cannot be used for matching; instead we use matching modulo α.This simplifies the algorithm searching for safe splitting variables, but on the other hand it does not fit well with instantiation and normalization.To overcome this problem we introduce the notions of normalized canonical instance and preservation of reducibility which were not present in previously mentioned papers.Finally, since the sets of function symbols and rewrite rules grow as the environment extends, coverage is checked with respect to constructors only.
Even though the worst-case complexity of the coverage checking is clearly exponential, for practical examples the algorithm should be quite efficient.It is very similar in spirit to the algorithms checking exhaustiveness of definitions by pattern matching in functional programming languages and these are known to work effectively in practice.
An important issue which is not addressed in this paper is to know how much we extend conversion.Of course it depends on the choice of conditions ACC and POS and on the unification algorithm used for coverage checking.In particular, some of the definitions by pattern matching can be encoded by recursors [13], so if ACC is strict, we may have no extension at all.In general there seems to be at least two kinds of extensions.The first are non-standard elimination rules for inductive types, but the work of McBride shows that the axiom K is sufficient to encode all other definitions by pattern matching considered by Coquand.The second are additional rules which extend a definition by pattern matching (like associativity for +).It is known that for first-order rewriting, these rules are inductive consequences of the pattern matching ones, i.e. all their canonical instances are satisfied as equations (see e.g.Theorem 7.6.5 in [19]).Unfortunately, this is no longer true for higherorder rules over inductive types with functional arguments.Nevertheless it seems that such rules are inductive consequences of the pattern matching rules if the corresponding equality is extensional.
Finally, our completeness condition COMP verifies closure properties defined in [9,10].Hence, it is adequate for a smooth integration of rewriting with the module system present in Coq since its version 7.4.

Figure 1 :
Figure 1: Definition correctness, environment correctness and lookup, PTS rules

Example 4 . 4 .
The terms (S O), λx:nat.xand (Node O Leaf true O Leaf) are canonical, while (O + O) and (Node nA A b O Leaf) are not.Given the definition of rotr from Section 2 consider the following terms:
Inductive I : bool → Set := C : forall b:bool, I (or b b).Symbol f: forall b:bool, I b → bool Rules f (or b b) (C b) −→ if b (f true (C true)) (f false (C false))

Lemma 5 . 7 .
Let E ⊢ e : t and e = f (e 1 , . . .e n ), where f of arity n comes from Rew(Γ, R) which preserves reducibility.If e is head-reducible by R then f (e 1 ↓, . . .e n ↓) is also headreducible by R. Proof.By induction on −→.If e 1 , . . .e n are in normal forms then the conclusion is obvious.

Example 5 . 11 .
In the goal rotr (S (S nC)) (Node O Leaf b (S nC) C) there are two safe splitting variables b and C along the left-hand sides of the rules defining rotr.

Lemma 5 . 19 .
q∈Qp |l| q | ≤ |l| p |, where Q p = {q ∈ S c | p q} and that q∈Qp |l| q | = |l| p | only if Q p = {p}.Consequently, if we show that there exists a position p such that p ∈ Q p , we immediately get dist(tσ c , l) < dist(t, l).Since x : I u is a safe splitting variable for t along l, there exists a position p ∈ S such that t| p = x and l| p ∈ G or l| p = c ′ ( a) for some constructor c ′ .Since σ c results from successful splitting, xσ c = c( b) for some b.Now, there are three cases.If l| p ∈ G then Lemma 5.16 the measures of goals from {φ 1 , . . ., φ n } are strictly smaller than the measure of φ and consequently M (W ) strictly decreases.If Rew(Γ, R) preserves reducibility and the algorithm stops with CE = ∅ then the initial goal is covered.

Example 5 . 20 .
The beginning of a possible run of the algorithm for the function rotr is presented already in Example 5.4.Both splitting operations are performed on safe variables, as required.We are left with the goal rotr (S (nA + nC)) (Node nA A b nC C).Splitting along A results in: rotr (S (O + nC)) (Node O Leaf b nC C) rotr (S((S(nX+nZ))+nC)) (Node (S(nX+xZ)) (Node nX X y nZ Z) b nC C)

7 .
Symbol or' : bool → bool → bool Rules or' x x −→ x or' true y −→ true or' x true −→ true Symbol lt, diff : nat → nat → bool Rules lt O y −→ diff O y lt x O −→ false lt (S x) (S y) −→ lt x y diff x x −→ false diff O (S y) −→ true diff (S x) O −→ true diff (S x) (S y) −→ diff x y Conclusions and Related Work ).An environment E is closed if and only if it contains only inductive definitions and complete definitions by rewriting, i.e. for each partition of E into E 1 ; Rew(Γ, R); E 2 the condition COMP E 1 (Γ, R) is satisfied.
Lemma 4.6(Canonicity).Let E be a closed environment.If E ⊢ e : t and e is in normal form then e is canonical.
and only if there is a position p such that subterms occurring at p in t 1 and t 2 either have different head symbols, or t 2 | p (resp.t 1 | p ) is a bound variable in t 2 (resp.t 1 ) and t 1 | p = t 2 | p .Of course, if we compare t 1 γ| p and t 2 | p then either they still have different head-symbols or t 2 | p (resp.t 1 | p ) is a bound variable and t 1 γ| p = t 2 | p .Hence d γ = 0.If ⊥ ∈ S then d = p∈S |t 2 | p | ≥ 0. If ⊥ ∈ S γ then obviously 0 = d γ ≤ d.