Canonized Rewriting and Ground AC Completion Modulo Shostak Theories : Design and Implementation

AC-completion efficiently handles equality modulo associative and commutative function symbols. When the input is ground, the procedure terminates and provides a decision algorithm for the word problem. In this paper, we present a modular extension of ground AC-completion for deciding formulas in the combination of the theory of equality with user-defined AC symbols, uninterpreted symbols and an arbitrary signature disjoint Shostak theory X. Our algorithm, called AC(X), is obtained by augmenting in a modular way ground AC-completion with the canonizer and solver present for the theory X. This integration rests on canonized rewriting, a new relation reminiscent to normalized rewriting, which integrates canonizers in rewriting steps. AC(X) is proved sound, complete and terminating, and is implemented to extend the core of the Alt-Ergo theorem prover.


Introduction
Many mathematical operators occurring in automated reasoning such as union and intersection of sets, or boolean and arithmetic operators, satisfy the following associativity and commutativity (AC) axioms ∀x.∀y.∀z. u(x, u(y, z)) = u(u(x, y),z)( A ) ∀x.∀y. u(x, y)=u(y, x)( C ) Automated AC reasoning is known to be difficult. Indeed, the mere addition of these two axioms to a prover will usually glut it with plenty of useless equalities which will strongly impact its performances 1 . In order to avoid this drawback, built-in procedures have been designed to efficiently handle AC symbols. For instance, SMT-solvers incorporate dedicated decision procedures for some specific AC symbols such as arithmetic or boolean operators. On the contrary, algorithms found in resolution-based provers such as AC-completion allow a powerful generic treatment of user-defined AC symbols.
Given a finite word problem i∈I s i = t i ⊢ s = t where the function symbols are either uninterpreted or AC, AC-completion attempts to transform the conjunction i∈I s i = t i into a finitely terminating, confluent term rewriting system R whose reductions preserve identity. The rewriting system R serves as a decision procedure for validating s = t modulo AC: the equation holds if and only if the normal forms of s and t w.r.t R are equal modulo AC. Furthermore, when its input contains only ground equations, AC-completion terminates and outputs a convergent rewriting system [16].
Unfortunately, AC reasoning is only a part of the automated deduction problem, and what we really need is to decide formulas combining AC symbols and other theories. For instance, in practice, we are interested in deciding finite ground word problems which contain a mixture of uninterpreted, interpreted and AC function symbols, as in the following assertion where u is an AC symbol, +, −, * and the numerals are from the theory of linear arithmetic, f is an uninterpreted function symbol and the other symbols are uninterpreted constants. A combination of AC reasoning with linear arithmetic and the free theory E of equality is necessary to prove this formula. Linear arithmetic is used to show that c 2 − c 1 = c 1 +1 so that (i) u(a, c 1 +1) = a follows by congruence. Independently, e 2 = b and d = c 1 +1 imply (ii) u(c 1 +1,c 1 +1) = 0 by congruence, linear arithmetic and commutativity of u. AC reasoning can finally be used to conclude that (i)a n d( ii)i m p l yt h a tu(a, c 1 +1,c 1 +1) is equal to both a and u(a, 0). There are two main methods for combining decision procedures for disjoint theories. First, the Nelson-Oppen approach [18] is based on a variable abstraction mechanism and the exchange of equalities between shared variables. Second, the Shostak's algorithm [21] extends a congruence closure procedure with theories equipped with canonizers and solvers, i.e. procedures that compute canonical forms of terms and solve equations, respectively. While ground AC-completion can be easily combined with other decision procedures by the Nelson-Oppen method, it cannot be directly integrated in the Shostak's framework since it actually does not provide a solver for the AC theory.
In this paper, we investigate the integration of Shostak theories in ground AC-completion. We first introduce a new notion of rewriting called canonized rewriting which adapts normalized rewriting to cope with canonization. Then, we present a modular extension of ground AC-completion for deciding formulas in the combination of the theory of equality with user-defined AC symbols, uninterpreted symbols and an arbitrary signature disjoint Shostak theory X.The main ideas of our integration are to substitute standard rewriting by canonized rewriting, using a global canonizer for AC and X, and to replace the equation orientation mechanism found in ground AC-completion with the solver for X.
AC-completion has been studied for a long time in the rewriting community [15,20]. A generic framework for combining completion with a generic builtin equational theory E has been proposed in [10]. Normalized completion [17] is designed to use a modified rewriting relation when the theory E is equivalent to the union of the AC theory and a convergent rewriting system S. In this setting, rewriting steps are only performed on S-normalized terms. AC(X) can be seen as an adaptation of ground normalized completion to efficiently handle the theory E when it is equivalent to the union of the AC theory and a Shostak theory X. In particular, S-normalization is replaced by the application of the canonizer of X. This modular integration of X allows us to reuse proof techniques of ground AC-completion [16] to show the correctness of AC(X).
Kapur [11] used ground completion to demystify Shostak's congruence closure algorithm and Bachmair et al. [3] compared its strategy with other ones into an abstract congruence closure framework. While the latter approach can also handle AC symbols, none of these works formalized the integration of Shostak theories into (AC) ground completion.
Outline. Section 2 recalls standard ground AC completion. Section 3 is devoted to Shostak theories and global canonization. Section 4 presents the AC(X) algorithm and illustrates its use through an example. The correctness of AC(X) is sketched in Section 5 and experimental results are presented in Section 6. Conclusion and future works are presented in Section 7.

Ground AC-Completion
In this section, we first briefly recall the usual notations and definitions of [1,7] for term rewriting modulo AC. Then, we give the usual set of inference rules for ground AC-completion procedure and we illustrate its use through an example.
Terms are built from a signature Σ = Σ AC ⊎ Σ E of AC and uninterpreted symbols, and a set of variables X yielding the term algebra T Σ (X ). The range of letters a...f denotes uninterpreted symbols, u denotes an AC function symbol, s, t, l, r denote terms, and x, y, z denote variables. Viewing terms as trees, subterms within a term s are identified by their positions. Given a position p, s| p denotes the subterm of s at position p,a n ds[r] p the term obtained by replacement of s| p by the term r. We will also use the notation s(p)todenotethe symbol at position p in the tree, and the root position is denoted by Λ.G i v e na subset Σ ′ of Σ, a subterm t| p of t is a Σ ′ -alien of t if t(p) ∈ Σ ′ and p is minimal w.r.t the prefix word ordering 2 .W ew r i t eA Σ ′ (t) the multiset of Σ ′ -aliens of t.
A substitution is a partial mapping from variables to terms. Substitutions are extended to a total mapping from terms to terms in the usual way. We write tσ for the application of a substitution σ t oat e r mt. A well-founded quasiordering [6] on terms is a reduction quasi-ordering if s t implies sσ tσ and l[s] p l[t] p , for any substitution σ,t e r ml and position p. A quasi-ordering defines an equivalence relation ≃ as ∩ and a partial ordering ≺ as ∩ .
An equation is an unordered pair of terms, written s ≈ t.T h ev a r i a b l e s contained in an equation, if any, are understood as being universally quantified. Given a set of equations E, the equational theory of E, written = E ,i st h ese tof equations that can be obtained by reflexivity, symmetry, transitivity, congruence and instances of equations in E 3 . The word problem for E consists in determining if, given two ground terms s and t, the equation s ≈ t is in = E , denoted by s = E t. The word problem for E is ground when E contains only ground equations. An equational theory = E is said to be inconsistent when s = E t,f o rany s and t.
A rewriting rule is an oriented equation, usually denoted by l → r.At e r ms rewrites to a term t at position p by the rule l → r, denoted by s → p l→r t,iffthere exists a substitution σ such that s| p = lσ and t = s[rσ] p . A rewriting system R is a set of rules. We write s → R t whenever there exists a rule l → r of R such that s rewrites to t by l → r at some position. A normal form of a term s w.r.t to R is atermt such that s → * R t and t cannot be rewritten by R. The system R is said to be convergent whenever any term s has a unique normal form, denoted s ↓ R , and does not admit any infinite reduction. Completion [12] aims at converting a set E of equations into a convergent rewriting system R such that the sets = E and {s ≈ t | s ↓ R = t ↓ R } coincide. Given a suitable reduction ordering on terms, it has been proved that completion terminates when E is ground [14].
Rewriting modulo AC. Let = AC be the equational theory obtained from the set: In general, given a set E of equations, it has been shown that no suitable reduction ordering allows completion to produce a convergent rewriting system for E ∪ AC.W h e nE is ground, an alternative consists in in-lining AC reasoning both in the notion of rewriting step and in the completion procedure.
Rewriting modulo AC is directly related to the notion of matching modulo AC as shown by the following example. Given a rule u(a, u(b, c))) → t,w ew ould like the following reductions to be possible: Associativity and commutativity of u are needed in (1) for the subterm u(c, u(b, a)) to match the term u(a, u(b, c)), and in (2) for the term u(a, u(c, u(d, b))) to be seen as u(u(a, u(b, c)),d), so that the rule can be applied. More formally, this leads to the following definition.
Definition 1 (Ground rewriting modulo AC). Ate r ms rewrites to a term t modulo AC at position p by the rule l → r, denoted by s → p AC\l→r t,i ff(1) s| p = AC l and t = s[r] p or (2) l(Λ)=u and there exists a term s ′ such that In order to produce a convergent rewriting system, ground AC-completion requires a well-founded reduction quasi-ordering total on ground terms with an underlying equivalence relation which coincides with = AC . Such an ordering will be called a total ground AC-reduction ordering.

Fig. 1. Inference rules for ground AC-completion
The inference rules for ground AC-completion are given in Figure 1. The rules describe the evolution of the state of a procedure, represented as a configuration E | R ,w h e r eE is a set of ground equations and R a ground set of rewriting rules. The initial state is E 0 |∅ where E 0 is a given set of ground equations. Tri vial removes an equation u ≈ v from E when u and v are equal modulo AC. Orient turns an equation into a rewriting rule according to a given total ground AC-reduction ordering . R is used to rewrite either side of an equation (Simplify), and to reduce right hand side of rewriting rules (Compose). Given ar u l el → r, Collapse either reduces l at an inner position, or replaces l by a term smaller than r. In both cases, the reduction of l to l ′ may influence the orientation of the rule l ′ → r w h i c hi sa d d e dt oE as an equation in order to be re-oriented. Finally, Deduce adds equational consequences of rewriting rules to E. For instance, if R contains two rules of the form u(a, b) → s and u(a, c) → t, then the term u(a, u(b, c)) can either be reduced to u(s, c)ort ot het ermu(t, b). The equation u(s, c) ≈ u(t, b), called critical pair, is thus necessary for ensuring convergence of R. Critical pairs of a set of rules are computed by the following function (a µ stands for the maximal term w.r.t. size enjoying the assertion): Example. To get a flavor of ground AC-completion, consider a modified version of the assertion given in the introduction, where the arithmetic part has been removed (and uninterpreted constant symbols renamed for the sake of simplicity) u(a 1 ,a 4 ) ≈ a 1 ,u(a 3 ,a 6 ) ≈ u(a 5 ,a 5 ),a 5 ≈ a 4 ,a 6 ≈ a 2 ⊢ a 1 ≈ u(a 1 ,u(a 6 ,a 3 )) The precedence a 1 ≺ p ··· ≺ p a 6 ≺ p u defines an AC-RPO ordering on terms [19] which is suitable for ground AC-completion. The table in Figure 2 shows the application steps of the rules given in Figure 1 from an initial configuration {u(a 1 ,a 4 ) ≈ a 1 ,u(a 3 ,a 6 ) ≈ u(a 5 ,a 5 ),a 5 ≈ a 4 ,a 6 ≈ a 2 }|∅ to a final configuration ∅ | R f ,w h e r eR f is the set of rewriting rules {1, 3, 5, 7, 10}.I tcanbe checked that a 1 ↓ R f and u(a 1 ,u(a 6 ,a 3 )) ↓ R f are identical.

Shostak Theories and Global Canonization
In this section, we recall the notions of canonizers and solvers underlying Shostak theories and show how to obtain a global canonizer for the combination of the theories E and AC with an arbitrary signature disjoint Shostak theory X.
From now on, we assume given a theory X with a signature Σ X . A canonizer for X is a function can X that computes a unique normal form for every term such that s = X t iff can X (s)=can X (t). A solver for X is a function solve X that solves equations between Σ X -terms. Given an equation s ≈ t, solve X (s ≈ t) either returns a special value ⊥ when s ≈ t ∪ X is inconsistent, or an equivalent substitution. A Shostak theory X is a theory with a canonizer and a solver which fulfill some standard properties given for instance in [13].
Our combination technique is based on the integration of a Shostak theory X in ground AC-completion. From now on, we assume that terms are built from as i g n a t u r eΣ defined as the union of the disjoint signatures Σ AC , Σ E and Σ X . We also assume a total ground AC-reduction ordering defined on T Σ (X )u s e d later on for completion. The combination mechanism requires defining both a global canonizer for the union of E,A Ca n dX, and a wrapper of solve X to handle heterogeneous equations. These definitions make use of a global one-toone mapping α : T Σ →X (and its inverse mapping ρ) and are based on a variable abstraction mechanism which computes the pure Σ X -part [[t]] of a heterogeneous term t as follows: The canonizer for AC defined in [9] is based on flattening and sorting techniques which simulate associativity and commutativity, respectively. For instance, the term u(u(u ′ (c, b),b),c)isfirstflattenedtou(u ′ (c, b),b,c) and then sorted 4 to get the term u(b, c, u ′ (c, b)). It has been formally proved that this canonizer solves the word problem for AC [5]. However, this definition implies a modification of the signature Σ AC where arity of AC symbols becomes variadic. Using such canonizer would impact the definition of AC-rewriting given in Section 2. In order to avoid such modification we shall define an equivalent canonizer that builds degenerate trees instead of flattened terms. For instance, we would expect the normal form of u(u(u ′ (c, b),b),c)t ob eu(b, u(c, u ′ (c, b))). Given a signature Σ which contains Σ AC and any total ordering on terms, we define can AC by: We can easily show that can AC enjoys the standard properties required for a canonizer. The proof that can AC solves the word problem for AC follows directly from the one given in [5].
Using the technique described in [13], we define our global canonizer can which combines can X with can AC as follows: Again, the proofs that can solves the word problem for the union E,A Ca n dX and enjoys the standard properties required for a canonizer are similar to those given in [13]. The only difference is that can AC directly works on the signature Σ, which avoids the use of a variable abstraction step when canonizing a mixed term of the form u(t 1 ,t 2 ) such that u ∈ Σ AC . Using the same mappings α, ρ and the abstraction function, the wrapper solve can be easily defined by: In order to ensure termination of AC(X), the global canonizer and the wrapper must be compatible with the ordering used by AC-completion, that is: We can prove that the above properties hold when the theory X enjoys the following local compatibility properties: To fulfil this axiom, AC-reduction ordering can be chosen as an AC-RPO ordering [19] based on a precedence relation ≺ p such that Σ X ≺ p Σ E ∪ Σ AC .F r o m now on, we assume that X is locally compatible with .
Example. To solve the equation u(a, b)+a ≈ 0, we use the abstraction α = {u(a, b) → x, a → y} and call solve X on x + y ≈ 0. Since a ≺ u(a, b), the only solution which fulfills the axiom above is {x ≈−y}. We apply ρ and get the set {u(a, b) →−a} of rewriting rules.

Ground AC-Completion Modulo X
In this section, we present the AC(X) algorithm which extends the ground ACcompletion procedure given in Section 2. For that purpose, we first adapt the notion of ground AC-rewriting to cope with canonizers. Then, we show how to refine the inference rules given in Figure 1 to reason modulo the equational theory induced by a set E of ground equations and the theories E,A Ca n dX.

Canonized Rewriting
From rewriting point of view, a canonizer behaves like a convergent rewriting system: it gives an effective way of computing normal forms. Thus, a natural way for integrating can in ground AC-completion is to extend normalized rewriting [17].
Definition 2. Let can be a canonizer. A term s can-rewrites to a term t at position p by the rule l → r, denoted by s p l→r t,i ff s → p AC\l→r t ′ and can(t ′ )=t Example. Using the usual canonizer can A for linear arithmetic and the rule γ : u(a, b) → a,t h et e r mf (a +2 * u(b, a)) can A -rewrites to f (3 * a)b y γ as follows: f (a +2 * u(b, a)) → AC\γ f (a +2 * a)a n dcan A (f (a +2 * a)) = f (3 * a).

The AC(X) Algorithm
The first step of our combination technique consists in replacing the rewriting relation found in completion by canonized rewriting. This leads to the rules of AC(X) given in Figure 3. The state of the procedure is a pair E | R of equations and rewriting rules. The initial configuration is E 0 |∅ where E 0 is supposed to be a set of equations between canonized terms. Since AC(X)'s rules only involve canonized rewriting, the algorithm maintains the invariant that terms occurring in E and R are in canonical forms. Tri vial thus removes an equation u ≈ v from E when u and v are syntactically equal. A new rule Bottom is used to detect inconsistent equations. Similarly to normalized completion, integrating the global canonizer can in rewriting is not enough to fully extend ground ACcompletion with the theory X: in both cases the orientation mechanism has to be adapted . Therefore, the second step consists in integrating the wrapper solve in the Orient rule. The other rules are much similar to those of ground AC-completion except that they use the relation R instead of → AC\R .
u(a, c1 +1)≈ a Col 1a n d1 1 13 u(a, c1 + 1) → a Ori u(a, c1 +1)≈ a 14 u(0,a) ≈ u(a, c1 +1) Ded from 9 and 13 15 u(0,a) ≈ a Sim 14 by 13 16 u(0, a) → a Ori 15 Fig. 4. AC(X) on the running example Example. We illustrate AC(X) on the example given in the introduction: The table given in Figure 4 shows the application of the rules of AC(X) on the example when X is instantiated by linear arithmetic. We use an AC-RPO ordering based on the precedence 1 ≺ p 2 ≺ p a ≺ p b ≺ p c 1 ≺ p c 2 ≺ p d ≺ p e 1 ≺ p e 2 ≺ p f ≺ p u. The procedure terminates and produces a convergent rewriting system R f = {3, 5, 9, 10, 11, 13, 16}.U si n gR f ,w ec a nc h e c kt h a ta and u(a, 0) can-rewrite to the same normal form.

Correctness
As usual, in order to enforce correctness, we cannot use any (unfair) strategy. We say that a strategy is strongly fair when no possible application of an inference rule is infinitely delayed and Orient is only applied over fully reduced terms.
Theorem 1. Given a set E of ground equations, the application of the rules of AC(X) under a strongly fair strategy terminates and either produces ⊥ when E ∪ AC ∪ X is inconsistent, or yields a final configuration ∅|R such that: The proof 5 is based on three intermediate theorems, stating respectively soundness, completeness and termination. In the following, we shall consider a fixed run of the completion procedure, starting from the initial configuration E 0 |∅ .W ed e n o t eR ∞ (resp. E ∞ )t h e set of all encountered rules n R n (resp. equations n E n )a n dR ω (resp. E ω ) the set of persistent rules n i≥n R i (resp. equations n i≥n E i ).

Soundness
Soundness is ensured by the following invariant: Proof. The invariant obviously holds for the initial configuration and is preserved by all the inference rules. The rules Simplify, Compose, Collapse and Deduce preserve the invariant since for any rule l → r,i fl = AC,X,E0 r, for any term s rewritten by l→r into t,t h e ns = AC,X,E0 t.I fOrie n ti su s e dt ot u r na n equation s ≈ t into a set of rules ] t i can be instantiated by ρ, yielding an equational proof p i = X,s≈t v i . Since by induction s = AC,X,E0 t holds, we get p i = AC,X,E0 v i .
In the rest of this section, we assume that the strategy is strongly fair. This implies in particular that headCP(R ω ) ⊆ E ∞ , E ω = ∅ and R ω is inter-reduced, that is none of its rules can be collapsed or composed by another one. We also assume that ⊥ is not encountered, otherwise, termination is obvious.

Completeness
Completeness is established by using a variant of the technique introduced by Bachmair et al. in [2] for proving completeness of completion. It transforms a proof between two terms which is not under a suitable form into a smaller one, and the smallest proofs are the desired ones. The proofs we are considering are made of elementary steps, either equational steps, with AC, X and E ∞ ,o r rewriting steps, with R ∞ and the additional (possibly infinite) rules R can = {t → can(t) | can(t) = t}. Rewriting steps with R ∞ can be either R∞ or → R∞ 6 . The measure of a proof is the multiset of the elementary measures of its elementary steps. The measure of an elementary step takes into account the number of terms which are in a canonical form in an elementary proof: the canonical weight of a term t, w can (t)i se q u a lt o0i fcan(t)= AC t and to 1 otherwise. Notice that if w can (t)=1 ,t h e ncan(t) ≺ t,a n di fw can (t)=0 ,t h e n can(t) ≃ t. The measure of an elementary step between t 1 and t 2 performed thanks to: an equation is equal to ({t 1 ,t 2 }, , , , ) -ar u l el → r ∈ R ∞ is equal to ({t 1 }, 1,w can (t 1 )+w can (t 2 ),l,r)i ft 1 l→r t 2 or t 1 → l→r t 2 . -ar u l eo fR can is equal to ({t 1 }, 0,w can (t 1 )+w can (t 2 ),t 1 ,t 2 )i ft 1 → Rcan t 2 .
As usual the measure of a step s ← t is the measure of t → s.E l e m e n t a r ys t e p s are compared lexicographically using the multiset extension of for the first component, the usual ordering over natural numbers for the components 2 and 3, and for last ones. Since is an AC-reduction ordering, the ordering defined over proofs is well-founded.
The general methodology is to show that a proof which contains some unwanted elementary steps can be replaced by a proof with a strictly smaller measure. Since the ordering over measures is well-founded, there exists a minimal proof, and such a minimal proof is of the desired form.

Termination
We shall prove that, under a strongly fair strategy, R ω is finite and obtained in a finite time (by cases on the head function symbol of the rule's left-hand side), and then we show that R ω will clean up the next configurations and the completion process eventually halts on ∅|R ω . In order to make our case analysis on rules, and to prove the needed invariants, we define several sets of terms (assuming without loss of generality that E 0 = can(E 0 )): T 0 = {t |∃t 0 ,e 1 ,e 2 ∈T Σ (X ),e 1 ≈ e 2 ∈ E 0 and t 0 = e i | p and t 0 * R∞ t} T 0X = T 0 ∪{f X (t 1 ,...,t n ) | f X ∈ Σ X and ∀i, t i ∈ T 0X } T 1 = {t | t ∈ T 0 and ∀p, t| p ∈ T 0X } T 2 = {u(t 1 ,...,t n ) | u ∈ Σ AC and ∀i, t i ∈ T 1 } T 0 is the set of all terms and subterms in the original problem as well as their reducts by R ∞ .T h es e tT 0X moreover contains terms with X-aliens in T 0 . T 1 is the set of terms that can be introduced by X from terms of T 0 (by solving or canonizing). T 2 is a superset of the terms built by critical pairs.
We first establish by structural induction over terms that: Then, by induction over n, we show that any configuration E n | R n accessible from E 0 |∅ after n steps is such that E n ∪ R n ⊆ T 2 1 ∪ T 2 2 . The fact that R ∞ is finitely branching is a corollary of Lemma 4. If l → r n is created at step n in R n and l → r m at step m in R m , with n<m,t h e nr m is a reduct of r n by R∞ .
The proof of this lemma is by induction over the length of the derivation, and by a case analysis over the applied inference rule.
Theorem 4. Under a strongly fair strategy, AC(X) terminates.
By the above properties, R ω can be divided in R ω ∩ T 2 1 and R ω ∩ T 2 2 . R ω ∩ T 2 1 is finite, since all its left-hand sides are reducts of a finite number of terms by R ∞ which is well-founded and finitely branching. R ω ∩ T 2 2 is finite by using the same argument as in the ground AC-completion proof, based on the Higman's lemma. Hence R ω is finite and obtained in a finite number of steps, that is, there exists n such that R ω ⊆ R n .T h e nR ω will clean the rest of E n ,a n dt h en e w l y generated critical pairs will be discarded as trivial ones.

Experimental Results
AC(X) has been implemented in the Alt-Ergo [8] theorem prover. In this section, we show that this extension has strong impact both on performances and memory allocation w.r.t. an axiomatic approach. For that purpose, we benchmarked our implementation and compared its performances with state-of-the-art smt solvers (Z3 v2.8, CVC3 v2.2, Simplify v1.5.4). All measures were obtained on a laptop running Linux equipped with a 2.58GHz dual-core Intel processor a n dw i t h4 Gb main memory. Provers were given a time limit of five minutes for each test and memory limitation was managed by the system. The results are given in seconds; we write to for timeout and om for out of memory.
Our test suite is made of formulas which are valid in the combination of the theory of linear arithmetic A, the free theory of equality 7 E and a small part of the theory of sets defined by the symbols ∪, ⊆, the singleton constructor {·}, and the following set of axioms: In order to get the most accurate information from our benchmarks, we classify formulas in three categories according to the subset of axioms needed to prove their validity. We use the standard mathematical notation d i=1 a i for the terms of the form a 1 ∪ (a 2 ∪ (···∪a d )) ···)andw ewrite d i=1 a i ; b for terms of the form a 1 ∪ (a 2 ∪ (···∪(a d ∪ b))) ···). Formulas in the first category are of the form: and proving their validity only requires the theory E and the AC properties of ∪.
The second category contains formulas additionally involving linear arithmetic: In order to prove their validity, we additionally need some axioms of S.T h e results of the benchmarks are shown in Fig. 5, Fig. 6 and Fig 7. The first column contains the results for Alt-Ergo when we explicitly declare ∪ as an AC symbol and remove the AC axioms from the problem. In the second column, we do not take advantage of AC(X) and keep the AC axioms in the context.  The main reason is that its instantiation mechanism is not spoiled by the huge number of intermediate terms the other provers generate when they instantiate the AC axioms.

Conclusion and Future Works
We have presented a new algorithm AC(X) which efficiently combines, in the ground case, the AC theory with a Shostak theory X and the free theory of equality. Our combination consists in a tight embedding of the canonizer and the solver for X in ground AC-completion. The integration of the canonizer relies on a new rewriting relation, reminiscent to normalized rewriting, which interleaves canonization and rewriting rules. We proved correctness of AC(X) by reusing standard proof techniques. Completeness is established thanks to a proofs' reduction argument, and termination follows the lines of the proof of ground AC-completion where the finitely branching result is adapted to account for the theory X. AC(X) has been implemented in the Alt-Ergo theorem prover. The first experiments are very promising and show that a built-in treatment of AC, in the combination of the free theory of equality and a Shostak theory, is more efficient than an axiomatic approach. Although effective, the integration of AC(X) in Alt-Ergo fails to prove the formula (∀x, y, z.P ((x ∪ y) ∪ z)) ∧ b ≈ c ∪ d → P (a ∪ b) since the trigger for the internal quantified formula (the term (x∪y)∪z)) does not match the term a ∪ b, even when exploiting the ground equation b = c ∪ d which allows to match the term a ∪ (c ∪ d)). Introducing explicitly the AC axioms for ∪ would allow the matcher to generate the ground term (a ∪ c) ∪ d that could be matched. However, as shown by our benchmarks, too many terms are generated with these axioms in general. In order to fix this problem, we intend to extend the pattern-matching algorithm of Alt-Ergo to exploit both ground equalities and properties of AC symbols. In the near future, we also plan to extend AC(X) to handle the AC theory with unit or idempotence. This will be a first step towards a decision procedure for a substantial part of the finite sets theory.