Deciding Quantifier-Free Presburger Formulas Using Parameterized Solution Bounds

Given a formula in quantifier-free Presburger arithmetic, if it has a satisfying solution, there is one whose size, measured in bits, is polynomially bounded in the size of the formula. In this paper, we consider a special class of quantifier-free Presburger formulas in which most linear constraints are difference (separation) constraints, and the non-difference constraints are sparse. This class has been observed to commonly occur in software verification. We derive a new solution bound in terms of parameters characterizing the sparseness of linear constraints and the number of non-difference constraints, in addition to traditional measures of formula size. In particular, we show that the number of bits needed per integer variable is linear in the number of non-difference constraints and logarithmic in the number and size of non-zero coefficients in them, but is otherwise independent of the total number of linear constraints in the formula. The derived bound can be used in a decision procedure based on instantiating integer variables over a finite domain and translating the input quantifier-free Presburger formula to an equi-satisfiable Boolean formula, which is then checked using a Boolean satisfiability solver. In addition to our main theoretical result, we discuss several optimizations for deriving tighter bounds in practice. Empirical evidence indicates that our decision procedure can greatly outperform other decision procedures.


Introduction
Presburger arithmetic [Pre29] is the first-order theory of the structure N, 0, 1, , + , where N denotes the set of natural numbers.The satisfiability problem for Presburger arithmetic is decidable, but of super-exponential worst-case complexity [FR74] For each software verification project, the maximum fraction of non-difference constraints is shown, as well as the maximum width of a non-difference constraint, where the maximum is taken over all formulas in the set.The Blast formulas were generated from device drivers written in C, the Magic formulas from an implementation of openssl written in C, the MIT formulas from Java programs, and the WiSA formulas were generated in the checking of format string vulnerabilities.
(1) Mainly Difference Constraints: Of the m constraints, m − k are difference constraints, where k ≪ m.Difference constraints, also called separation or difference-bound constraints, are of the form x i − x j ⊲⊳ b t or x i ⊲⊳ b t , where b t is an integer constant, and ⊲⊳ stands for a relational symbol in the set {>, ≥, =, <, ≤}.
(2) Sparse Structure: The k non-difference constraints are sparse, with at most w variables per constraint, where w is "small".We will refer to w as the width of the constraint.Pratt [Pra77] observed that most inequalities generated in program verification are difference constraints.More recently, the authors of the theorem prover Simplify observed in the context of the Extended Static Checker for Java (ESC/Java) project that "the inequalities that occur in program checking rarely involve more than two or three terms" [DNS03].We have performed a study of formulas generated in various recent software verification projects: the Blast project at Berkeley [HJMS02], the Magic project at CMU [CCG + 03], the Wisconsin Safety Analyzer (WiSA) project [Wis], and the software upgrade checking project at MIT [ME03].The results of this study, indicated in Table 1, support the aforementioned observations regarding the "sparse, mostly difference" nature of constraints in QFP formulas.To our knowledge, no previous decision procedure for QFP has attempted to exploit this problem structure.
We make the following novel contributions in this paper: • We derive bounds on solutions for QFP formulas, not only in terms of the traditional parameters m, n, a max , and b max , but also in terms of k and w.In particular, we show that the worst-case number of bits required per integer variable is linear in k, but only logarithmic in w.Unlike previously derived bounds, ours is not in terms of the total number of constraints m. • We use the derived bounds in a sound and complete decision procedure for QFP based on finite instantiation, and present empirical evidence that our method can greatly outperform other decision procedures.Related Work.There has been much work on deciding quantifier-free Presburger arithmetic; we present a brief discussion here and refer the reader to a recent survey [GBD02] for more details.Recent techniques fall into four categories: • The first class comprises procedures targeted towards solving conjunctions of constraints, with disjunctions handled by enumerating terms in a disjunctive normal form (DNF).
Examples include the Omega test [Pug91] (which is an extension of Fourier-Motzkin elimination for integers) and solvers based on other integer linear programming techniques.The drawback of these methods is the need to enumerate the potentially exponentially many terms in the DNF representation.Our work is targeted towards solving formulas with a complicated Boolean structure, which often arise in verification applications.• The second set of methods attempt to remedy this problem by instead relying on modern SAT solving strategies.The approach works as follows.A Boolean abstraction of the QFP formula Φ is generated by replacing each linear constraint with a corresponding Boolean variable.If the abstraction is unsatisfiable, then so is Φ.If not, the satisfying assignment (model) is checked for consistency with the theory of quantifier-free Presburger arithmetic, using a ground decision procedure for conjunctions of linear constraints (i.e., a procedure for checking feasibility of integer linear programs).Assignments that are inconsistent are excluded from later consideration by adding a "lemma" to the Boolean abstraction.The process continues until either a consistent assignment is found, or all (exponentially many) assignments have been explored.Examples of decision procedures in this class that have some support for QFP include CVC [BDS02,BGD03] and ICS [dMRS02]. 3The ground decision procedures used by provers in this class employ a combination framework such as the Nelson-Oppen architecture for cooperating decision procedures [NO79] or a Shostaklike combination method [Sho84,SR02].These methods are only defined for combining disjoint theories.In order to exploit the mostly-difference structure of a formula, one approach could be to combine a decision procedure for a theory of difference constraints with one for a theory of non-difference constraints, but this needs an extension of the combination methods that applies to these non-disjoint theories.• Strichman [Str02] presents SAT-based decision procedures for linear arithmetic (over the rationals) and QFP.For QFP, the basic idea is to create a Boolean encoding of all the possible variable projection steps performed by the Omega test.Since Fourier-Motzkin elimination (and therefore, the Omega test) has worst-case double-exponential complexity in both time and space [Cha93], this approach leads to a SAT problem that, in the worstcase, is doubly-exponential in the size of the original formula and takes doubly-exponential time to generate.In contrast, in our approach the SAT-encoding is polynomial in the size of the original formula, and is generated in polynomial time.• The final class of methods are based on automata theory (e.g., [WB95,GBD02]).The basic idea in these methods is to construct a finite automaton corresponding to the input QFP formula Φ such that the language accepted by the automaton consists of the binary encodings of satisfying solutions of Φ.According to a recent experimental evaluation with other methods [GBD02], these techniques are better than others at solving formulas with very large coefficients, but do not scale well with the number of variables and constraints. 4The approach we present in this paper is distinct from the categories mentioned above.In particular, the following unique features differentiate it from previous methods: • It is the first finite instantiation method and the first tractable procedure for translating a QFP formula to SAT in a single step.The clear separation between the translation and the SAT solving allows us to leverage future advances in SAT solving far more easily than other SAT-based procedures.• It is the first technique, to the best of our knowledge, that formally exploits the structure of formulas commonly encountered in software verification.In addition to the above, the bounds we derive in this paper are also of independent theoretical interest.For instance, they indicate that the solution bound does not depend on the number of difference constraints.
Outline of the paper.The rest of this paper is organized as follows.In Section 2, we discuss background material on bounds on satisfying solutions of integer linear programs.An integer linear program (ILP) is a conjunction of linear constraints, and hence is a special kind of QFP formula.The bounds for QFP follow directly from those for ILPs.Our main theoretical results are presented in Section 3. Section 3.1 gives bounds for ILPs for the case of k = 0, when all constraints are difference constraints.In Section 3.2, we compute a bound for ILPs for arbitrary k.In Section 3.3, we show how our results extend to arbitrary QFP formulas.Techniques for improving the bound in practice are discussed in Section 4. We report on experimental results in Section 5, and conclude in Section 6.

Background
In this section, we define the integer linear programming problem formally and state the previous results on bounding satisfying solutions of ILPs.A more detailed discussion on the steps outlined in Section 2.1 can be found in reference books on ILP (e.g.[Sch86,PS82]).
2.1.Preliminaries.Consider a system of m linear constraints in n integer-valued variables: Ax ≥ b (2.1)Here A is an m × n matrix with integral entries, b is a m × 1 vector of integral entries, and x is a n × 1 vector of integer-valued variables.A satisfying solution to system (2.1) is an evaluation of x that satisfies (2.1).
In system (2.1), the entries in x can be negative.We can constrain the variables to be non-negative by adding a dummy variable x 0 that refers to the "zero value," replacing each original variable x i by x ′ i − x 0 , and then adjusting the coefficients in the matrix A to get a new constraint matrix A ′ and the following system:5 Here the system has n ′ = n + 1 variables, and A ′ has the structure that a ′ i,j = a i,j for j = 1, 2, . . ., n and a ′ i,n+1 = − n j=1 a i,j .Note that the last column of A ′ is a linear combination of the previous n columns.It is easy to show that system (2.1) has a solution if and only if system (2.2) has one.
Finally, adding surplus variables to the system, we can rewrite system (2.2) as follows: where A ′′ = [A| − I m ] is an m × (n ′ + m) integer matrix formed by concatenating A with the negation of the m × m identity matrix I m .
For convenience we will drop the primes, referring to A ′′ and x ′′ simply as A and x.Rewriting system (2.3) thus, we get Hereafter we will mostly use the definition in (2.4).
We next define two useful terms: solution bound and enumeration bound.Proof.Given a solution x ′ * to system (2.2), we construct a solution x * to system (2.1) by setting x d] for all j.Similarly, if d is an enumeration bound for system (2.1), then 2d is a solution bound for system (2.2).
Finally, we introduce symbols a max and b max with the following associated meanings: a max = max (i,j) |a i,j | and b max = max t |b t |.In words, a max and b max are tight upper bounds on the absolute values of entries of A and b respectively.

Previous Results
. The results of this paper build on results obtained by Borosh, Treybig, and Flahive [BT76,BFT86] on bounding the solutions of systems of the form (2.4).We state their result in the following theorem: Theorem 1.Consider the augmented matrix [A|b] of dimension m × (n ′ + m + 1).Let ∆ be the maximum of the absolute values of all minors of this augmented matrix.Then, the system (2.4) has a satisfying solution if and only if it has one with all entries bounded by (n + 2)∆.
However, note that the determinant of a matrix can be more than exponential in the dimension of the matrix [BC72].In the case of the Borosh-Flahive-Treybig result, it means that ∆ can be as large as µ m (m+1) (m+1)/2 2 m , where µ = max(a max , b max ).Papadimitriou [Pap81,PS82] also gives a bound of similar size, stated in the following theorem: Theorem 2. If the ILP of (2.4) has a satisfying solution, then it has a satisfying solution where all entries in the solution vector are bounded by (n Papadimitriou's bound implies that we need O(log m + log b max + m[log m + log a max ]) bits to encode each variable (assuming n ′ = O(m)).The Borosh-Flahive-Treybig bound implies needing O(m[log m + log µ]) bits per variable, which is of the same order.

3.1.
Bounds for a System of Difference Constraints.Let us first consider computing solution bounds for an ILP for the case where k = 0, i.e., system (2.4) comprises only of difference constraints.
In this case, the left-hand side of each equation comprises exactly three variables: two variables x i and x j where 0 ≤ i, j ≤ n and one surplus variable x l where n + 1 ≤ l ≤ n + m.The t th equation in the system is of the form As we noted in Section 2.1, the matrix A can be written as [A o |−I m ] where A o comprises the first n ′ = n + 1 columns, and I m is the m × m identity matrix.
The important property of A o is that each row has exactly one +1 entry and exactly one −1 entry, with all other entries 0. Thus, A T o can be interpreted as the node-arc incidence matrix of a directed graph.Therefore, A T o is totally unimodular (TUM), i.e., every square submatrix of A T o has determinant in {0, −1, +1} [PS82].Therefore, A o is TUM, and so is Now, let us consider using the Borosh-Flahive-Treybig bound stated in Theorem 1.This bound is stated in terms of the minors of the matrix [A|b].For the special case of this section, we have the following bound on the size of any minor: If the minor is obtained by deleting the last column (corresponding to b), then it is a minor of A, and its value is in {0, −1, +1} since A is TUM.Thus, the bound of s b max is attained for any non-trivial minor with s ≥ 1 and b max ≥ 1.
Suppose the b column is not deleted.First, note that the matrix A is of the form [A o | − I m ] where the rank of A o is at most s ′ = min(n, m).This is because A o has dimensions m × n + 1, and the last column of A o , corresponding to the variable x 0 , is a linear combination of the previous n columns.(Refer to the construction of system (2.2) from system (2.1).) Next, suppose the sub-matrix corresponding to M comprises p columns from the −I m part, r − p − 1 columns from the A o part, and the one column corresponding to b.Since permuting the rows and columns of M does not change its absolute value, we can permute the rows of M and the columns corresponding to the −I m part to get the corresponding sub-matrix in the following form: Expanding M along the last column, we get where each M i is a minor corresponding to a submatrix of A. However, notice that M i = 0 for all 1 ≤ i ≤ p, since each of those minors have an entire column (from the −I m part) equal to 0. Therefore, we can reduce the right-hand side to the sum of r − p terms: Notice that, so far, we have not made use of the special structure of A. Now, observing that A is TUM, Further, since each non-zero M i can be of order at most s ′ , r − p ≤ s = min(s ′ + 1, m).6 Therefore, we get

|M | ≤ s b max
Using the terminology of Theorem 1, we have ∆ ≤ s b max .Thus, the bound in this case is (n + 2) s b max .
Thus, S, the bound on the number of bits per variable, is Formulas generated from verification problems tend to be overconstrained, so we assume n < m.Thus, s = n + 1, and the bound reduces to O(log n + log b max ) bits per variable.
Remark: The only property of the A matrix that the proof of Theorem 3 relies on is the totally unimodular (TUM) property.Thus, Theorem 3 would also apply to any system of linear constraints whose coefficient matrix is TUM.Examples of such matrices include interval matrices, or more generally network matrices.Note that the TUM property can be tested for in polynomial time [Sch86].
3.2.Bounds for a Sparse System of Mainly Difference Constraints.We now consider the general case for ILPs, where we have k non-difference constraints, each referring to at most w variables.
Without loss of generality, we can reorder the rows of matrix A so that the k nondifference constraints are the top k rows, and the difference constraints are the bottom m − k rows.Reordering the rows of A can only change the sign of any minor of [A|b], not the absolute value.Thus, the matrix [A|b] can be put into the following form: Here, A 1 is a k × n + 1 dimensional matrix corresponding to the non-difference constraints, A 2 is a m − k × n + 1 dimensional matrix with the difference constraints, I m is the m × m identity corresponding to the surplus variables, and the last column is the vector b.
For ease of presentation, we will assume in the rest of Sections 3.2 and 3.3 that k ≤ n+1.We will revisit this assumption at the end of Section 3.
The matrix composed of A 1 and A 2 will be referred to, as before, as A o .Note that each row of A 1 has at most w non-zero entries, and each row of A 2 has exactly one +1 and one −1 with the remaining entries 0. Thus, A 2 is TUM.
We prove the following theorem: where M j is a minor of A o .If M does not include b, then it is a minor of A. Without loss of generality, we can assume that M does not include a column from the −I m part of A, since such columns only contribute to the sign of the determinant.
So, let us consider bounding a minor M j of A o of order r (or r − 1, if M includes the b column).
Since A o = A 1 A 2 , consider expanding M j , using the standard determinant expansion by minors along the top k rows corresponding to non-difference constraints.Each term in the expansion is (up to a sign) the product of at most k entries from the A 1 portion, one from each row, and a minor from A 2 .Since A 2 is TUM, each product term is bounded in absolute value by a k max .Furthermore, there can be at most w k non-zero terms in the expansion, since each non-zero product term is obtained by choosing one non-zero element from each of the rows of the A 1 portion of M j , and this can be done in at most w k ways.Therefore, |M j | is bounded by (a max w) k .Combining this with the inequality (3.1), and since r − p ≤ s, we get |M | ≤ s b max (a max w) k which is what we set out to prove.Thus, we conclude that ∆ ≤ s b max (a max w) k , where s = min(n + 1, m).From Theorems 1 and 4, and Remark 1, we obtain the following theorem: Theorem 5. A solution bound for the system (2.2) is Thus, the solution size S is We make the following observations about the bound derived above, assuming as before, that n < m, and so s = n + 1: • Dependence on Parameters: We observe that the bound is linear in k, logarithmic in a max , w, n, and b max .In particular, the bound is not in terms of the total number of linear constraints, m. • Typical-case Asymptotic Growth: As observed in our study of formulas from software verification, w is typically a small constant, so the number of bits needed per variable is O(log n + log b max + k log a max + k).In many cases, a max and k are also bounded by a small constant.Thus, S is typically O(log n + log b max ).This reduces the search space by an exponential factor over using the bound expressed in terms of m. • Representing Non-difference Constraints: There are many ways to represent non-difference constraints and these have an impact on the bound we derive.In particular, it is possible to transform a system of non-difference constraints to one with at most three variables per constraint.For example, the linear constraint x 1 + x 2 + x 3 + x 4 = x 5 can be rewritten as: For the original representation, k = 1 and w = 5, while for the new representation k = 3 and w = 3.Since our bound is linear in k and logarithmic in w, the original representation would yield a tighter bound.Similarly, one can eliminate variables with coefficients greater than 1 in absolute value by introducing new variables; e.g., 2x is represented as x+x ′ with an additional difference constraint x = x ′ .This can be used to adjust w, a max , and n so that the overall bound is reduced.
The derived bound only yields benefits in the case when the system has few nondifference constraints which themselves are sparse.In this case, we can instantiate variables over a finite domain that is much smaller than that obtained without making any assumptions on the structure of the system.Finally, from Proposition 1 and Theorem 5, we obtain an enumeration bound for system (2.1):Theorem 6.An enumeration bound for system (2.1) is Note that the values of a max and w in the statement of Theorem 6 are those for system (2.2).
3.3.Bounds for Arbitrary Quantifier-Free Presburger Formulas.We now return to the original goal of this paper, that of finding a solution bound for an arbitrary QFP formula Φ.
Suppose that Φ has m linear constraints φ 1 , φ 2 , . . ., φ m , of which m − k are difference constraints, and n variables x 1 , x 2 , . . ., x n .As before, we assume that each non-difference constraint has at most w variables, a max is the maximum over the absolute values of coefficients a i,j of variables, and b max is the maximum over the absolute values of constants b i appearing in the constraints.Furthermore, let us assume that the zero variable (used in transforming system 2.1 to system 2.2) have already been introduced into the constraints.
We prove the following theorem.
Theorem 7. If Φ is satisfiable, there is a solution to Φ that is bounded by (n + 2)∆ where ∆ = s (b max + 1) (a max w) k and s = min(n + 1, m).
Proof.Let σ be a (concrete) model of Φ.Let m ′ constraints, φ i 1 , φ i 2 , . . ., φ i m ′ , evaluate to true under σ, the rest evaluating to false.Let A ′ = [a i,j ] be a m ′ × n matrix in which each row comprises the coefficients of variables x 1 , x 2 , . . ., x n in a constraint Clearly, σ is a satisfying solution to the ILP given by Also, if the system (3.2) has a satisfying solution then Φ is satisfied by that solution.Thus, Φ and the system (3.2) are equi-satisfiable, for every possible system (3.2) we construct in the manner described above.
By Theorems 1 and 4, we can conclude that if system (3.2) has a satisfying solution, it has one bounded by (n + 2)∆ where ∆ = s (b max + 1) (a max w) k and s = min(n + 1, m).Moreover, this bound works for every possible system (3.2).
Therefore, if Φ has a satisfying solution, it has one bounded by (n + 2)∆.
Thus, to generate the Boolean encoding of the starting QFP formula, we must encode each integer variable as a symbolic bit-vector of length S given by If the zero variable is not introduced into the formula Φ, we can search for solutions in n i=1 [−d, d], where d = (n + 2)∆.As noted earlier, values of a max and w used in computing ∆ are those obtained after introducing the zero variable.
Remark 3. In Section 3.2, we assumed, for ease of presentation, that k ≤ n + 1.If this does not hold, we can simply replace k in the results of Sections 3.2 and 3.3 by min(k, n+1).This is because the dimension of the minor M j of A o (mentioned in the proof of Theorem 4) is limited by n + 1.
We conclude this section by summarizing the symbols used to represent formula parameters and the quantities derived therefrom.For easy reference, they are listed in

Improvements
The bounds we derived in the preceding section are conservative.For a particular problem instance, the size of minors can be far smaller than the bound we computed.However, this cannot be directly exploited by enumerating minors, since the number of minors grows exponentially with the dimensions of the constraint matrix.Also, there is a special case under which one can improve the (n+2)∆ bound.If all the constraints are originally equalities and the system of constraints has full rank, a bound of ∆ suffices [BFRT89].However, in our experience, even if the linear constraints are all equalities, they still tend to be linearly dependent.Thus, we have not been able to make use of this special case result.
Fortunately, there are other techniques for improving the solution bound that we have found to be fairly useful in practice.These include theoretical improvements as well as heuristics that are useful in practice.We describe these methods in this section.
4.1.Variable Classes.So far, we have used a single bit-vector length for all integer variables appearing in the formula Φ.This is overly conservative.In general, we can partition the set of variables into classes such that two variables are placed in the same class if there is a constraint in which they both appear with non-zero coefficients.Note, moreover, that this partitioning optimization can be performed before adding the "zero" variable x 0 .A different zero variable is then used for each variable class.For each class, we separately compute parameters n, k, b max , a max , and w, resulting in a separately computed bit-vector length for each class.
For example, consider the formula In this case, variables x 1 , x 2 , and x 3 fall into one class, while x 4 and x 5 will be put into a different class.The correctness of this partitioning optimization follows from a reduction to ILP as performed in the proof of Theorem 7, along with the following two observations: • By construction, different variable classes share neither variables nor constraints.
• A different zero variable can be introduced for each class because that transformation preserves solutions in the same way as the transformation from system (2.1) to system (2.2) does.• A satisfying solution to a system of ILPs, no two of which share a variable, can be obtained by solving them independently and concatenating the solutions.
4.2.Tighter Bounds for Special Constraint Classes.Consider specializing the solution bound of Section 3.3 to the special cases of equality logic and difference logic.(An equality logic formula only has constraints of the form x i = x j .)For equality logic, k = 0, and b max = 0. Thus, our bound specializes to (n + 2) • s, which, assuming n < m, is O(n 2 ).For separation logic too, k = 0.This yields a bound of (n + 2) • s • (b max + 1).
However, both of these bounds are too conservative.
For an equality logic formula with n variables, it is well-known that a solution bound of n suffices to decide the satisfiability of the formula.
Similarly, if the formula is in difference logic, a solution bound of min(n, m) • (b max + 1) suffices.We sketch the proof of this result here, omitting details.The proof is based on a graph-theoretic view of difference-bound constraints, with each variable corresponding to a vertex, and a constraint x i ≥ x j + b t corresponding to an edge from x i to x j of weight b t .(The graph is constructed after first putting the formula into negation normal form; see the paper by Strichman et al. [SSB02] for details on graph construction.)A satisfying assignment is an assignment of integers to vertices such that the graph has no positive cycles.Now note that, in this graph, the longest path is of length min(n, m) • (b max + 1), since there are n + 1 vertices in the graph (including that for the zero variable) and the weight of any edge is at most b max + 1.Thus, if there is a satisfying assignment, there is one in which the separation between the minimum and maximum integer value does not exceed min(n, m) • (b max + 1).This concludes the proof sketch.
Clearly, if the formula is purely in equality logic or purely in difference logic, we can use the tighter bounds for the appropriate logic.However, the optimization of computing variable classes (presented in Section 4.1) allows us to exploit the tighter bounds even if the overall formula is not in equality logic or difference logic: The tighter bounds can be used for encoding variables in variable classes that comprise purely equality or purely difference constraints.The correctness of this optimization follows for the same reasons as that of the original variable class partitioning optimization.4.3.Dealing with Large Coefficients and Widths.In the expression for S, the term involving a max (and w) is multiplied by a factor of k.Thus, any increase in log a max gets amplified by a factor of k.It is therefore useful, in practice, to more carefully model the dependence of S on coefficients.We present two techniques to alleviate the problem of dealing with large coefficients.These techniques also apply to dealing with large constraint widths.

4.3.1.
An n k -fold reduction.The coefficient of the zero variable x 0 has, so far, been used in computing a max .We will now show that we can ignore this coefficient, and also ignore any contribution of x 0 to the width w.This optimization can result in a reduction of up to a factor of n k in the solution bound d.
The largest reduction occurs when, in the original formula, we have a constraint of the form j a i x j ≥ b i , where a i is the largest coefficient in absolute value.After adding the zero variable, this constraint is transformed to ( j a i x j ) − (n • a i )x 0 ≥ b i .Thus, a max now equals n • a i , a factor of n times greater than in the original formula.
Let us revisit the transformation performed in Section 2.1 to convert system (2.1) to system (2.2).A different and commonly-used transformation to non-negative variables is to write each x j as x + j − x − j , where x + j , x − j ≥ 0 for all j.Let the resulting system be referred to as system (2.2').Let us assume that this different transformation is used in place of the original one that generates system (2.2), leaving all successive transformations the same.Now, consider the form of the matrix [A|b], as used in Section 3.2, reproduced below: With the new transformation method, A 1 is a k × 2n dimensional matrix corresponding to the non-difference constraints, A 2 is a (m − k) × 2n dimensional matrix with the difference constraints, I m is the m × m identity corresponding to the surplus variables, and the last column is the vector b.Importantly, note that A 2 is still totally unimodular and the ranks of A 1 and A 2 are the same as they were with the use of the single zero variable x 0 .This is because any non-singular sub-matrix of A o must include exactly one of the columns corresponding to x + i and x − i , since they are negations of each other.Therefore, the values of w and a max used in the proof of Theorem 4 are those for the system (2.1).
Thus, if we use the transformation method of replacing x i with x + j − x − j , the values of w and a max used in the statement of Theorem 4 are those for the system (2.1).
Note, however, that by replacing x i with x + j − x − j , the number of variables in the problem doubles, and in particular, the number of input variables in the SAT-encoding is doubled.This is rather undesirable.
Fortunately, there are two solutions that avoid the doubling of variables at the minor cost of only 1 extra bit per variable.
(1) The first solution is based on the following proposition that mirrors Proposition 1.
Proof.Given a solution x ′ * within the solution bound d to system (2.2'), we construct a solution x * to system (2.1) by setting Thus, we can restrict our search to the hypercube n i=1 [−d, d], where the solution bound d is computed using the values of w and a max for the system (2.1).
(2) The second solution uses the following proposition showing that we can use the technique of adding a zero variable x 0 and the values of w and a max for the system (2.1), while paying only a minor penalty of 1 extra bit per variable.
Proof.(if part): Suppose system (2.2') has a solution in [0, d]; i.e., x + j , x − j ∈ [0, d] for all j.Then, we construct a satisfying assignment to system (2.2) as follows: • x 0 is assigned the value max j x − j .• x j , for j > 0, is assigned the value x + j + (x 0 − x − j ).Since 0 ≤ (x 0 − x − j ) ≤ d, we can conclude that 0 ≤ x j ≤ 2d for all j.It is easy to see that the resulting assignment satisfies system (2.2).
(only if part): Suppose system (2.2) has a solution in [0, 2d].This means that the original system (2.1) is feasible.It follows that system (2.2') has a solution in [0, d].
In both solutions, we must search 2d + 1 values for each variable x j , 1 ≤ j ≤ n.However, the former avoids the need to add x 0 , and hence will have fewer input variables in the SAT-encoding.Hence, the former solution is preferable.
The reader must note, though, that this optimization is only relevant when the introduction of the zero variable (significantly) affects the value of a max .(The impact on w is minor.)If the value of a max is unaffected by the introduction of the zero variable x 0 , using x 0 can result in a more compact SAT-encoding than using an enumeration domain of [−d, d] for each variable.If one uses the x 0 variable, one introduces log d input Boolean variables for x 0 in the SAT-encoding.On the other hand, without the x 0 variable, one introduces n additional Boolean variables to encode sign bits.The relative size of the SAT-encoding, and hence the decision to introduce x 0 , would depend on whether n significantly exceeds log d.

Product of k largest coefficients and widths.
There is a simpler optimization which we have found to be useful in practice.
In the proof of Theorem 4, in deriving the (a max • w) k term, we have assumed the worstcase scenario of each term in the determinant expansion equaling a k max and there being w terms to choose from in each row.
In fact, we can replace a k max with k i=1 a maxi , where a maxi denotes the largest coefficient in row i, in absolute value.Similarly, w k can be replaced with i w i , where w i is the width of constraint i.

4.4.
Dealing with Large Constant Terms.For some formulas, the value of b max is very large due to the presence of a single large constant (or very few of them).In such cases, a less conservative analysis or other problem transformations are useful.We present two such techniques here.
Like the optimization of Section 4.3.2,this has also proved fairly useful in practice.4.4.2.Shift of origin.Another transformation that can be useful for dealing with large constant terms is to replace a variable x j by x j − α j ; this corresponds to shifting the origin in R n by α j along the x j -axis.
The i th constraint is then transformed into j a i,j (x j − α j ) ≥ b i .Rewriting this, we obtain the form j a i,j x j ≥ b ′ i , where b ′ i = b i + ( j a i,j α j ).The new value of b max , after the transformation, is max i |b ′ i |.Therefore, we wish to find values of α j s so as to minimize the value of max i |b ′ i |.
This problem can be phrased as the following integer linear program: This ILP has n + 1 variables and 2m + 1 constraints (including the non-negativity constraint on z).
In fact, one can write one such ILP for each variable class, since they do not share any variables or constraints.Then, the optimum value for each class will indicate the new value of b max to use for that class.

Implementation and Experimental Results
5.1.Implementation.We used the bound derived in the previous section to implement a decision procedure based on finite instantiation.
The procedure starts by analyzing the formula to obtain parameters, and computes the solution bound.We found that the optimizations of Section 4.1, 4.2, and 4.3.1 are always useful, especially since formulas tend to contain many variables classes comprising of only difference constraints.Hence, our base-line implementation always includes these optimizations.The impact of other optimizations is reported in Section 5.2.2.
Given the solution bound, integer variables in the QFP formula are encoded as symbolic bit-vectors large enough to express any integer value within the bound.Arithmetic operators are implemented as arbitrary-precision bit-vector arithmetic operations.Equalities and inequalities over integer expressions are translated to corresponding relations over bit-vector expressions.The resulting Boolean formula is passed as input to a SAT solver.
We implemented our procedure as part of the UCLID verifier [UCL], which is written in Moscow ML [Mos].In our implementation we used the zChaff SAT solver [zCh] version 2004.5.13.In the sequel, we will refer to our decision procedure as the "UCLID" procedure.

Experimental Results.
We report here on a series of experiments we performed to evaluate our decision procedure against other theorem provers, as well as to assess the impact of the various optimizations discussed in Section 4. 7All experiments were performed on a Pentium-IV 2 GHz machine with 1 GB of RAM running Linux.A timeout of 3600 seconds (1 hour) was imposed on each run.5.2.2.Impact of optimizations.In this section, we discuss the impact of optimizations discussed in Sections 4.3 and 4.4.
Table 4 compares the following 4 different encoding options based on different ways of computing the solution bound: Base: The base-line method of computing the solution bound.Coeff: Using the optimization of Section 4.3.2alone.Const: Using the optimization of Section 4.4.1 alone.
All: Using optimization methods of both Sections 4.3.2 and 4.4.1.The comparison is made with respect to the largest number of bits needed for any variable class, and the run-times for both generating the SAT-encoding and for SAT solving.First, we note that Coeff and Const both generate more compact encodings than Base; on the WiSA benchmarks, they use about 5-10 fewer bits per variable in the largest variable class.The reduction in the total number of bits, summed over all variables in all variable classes, is similar, since most variables fall into a single class.
The encoding times decrease with reduction in number of bits; this is just as one would predict.
However, the comparison of SAT solving times is more mixed; on a few benchmarks Coeff and Const outperform Base, and on others, they do worse.The latter behavior is observed especially on satisfiable formulas.The reason for this appears to be a relative ease in finding larger solutions for those formulas than finding smaller solutions.
When Coeff and Const are both used (indicated as "All"), we find that not only are encoding times smaller than the Base technique, but SAT solving times are also smaller in all cases.This seems to indicate that a reduction in SAT-encoding size beyond a certain limit overcomes any negative effects of restricting the search to smaller solutions.
We also performed an experiment to explore the use of the shift-of-origin optimization described in Section 4.4.2.UCLID automatically formulated the ILP and solved it using the CPLEX optimization tool [CPL] (version 8.1).Since none of the benchmarks listed in Table 3 have especially large constants, we used a different, unsatisfiable formula from the Blast suite which has only difference constraints, but with large constants.
Table 5 summarizes the key characteristics of this formula as well as the results obtained by comparing versions of the base-line (Base) implementation with and without the optimization enabled.We list the values of parameters, with and without the shift-of-origin optimization enabled, for the variable classes that yield the two largest values of S when the optimization is disabled.With the optimization turned on, the largest constant in the entire formula falls from 261133242 to 432539, a 600-fold reduction.However, if we restrict our attention to the largest variable class, comprising 230 variables, the reduction in b max is more modest, about a factor of 4. This yields a saving of 2 bits per variable for that variable class.The saving in the total number of bits, summed over all variable classes, is 677.This is, however, not large enough to reduce either the encoding time or the SAT time.In fact, the encoding time increases by about a second; this is the time required to run CPLEX and for the processing overhead of creating the ILP.
Even though the shift-of-origin optimization has not resulted in faster run-times in our experiments, it clearly has the potential to greatly reduce the number of bits, and might prove useful on other benchmarks.

5.2.3.
Comparison with other theorem provers.We compared UCLID's performance with that of the SAT-based provers ICS [ICS] (version 2.0) and CVC-Lite [CVC] (the new implementation of CVC, version 2.0.0),9 as well as the automata-based procedure LASH [LAS] (version 0.9).While CVC-Lite and LASH are sound and complete for QFP, ICS 2.0 is incomplete; i.e., it can report a formula to be satisfiable when it is not.The ground decision procedure ICS uses is the Simplex linear programming algorithm with some additional heuristics to deal with integer variables.However, in our experiments, both UCLID and ICS returned the same answer whenever ICS terminated within the timeout.The ground decision procedure for CVC-Lite is a proof-producing variant of the Omega test [BGD03].
LASH was unable to complete on any benchmark within the timeout since it was unable to construct the corresponding automaton; we attribute this to the relatively large number of variables and constraints in our formulas, and note that Ganesh et al. obtained similar results in their study [GBD02] The UCLID version is the one with all optimizations turned on ("All").For ICS, we give the total time, the number of inconsistent Boolean assignments analyzed by the ground decision procedure ("#(Inc.assn.)"), as well as the overall time taken by the ground decision procedure ("Ground").For CVC-Lite, we indicate the total run-time.A " * " indicates that the decision procedure timed out after 3600 sec.LASH did not complete within the timeout on any formula.
A comparison of UCLID versus ICS and CVC-Lite is displayed in Table 6.From Table 6, we observe that UCLID outperforms ICS on all the WiSA benchmarks, terminating within a few seconds on each one.However, ICS performs best on the Blast formulas, finishing within a fraction of a second on all.CVC-Lite runs much faster than ICS on the satisfiable WiSA formulas, but does not finish on either of the unsatisfiable WiSA formulas, and does not outperform UCLID on any of the WiSA benchmarks.However, it outperforms UCLID on one of the Blast formulas.Due to the unavailability of statistics on where CVC-Lite spends its time, we can only present a detailed comparison between UCLID and ICS here.We believe that CVC-Lite's superior performance to ICS on satisfiable formulas is mainly due to improved Boolean simplification heuristics and, to a lesser extent, due to a faster ground decision procedure. 10The better performance compared to UCLID on one of the Blast formulas is because that formula is propositionally unsatisfiable, as we will discuss in more detail below.
Let us consider the WiSA benchmarks first.These formulas have a non-trivial Boolean structure that requires ICS to enumerate many inconsistent Boolean assignments before being able to decide the formula.The ICS run-time is dominated by the time taken by the ground decision procedure.We observe that the number of inconsistent Boolean assignments alone is not a precise indicator of total run-time, which also depends on the time taken by the ground decision procedure in ruling out a single Boolean assignment.Further optimization of ICS's ground decision procedure might improve its overall run-time, at least on the satisfiable formulas.
The reason for UCLID's superior performance is the formula structure, where k, w, and a max remain fixed at a low value while m, n, and b max increase.Thus, the maximum number of bits per variable stays about the same even as m increases substantially, and the resulting SAT problem is within the capacity of zChaff.The times for both encoding and SAT solving phases are small.In particular, the small SAT solving time on the unsatisfiable instances indicates that the proof of unsatisfiability is also small.
Next, consider the results on the Blast formulas.The reason for ICS's superior performance on these can be gauged by the number of inconsistent Boolean assignments it has to enumerate.On the formula named "blast-tl3", purely Boolean reasoning suffices to decide unsatisfiability.For the other two formulas, the reason for unsatisfiability is a mutually-inconsistent subset amongst all the linear constraints that are conjoined together, and a single call to ICS's ground decision procedure suffices to infer the inconsistency.In all three cases, the "proof of unsatisfiability" that ICS must find is small.
On the other hand, UCLID's run-time is dominated by the encoding time.Once the encoding is generated, the SAT solver decides unsatisfiability easily.
To summarize, it appears that decision procedures like ICS and CVC-Lite, which are based on a lazy translation to SAT, are effective when the formula structure is such that only a few calls to the ground decision procedure are required (i.e., satisfiable solutions are easy to find, or the proof of unsatisfiability is shallow), and the ground decision procedure is itself efficient.UCLID performs better on formulas with complicated Boolean structure and comprising linear constraints with the sparse structure formalized in this paper.

Conclusions and Future Work
In this paper, we have presented a formal approach to exploiting the "sparse, mainly difference constraint" nature of quantifier-free Presburger formulas encountered in software verification.Our approach is based on formalizing this sparse structure using new parameters, and deriving a new parameterized bound on satisfying solutions to QFP formulas.We have also proposed several ways in which the bound can be reduced in practice.Experimental results show the benefits of using the derived bound in a SAT-based decision procedure based on finite instantiation.
The work described in this paper can be extended in a few new directions.Some of these are discussed below.6.1.Computing the Solution Bound Lazily.In our implementation, we compute a conservative bound and translate a QFP formula to a Boolean formula in a single step.An alternative approach is to perform this transformation lazily, increasing the solution bound "on demand".
One such lazy encoding approach works, in brief, as follows.(Details can be found in the paper by Kroening et al. [KOSS04].) We start with an encoding size for each integer variable that is smaller than that prescribed by the conservative bound (say, 1 bit per variable).
If the resulting Boolean formula is satisfiable, so is the original QFP formula.If not, the proof of unsatisfiability generated by the SAT solver is used to generate a sound abstraction of the original formula, which can be checked with a sound and complete decision procedure for QFP (such as the one proposed in this paper).If this decision procedure concludes that the abstraction is unsatisfiable, so is the original formula, but if not, it provides a counterexample which indicates the necessary increase in the encoding size.A new SATencoding is generated, and the procedure repeats.
The bound S on solution size that we derive in this paper implies an upper bound nS on the number of iterations of this lazy encoding procedure; thus the lazy encoding procedure needs only polynomially many iterations before it terminates with the correct answer.
The potential advantage of this lazy approach is two-fold: (1) It avoids using the conservative bounds we have derived in this paper, and (2) if the generated abstractions are small, the sound and complete decision procedure used by this approach will run much faster than if it were fed the original formula.
For the WiSA benchmarks discussed in Section 5, we found that a solution bound of 2 8 −1, i.e., 8 bits per variable, is sufficient to decide satisfiability.However, the time required to derive this bound using the method of [KOSS04] is much greater than the run-times we report in Section 5. Still, the lazy approach might prove especially useful in cases in which S is so large that the SAT problem is outside the reach of current SAT solvers.6.2.Special Classes of Constraints.In Section 4.2, we saw that if all linear constraints are difference constraints, a tighter solution bound can be used.Recently, we have derived a tighter bound for a special class of constraints that is a superset of difference constraints.Constraints in this class refer to at most two variables (w = 2), and all variable coefficients are in {0, −1, +1} (i.e., a max ≤ 1).These constraints are referred to in literature as either generalized 2SAT constraints or unit two-variable per inequality constraints.For this special case, we have derived a solution bound of 2 • min(n, m) • (b max + 1) [SSB04], exactly twice the bound for difference logic.The proof techniques for deriving this bound are quite different from those used in this paper.
It would be interesting to find other special constraint classes for which the bounds presented in this paper can be further tightened.6.3.Other Directions and Open Problems.As we have observed in Section 5, the impact of reduction of number of bits on the SAT solving time is not always predictable.We are currently trying to better understand the reasons for this.
Encoding to SAT is not the only way in which the bounds presented in this paper can be used.It would be interesting to explore non-SAT-based decision procedures based on the bounds we derive.
The theoretical results of this paper rest heavily on the bound (n + 2) • ∆ given by Borosh, Treybig, and Flahive, stated in Theorem 1.In their 1992 paper [BT92], Borosh and Treybig conjectured that this bound can be improved to just ∆.To our knowledge, this conjecture is still open.
Finally, it would also be interesting to apply our work to areas outside of software verification that share the special structure of linear constraints exploited in this paper.

Theorem 3 .
The absolute value of any minor of [A|b] is bounded above by s b max , where s = min(n + 1, m).Proof.Consider any minor M of [A|b].Let r be the order of M .

Theorem 4 .
The absolute value of any minor of [A|b] is bounded above by s b max (a max w) k , where s = min(n + 1, m).Proof.Consider any minor M of [A|b], and let r be its order.As in Theorem 3, if M includes p columns from the −I m part of A, then we can infer that r − p ≤ s. (Our proof of this property in Theorem 3 made no assumptions on the form of A o .)If M includes the last column b, then as in the proof of Theorem 3, we can conclude that |M | ≤ (r − p) b max [ • Worst-case Asymptotic Growth: In the worst case, k = m, w = n + 1, and n = O(m), and we get the O(log m + log b max + m[log m + log a max ]) bound of Papadimitriou.

4.4. 1 .
Product of s largest constants.It is easy to see that, in the proof of Theorem 4, the s b max term can be replaced by s j=1 |b i j |, where b i 1 , b i 2 , . . ., b is are the s largest elements of b in absolute value.Similarly, the expression for ∆ derived in Theorem 7 gets modified to ∆ = s j=1

Table 1 :
Linear Arithmetic Constraints in Software Verification are Mostly Difference Constraints.
Definition 1.Given a QFP formula Φ, a solution bound is an integer d such that Φ has an integer solution if and only if it has an integer solution in the n-dimensional hypercube

Table 2 :
Parameters and Derived Quantities.

Table 4 :
An experimental evaluation of encoding optimizations.We compare the 4 different UCLID encoding options with respect to the maximum number of bits needed for any integer variable ("Max.#bits/var."),the time taken to generate the Boolean encoding, and the time taken by the SAT solver.

Table 5 :
Evaluating the shift-of-origin optimization.We list the values of parameters corresponding to variable classes with the two largest values of S, as computed without the shift-of-origin optimization."Total #bits" indicates the number of bits needed to encode all integer variables.Encoding time is indicated as "Enc."and SAT solving time as "SAT".

Table 6 :
. Experimental comparison with other theorem provers.