Generalized Craig Interpolation for Stochastic Boolean Satisfiability Problems with Applications to Probabilistic State Reachability and Region Stability

The stochastic Boolean satisfiability (SSAT) problem has been introduced by Papadimitriou in 1985 when adding a probabilistic model of uncertainty to propositional satisfiability through randomized quantification. SSAT has many applications, among them probabilistic bounded model checking (PBMC) of symbolically represented Markov decision processes. This article identifies a notion of Craig interpolant for the SSAT framework and develops an algorithm for computing such interpolants based on a resolution calculus for SSAT. As a potential application area of this novel concept of Craig interpolation, we address the symbolic analysis of probabilistic systems. We first investigate the use of interpolation in probabilistic state reachability analysis, turning the falsification procedure employing PBMC into a verification technique for probabilistic safety properties. We furthermore propose an interpolation-based approach to probabilistic region stability, being able to verify that the probability of stabilizing within some region is sufficiently large.


Introduction
Papadimitriou [Pap85] has proposed the idea of modeling uncertainty within propositional satisfiability (SAT) by adding randomized quantification to the problem description.The resultant stochastic Boolean satisfiability (SSAT) problems consist of a quantifier prefix followed by a propositional formula.The quantifier prefix is an alternating sequence of existentially quantified variables and variables bound by randomized quantifiers.The meaning of a randomized variable x is that x takes value true with a certain probability p and value false with the complementary probability 1 − p. Due to the presence of such probabilistic assignments, the semantics of an SSAT formula Φ no longer is qualitative in the sense that Φ is satisfiable or unsatisfiable, but rather quantitative in the sense that we are interested in the maximum probability of satisfaction of Φ. Intuitively, a solution of Φ is a strategy for assigning the existential variables, i.e. a tree of assignments to the existential variables depending on the probabilistically determined values of preceding randomized variables, such that the assignments maximize the probability of satisfying the propositional formula.
In recent years, the SSAT framework has attracted interest within the Artificial Intelligence community, as many problems from that area involving uncertainty have concise descriptions as SSAT problems, in particular probabilistic planning problems [LMP01, ML98,ML03].Inspired by that work, other communities have started to exploit SSAT and closely related formalisms within their domains.The Constraint Programming community is working on stochastic constraint satisfaction problems [Wal02,BS06] to address, among others, multi-objective decision making under uncertainty [BS07].Recently, a technique for the symbolic analysis of probabilistic hybrid systems based on stochastic satisfiability has been suggested by the authors [FHT08,TF09,FTE10,TEF11].To this end, SSAT has been extended by embedded theory reasoning over arithmetic theories, as known from satisfiability modulo theories (SMT) [BSST09], which yields the notion of stochastic satisfiability modulo theories (SSMT).By the expressive power of SSMT, bounded probabilistic reachability problems of uncertain hybrid systems can be phrased symbolically as SSMT formulae yielding the same probability of satisfaction [FHT08,TF09,FTE10,TEF11].As this bounded model checking approach yields valid lower bounds lb of the probability of reaching undesirable system states along unbounded runs, it is able to falsify probabilistic safety requirements of shape "a system error occurs with probability at most 0.1 ", namely if a lower bound lb > 0.1 is computed.
Though the general SSAT problem and even its restriction to 2CNF, i.e. to formulae in conjunctive normal form containing clauses with two literals only, are PSPACEcomplete [TF10], the plethora of real-world applications calls for practically efficient algorithms.The first SSAT algorithm, suggested by Littman [Lit99], extends the Davis-Putnam-Logemann-Loveland (DPLL) procedure [DP60,DLL62] for SAT with appropriate quantifier handling and algorithmic optimizations like thresholding.Majercik further improved the DPLL-based SSAT algorithm by non-chronological backtracking [Maj04].The SSMT algorithm from [FHT08, TF08, TF09, TEF11] being implemented in the SSMT tool SiSAT builds on the DPLL-based SSAT procedures plus conflict-driven clause learning, but also integrates an underlying theory solver addressing non-linear arithmetics, and was successfully applied to realistic case studies featuring hybrid discrete-continuous state spaces [TF09,FTE10,TEF11].Unlike these explicit tree-traversal approaches and motivated by work on resolution for propositional and first-order formulae [Rob65] and for quantified Boolean formulae (QBF) [BKF95], the authors have recently developed an alternative SSAT procedure based on resolution [TF10].
In this article, we investigate the concept of Craig interpolation for SSAT.Given two formulae A and B for which A ⇒ B is true, a Craig interpolant [Cra57] I is a formula over variables common to A and B that "lies in between" A and B in the sense that A ⇒ I and I ⇒ B. In the automatic hardware and software verification communities, Craig interpolation has found widespread use in model checking algorithms, both as a means of extracting reasons for non-concretizability of a counterexample obtained on an abstraction as well as for obtaining a symbolic description of reachable state sets.In McMillan's approach [McM03,McM05], interpolants are used to symbolically describe an overapproximation of the step-bounded reachable state set.If the sequence of interpolants thus obtained stabilizes eventually, i.e. no additional state is found to be reachable, then the corresponding state-set predicate R has all reachable system states as its models.The safety property that states satisfying B, where B is a predicate, are never reachable is then verified by checking R ∧ B for unsatisfiability.Given McMillan's verification approach to reachability analysis of non-probabilistic systems based on Craig interpolation for SAT, it is natural to ask whether a corresponding probabilistic counterpart can be developed, i.e. a verification approach to probabilistic reachability analysis of probabilistic systems based on Craig interpolation for stochastic SAT.Such an approach would complement the aforementioned falsification procedure for probabilistic systems based on SSAT/SSMT.In this article, we suggest a solution to the issue above.
In addition to probabilistic state reachability, we address the problem of probabilistic region stability.The latter problem is motivated by the notion of region stability for nonprobabilistic hybrid systems [PW07a,PW07b], where a system is called stable with respect to some region R iff all system runs eventually reach R and finally stay in R forever.In this article, we suggest an adaptation of region stability to the probabilistic case along with a symbolic, interpolation-based procedure for the verification of probabilistic stability properties like "the probability that the system stabilizes within region R is at least 99.9%".
Structure of the article.After a formal introduction to SSAT in Section 1, Section 2 is devoted to a generalization of the notion of Craig interpolants suitable for SSAT.Thereafter, Section 3 elaborates on an algorithm for computing such generalized Craig interpolants, which relies on a resolution calculus for SSAT.The application of generalized Craig interpolation to the symbolic analysis of probabilistic systems, namely to probabilistic state reachability as well as to probabilistic region stability, is then addressed in Section 4, where applicability of these novel techniques is illustrated on small examples.Section 5 finally concludes the article.

Stochastic Boolean satisfiability
A stochastic Boolean satisfiability (SSAT) formula is of the form Φ = Q : ϕ with a prefix Q = Q 1 x 1 . . .Q n x n of quantified propositional variables x i , where Q i is either an existential quantifier ∃ or a randomized quantifier R p i with a rational constant 0 < p i < 1, and a propositional formula ϕ such that Var (ϕ) ⊆ {x 1 , . . ., x n }, where Var (ϕ) denotes the set of all (necessarily free) variables occurring in ϕ.Note that SSAT formula Φ thus has no free variables.Without loss of generality, we assume that ϕ is in conjunctive normal form (CNF), i.e. a conjunction of disjunctions of propositional literals.A literal ℓ is a propositional variable, i.e. ℓ = x i , or its negation, i.e. ℓ = ¬x i .A clause is a disjunction of literals.Throughout the article and without loss of generality, we require that a clause does not contain the same literal more than once as ℓ ∨ ℓ ≡ ℓ.Consequently, we may also identify a clause with its set of literals.The semantics of Φ, as illustrated in Figure 1, is defined by the maximum probability of satisfaction P r(Φ) as follows.

P r(ε
Note that the semantics is well-defined as Φ has no free variables such that all variables have been substituted by the constants true and false when reaching the quantifier-free base case.

Generalized Craig interpolants
Craig interpolation [Cra57] is a well-studied notion in formal logics which has several applications in Computer Science, among them model checking [McM03,McM05].Given two formulae ϕ and ψ such that ϕ ⇒ ψ is valid, a Craig interpolant for (ϕ, ψ) is a formula I which refers only to common variables of ϕ and ψ, and I is "intermediate" in the sense that ϕ ⇒ I and I ⇒ ψ.Such interpolants do trivially exist in all logics permitting quantifier elimination, for instance, in propositional logic.The observation that ϕ ⇒ ψ holds iff ϕ ∧ ¬ψ is unsatisfiable gives rise to an equivalent definition which we refer to in the rest of the article: 1 given an unsatisfiable formula ϕ ∧ ¬ψ, a formula I is a Craig interpolant for (ϕ, ψ) iff both ϕ ∧ ¬I and I ∧ ¬ψ are unsatisfiable and I mentions only common variables.
In this section, we investigate the issue of Craig interpolation for stochastic SAT.We propose a generalization of Craig interpolants suitable for SSAT and show the general existence of such interpolants.In Section 3, we then devote our attention to an automatic method for computing generalized Craig interpolants based on a resolution calculus for SSAT.
When approaching a reasonable definition of interpolants for SSAT, the semantics of the non-classical quantifier prefix poses problems: Let Φ = Q : (A∧B) be an SSAT formula.Each variable in A ∧ B is bound by Q, which provides the probabilistic interpretation of the variables that is lacking without the quantifier prefix.This issue can be addressed by considering the quantifier prefix Q as the global setting that serves to interpret the quantifier-free part, and consequently interpreting the interpolant also within the scope of Q, thus reasoning about Q : (A∧¬I) and Q : (I ∧B).A more fundamental problem is that a classical Craig interpolant for Φ only exists if P r(Φ) = 0, since A ∧ B has to be unsatisfiable by definition of a Craig interpolant which applies iff P r(Q : (A ∧ B)) = 0.The precondition that P r(Q : (A ∧ B)) = 0 would be far too restrictive for application of interpolation, as the notion of unsatisfiability of A ∧ B is naturally generalized to satisfiability with insufficient probability, i.e.P r(Q : (A ∧ B)) being "sufficiently small", in the stochastic setting.Such 1 This is of technical nature as SSAT formulae are interpreted by the maximum probability of satisfaction.
As the maximum probability that an implication ϕ ⇒ ψ holds is inappropriate for our purpose, we reason about the maximum satisfaction probability p of the negated implication, i.e. of ϕ ∧ ¬ψ, instead.The latter relates to the minimum probability 1 − p that ϕ ⇒ ψ holds, which is the desired notion.
relaxed requirements actually appear in practice, for instance, in probabilistic verification where safety properties like "a fatal system error is never reachable" are frequently replaced by probabilistic ones like "a fatal system error is reachable only with (sufficiently small) probability of at most 0.1 ".Motivated by above facts, interpolants for SSAT should also exist when A ∧ B is satisfiable with reasonably low probability.
The resulting notion of interpolation, which is to be made precise in Definition 2.1, matches the following intuition.In classical Craig interpolation, when performed in logics permitting quantifier elimination, the Craig interpolants of (A, ¬B) form a lattice with implication as its ordering, A ∃ = ∃a 1 , . . .a α : A as its bottom element and B ∀ = ¬∃b 1 , . . .b β : B as its top element, where the a i and b i are the local variables of A and of B, respectively.
In the generalized setting required for SSAT2 , A ⇒ ¬B and thus A ∃ ⇒ B ∀ may no longer hold such that the above lattice can collapse to the empty set.To preserve the overall structure, it is however natural to use the lattice of propositional formulae "in between" A ∃ ∧ B ∀ as bottom element and A ∃ ∨ B ∀ as top element instead.This lattice is non-empty and coincides with the classical one whenever A ∧ B is unsatisfiable.
Definition 2.1 (Generalized Craig interpolant).Let A, B be propositional formulae and and Given any two propositional formulae A and B, the four quantifier-free propositional formulae equivalent to A ∃ ∧ B ∀ , to A ∃ , to B ∀ , and to A ∃ ∨ B ∀ , are generalized Craig interpolants for (A, B).These generalized interpolants always exist since propositional logic has quantifier elimination.While Definition 2.1 motivates the generalized notion of Craig interpolant from a modeltheoretic perspective, we state an equivalent definition of generalized Craig interpolants in Lemma 2.2 that substantiates the intuition of generalized interpolants and allows for an illustration of their geometric shape.Given two formulae A and B, the idea of generalized Craig interpolant is depicted in Figure 2. The set of solutions of A is defined by the rectangle on the V A , V A,B -plane with a cylindrical extension in V B -direction as A does not contain variables in V B .Similarly, the solution set of B is given by the triangle on the V B , V A,Bplane and its cylinder in V A -direction.The solution set of A ∧ B is then determined by the intersection of both cylinders.Since A ∧ B ∧ ¬(A ∧ B) is unsatisfiable, the sets A ∧ ¬(A ∧ B) and B ∧ ¬(A ∧ B) are disjoint.This gives us the possibility to talk about interpolants wrt.these sets.However, a formula I over only common variables in V A,B may not exist when demanding A ∧ ¬(A ∧ B) ∧ ¬I and I ∧ B ∧ ¬(A ∧ B) to be unsatisfiable.This is indicated by Lemma 2.2 (Generalized Craig interpolant for SSAT).Let Φ = Q : (A ∧ B) be some SSAT formula, V A , V B , V A,B be defined as in Definition 2.1, and S A,B be a propositional formula with Var (S A,B ) ⊆ V A,B such that S A,B ≡ ∃a 1 , . . ., a α , b 1 , . . ., b β : (A ∧ B).Then, a propositional formula I is a generalized Craig interpolant for (A, B) iff the following properties are satisfied. (1) Proof.As Var (I) ⊆ V A,B holds for generalized Craig interpolants I, it remains to show that (A ∃ ∧ B ∀ ) ⇒ I and I ⇒ (A ∃ ∨ B ∀ ) iff P r(Q : (A ∧ ¬S A,B ∧ ¬I)) = 0 and P r(Q : We remark that the concept of generalized Craig interpolants is a generalization of Craig interpolants in the sense that whenever A∧B is unsatisfiable, i.e. when P r(Q : (A∧B)) = 0, then each generalized Craig interpolant I for (A, B) actually is a Craig interpolant for A and B since S A,B ≡ false.

Computation of generalized Craig interpolants
In this section, we proceed to the efficient computation of generalized Craig interpolants.The remark following Definition 2.1 shows that generalized interpolants can in principle be computed by explicit quantifier elimination methods, like Shannon's expansion or binary decision diagrams (BDDs).We aim at a more efficient method based on SSAT resolution [TF10] akin to resolution-based Craig interpolation for propositional SAT by Pudlák [Pud97].The latter approach has been integrated into DPLL-based SAT solvers featuring conflict analysis and successfully applied to symbolic model checking [McM03,McM05].To this end, we first recall the sound and complete resolution calculus for SSAT from [TF10] in Section 3.1.Thereafter, SSAT resolution is enhanced in order to compute generalized Craig interpolants in Section 3.2.
3.1.Resolution for SSAT.As basis of the SSAT interpolation procedure introduced in Section 3.2, we recall the sound and complete resolution calculus for SSAT from [TF10], subsequently called S-resolution.In contrast to SSAT algorithms implementing a DPLLbased backtracking procedure, thereby explicitly traversing the tree given by the quantifier prefix and recursively computing the individual satisfaction probabilities for each subtree by the scheme illustrated in Figure 1, S-resolution follows the idea of resolution for propositional and first-order formulae [Rob65] and for QBF formulae [BKF95] by deriving new clauses c p annotated with probabilities 0 ≤ p ≤ 1. S-resolution differs from non-stochastic resolution, as such derived clauses c p need not be implications of the given formula, but are just entailed with some probability.Informally speaking, the derivation of a clause c p means that under SSAT formula Q : ϕ, the clause c is violated with a maximum probability at most p, i.e. the satisfaction probability of Q : (ϕ ∧ ¬c) is at most p.More intuitively, the minimum probability that clause c is implied by ϕ is at least 1 − p.3 Once an annotated empty clause ∅ p is derived, it follows that the probability of the given SSAT formula is at most p, i.e.P r(Q In what follows, let Q : ϕ be an SSAT formula with ϕ in CNF.Without loss of generality, ϕ contains only non-tautological clauses4 , i.e. ∀c ∈ ϕ : |= c.Let Q = Q 1 x 1 . . .Q n x n be the quantifier prefix and ϕ be some propositional formula with Var (ϕ) ⊆ {x 1 , . . ., x n }.The quantifier prefix Q(ϕ) is defined to be shortest prefix of Q that contains all variables from ϕ, i.e.Q(ϕ) = Q 1 x 1 . . .Q i x i where x i ∈ Var (ϕ) and for each j > i : x j / ∈ Var (ϕ).Let further be Var (ϕ) ↓ k := {x 1 , . . ., x k } for each integer 0 ≤ k ≤ n.For a non-tautological clause c, i.e. if |= c, we define the unique assignment ff c that falsifies c as the mapping Starting with clauses in ϕ, S-resolution is given by the consecutive application of rules R.1 to R.3 to derive new clauses c p with 0 ≤ p ≤ 1. Rule R.1 derives a clause c 0 from an original clause c in ϕ.Referring to the definition of P r(Q : ϕ) in Section 1, R.1 corresponds to the quantifier-free base case where ϕ is equivalent to false under any assignment that falsifies c. c ∈ ϕ c 0 (R.1) Similarly, R.2 reflects the quantifier-free base case in which ϕ is equivalent to true under any assignment τ ′ that is conform to the partial assignment The constructed clause c 1 then encodes the opposite of this satisfying (partial) assignment τ .We remark that finding such a τ in the premise of R.2 is NP-hard (equivalent to finding a solution of a propositional formula in CNF).This strong condition on τ is not essential for soundness and completeness and could be removed5 but, as mentioned above, facilitates a less technical presentation of generalized interpolation in Section 3.2.Another argument justifying the strong premise of R.2 is a potential integration of S-resolution into DPLLbased SSAT solvers since whenever a satisfying (partial) assignment τ of ϕ is found by an SSAT solver then τ meets the requirements of R.2.
Rule R.3 finally constitutes the actual resolution rule as known from the non-stochastic case.Depending on whether an existential or a randomized variable is resolved upon, the probability value of the resolvent clause is computed according to the semantics P r(Q : ϕ) defined in Section 1.
The derivation of a clause c p by R.1 from c, by R.2, and by R. and by (c p 1 1 , c p 2 2 ) ⊢ R.3 c p , respectively.Given rules R.1 to R.3, S-resolution is sound and complete in the following sense.
Lemma 3.1.Let clause c p be derivable by S-resolution and let Proof.We show the lemma by induction over the application of rules R.1, R.2, and R.3.The base case is given by rules R.1 and R.2.
is unsatisfiable for R.1 and tautological for R.2 which immediately establishes the result for the base case.Now assume that the assumption holds for all clauses in the premises of R.3, i.e.
P r where x j = x with j ≥ i+1.By definition of P r, for each τ with τ The result is obvious for j = i + 1.For j > i + 1, note that variables x i+1 , . . ., x j−1 do not occur in the derived clause (c 1 ∨ c 2 ).Hence, for k = j − 1 down to i + 1 we successively conclude that From case k = i + 1 the lemma follows.
Corollary 3.2 (Soundness of S-resolution).If the empty clause ∅ p is derivable by Sresolution from a given SSAT formula Q : Corollary 3.2 follows directly from Lemma 3.1, namely for the special case c p = ∅ p .Theorem 3.3 shows completeness of S-resolution.
In the induction step, we show that ∅ p is derivable for P r(Qx The above presentation of S-resolution differs slightly from [TF10] in order to avoid overhead in interpolant generation incurred when employing the original definition, like the necessity of enforcing particular resolution sequences.For readers familiar with [TF10], the particular modifications are: 1) derived clauses c p may also carry value p = 1, 2) former rules R.2 and R.5 are joined into the new rule R.2, and 3) former rules R.3 and R.4 are collapsed into rule R.3.These modifications do not affect soundness and completeness of S-resolution, confer Corollary 3.2 and Theorem 3.3.The advantage of the modification is that derivable clauses c p are forced to have a tight bound p in the sense that under each assignment which falsifies c, the satisfaction probability of the remaining subproblem exactly is p, confer Lemma 3.1.This fact confirms the conjecture from [TF10, page 14] about the existence of such clauses (c ∨ ℓ) p and allows for a generalized clause learning scheme to be integrated into DPLL-SSAT solvers: the idea is that under a partial assignment falsifying c, one may directly propagate literal ℓ as the satisfaction probability of the other branch, for which the negation of ℓ holds, is known to be p already.

Example of S-resolution. Consider the SSAT formula
3.2.Interpolating resolution for SSAT.We now devote our attention to the computation of generalized Craig interpolants for SSAT by means of an enhanced version of S-resolution, which is akin to resolution-based Craig interpolation for propositional SAT by Pudlák [Pud97].We remark that on SSAT formulae Q : (A ∧ B), Pudlák's algorithm, which has unsatisfiability of A ∧ B as precondition, will not work in general.When instead considering the unsatisfiable formula A ∧ B ∧ ¬S A,B with ¬S A,B in CNF then Pudlák's method would be applicable and would actually produce a generalized Craig interpolant.The main drawback of this approach however is the explicit construction of ¬S A,B , calling for explicit quantifier elimination.
In the following, we propose an algorithm based on S-resolution for computing generalized Craig interpolants which operates directly on A ∧ B without adding ¬S A,B , and thus does not comprise any preprocessing involving quantifier elimination.For this purpose, the rules of S-resolution are enhanced to deal with pairs (c p , I) of annotated clauses c p and propositional formulae I.Such formulae I are in a certain sense intermediate generalized interpolants, i.e. generalized interpolants for subformulae arising from instantiating some variables by partial assignments that falsify c, confer Lemma 3.4.Once a pair (∅ p , I) comprising the empty clause is derived, I thus is a generalized Craig interpolant for the given SSAT formula.This augmented S-resolution, which we call interpolating S-resolution, is defined by rules RI.1, RI.2, and RI.3.The construction of intermediate interpolants I in RI.1 and RI.3 coincides with the classical rules by Pudlák [Pud97], while RI.2 misses a corresponding counterpart.The rationale is that RI.2 (or rather R.2) refers to satisfying valuations τ of A ∧ B, which do not exist in classical interpolation.As A ∧ B becomes a tautology after substituting the partial assignment τ from R.2 into it, its quantified variant S A,B = ∃a 1 , . . ., b 1 , . . .: A ∧ B also becomes tautological under the same substitution This implies that the actual intermediate interpolant in RI.2 can be chosen arbitrarily over variables in V A,B .This freedom will allow us to control the geometric extent of generalized interpolants within the "don't care"-region provided by the models of S A,B , confer Corollary 3.6.
The following lemma establishes the theoretical foundation of computing generalized Craig interpolants by interpreting the derived pairs (c p , I).
x n be some SSAT formula, and the pair (c p , I) be derivable from Φ by interpolating S-resolution, where Then, for each τ : Proof.We prove the lemma by induction over application of the interpolating S-resolution rules RI.1, RI.2, and RI.3.In the base case, we can just apply RI.1 and RI.
and by construction of τ , For rule RI.2, we have Consequently, for any propositional formula This proves items 2 and 3 for the base case.
In the induction step, we now assume that the lemma holds for all clauses in the premises of rule RI.3.Then, by construction of I, item 1 clearly holds for I, i.e.Var (I) ⊆ V A,B .Induction hypothesis assumes that holds for ((c 1 ∨ ¬x j ) p 1 , I 1 ) and for each τ 1 : We therefore distinguish the three cases x j ∈ V A , x j ∈ V B , and First, let be x j ∈ V A .Then, I = I 1 ∨ I 2 .By induction hypothesis and by construction of I, Due to construction of τ , it holds in particular that 0 = P r A,x . Analogously, and thus 0 = P r A,¬x .
which implies P r B,x = P r B,¬x .We conclude from induction hypothesis that due to construction of τ .Note that if P r(Q : ϕ 1 ) = 0 and P r(Q : ϕ 2 ) = 0 then P r(Q : (ϕ 1 ∨ ϕ 2 )) = 0 since P r(Q : ϕ) = 0 if and only if ϕ is unsatisfiable. 6As a consequence, Second, let be x j ∈ V B .Then, I = I 1 ∧ I 2 .As x j / ∈ Var (A) ∪ Var (¬S A,B ) ∪ Var (¬I), with the same argument as above, Again following the reasoning above, we have and thus 0 = P r B,x as well as and thus 0 = P r B,¬x .Third, let be x j ∈ V A,B .Then, I = (¬x j ∨ I 1 ) ∧ (x j ∨ I 2 ), and we deduce and, in particular, 0 = P r A,x . Analogously, and, in particular, 0 = P r A,¬x . Furthermore, and, in particular, 0 = P r B,x . Finally, and, in particular, 0 = P r B,¬x .
Having shown that P r A,x = P r A,¬x = P r B,x = P r B,¬x = 0, we can now prove the intermediate result above, i.e.P r A = P r B = 0.If Q j = ∃ then P r A = max(P r A,x , P r A,¬x ) = 0 and P r B = max(P r B,x , P r B,¬x ) = 0, and if To finish the proof, we finally need to show that items 2 and 3, i.e.

P r(
If j = i+1 then the result is obvious.Otherwise, i.e. if j > i+1, the variables x i+1 , . . ., x j−1 do not occur in the derived clause (c By definition of assignment τ , for k = j − 1 down to i + 1 we may therefore successively conclude that From case k = i + 1 the result immediately follows. Completeness of S-resolution, as stated in Theorem 3.3, together with above Lemma 3.4, applied to the derived pair (∅ p , I), yields Corollary 3.5 (Generalized Craig interpolants computation).If Q : (A ∧ B) is an SSAT formula then a generalized Craig interpolant for (A, B) can be computed by interpolating S-resolution.
Note that computation of generalized interpolants does not depend on the actual truth state of A ∧ B. The next observation facilitates to effectively control the geometric extent of generalized Craig interpolants within the "don't care"-region S A,B .This result will be useful within applications of generalized Craig interpolation to the symbolic analysis of probabilistic systems being investigated in Section 4. Proof.The proof works analogously to the one of Lemma 3.4.For the base case, it is clear that the desired property for RI.
Then, we can modify the induction hypothesis: for case "I = true in RI.2", we assume that P r(Q The induction step then follows the same reasoning as in the remaining proof of Lemma 3.4.

Applications of generalized Craig interpolation to analysis of probabilistic systems
In this section, we investigate the application of generalized Craig interpolation to the symbolic analysis of probabilistic systems.We direct our attention to two analysis goals, namely to probabilistic state reachability in Section 4.1 as well as to probabilistic region stability in Section 4.2.As a system model, we consider finite-state Markov decision processes (MDPs) [Bel57].An MDP M = (ı, S, Act , ps(•, •, •)) is a finite-state system in which state changes are subject to non-deterministic selection among available actions followed by a probabilistic choice among potential successor states, while the probability distribution of the latter choice depends on the selected action.More precisely, S is a finite set of states, ı ∈ S is the initial state, Act is a finite set of actions, and ps(s, a, s ′ ) gives the probability that M performs a transition step from s ∈ S to s ′ ∈ S under action a ∈ Act.
For an example, consider the simple MDP M from Figure 4 where ı = i, S = {i, f, e, s}, and Act = {a, b}.A transition ps(z, act , z ′ ) = p > 0 is indicated by an arrow from z to z ′ accompanied by action act and by the corresponding transition probability p.If two states are not connected by an arrow then the corresponding transition probability is 0, and if no action is specified then that transition is feasible for all actions.A probability measure of an MDP is well-defined only if considering a particular scheduler σ resolving the non-determinism.That is, σ schedules the action for the current state.Different such schedulers σ have been investigated in the literature, confer, for instance, [BHKH05]: σ may select the next action either in a deterministic or randomized fashion.In both cases, σ may have access to and thus base its selection on either the current state only or the full system history.In our scenarios, we do not manipulate schedulers explicitly, but define the probability measures obtained by worst-case deterministic schedulers achieving maximum or minimum, depending on how the worst case is understood, probability of reaching target states directly as the limit of a recursive function over N.For each k ∈ N, the recursive function determines the maximum or minimum probability of reaching target states within k steps, as achieved by a worst-case history-dependent scheduler.As a worst-case historydependent scheduler will always maximize or minimize the probability of reaching the target within the remaining number of steps, its performance coincides with the probabilities computed by a backward induction resolving non-deterministic choices by taking the maximum or minimum, respectively, of the probability values obtained from the next-lower recursion depth.
All experiments mentioned in this section were performed on a 1.83 GHz Intel Core 2 Duo machine with 1 GByte physical memory running Linux.
4.1.Interpolation-based probabilistic state reachability.Let be given an MDP M and a set of target states Target ⊆ S in M. With regard to probabilistic state reachability, the goal is to compute the probability of reaching the target states Target from the initial state ı under some explicitly or implicitly (e.g., by an optimality condition) given scheduler σ.In most applications, the target states are considered to be bad, for instance, to be fatal system errors, such that one is faced with computing the worst-case probability of reaching the bad states, i.e. maximizing the reachability probability under each possible scheduler.This maximum probability MaxReach(M, Target ) can be defined directly as the limit of the maximum step-bounded probability of reaching the target states as similarly shown by [FHH + 11, Lemma 1], i.e.

MaxReach(M, Target
gives the maximum probability of reaching the target states from state s ∈ S within k steps (k ∈ N) under each possible scheduler.For some threshold value θ ∈ [0, 1], the safety verification problem is to decide whether the worst-case probability of reaching the bad states is at most θ, i.e. to decide whether MaxReach(M, Target ) ≤ θ (4.1) holds.
In previous work [FHT08, FTE10, TEF11], we have established a symbolic falsification procedure for above problem 4.1.Though this approach is based on SSMT, i.e. an arithmetic extension of SSAT, and works for the more general class of discrete-time probabilistic hybrid systems, which roughly are MDPs with arithmetic-logical transition guards and actions, the same procedure restricted to SSAT is applicable for finite-state MDPs.The key idea here is to adapt bounded model checking (BMC) [BCCZ99] to the probabilistic case by encoding step-bounded reachability as an SSAT problem: like in classical BMC, the initial states, the transition relation, and the target states of an MDP M are symbolically encoded by propositional formulae in CNF, namely by Init(s), Trans(s, nt, pt, s ′ ), and Target (s), respectively, where the propositional variable vector s represents the system state before and s ′ after a transition step.To keep track of the non-deterministic and probabilistic selections of transitions in Trans(s, nt, pt, s ′ ), we further introduce propositional variables nt and pt to encode non-deterministic selection among available actions and to describe probabilistic choice of the successor state, respectively.Assignments to these variables determine which of possibly multiple available transitions departing from s is taken.In contrast to traditional BMC, all variables are quantified: all state variables s and s ′ are existentially quantified in the prefixes Q s and Q s ′ .The transition-selection variables nt encoding non-deterministic choice are existentially quantified by Q nt , while the probabilistic selector variables pt are bound by randomized quantifiers in Q pt . 7For the sake of clarity, let be t := nt ∪ pt and According to [FHT08, Proposition 1], the maximum probability of reaching the target states in M from the initial state ı within k transition steps, i.e.MaxReach k M,Target (ı), is equal to the satisfaction probability Observe that each value lb k = MaxReach k M,Target (ı) can be computed by an SSAT solver and constitutes a lower bound of the maximum reachability probability MaxReach(M, Target ) due to monotonicity of the chain MaxReach k M,Target (ı) k∈N .This symbolic approach, called probabilistic bounded model checking (PBMC), is able to falsify safety properties of shape 4.1 once a value lb k > θ is computed for some k.
However, the development of a corresponding counterpart based on SSAT that is able to compute upper bounds ub k of the maximum reachability probability MaxReach(M, Target ) was left as an open challenge.Such an approach would permit to verify safety properties of shape 4.1 once a value ub k ≤ θ is computed for some k.
In the remainder of this section, we propose such a symbolic verification procedure for above problem 4.1 by means of generalized Craig interpolation.This verification method proceeds in two phases.Phase 1 computes a symbolic representation of an overapproximation of the backward reachable state set, where a state is backward reachable if it is the origin of a transition sequence leading into Target.Phase 1 can be integrated into PBMC, as used to falsify the probabilistic safety property.Whenever such falsification fails for a given step depth k, we apply generalized Craig interpolation to the (just failed) PBMC proof to compute a symbolic overapproximation of the backward reachable state set at depth k and then proceed to PBMC at some higher depth k ′ > k.As an alternative to the integration into PBMC, interpolants describing the backward reachable state sets can be successively extended by "stepping" them by prepending another transition, as explained below.In either case, phase 1 ends when the backward reachable state set becomes stable, in which case we 7 Non-deterministic branching of n alternatives can be represented by a binary tree of depth ⌈log 2 n⌉ and probabilistic branching by a sequence of at most n − 1 binary branches, yielding ⌈log 2 n⌉ existential and n − 1 randomized quantifiers, respectively.
have computed a symbolic overapproximation of the whole backward reachable state set.In phase 2, we construct an SSAT formula with parameter k that forces the system to stay within the backward reachable state set for k steps.The maximum satisfaction probability of that SSAT formula then gives an upper bound on the maximum probability of reaching the target states.The rationale is that system runs leaving the backward reachable state set will never reach the target states.
Phase 1.Given an SSAT encoding of an MDP M as above, the state-set predicate B k (s) for k ∈ N over state variables s is inductively defined as • B 0 (s) := Target (s), and Observe that each generalized Craig interpolant I k+1 (s) can be computed by interpolating S-resolution if we rewrite B k (s) into CNF, the latter being always possible in linear time by adding auxiliary V A -variables.During computation of each I k+1 (s), we take I = true in every application of rule RI.2 such that B k (s) overapproximates all system states backward reachable from target states within k steps due to Corollary 3.6.Whenever B k (s) has stabilized, i.e.B k+1 (s) ⇒ B k (s) , we can be sure that B(s) := B k (s) overapproximates all backward reachable states.It is obvious that B k (s) finally stabilizes in the finite-state case.
Note that parameter j ≥ 1 can be chosen arbitrarily, i.e. the system may execute any number of transitions until state s j−1 is reached since this does not destroy the "backwardoverapproximating" property of B k+1 (s).The rationale of having parameter j is the additional freedom in constructing generalized interpolants since j may influence the shape of I k+1 (s), as we will see in the example below.
We remark that phase 1 is a clean generalization of McMillan's approach [McM03,McM05], the latter having unsatisfiability of A ∧ B as precondition in each iteration k. 8 Phase 2. Having symbolically described all backward reachable states by the predicate B(s), upper bounds ub k of the maximum probability MaxReach(M, Target ) of reaching the target states Target can now be computed by SSAT solving applied to First observe that the formula above excludes all system runs that leave the set of backward reachable states.This is sound since leaving B(s) means to never reach the Target (s) states.Second, the system behavior becomes more and more constrained for increasing k, i.e. the ub k 's are monotonically decreasing.With regard to solving problem 4.1, the safety property MaxReach(M, Target ) ≤ θ is verified by the procedure above once an upper bound ub k ≤ θ is computed for some k.
Example.To illustrate the symbolic approach to probabilistic safety verification based on generalized Craig interpolation, consider the simple MDP M from Figure 4 with s being the only target state.
With regard to the symbolic encoding of M, we introduce four Boolean variables i, f, e, s to describe the state space.The literal i means that M is in state i while literal ¬i expresses that M is not in i.The same holds analogously for the other states.Note that, in order to encode valid system states, we have to ensure that exactly one of the variables i, f, e, s is true in each time instant.The encoding of this constraint will be explained later on.The non-deterministic choice between actions a and b is encoded by a Boolean variable act while action a is represented by the positive literal act and action b by the negative literal ¬act.For the three probabilistic choices in M, we introduce three Boolean variables pi for the choice from i, pea for the choice from e under action a, and peb for the choice from e under action b.Recall that all state variables as well as variables encoding nondeterministic selection are existentially quantified while variables describing probabilistic choices are bound by randomized quantifiers.We thus obtain the corresponding quantifier prefixes The formulae in CNF representing the initial state and the target states are specified by Init(s) = (i) ∧ (¬f ) ∧ (¬e) ∧ (¬s) and Target (s) = (s) , respectively.To obtain the transition relation predicate, we encode each single transition step.For instance, a step from state e to f under action a can be encoded by the implication (e ∧ act ∧ ¬pea) ⇒ f ′ , the latter being equivalent to the clause (¬e ∨ ¬act ∨ pea ∨ f ′ ).The conjunction of all these clauses then encodes the full system behavior symbolically.Since we represent each system state by an own Boolean variable, as mentioned above, we need to enforce that exactly one of the primed state variables, constituting the system state after the transition step, carries value true.This is simply achieved by the formula in CNF exactly one(i  4 for different values of parameter j.In addition to the formal presentation of the predicates, the concrete state sets are given explicitly.
We are now interested in the maximum probability of reaching the target state s from the initial state i.Applying the PBMC scheme 4.2, we are only able to compute lower bounds lb k of the maximum reachability probability, for instance, lb 0 = lb 1 = 0, lb 2 = lb 3 = 0.54, lb 4 = lb 5 = 0.693, . .., lb 20 = 0.817971, . .., lb 100 = 0.81818181818181803208.The latter results were achieved by employing the SSMT solver SiSAT 9 [TEF11] that provides a convenient input language for specifying probabilistic transition systems like MDPs.Unwinding of the system's transition relation for increasing step bounds k, i.e. the construction of the SSAT formulae specified by scheme 4.2 in our context, is done fully automatically.Furthermore, several algorithmic optimizations are exploited to improve performance of the tool.Concerning runtime, all 100 SSAT formulae were solved within 37.05 seconds, while computation of the first 20 lower bounds lb 0 to lb 20 just needed 370 milliseconds.The highest computation time for a single SSAT problem was obtained for lb 100 , namely 1.14 seconds.The evolution of the lb k 's up to k = 20 is presented graphically on the right of Figure 5.Given these results, one can suppose that the lower bounds converge to and never exceed value 9 /11 = 0.81.However, there is no mathematical guarantee for the latter guess.
To overcome this limitation, we first apply the generalized interpolation scheme 4.3 to compute an overapproximation of the backward reachable state set.The latter then facilitates to compute upper bounds ub k of the maximum reachability probability by means of scheme 4.4.In order to compute the generalized Craig interpolants I k+1 (s j−1 ) automatically during solving the SSAT formulae 4.3, we have implemented a simple DPLL-based SSAT solver that integrates interpolating S-resolution.As mentioned earlier, scheme 4.3 allows freedom in choosing parameter j ≥ 1.This parameter permits to specify the number j − 1 of transition steps until system state s j−1 is reached, which is the common state of formula parts A and B. The experimental results of applying the generalized interpolation scheme 4.3 on the MDP M for different values of j are shown in Table 1. 9 The SiSAT tool is available on http://sisat.gforge.avacs.org/.From the results of Table 1, we observe that the value of j actually has an impact on the shape of the resulting interpolants.Let us consider the first interpolants I 1 which overapproximate all states backward reachable in one step.Clearly, the exact set of states backward reachable in one step is {e, s}.For j = 1, the overapproximated set {f, e, s} computed by the procedure is too coarse and actually contains a state which is not backward reachable at all, namely f .Though the set {i, e, s} for j = 2 actually consists of backward reachable states only, it is not tight enough as the initial state i is backward reachable after two steps only.For j = 3, we achieved the precise set {e, s}.Continuing the scheme for j = 1, I 2 and then I 3 become true meaning that the overapproximated set of the backward reachable states B covers the whole state space.Using this inconclusive result in scheme 4.4 yields only trivial upper bounds ub k = 1 for all k.With regard to j = 2, the interpolation process has stabilized after computation of I 2 .The resulting state set {i, e, s} encoded by B actually is the precise set of all backward reachable states.Though I 1 was too coarse, this could be compensated in the computation of I 2 .For j = 3, we observe that all generalized interpolants I 1 , I 2 , and I 3 describe the corresponding backward reachable states accurately, thus leading to the precise set of all backward reachable states.The computed state sets for j = 3 are illustrated on the left of Figure 5.After having examined the results above, it seems that the greater the value of j, i.e. the more transition steps are performed, the more accurate the resulting overapproximation of the backward reachable state set.
Concerning runtime, each generalized Craig interpolant was computed by the interpolating DPLL-based SSAT solver within fractions of a second, where the highest runtime of 36 milliseconds was observed when computing I 3 for j = 3.
Having computed a symbolic representation B(s) of an overapproximation of all backward reachable states, we are now able to compute upper bounds ub k of the maximum reachability probability by means of scheme 4.4, where we use B(s) = ¬f ∨ s as obtained for j = 3 as well as for j = 2. Again employing the SSMT tool SiSAT, some of the results are ub 0 = 1, ub 1 = ub 2 = 0.9, ub 3 = ub 4 = 0.855, ub 5 = ub 6 = 0.83475, . .., ub 20 = 0.818243, . .., ub 100 = 0.81818181818181821948.Concerning runtime, all 100 SSAT formulae were solved within 54.76 seconds, while computation of the first 20 upper bounds ub 0 to ub 20 just needed 400 milliseconds.The highest computation time for a single SSAT problem was obtained for ub 100 , namely 1.77 seconds.The evolution of the ub k 's up to k = 20 is presented graphically on the right of Figure 5.
In addition to estimating the maximum reachability probability from below using the PBMC scheme 4.2, we are now able to estimate the probability also from above.In our example, we can safely conclude that 0.81818181818181803208 = lb 100 ≤ MaxReach(M, {s}) ≤ ub 100 = 0.81818181818181821948 holds where the difference ub 100 − lb 100 is below 10 −15 .The total computational effort for obtaining this precise result is about 92 seconds.If reduced accuracy suffices then runtime obviously improves.For instance, the fact 0.817971 = lb 20 ≤ MaxReach(M, {s}) ≤ ub 20 = 0.818243 with ub 20 − lb 20 < 10 −3 was deduced within one second.With regard to the safety verification problem 4.1, system safety for each threshold value θ with θ < 0.817971 or θ ≥ 0.818243 is falsified or verified, respectively, within a second.
With respect to competitive and more established methods based on value or policy iteration, we observed that the runtime of our prototypic tool chain does not compare favorably on the simple probabilistic reachability problem above.For instance, the version 4.0.1 of the PRISM model checker 10 [KNP11] solved the problem in about 600 milliseconds with a precision of 10 −15 (returning the result 0.8181818181818175).
In spite of the above fact, we have identified two promising directions for future research where probabilistic reachability analysis based on generalized Craig interpolation may pay off: (1) Embedding the same interpolation process into SSMT [FHT08], i.e. an arithmetic extension of SSAT, renders the generalized Craig interpolation scheme 4.3 directly applicable to probabilistic hybrid discrete-continuous systems, yielding a symbolic overapproximation of the backward reachable state set.As for the finite state case, scheme 4.4 then facilitates computing upper bounds of the reachability probability for hybrid systems by means of SSMT solving, just as already pursued when computing lower bounds according to the PBMC scheme 4.2 [FHT08, TF08, FTE10, TEF11].
It is important to remark that classical value or policy iteration procedures are not directly applicable in the hybrid state case but even after finite-state abstraction, confer, for instance, [ZSR + 10, FHH + 11].
(2) Due to its symbolic nature, the analysis procedures based on SSAT and SSMT support compact representations of concurrent probabilistic (finite-state and hybrid) systems without an explicit construction of the product automaton [TEF11], the latter being of size exponential in the number of parallel components.This fact constitutes a strong argument that these symbolic procedures are able to alleviate the state explosion problem, which arises necessarily when applying explicit-state algorithms or methods based on finite-state abstraction refinement.
4.2.Interpolation-based probabilistic region stability.In addition to probabilistic state reachability being investigated in the previous section, we now address the problem of probabilistic region stability.For that purpose, we take into account the notion of region stability as introduced for non-probabilistic hybrid systems by Podelski and Wagner in [PW07a,PW07b].According to their definition, given some set R of states called region, a (non-probabilistic) system is called stable with respect to region R iff for every infinite run s 0 , s 1 , . . ., s i , . . . of the system, i.e. for every infinite sequence of states that follows the transition relation, there is some point of time i ≥ 0 such that from i on the system remains in R forever, i.e. ∃i ≥ 0 ∀j ≥ i : s j ∈ R.
Concerning the probabilistic case, several adaptations of region stability seem feasible, some of which pose measurability problems.Our main concern in this article being to identify potential application areas for generalized Craig interpolation rather than to discuss semantic issues of probabilistic stabilization, we do study a simple notion of probabilistic region stability in the sequel which circumvents measure-theoretic issues.As for probabilistic state reachability, we aim at defining a reasonable probability measure as the limit of the value of a recursive function defining the corresponding step-bounded measures.Intuitively, we consider finite run prefixes s 0 , s 1 , . . ., s i such that from time point i on the probabilistic system remains in the given region forever under each possible future behavior, i.e. independent of the non-deterministic and probabilistic choices the system will take.The latter fact is guaranteed whenever the system has reached an invariance kernel of the given region that can never be left.The probability measure is then defined by the minimum probability of reaching the maximal invariance kernel.
Formally, let be given an MDP M and a set of states Region ⊆ S called the stabilization region or the region for short.An invariance kernel K ⊆ Region with respect to M is a set of states from Region such that there is no transition from a state in K to a state outside K, i.e. there does not exist a tuple (z, act , z ′ ) ∈ K × Act × (S \ K) : ps(z, act , z ′ ) > 0. An invariance kernel K is called maximal if adding any new states to K does not lead to an invariance kernel, i.e. each K ∪ Z with Z ⊆ Region \ K and Z = ∅ is not an invariance kernel.Note that the maximal invariance kernel is unique.The latter fact can be simply shown using the observation that the set of all invariance kernels K ⊆ Region with respect to M is closed under union.Let K * ⊆ Region be the (unique) maximal invariance kernel with respect to M.Then, the minimum probability MinStable(M, Region) that M is stable with respect to Region is defined as the limit of the minimum step-bounded probability of reaching the maximal invariance kernel K * , i.e.

MinStable(M, Region
gives the minimum probability of reaching K * from state s ∈ S within k steps (k ∈ N) under each possible scheduler.When considering stabilization within Region as the desired property then the value of MinStable(M, Region) establishes the probability of stabilizing in worst case, i.e. under an optimal adversarial scheduler.For some threshold value θ ∈ [0, 1], the stability verification problem then is to decide whether this worst-case probability is at least θ, i.e. to decide whether MinStable(M, Region) ≥ θ (4.5) holds.
In what follows, we propose a symbolic verification procedure for above problem 4.5.In a first phase, we compute a symbolic representation of an invariance kernel by means of generalized Craig interpolation.The main idea here is to iteratively eliminate states z not belonging to an invariance kernel from Region until a fixed point is reached.Due to the use of interpolation, the set of such states z is overapproximated in each iteration, meaning that potentially too many states are removed.This implies that the resulting invariance kernel is not necessarily maximal.However, each invariance kernel can be used for computing valid lower bounds of MinStable(M, Region).The latter computation then is performed in a second phase by means of SSAT-based bounded reachability checking.Once a lower bound lb ≥ θ is computed, property 4.5 is verified.
Phase 1.Let be given an SSAT encoding of an MDP M as explained in Section 4.1 as well as some propositional formula Region(s) encoding the stabilization region Region.Then, the state-set predicate R k (s) for k ∈ N over state variables s is inductively defined as • R 0 (s) := Region(s), and Trans(s j−1 , t j , s j ) ∧ ¬R k (s j ) . (4.6) Observe that each I k+1 (s) can be computed by interpolating S-resolution if we rewrite ¬R k (s) into CNF, the latter being always possible in linear time by adding auxiliary V Avariables.During computation of each I k+1 (s), we take I = true in every application of rule RI.2 such that I k+1 (s) overapproximates all system states directly leading to the state set ¬R k (s) due to Corollary 3.6.As a consequence, from each state in R k+1 (s) = R k (s) ∧ ¬I k+1 (s) it is infeasible to leave the set R k (s) in one step.Whenever the chain R k (s) has stabilized, i.e.R k (s) ⇒ R k+1 (s) , it follows that K(s) := R k (s) is an invariance kernel of Region(s) with respect to M, i.e. once entered, the system cannot leave the set K(s).Obviously, the chain R k (s) eventually stabilizes in the finite-state case.
Similar to scheme 4.3, parameter j ≥ 1 can be chosen arbitrarily, i.e. the system may execute any number of transitions until state s j−1 is reached since this does not destroy the overapproximation property of I k+1 (s).The presence of parameter j gives us additional freedom in constructing generalized interpolants as j may influence the shape of I k+1 (s), as we will see in the example below.Phase 2. Having computed a symbolic representation K(s) of a (not necessarily maximal) invariance kernel K with respect to M, we now compute lower bounds of the minimum probability MinStable(M, Region) of stabilizing within Region by means of SSAT solving.To this end, first observe that MinReach k M,K * (ı) is monotonic in k which implies that MinReach k M,K * (ı) ≤ MinStable(M, Region) for each k ∈ N. Let K * be the unique maximal invariance kernel with respect to M.Then, K ⊆ K * since K is an invariance kernel and the maximal invariance kernel K * is unique.As a consequence, ı) establishes a lower bound of MinStable(M, Region).In principle, MinReach k M,K (ı) can be reduced to an SSAT formula similar to PBMC scheme 4.2.The difference, however, is that we need to minimize the satisfaction probability.The latter can be achieved by a very similar SSAT encoding scheme that exploits universal quantifiers to resolve non-deterministic transition choices.Universal quantifiers then aim at minimizing the satisfaction probability.Though the SSMT solver SiSAT actually supports universal quantification, confer [TF09, TEF11], we instead stay within the scope of the logic exposed in this article and rephrase minimum probabilistic state reachability as a maximum probabilistic state avoidance problem as follows: ı) which can be proven by straightforward induction over step bound k.In the base cases, i.e. if k = 0 and s ∈ K or s / ∈ K, the statement is clear.Within the induction step, we exploit the property that min i j p i,j • P i,j = 1 − max i j p i,j • (1 − P i,j ) is true for 0 ≤ P i,j ≤ 1 and j p i,j = 1.
The problem of computing the value of MaxAvoid k M,K (ı) can be reduced to computing the maximum probability of satisfaction of the SSAT formula According to the definition of MaxAvoid k M,K (ı), the propositional formula of Φ k M,K describes all system runs avoiding the invariance kernel K for at least k transition steps.That is, all assignments encoding such latter runs yield satisfaction probability 1, while assignments encoding runs that visit K within the first k steps do not satisfy the propositional formula, thus leading to satisfaction probability 0. As a consequence, MaxAvoid k M,K (ı) = P r Φ k M,K .4 for different values of parameter j.In addition to the symbolic representations computed by interpolation, the concrete state sets represented by these predicates are stated explicitly.
Using above facts, we deduce the following relation This finally enables us to compute lower bounds lb k of MinStable(M, Region) using the scheme lb the latter being addressed by SSAT solving.Note that the system behavior encoded by Φ k M,K becomes more and more constrained for increasing k such that the satisfaction probabilities P r Φ k M,K are monotonically decreasing.This in turn means that the lb k 's are monotonically increasing.With regard to solving the stability verification problem 4.5, the desired property MinStable(M, Region) ≥ θ is verified by the procedure above once a lower bound lb k ≥ θ is computed for some k.
Example.To illustrate the symbolic approach to probabilistic region stability based on generalized Craig interpolation, again consider the simple MDP M from Figure 4 where the symbolic representation of the region is given by Region(s) = ¬f .That is, the region in which M should stabilize consists of the states i, e, and s.The symbolic SSAT encoding of M being introduced in the example of Section 4.1 is reused in the following.
We are first interested in computing an invariance kernel K ⊆ Region(s) with respect to M by means of the generalized Craig interpolation scheme 4.6.To cope with the latter scheme automatically, we employ the simple interpolating DPLL-based SSAT solver mentioned in Section 4.1.The results of these experiments for different values of j are shown in Table 2.It is not hard to see that the unique maximal invariance kernel K * consists of the state s only.Recall that each interpolant I k+1 overapproximates all system states directly leading to the state set ¬R k .When setting parameter j to value 1 or 2, we observe that interpolant I 1 = true is too coarse since it includes the whole state space.This causes the trivial invariance kernel K = false representing the empty set.For choices j = 3 and j = 4, however, I 1 = ¬s describes the exact set of states which lead to ¬R 0 = ¬Region.Finally, the non-trivial invariance kernel K = ¬f ∧ s consisting of state s only is computed.Note that K actually is the maximal invariance kernel.The computed state sets for j ∈ {3, 4} are illustrated on the left of Figure 6.These results confirm the observation made from the experiments of Section 4.1, namely that the greater the value of j, i.e. the more transition steps are performed, the more accurate the resulting overapproximations.Concerning runtime, each generalized Craig interpolant was computed by the interpolating DPLL-based SSAT solver within fractions of a second, where the highest runtime of 88 milliseconds was observed when computing I 2 for j = 4.
Having computed an invariance kernel K(s) ⊆ Region(s) with respect to M, we are now able to compute lower bounds lb k of the minimum probability that M is stable with respect to Region by means of scheme 4.7, where we use K(s) = ¬f ∧ s as obtained for j ∈ {3, 4}.Employing the SSMT tool SiSAT, some of the results are lb 0 = lb 1 = 0, lb 2 = lb 3 = 0.45, lb 4 = lb 5 = 0.54, . .., lb 100 = 0.54.Concerning runtime, all 100 SSAT formulae were solved within 88.16 seconds, while computation of the first 20 lower bounds lb 0 to lb 20 just needed 600 milliseconds.The highest computation time for a single SSAT problem was obtained for lb 100 , namely 2.91 seconds.The evolution of the lb k 's up to k = 10 is presented graphically on the right of Figure 6.With regard to the stability verification problem 4.5, the desired property MinStable(M, Region) ≥ θ is verified for each threshold value θ ≤ 0.54 within a second.
Concerning competitive approaches, we remark that the probabilistic model checking tool PRISM 4.0.1 [KNP11] is also able to deal with probabilistic region stability of MDPs by means of path operators. 11To determine the value of MinStable(M, Region) for the example above, we used the specification Pmin=?[F P>=1 [G (!f)]] meaning that we are interested in the minimum probability (Pmin=?) that finally (F) the system satisfies almost surely (P>=1) the property that globally (G) state f is never visited (!f).PRISM solved the problem in 644 milliseconds returning the result 0.54.
As discussed for the case of probabilistic state reachability at the end of Section 4.1, we are also confident that the presented approach to probabilistic region stability based on generalized Craig interpolation becomes beneficial when adapted to probabilistic hybrid systems, where the classical procedures are not directly applicable.Furthermore, a particular pay-off is expected when dealing with concurrent probabilistic systems owing to the symbolic nature of the interpolation-based technique.

Conclusion and future work
In this article, we elaborated on the idea of Craig interpolation for stochastic Boolean satisfiability.In consideration of the difficulties that arise in this stochastic extension of the propositional satisfiability problem, we first proposed a suitable definition of a generalized Craig interpolant and then presented an algorithm for automatically computing such interpolants.For the latter purpose, we enhanced the SSAT resolution calculus by corresponding rules for the construction of generalized Craig interpolants.We furthermore demonstrated two applications of generalized Craig interpolation as a means of automated analysis of probabilistic finite-state systems.
We first considered probabilistic state reachability.The resulting procedure is able to verify probabilistic safety requirements of the form "the worst-case probability of reaching undesirable system states is at most some given safety threshold".This complements the existing SSAT-based probabilistic bounded model checking approach, which mechanizes falsification of such safety properties.As a second application, we gave attention to probabilistic region stability and presented a symbolic technique for verifying stability properties like "the worst-case probability that the system stabilizes within some given region is at least some given safety threshold".
For future work, we are particularly interested in the adaptation of generalized Craig interpolation to SSMT, i.e. the extension of SSAT with arithmetic theories.One of the most challenging issues here will be the enhancement of the SSAT resolution calculus as well as the corresponding rules for the construction of generalized interpolants in order to deal with SSMT problems.The ability of computing generalized Craig interpolants for SSMT would lift the interpolation schemes 4.3 and 4.6 to SSMT problems, thus establishing symbolic verification approaches to probabilistic state reachability and to probabilistic region stability for discrete-time probabilistic hybrid systems.We are confident that such symbolic procedures will prove beneficial within the analysis of probabilistic hybrid systems, in particular when systems with a high degree of concurrency are considered.

Figure 1 :
Figure 1: Semantics of an SSAT formula depicted as a tree.

Figure 2 :
Figure 2: Geometric interpretation of a generalized Craig interpolant I. V A -, V B -, and V A,B -axes denote assignments of variables occurring only in A, only in B, and in both A and B, respectively.
ff c : Var (c) → B such that ∀x ∈ Var (c) : ff c (x) = true ; ¬x ∈ c, false ; x ∈ c.Consequently, c evaluates to false under assignment ff c .
2. Item 1 clearly holds for both rules since I contains only variables in V A,B .Let us consider RI.1 first.If c ∈ A then I = false.By construction of τ , i.e. c evaluates to false under τ , it follows

Figure 3 :Figure 4 :
Figure 3: Example of interpolating S-resolution and illustration of the resulting generalized Craig interpolants by means of Karnaugh-Veitch diagrams.Arrows denote applications of the specified interpolating S-resolution rules, while DC stands for any formula over V A,B as in rule RI.2.

I 2 ,Figure 5 :
Figure 5: Illustration of the computed state sets for MDP M by the generalized interpolation scheme 4.3 with j = 3 (left), and lower bounds lb k and upper bounds ub k of the maximum probability of reaching target state s over number k of transition steps computed by schemes 4.2 and 4.4, respectively (right).

Figure 6 :
Figure 6: Illustration of the computed state sets for MDP M by the generalized interpolation scheme 4.6 with j ∈ {3, 4} (left), and lower bounds lb k of the minimum probability of reaching the invariance kernel K = {s} over number k of transition steps computed by scheme 4.7 (right).

Table 1 :
Experimental results of applying the generalized interpolation scheme 4.3 on M from Figure

Table 2 :
Experimental results of applying the generalized interpolation scheme 4.6 on M from Figure