Sound approximate and asymptotic probabilistic bisimulations for PCTL

We tackle the problem of establishing the soundness of approximate bisimilarity with respect to PCTL and its relaxed semantics. To this purpose, we consider a notion of bisimilarity inspired by the one introduced by Desharnais, Laviolette, and Tracol, and parametric with respect to an approximation error $\delta$, and to the depth $n$ of the observation along traces. Essentially, our soundness theorem establishes that, when a state $q$ satisfies a given formula up-to error $\delta$ and steps $n$, and $q$ is bisimilar to $q'$ up-to error $\delta'$ and enough steps, we prove that $q'$ also satisfies the formula up-to a suitable error $\delta"$ and steps $n$. The new error $\delta"$ is computed from $\delta$, $\delta'$ and the formula, and only depends linearly on $n$. We provide a detailed overview of our soundness proof. We extend our bisimilarity notion to families of states, thus obtaining an asymptotic equivalence on such families. We then consider an asymptotic satisfaction relation for PCTL formulae, and prove that asymptotically equivalent families of states asymptotically satisfy the same formulae.


Introduction
The behaviour of many real-world systems can be formally modelled as probabilistic processes, e.g. as discrete-time Markov chains.Specifying and verifying properties on these systems requires probabilistic versions of temporal logics, such as PCTL [HJ94].PCTL allows to express probability bounds using the formula Pr ≥π [ψ], which is satisfied by those states starting from which the path formula ψ holds with probability ≥ π.A well-known issue is that real-world systems can have tiny deviations from their mathematical models, while logical properties, such as those written in PCTL, impose sharp constraints on the behaviour.To address this issue, one can use a relaxed semantics for PCTL, as in [DAK12].There, the semantics of formulae is parameterised over the error δ ≥ 0 one is willing to tolerate.While in the standard semantics of Pr ≥π [ψ] the bound ≥ π is exact, in relaxed PCTL this bound is weakened to ≥ π − δ.So, the relaxed semantics generalises the standard PCTL semantics of [HJ94], which can be obtained by choosing δ = 0. Instead, choosing an error Figure 1: A Markov chain modelling repeated tosses of a fair coin.δ > 0 effectively provides a way to measure "how much" a state satisfies a given formula: some states might require only a very small error, while others a much larger one.
When dealing with temporal logics such as PCTL, one often wants to study some notion of state equivalence which preserves the semantics of formulae: that is, when two states are equivalent, they satisfy the same formulae.For instance, probabilistic bisimilarities like those in [DGJP10, DEP02,LS91] preserve the semantics of formulae for PCTL and other temporal logics.Although strict probabilistic bisimilarity preserves the semantics of relaxed PCTL, it is not robust against small deviations in the probability of transitions in Markov chains [GJS90].A possible approach to deal with this issue is to also relax the notion of probabilistic bisimilarity, by making it parametric with respect to an error δ [DAK12].Relaxing bisimilarity in this way poses a choice regarding which properties of the strict probabilistic bisimilarity are to be kept.In particular, transitivity is enjoyed by the strict probabilistic bisimilarity, but it is not desirable for the relaxed notion.Indeed, we could have three states q, q and q where the behaviour of q and q is similar enough (within the error δ), the behaviour of q and q is also similar enough (within δ), but the distance between q and q is larger than the allowed error δ.At best, we can have a sort of "triangular inequality", where q and q can still be related but only with a larger error 2 • δ.
Bisimilarity is usually defined by coinduction, essentially requiring that the relation is preserved along an arbitrarily long sequence of moves.Still, in some settings, observing the behaviour over a very long run is undesirable.For instance, consider the PCTL formula φ = Pr ≥0.5 [true U ≤n a], which is satisfied by those states from which, with probability ≥ 0.5, a is satisfied within n steps.In this case, a behavioural equivalence relation that preserves the semantics of φ can neglect the long-run behaviour after n steps.More generally, if all the until operators are bounded, as in φ 1 U ≤k φ 2 , then each formula has an upper bound of steps n after which a behavioural equivalence relation can ignore what happens next.Observing the behaviour after this upper bound is unnecessarily strict, and indeed in some settings it is customary to neglect what happens in the very long run.For instance, a real-world player repeatedly tossing a coin is usually considered equivalent to a Markov chain with two states and four transitions with probability 1 /2 (see Figure 1), even if in the long run the real-world system will diverge from the ideal one (e.g., when the player dies).
Another setting where observing the long-term behaviour is notoriously undesirable is that of cryptography.When studying the security of systems modelling cryptographic protocols, two states are commonly considered equivalent when their behaviour is similar (up to a small error δ) in the short run, even when in the very long run they diverge.For instance, a state q could represent an ideal system where no attacks can be performed by construction, while another state q could represent a real system where an adversary can try to disrupt the cryptographic protocol.In such a scenario, if the protocol is secure, we would like to have q and q equivalent, since the behaviour of the real system is close to the one of the ideal system.Note that in the real system an adversary can repeatedly try to guess the secret cryptographic keys, and break security in the very long run, with very high probability.Accordingly, standard security definitions require that the behaviour of the ideal and real system are within a small error, but only for a bounded number of steps, after which their behaviour could diverge.
Contributions.To overcome the above mentioned issues, in this work we introduce a bounded, approximate notion of bisimilarity ∼ n δ , that only observes the first n steps, and allows for an error δ.Unlike standard bisimilarity, our relation is naturally defined by induction on n.We call this looser variant of bisimilarity an up-to-n, δ bisimilarity.We showcase up-to-n, δ bisimilarity on a running example (Examples 3.6, 4.4, 5.2, and 6.6), comparing an ideal combination padlock against a real one which can be opened by an adversary guessing its combination.We show that the two systems are bisimilar up-to-n, δ, while they are not bisimilar according to the standard coinductive notion.We then discuss how the two systems satisfy a basic security property expressed in PCTL, with suitable errors.To make our theory amenable to reason about infinite-state systems, such as those usually found when modelling cryptographic protocols, all our results apply to Markov chains with countably many states.In this respect, our work departs from most literature on probabilistic bisimulations [DAK12,SZGN13] and bisimilarity distances [vB17,TvB17,TvB18,TvB16,Fu12,CvBW12,vBSW08], which usually assume finite-state Markov chains, as they focus on computing the distances.In Example 4.5 we exploit infinite-state Markov chains to compare a biased random bit generator with an ideal one.
Our first main contribution is a soundness theorem establishing that, when a state q satisfies a PCTL formula φ (up to a given error), any bisimilar state q ∼ q must also satisfy φ, at the cost of a slight increase of the error.More precisely, if φ only involves until operators bounded by ≤ n, state q satisfies φ up to some error, and bisimilarity holds for enough steps and error δ, then q satisfies φ with an additional asymptotic error O(n • δ).
This asymptotic behaviour is compatible with the usual assumptions of computational security in cryptography.There, models of security protocols include a security parameter η, which affects the length of the cryptographic keys and the running time of the protocol: more precisely, a protocol is assumed to run for n(η) steps, which is polynomially bounded w.r.t.η.As already mentioned above, cryptographic notions of security do not observe the behaviour of the systems after this bound n(η), since in the long run an adversary can surely guess the secret keys by brute force.Coherently, a protocol is considered to be secure if (roughly) its actual behaviour is approximately equivalent to the ideal one for n(η) steps and up to an error δ(η), which has to be a negligible function, asymptotically approaching zero faster than any rational function.Under these bounds on n and δ, the asymptotic error O(n • δ) in our soundness theorem is negligible in η.Consequently, if two states q and q represent the ideal and actual behaviour, respectively, and they are bisimilar up to a negligible error, they will satisfy the same PCTL formulae with a negligible error.
We formalise this reasoning by providing a notion of asymptotic equivalence.We start by considering families of states Ξ(η), intuitively representing the behaviour of a system depending on a security parameter η.Our asymptotic equivalence Ξ 1 ≡ Ξ 2 holds whenever the behaviour of the two families is n, δ-bisimilar within a negligible error whenever we only perform a polynomial number of steps.We further introduce an asymptotic satisfaction relation Ξ |= φ which holds whenever the state Ξ(η) satisfies φ under similar assumptions on the number of steps and the allowed error.Our second main result is the soundness of the asymptotic equivalence with respect to asymptotic satisfaction.Asymptotically equivalent families asymptotically satisfy the same PCTL formulae.
We provide a detailed overview of the proof of our soundness theorem for n, δ-bisimilarity in section 5, deferring the gory technicalities to Appendix A. The proof of asymptotic soundness, which exploits the soundness theorem for n, δ-bisimilarity, is given in section 6.

Related work
There is a well-established line of research on establishing soundness and completeness of probabilistic bisimulations against various kinds of probabilistic logics [DGJP10, FMM20, HPS + 11, LS91, MS17, Mio18].
The work closest to ours is that of D'Innocenzo, Abate and Katoen [DAK12], which addresses the model checking problem on a relaxed PCTL differing from ours in a few aspects.First, their syntax allows for an individual bound on the number of steps k for each until operator U ≤k , while we assume all such bounds are equal and we make the semantics of PCTL parametrized w.r.t. the number of steps to be considered in until.This approach allows us to simplify the statement of the soundness theorem and the definition of asymptotic satisfaction relation, since the bound is not fixed by the formula, but it is a parameter of the semantics.Dealing with the case where each until in a formula could have its bound seems possible, at the cost of increasing the level of technicalities.Second, their main result shows that bisimilar states up-to a given error satisfy the same formulae ψ, provided that ψ ranges over the so-called -robust formulae.Instead, our soundness result applies to all PCTL formulae, and ensures that when moving from a state satisfying φ to a bisimilar one, φ is still satisfied, but at the cost of slightly increasing the error.Third, their relaxed semantics differs from ours.In ours, we relax all the probability bounds by the same amount δ.Instead, the relaxation in [DAK12] affects the bounds by a different amount which depends on the error , the until bound k, and the underlying DTMC.
Desharnais, Laviolette and Tracol [DLT08] use a coinductive approximate probabilistic bisimilarity, up-to an error δ.Using such coinductive bisimilarity, [DLT08] establishes the soundness and completeness with respect to a Larsen-Skou logic [LS91] (instead of PCTL).In [DLT08], a bounded, up-to n, δ version of bisimilarity is only briefly used to derive a decision algorithm for coinductive bisimilarity under the assumption that the state space is finite.In our work, instead, the bounded up-to n, δ bisimilarity is the main focus of study.In particular, our soundness result only assumes n, δ bisimilarity, which is strictly weaker than coinductive bisimilarity.Another minor difference is that [DLT08] considers a labelled Markov process, i.e. the probabilistic variant of a labelled transition system, while we instead focus on DTMCs having labels on states.
Bian and Abate [BA17] study bisimulation and trace equivalence up-to an error , and show that -bisimilar states are also -trace equivalent for a suitable which depends on .Furthermore, they show that -trace equivalent states satisfy the same formulae in a bounded LTL, up-to a certain error.In our work, we focus instead on the branching logic PCTL.
A related research line is that on bisimulation metrics [vB17,vBHMW05,vBW05].Some of these metrics, like our up-to bisimilarity, take approximations into account [DGJP99,CGT16].Similarly to our bisimilarity, bisimulation metrics allow to establish two states equivalent up-to a certain error (but usually do not take into account the bound on the number of steps).Interestingly, Castiglioni, Gebler and Tini [CGT16] introduce a notion of distance between Larsen-Skou formulae, and prove that the bisimulation distance between two processes corresponds to the distance between their mimicking formulae.De Alfaro, Majumdar, Raman and Stoelinga [dAMRS08] elegantly characterise bisimulation metrics with a quantitative µ-calculus.Such logic allows to specify interesting properties such as maximal reachability and safety probability, and the maximal probability of satisfying a general ω-regular specification, but not full PCTL.Mio [Mio14] characterises a bisimulation metric based on total variability with a more general quantitative µ-calculus, dubbed Lukasiewicz µ-calculus, able to encode PCTL.Both [dAMRS08] and [Mio14] do not take the number of steps into account, therefore their applicability to the analysis of security protocols is yet to be investigated.
Metrics with discount [DGJP04, dAHM03, BBL + 21, DCPP06,vBSW08] are sometimes used to relate the behaviour of probabilistic processes, weighing less those events that happen in the far future compared to those happening in the first steps.Often, in these metrics each step causes the probability of the next events to be multiplied by a constant factor c < 1, in order to diminish their importance.Note that this discount makes it so that after η steps, this diminishing factor becomes c η , which is a negligible function of η.As discussed before, in cryptographic security one needs to consider as important those events happening within polynomially many steps, while neglecting the ones after such a polynomial threshold.Using an exponential discount factor c η after only η steps goes against this principle, since it would cause a secure system to be at a negligible distance from an insecure one which can be violated after just η steps.For this reason, instead of using a metric with discount, in this paper we resort to a bisimilarity that is parametrized over the number of steps n and error δ, allowing us to obtain a notion which distinguishes between the mentioned secure and insecure systems.
Several works develop algorithms to decide probabilistic bisimilarity, and to compute metrics [vBW14,CvBW12,Fu12,TvB16,TvB17,TvB18].To this purpose, they restrict to finite-state systems, like e.g.probabilistic automata.Our results, instead, apply also to infinite-state systems.
In [ZD05] a calculus with cryptographic primitives is introduced, together with a semantics where attackers have a probability π(η) of guessing encryption keys.It is shown that, assuming that π(η) is negligible and that attackers run in polynomial time, some security properties (e.g.secrecy, authentication) are equivalent to the analogous properties with standard Dolev-Yao assumptions (that is, attackers never guess keys but are not restricted to polynomial time).This result can be seen as a special case of our asymptotic soundness theorem.
The interesting work [LG22] proposes a behavioural notion of indistinguishability between session typed probabilistic π-calculus processes, with the aim of providing a formal system for proving security of real cryptographic protocols by comparison with ideal ones.The type system, which is based on bounded linear logic [GSS92,LG16], guarantees that processes terminate in polynomial time.This differs from our approach, where polynomiality appears directly in the equivalence definition (Definition 6.2).Moreover, the calculus of [LG22] is quite restrictive: for instance, it is not possible to specify adversaries that access an oracle a polynomial number of times.By contrast, our abstract model is general enough to represent such adversaries.
Comparison with [BMZ22].This paper extends the work [BMZ22] in two directions.First, the current paper includes the proofs of all statements, which were not present in [BMZ22].Second, in [BMZ22] we hinted at the possible application of soundness to the asymptotic behaviour of systems which depend on a parameter η.Here, we properly develop and formalise that intuition in section 6, providing a new asymptotic soundness result.

The probabilistic temporal logic PCTL
Assume a set L of labels, ranged over by l, and let δ, π range over non-negative reals.A discrete-time Markov chain (DTMC) is a standard model of probabilistic systems.Throughout this paper, we consider a DTMC having a countable, possibly infinite, set of states q, each carrying a subset of labels (q) ⊆ L. Definition 3.1 (Discrete-Time Markov Chain).A (labelled) DTMC is a triple (Q, Pr, ) where: write Pr(q, Q) for q ∈Q Pr(q, q ) and we require that Pr(q, Q) = 1 for all q ∈ Q.
A trace is an infinite sequence of states t = q 0 q 1 • • • , where we write t(i) for q i , i.e. the i-th element of t.A trace fragment is a finite, non-empty sequence of states t = q 0 • • • q n−1 , where | t| = n ≥ 1 is its length.Given a trace fragment t and a state q, we write tq ω for the trace tqqq • • • .
It is well-known that, given an initial state q 0 , the DTMC induces a σ-algebra of measurable sets of traces T starting from q 0 , i.e. the σ-algebra generated by cylinder sets [BK08].More in detail, given a trace fragment t = q 0 • • • q n−1 , its cylinder set As usual, if n = 1 the product is empty and evaluates to 1. Closing the family of cylinder sets under countable unions and complement we obtain the family of measurable sets.The probability measure on cylinder sets then uniquely extends to all the measurable sets.
Given a set of trace fragments T , all starting from the same state q 0 and having the same length, we let Pr( T ) = Pr( t∈ T Cyl ( t)) = t∈ T Pr(Cyl ( t)).Note that using same-length trace fragments ensures that their cylinder sets are disjoint, hence the second equality holds.
Below, we define PCTL formulae.Our syntax is mostly standard, except for the until operator.There, for the sake of simplicity, we do not bound the number of steps in the syntax φ 1 U φ 2 , but we do so in the semantics.Concretely, this amounts to imposing the same bound to all the occurrences of U in the formula.Such bound is then provided as a parameter to the semantics.Definition 3.2 (PCTL Syntax).The syntax of PCTL is given by the following grammar, defining state formulae φ and path formulae ψ: where ∈ {>, ≥} As syntactic sugar, we write Pr <π [ψ] for ¬Pr ≥π [ψ], and Pr ≤π [ψ] for ¬Pr >π [ψ].
Given a PCTL formula φ, we define its maximum X-nesting X max (φ) and its maximum U-nesting U max (φ) inductively as follows: Definition 3.3 (Maximum Nesting).For • ∈ {X, U}, we define: We now define a semantics for PCTL where the probability bounds π in Pr π [ψ] can be relaxed or strengthened by an error δ.Our semantics is parameterized over the until bound n, the error δ ∈ R ≥0 , and a direction r ∈ {+1, −1}.Given the parameters, the semantics associates each PCTL state formula with the set of states satisfying it.Intuitively, when r = +1 we relax the semantics of the formula, so that increasing δ causes more states to satisfy it.More precisely, the probability bounds π in positive occurrences of Pr π [ψ] are decreased by δ, while those in negative occurrences are increased by δ.Dually, when r = −1 we strengthen the semantics, modifying π in the opposite direction.Our semantics is inspired by the relaxed / strengthened PCTL semantics of [DAK12].Definition 3.4 (PCTL Semantics).The semantics of PCTL formulae is given below.Let n ∈ N, δ ∈ R ≥0 and r ∈ {+1, −1}.
The semantics is mostly standard, except for Pr π [ψ] and φ 1 Uφ 2 .The semantics of Pr π [ψ] adds r • δ to the probability of satisfying ψ, which relaxes or strengthens (depending on r) the probability bound as needed.The semantics of φ 1 Uφ 2 uses the parameter n to bound the number of steps within which φ 2 must hold.
Our semantics enjoys monotonicity.The semantics of state and path formulae is increasing w.r.t.δ if r = +1, and decreasing otherwise.The semantics also increases when moving from r = −1 to r = +1.
Figure 2: A Markov chain modelling an ideal (left) and a real (right) padlock.
Example 3.6.We compare an ideal combination padlock to a real one from the point of view of an adversary.The ideal padlock has a single state q ok , representing a closed padlock that can not be opened.Instead, the real padlock is under attack from the adversary who tries to open the padlock by repeatedly guessing its 5-digit PIN.At each step the adversary generates a (uniformly) random PIN, different from all the ones which have been attempted so far, and tries to open the padlock with it.The states of the real padlock are q 0 , . . ., q N −1 (with N = 10 5 ), where q i represents the situation where i unsuccessful attempts have been made, and an additional state q err that represents that the padlock was opened.Since after i attempts the adversary needs to guess the correct PIN among the N − i remaining combinations, the real padlock in state q i moves to q err with probability 1/(N − i), and to q i+1 with the complementary probability.
Summing up, we simultaneously model both the ideal and real padlock as a single DTMC with the following transition probability function (see Figure 2): Pr(q ok , q ok ) = 1 Pr(q err , q err ) = 1 Pr(q i , q err ) = 1/(N − i) We label the states with labels L = {err} by letting (q err ) = {err} and (q) = ∅ for all q = q err .The PCTL formula φ = Pr ≤0 [true U err] models the expected behaviour of an unbreakable padlock, requiring that the set of traces where the padlock is eventually opened has zero probability.Formally, φ is satisfied by state q when When q = q ok we have that Cyl (q ok ) ∩ [[true U err]] n δ,−1 = ∅, hence the above probability is zero, which is surely ≤ δ.Consequently, φ is satisfied by the ideal padlock q ok , for all n ≥ 0 and δ ≥ 0.
By contrast, φ is not always satisfied by the real padlock q = q 0 , since we have only for some values of n and δ.To show why, we start by considering some trivial cases.
Choosing δ = 1 makes equation (3.1) trivially true for all n.Furthermore, if we choose n = 1, then Cyl (q 0 ) ∩ [[true U err]] n δ,−1 = {q 0 q ω err } is a set of traces with probability 1/N .Therefore, equation (3.1) holds only when δ ≥ 1/N .More in general, when n ≥ 1, we have δ,−1 = {q 0 q ω err , q 0 q 1 q ω err , q 0 q 1 q 2 q ω err , . . ., q 0 . . .q n−1 q ω err } The probability of the above set is the probability of guessing the PIN within n steps.The complementary event, i.e. not guessing the PIN for n times, has probability Consequently, (3.1) simplifies to n/N ≤ δ, suggesting the least value of δ (depending on n) for which q 0 satisfies φ.For instance, when n = 10 3 , this amounts to claiming that the real padlock is secure, up to an error of δ = n/N = 10 −2 .

Up-to-n, δ Bisimilarity
We now define a relation on states q ∼ n δ q that intuitively holds whenever q and q exhibit similar behaviour for a bounded number of steps.The parameter n controls the number of steps, while δ controls the error allowed in each step.Note that since we only observe the first n steps, our notion is inductive, unlike unbounded bisimilarity which is co-inductive, similarly to [CGT16].Our notion is also inspired by [DLT08].
Definition 4.1 (Up-to-n, δ Bisimilarity).We define the relation q ∼ n δ q as follows by induction on n: (1) q ∼ 0 δ q always holds (2) q ∼ n+1 δ q holds if and only if, for all Q ⊆ Q: the image of the set Q according to the bisimilarity relation.
We now establish two basic properties of the bisimilarity.Our notion is reflexive and symmetric, and enjoys a triangular property.Furthermore, it is monotonic on both n and δ.
Example 4.4.We use up-to-n, δ bisimilarity to compare the behaviour of the ideal padlock q ok and the real one, in any of its states, when observed for n steps.When n = 0 bisimilarity trivially holds, so below we only consider n > 0.
We start from the simplest case: bisimilarity does not hold between q ok and q err .Indeed, q ok and q err have distinct labels ( (q ok ) = ∅ = {err} = (q err )), hence we do not have q ok ∼ n δ q err , no matter what n > 0 and δ are.We now compare q ok with any q i .When n = 1, both states have an empty label set, i.e. (q ok ) = (q i ) = ∅, hence they are bisimilar for any error δ.We therefore can write q ok ∼ 1 δ q i for any δ ≥ 0.
When n = 2, we need a larger error δ to make q ok and q i bisimilar.Indeed, if we perform a move from q i , the padlock can be broken with probability 1/(N − i), in which case we reach q err , thus violating bisimilarity.Accounting for such probability, we only obtain q ok ∼ 2 δ q i for any δ ≥ 1/(N − i).
When n = 3, we need an even larger error δ to make q ok and q i bisimilar.Indeed, while the first PIN guessing attempt has probability 1/(N − i), in the second move the guessing probability increases to 1/(N − i − 1).Choosing δ equal to the largest probability is enough to account for both moves, hence we obtain q ok ∼ 3 δ q i for any δ ≥ 1/(N − i − 1).Technically, note that the denominator N − i − 1 might be zero, since when i = n − 1 the first move always guesses the PIN, and the second guess never actually happens.In such case, we instead take δ = 1.More in detail, we verify item (2b) of Definition 4.1 for q ok ∼ 3 δ q i , assuming δ ≥ 1/(N − i − 1).We must prove that: Pr(q ok , Q) ≤ Pr(q i , ∼ 2 δ (Q)) + δ When q ok ∈ Q we have Pr(q ok , Q) = 0, hence the inequality holds trivially.Otherwise, if q ok ∈ Q we first observe that Pr(q ok , Q) = 1.From the case n = 2, we have q ok ∼ 2 δ q i+1 , since δ ≥ 1/(N − (i + 1)).Hence, q i+1 ∈ ∼ 2 δ (Q) and so: This proves item (2b); the proof for item (2c) is similar.More in general, for an arbitrary n ≥ 2, we obtain through a similar argument that q ok ∼ n δ q i for any δ ≥ 1/(N − i − n + 2).Intuitively, δ = 1/(N − i − n + 2) is the probability of guessing the PIN in the last attempt (the n-th), which is the attempt having the highest success probability.Again, when the denominator N − i − n + 2 becomes zero (or negative), we instead take δ = 1.
Note that the DTMC of the ideal and real padlocks (Example 3.6) has finitely many states.Our bisimilarity notion and results, however, can also deal with DTMCs with a countably infinite set of states, as we show in the next example.Example 4.5.We consider an ideal system which randomly generates bit streams in a fair way.We model such a system as having two states {q a , q b }, with transition probabilities Pr(x, y) = 1/2 for any x, y ∈ {q a , q b }, as in Figure 1.We label state q a with label a denoting bit 0, and state q b with label b denoting bit 1.
We compare this ideal system with a real system which generates bit streams in an unfair way.At each step, the real system draws a ball from an urn, initially having g 0 a-labelled balls and g 0 b-labelled balls.After each drawing, the ball is placed back in the urn.However, every time an a-labelled ball is drawn, an additional a-labelled ball is put in the urn, making the next drawings more biased towards a.
We model the real system using the infinite 1 set of states N×{a, b}, whose first component counts the number of a-labelled balls in the urn, and the second component is the label of the last-drawn ball.The transition probabilities are as follows, where g 0 ∈ N + (see Figure 3): Pr((g, x), (g + 1, a)) = g/(g + g 0 ) Pr((g, x), (g, b)) = g 0 /(g + g 0 ) Pr((g, x), (g , x )) = 0 otherwise We label each such state with its second component.We now compare the ideal system to the real one.Intuitively, the ideal system, when started from state q a , produces a sequence of states whose labels are uniform independent random values in {a, b}.Instead, the real system slowly becomes more and more biased towards label a.More precisely, when started from state (g 0 , a), in the first drawing the next label is uniformly distributed between a and b, as in the ideal system.When the sampled state has label a, this causes the component g to be incremented, increasing the probability g/(g + g 0 ) of sampling another a in the next steps.Indeed, the value g is always equal to g 0 plus the number of sampled a-labelled states so far.
Therefore, unlike the ideal system, on the long run the real system will visit a-labelled states with very high probability, since the g component slowly but steadily increases.While this fact makes the two systems not bisimilar according to the standard probabilistic bisimilarity [LS89], if we restrict the number of steps to n g 0 and tolerate a small error δ, we can obtain q a ∼ n δ (g 0 , a).For instance, if we let g 0 = 1000, n = 100 and δ = 0.05 we have q a ∼ n δ (g 0 , a).This is because, in n steps, the first component g of a real system (g, x) will at most reach 1100, making the probability of the next step to be (g + 1, a) to be at most 1100/2100 0.523.This differs from the ideal probability 0.5 by less than δ, hence bisimilarity holds.
1 Modelling this behaviour inherently requires an infinite set of states, since each number of a-labelled balls in the urn leads to a unique transition probability function.

Soundness
Our soundness theorem shows that, if we consider any state q satisfying φ (with steps n and error δ ), and any state q which is bisimilar to q (with enough steps and error δ), then q must satisfy φ, with the same number n of steps, at the cost of suitably increasing the error.For a fixed φ, the "large enough" number of steps and the increase in the error depend linearly on n.
Theorem 5.1 (Soundness).Let k X = X max (φ) be the maximum X-nesting of a formula φ, and let k U = U max (φ) be the maximum U-nesting of φ.Then, for all n, δ, δ we have: where n = n Example 5.2.We apply Theorem 5.1 to our padlock system in the running example.We take the same formula φ = Pr ≤0 [true U err] of Example 3.6 and choose n = 10 3 and δ = 0. Since φ has only one until operator and no next operators, the value n in the theorem statement is n = 10 3 • 1 + 0 + 1 = 1001.Therefore, from Theorem 5.1 we obtain, for all δ: In Example 3.6 we discussed how the ideal padlock q ok satisfies the formula φ for any number of steps and any error value.In particular, choosing 1000 steps and zero error, we get q ok ∈ [[φ]] 1000 0,+1 .Moreover, in Example 4.4 we observed that states q ok and q 0 are bisimilar with n = 1001 and δ = 1/(N − 0 − n + 2) = 1/99001, i.e. q ok ∼ n δ q 0 .In such case, the theorem ensures that q 0 ∈ [[φ]] 1000 1001/99001,+1 , hence the real padlock can be considered unbreakable if we limit our attention to the first n = 1000 steps, up to an error of 1001/99001 ≈ 0.010111.Finally, we note that such error is remarkably close to the least value that would still make q 0 satisfy φ, which we computed in Example 3.6 as n/N = 10 3 /10 5 = 0.01.
In the rest of this section, we describe the general structure of the proof in a top-down fashion, leaving the detailed proof for Appendix A.
We prove the soundness theorem by induction on the state formula φ, hence we also need to deal with path formulae ψ.Note that the statement of the theorem considers the image of the semantics of the state formula φ w.r.t.bisimilarity (i.e., ∼ n δ ([[φ]] n δ ,+1 )).Analogously, to deal with path formulae we also need an analogous notion on sets of traces.To this purpose, we consider the set of traces in the definition of the semantics: Then, given a state q bisimilar to p, we define the set of pointwise bisimilar traces starting from q, which we denote with Rn δ,q (T ).Technically, since ψ can only observe a finite portion of a trace, it is enough to define Rn δ,q ( T ) on sets of trace fragments T .Definition 5.3.Write F n q 0 for the set of all trace fragments of length n starting from q 0 .Assuming p ∼ n δ q, we define Rn δ,q : P(F n p ) → P(F n q ) as follows: In particular, notice that F 1 q = {q} (the trace fragment of length 1), and so: R1 δ,q (∅) = ∅ R1 δ,q ({q}) = {q} The key inequality we exploit in the proof (Lemma 5.4) compares the probability of a set of trace fragments T starting from p to the one of the related set of trace fragments Rm δ,q ( T ) starting from a q bisimilar to p.We remark that the component nδ in the error that appears in Theorem 5.1 results from the component mδ appearing in the following lemma.
Lemma 5.4.If p ∼ n δ q and T is a set of trace fragments of length m, with m ≤ n, starting from p, then: Pr( T ) ≤ Pr( Rm δ,q ( T )) + mδ Lemma 5.4 allows T to be an infinite set (because the set of states Q can be infinite).We reduce this case to that in which T is finite.We first recall a basic calculus property: any inequality a ≤ b can be proved by establishing instead a ≤ b + for all > 0.Then, since the probability distribution of trace fragments of length m is discrete, for any > 0 we can always take a finite subset of the infinite set T whose probability differs from that of T less than .It is then enough to consider the case in which T is finite, as done in the following lemma.
Lemma 5.5.If p ∼ n δ q and T is a finite set of trace fragments of length n > 0 starting from p, then: Pr( T ) ≤ Pr( Rn δ,q ( T )) + nδ We prove Lemma 5.5 by induction on n.In the inductive step, we partition the traces according to their first move, i.e., on their next state after p (for the trace fragments in T ) or q (for the bisimilar counterparts).A main challenge here is caused by the probabilities of such moves being weakly connected.Indeed, when p moves to p , we might have several states q , bisimilar to p , such that q moves to q .Worse, when p moves to another state p , we might find that some of the states q we met before are also bisimilar to p .Such overlaps make it hard to connect the probability of p moves to that of q moves.
To overcome these issues, we exploit the technical lemma below.Let set A represent the p moves, and set B represent the q moves.Then, associate to each set element a ∈ A, b ∈ B a value (f A (a), f B (b) in the lemma) representing the move probability.The lemma ensures that each f A (a) can be expressed as a weighted sum of f B (b) for the elements b bisimilar to a. Here, the weights h(a, b) make it possible to relate a p move to a "weighted set" of q moves.Furthermore, the lemma ensures that no b ∈ B has been cumulatively used for more than a unit weight ( a∈A h(a, b) ≤ 1).
Lemma 5.6.Let A be a finite set and B be a countable set, equipped with functions then there exists h : We visualize Lemma 5.6 in Figure 4 through an example.The leftmost graph shows a finite set A = {a 1 , a 2 , a 3 } where each a i is equipped with its associated value f A (a i ) and, Figure 4: Graphical representation of Lemma 5.6 (left) and its proof (right).similarly, a finite set B = {b 1 , . . ., b 4 } where each b i has its own value f B (b i ).The function g is rendered as the edges of the graph, connecting each a i with all b j ∈ g(a i ).
The graph satisfies the hypotheses, as one can easily verify.For instance, when A = {a 1 , a 2 } inequality (5.1) simplifies to 0.3 + 0.5 ≤ 0.5 + 0.6.The thesis ensures the existence of a weight function h(−, −) whose values are shown in the graph on the left over each edge.
The rightmost graph in Figure 4 instead sketches how our proof devises the desired weight function h, by constructing a network flow problem, and exploiting the well-known min-cut/max-flow theorem [DF55], following the approach of [Bai98].We start by adding a source node to the right (white bullet in the figure), connected to nodes in B, and a sink node to the left, connected to nodes in A. We write the capacity over each edge: we use f B (b i ) for the edges connected to the source, f A (a i ) for the edges connected to the sink, and +∞ for the other edges in the middle.
Then, we argue that the leftmost cut C shown in the figure is a min-cut.Intuitively, if we take another cut C not including some edge in C, then C has to include other edges making C not any better than C. Indeed, C can surely not include any edge in the middle, since they have +∞ capacity.Therefore, if C does not include an edge from some a i to the sink, it has to include all the edges from the source to each b j ∈ g(a i ).In this case, hypothesis (5.1) ensures that doing so does not lead to a better cut.Hence, C is indeed a min-cut.
From the max-flow corresponding to the min-cut, we derive the values for h(−, −).Thesis (5.2) follows from the flow conservation law on each b i , and the fact that the incoming flow of each b j from the source is bounded by the capacity of the related edge.Thesis (5.3) follows from the flow conservation law on each a i , and the fact that the outgoing flow of each a i to the sink is exactly the capacity of the related edge, since the edge is on a min-cut.

Asymptotic equivalence
In this section we transport the notion of bisimilarity and the semantics of PCTL to families of states, thus reasoning on their asymptotic behaviours.More precisely, given a statelabelled DTMC Q, we define a family of states to be an infinite sequence Ξ : N → Q. Intuitively, Ξ(η) can describe the behaviour of a probabilistic system depending on a security parameter η ∈ N.
When using bisimilarity (Definition 4.1) to relate two given states Q 1 and Q 2 , we have to provide a number of steps n and a probability error δ.By contrast, when relating two families Ξ 1 and Ξ 2 we want to focus on their asymptotic behaviour, and obtain an equivalence that does not depend on specific values of n and δ.To do so, we start by recalling the standard definition of negligible function: Definition 6.1 (Negligible Function).A function f : N → R is said to be negligible whenever We say that Ξ 1 and Ξ 2 are asymptotically equivalent (Ξ 1 ≡ Ξ 2 ) when the families are asymptotically pointwise bisimilar with a negligible error δ(η) whenever n(η) is a polynomial.Definition 6.2 (Asymptotic Equivalence).Given Ξ 1 , Ξ 2 : N → Q, we write Ξ 1 ≡ Ξ 2 if and only if for each polynomial n(−) there exists a negligible function δ(−) and η ∈ N such that for all η ≥ η we have Ξ 1 (η) ∼ n(η) δ(η) Ξ 2 (η) Lemma 6.3.≡ is an equivalence relation.
We now provide an asymptotic semantics for PCTL, by defining its satisfaction relation Ξ |= φ.As done above, this notion does not depend on specific values for n and δ (unlike the semantics in Definition 3.4), but instead considers the asymptotic behaviour of the family.Definition 6.4 (Asymptotic Satisfaction Relation).We write Ξ |= φ when there exists a polynomial n(−) such that for each polynomial n(−) ≥ n(−) there exists a negligible function δ(−) and η ∈ N such that for all η ≥ η we have Ξ In the above definition, we only consider polynomials greater than a threshold n(−).This is because a family Ξ representing, say, a protocol could require a given (polynomial) number of steps to complete its execution.It is reasonable, for instance, that Ξ(η) needs to exchange η 2 messages over a network to perform its task.In such cases, we still want to make Ξ satisfy a formula φ stating that the task is eventually completed with high probability.If we quantified over all polynomials n(−), we would also allow choosing small polynomials like n(η) = η or even n(η) = 1, which would not provide Ξ enough time to complete.Using a (polynomial) threshold n(−), instead, we always provide enough time.
We now establish the main result of this section, asymptotic soundness, stating that equivalent families of states asymptotically satisfy the same PCTL formulae.The proof relies on our previous soundness Theorem 5.1.Theorem 6.5 (Asymptotic Soundness).Let Ξ 1 , Ξ 2 be families of states such that Ξ 1 ≡ Ξ 2 .For every PCTL formula φ: φ) be the maximum X-nesting of φ, and let k U = U max (φ) be the maximum U-nesting of φ.Let n1 (−) as in the definition of the hypothesis Ξ 1 |= φ.To prove the thesis Ξ 2 |= φ, we choose n2 (−) = n1 (−), and we consider an arbitrary n 2 (−) ≥ n2 (−) = n1 (−).We can then choose n 1 (−) = n 2 (−) in the same hypothesis, and obtain a negligible δ 1 (−) and η1 , where for any η ≥ η1 we have Ξ We now exploit the other hypothesis Ξ 1 ≡ Ξ 2 , choosing the polynomial and obtaining a negligible δ(−) and η where for any η ≥ η we have To prove the thesis, we finally choose the negligible function δ 2 (η) = n(η) • δ(η) + δ 1 (η) and η2 = max(η 1 , η).By Theorem 5.1 we have that for any η ≥ η2 : +1 where n(η) is as in (6.2).Applying this to (6.1) and (6.3) we then have that, for any η ≥ η2 : Example 6.6.We now return to the padlock examples 3.6 and 4.4.We again consider an ideal padlock modelled by a state q ok , but also a sequence of padlocks having an increasing number of digits η, hence an increasing number N = 10 η of combinations.We assume that state q i,10 η models the state of a padlock having η digits where the adversary has already made i brute force attempts, following the same strategy as in the previous examples.The transition probabilities are also similarly defined.
In this scenario, we can define two state families.Family Ξ 1 (η) = q ok represents a (constant) sequence of ideal padlocks, while family Ξ 2 (η) = q 0,10 η represents a sequence of realistic padlocks with no previous brute force attempt (i = 0), in increasing order of robustness.Indeed, as η increases, the padlock becomes harder to break by brute force since the number of combinations N = 10 η grows.
In Example 4.4, we have seen that and we can observe that the above δ(η) is indeed negligible when n(η) is a polynomial.This means that Ξ 1 ≡ Ξ 2 holds, hence we can apply Theorem 6.5 and conclude that the families Ξ 1 and Ξ 2 asymptotically satisfy the same PCTL formulae.This is intuitive since, when the adversary can only attempt a polynomial number of brute force attacks, and when the number of combinations increases exponentially, the robustness of the realistic padlocks effectively approaches that of the ideal one.
We now discuss how Theorem 6.5 could be applied to a broad class of systems.Consider the execution of an ideal cryptographic protocol, modelled as a DTMC starting from the initial state q i .This model could represent, for instance, the semantics of a formal, symbolic system such as those that can be expressed using process algebras.In this scenario, the underlying cryptographic primitives can be perfect, in the sense that ciphertexts reveal no information to whom does not know the decryption key, signatures can never be forged, hash preimages can never be found, and so on, despite the amount of computational resources available to the adversary.
Given such a model, it is then possible to refine it making the cryptographic primitives more realistic, allowing an adversary to attempt attacks such as decryptions and signature forgeries, which however succeed only with negligible probability w.r.t. a security parameter η.This more realistic system can be modelled using a distinct DTMC state q η r whose behaviour is similar to that of q i : the state transitions are essentially the same, except for the cases in which the adversary is successful in attacking the cryptographic primitives.Therefore, the transition probabilities are almost the same, differing only by a negligible quantity.
Therefore, we can let Ξ 1 (η) = q i and Ξ 2 (η) = q η r , and observe that they are indeed asymptotically equivalent.Note that this holds in general by construction, no matter what is the behaviour of the ideal system q i we started from.
Finally, by Theorem 6.5 we can claim that both families Ξ 1 , Ξ 2 asymptotically satisfy the same PCTL formulas.This makes it possible, in general, to prove properties on the simpler q i system, possibly even using some automatic verification tools, and transfer these results in the more realistic setting q η r .A special case of this fact was originally studied in [ZD05], which however only considered reachability properties.By comparison, Theorem 6.5 is much more general, allowing one to transfer any property that can be expressed using a PCTL formula.
The construction above allows one to refine an ideal system q i into a more realistic one q η r by taking certain adversaries into account.However, if our goal were to study the security of the system against all reasonable adversaries, then the above approach would not be applicable.Indeed, it is easy to find an ideal system and a corresponding realistic refinement, comprising a reasonable adversary, where the asymptotic equivalence does not hold.For instance, consider an ideal protocol where Alice and Bob exchange ten messages, after which Alice randomly chooses and exchanges a single bit.To assess the security of a realistic implementation, we might want to consider the case where Alice is an adversary.In such case, a malicious Alice could exchange the first two messages, then flip a coin b ← {0, 1} in secret, exchange the other eight messages, and finally send the value b.The behaviour of such realistic system differs from the ideal one, since the ideal one has a probabilistic choice point only at the end, while the realistic system anticipates it after the first two moves.It is easy to check (and well known) that moving choices to an earlier point makes standard bisimilarity fail, and this is the case also for asymptotic equivalence.The failure of asymptotic equivalence prevents us from applying the asymptotic soundness theorem.In particular, assume that we have proved that the ideal system enjoys certain specifications expressed as PCTL formulae.We can not exploit the theorem to show that also the realistic system with the adversary enjoys the same specifications.

Conclusions
In this paper we studied how the (relaxed) semantics of PCTL formulae interacts with (approximate) probabilistic bisimulation.In the regular, non relaxed case, it is well-known that when a state q satisfies a PCTL formula φ, then all the states that are probabilisticbisimilar to q also satisfy φ ([DGJP10]).Theorem 5.1 extends this to the relaxed semantics, establishing that when a state q satisfies a PCTL formula φ up-to n steps and error δ, then all the states that are approximately probabilistic bisimilar to q with error δ (and enough steps) also satisfy φ up-to n steps and suitably increased error.We provide a way to compute the new error in terms of n, δ, δ .Theorem 6.5 extends such soundness result to the asymptotic behaviour where the error becomes negligible when the number of steps is polynomially bounded.
Our results are a first step towards a novel approach to the security analysis of cryptographic protocols using probabilistic bisimulations.When one is able to prove that a real-world specification of a cryptographic protocol is asymptotically equivalent to an ideal one, then one can invoke Theorem 6.5 and claim that the two models satisfy the same PCTL formulae, essentially reducing the security proof of the cryptographic protocol to verifying the ideal model.A relevant line for future work is to study the applicability of our theory in this setting.As discussed in section 6, our approach is not applicable to all protocols and all adversaries.A relevant line of research could be the study of larger asymptotic equivalences, which allow to transfer properties from ideal to realistic systems.This could be achieved, e.g., by considering weaker logics than PCTL, or moving to linear temporal logics.
Another possible line of research would be investigating proof techniques for establishing approximate bisimilarity and refinement [JL91], as well as devising algorithms for approximate bisimilarity, along the lines of [vBW14,CvBW12,Fu12,TvB16,TvB17,TvB18].This direction, however, would require restricting our theory to finite-state systems, which contrasts with our general motivation coming from cryptographic security.Indeed, in the analysis of cryptographic protocols, security is usually to be proven against an arbitrary adversary, hence also against infinite-state ones.Hence, model-checking of finite-state systems would not directly be applicable in this setting.

Figure 3 :
Figure 3: A Markov chain modelling an unfair random generator of bit streams.