Analysis of Timed and Long-Run Objectives for Markov Automata

Markov automata (MAs) extend labelled transition systems with random delays and probabilistic branching. Action-labelled transitions are instantaneous and yield a distribution over states, whereas timed transitions impose a random delay governed by an exponential distribution. MAs are thus a nondeterministic variation of continuous-time Markov chains. MAs are compositional and are used to provide a semantics for engineering frameworks such as (dynamic) fault trees, (generalised) stochastic Petri nets, and the Architecture Analysis&Design Language (AADL). This paper considers the quantitative analysis of MAs. We consider three objectives: expected time, long-run average, and timed (interval) reachability. Expected time objectives focus on determining the minimal (or maximal) expected time to reach a set of states. Long-run objectives determine the fraction of time to be in a set of states when considering an infinite time horizon. Timed reachability objectives are about computing the probability to reach a set of states within a given time interval. This paper presents the foundations and details of the algorithms and their correctness proofs. We report on several case studies conducted using a prototypical tool implementation of the algorithms, driven by the MAPA modelling language for efficiently generating MAs.


Introduction
Markov automata (MAs, for short) have been introduced in [16] as a continuous-time version of Segala's probabilistic automata [31]. Closed under operators such as parallel composition and hiding, they provide a compositional formalism for concurrent soft real time systems. A transition in an MA is either labelled with a positive real number representing the rate of a negative exponential distribution, or with an action. An action transition leads to a discrete probability distribution over states. MAs can thus model action transitions as in labelled transition systems, probabilistic branching as found in (discrete time) Markov chains and Markov decision processes, as well as delays that are governed by exponential distributions as in continuous-time Markov chains.
The semantics of MAs has been recently investigated in quite some detail. Weak and strong (bi)simulation semantics have been presented in [16,15], whereas it is shown in [13] that weak bisimulation provides a sound and complete proof methodology for reduction barbed congruence. A process algebra with data for the efficient modelling of MAs, accompanied with some reduction techniques using static analysis, has been presented in [35], and model checking of MAs against Continuous Stochastic Logic (CSL) is discussed in [21]. Although the MA model raises several challenging theoretical issues, both from a semantical and from an analytical point of view, our main interest is in their practical applicability. As MAs extend Hermanns' interactive Markov chains (IMCs) [23], they inherit IMC application domains, ranging from GALS hardware designs [9] and dynamic fault trees [6] to the standardised modelling language AADL [7,22]. The additional feature of probabilistic branching yields additional expressivity and thereby enriches the spectrum of application contexts further. This expressivity also makes them a natural semantic model for other formalisms. Among others, MAs are expressive enough to provide a natural operational model for generalised stochastic Petri nets (GSPNs) [2] and stochastic activity networks (SANs) [27], both popular modelling formalisms for performance and dependability analysis. Let us briefly motivate this by considering GSPNs. Whereas in SPNs all transitions are subject to an exponentially distributed delay, GSPNs also incorporate immediate transitions, transitions that happen instantaneously. The traditional GSPN semantics yields a continuous-time Markov chain (CTMC), i.e., an MA without action transitions. However, that semantics is restricted to a subclass of GSPNs, namely those that are confusion free. Confusion [1] is related to the presence of nondeterminism. Confused GSPNs are traditionally considered as semantically ambiguous and thus precluded from any kind of analysis. This gap is particularly disturbing because several published semantics for higherlevel modelling formalisms-e.g., UML, AADL, WSDL-map onto GSPNs without ensuring the mapping to be free of confusion, therefore possibly inducing confused models.
It has recently been detailed in [24,14] that MAs are a natural semantic model for every GSPN. To give some intuitive insight into this achievement, consider the GSPN in Fig. 1(a). This net is confused: In Petri net jargon, the transitions t 1 and t 2 are not in conflict, but firing transition t 1 leads to a conflict between t 2 and t 3 , which does not occur if t 2 fires before t 1 . Though decisive, the firing order between t 1 and t 2 is not determined. Transitions t 2 and t 3 are weighted so that in a marking {p 2 , p 3 } in which both transitions are enabled, t 2 fires with probability w 2 w 2 +w 3 and t 3 with its complement probability. The weight of transition t 1 is not relevant; we assume t 1 is not equipped with a weight. Classical GSPN semantics and analysis algorithms cannot cope with this net due to the presence of confusion (i.e., nondeterminism). Figure 1(b) depicts the MA semantics of this net. Here, states correspond to sets of net places that contain a token. In the initial state, there is a nondeterministic choice between the transitions t 1 and t 2 . Note that the presence of weights is naturally represented by discrete probabilistic branching as reflected in the outgoing transition from state {p 2 , p 3 }. One can show that the MA semantics conservatively extends the classical semantics, in the sense that the former and the latter are weakly bisimilar [14] on confusion-free GSPNs. Thus, if transition t 1 in our example is assigned some weight w 1 , the GSPN has no confusion. This would be reflected in the MA semantics by replacing the nondeterministic branching in state {p 1 , p 2 } by a single transition, yielding {p 2 , p 3 } with probability w 1 w 1 +w 2 and state {p 1 , p 5 } with the complement probability. This paper focuses on the quantitative analysis of MAs-and thus implicitly of (possibly confused) GSPNs, of AADL specifications containing error models, and so on. We present analysis algorithms for three objectives: expected time, long-run average, and timed (interval) reachability. As the model exhibits nondeterminism, we focus on maximal and minimal values for all three objectives. We show that expected-time and long-run average objectives can be efficiently reduced to well-known problems on MDPs such as stochastic shortest path, maximal end-component decomposition, and long-run ratio objectives. This generalises (and slightly improves) the results reported in [18] for IMCs to MAs. Secondly, we present a discretisation algorithm for timed interval reachability objectives which extends [38]. Finally, we present the MaMa tool chain, an easily accessible publicly available tool chain 1 for the specification, mechanised simplification-such as confluence reduction [36], a form of on-the-fly partial-order reduction-and quantitative evaluation of MAs. We describe the overall architectural design, as well as the tool components, and report on empirical results obtained with MaMa on a selection of case studies taken from different domains. The experiments give insight into the effectiveness of the reduction techniques in MaMa and demonstrate that MAs provide the basis of a very expressive stochastic timed modelling approach without sacrificing the ability of time and memory efficient numerical evaluation.
Organisation of the paper. We introduce Markov automata in Section 2. Section 3 considers the evaluation of expected-time properties. Section 4 discusses the analysis of long-run properties, and Section 5 focuses on timed reachability properties with time-interval bounds.
1 Stand-alone download as well as web-based interface available from http://fmt.cs.utwente.nl/~timmer/mama. Implementation details of our tool, a compositional modelling formalism as well as experimental results are discussed in detail in Section 6. Section 7 concludes the paper. We provide the proofs for our main results in the appendix.

Preliminaries
2.1. Markov automata. An MA is a transition system with two types of transitions: probabilistic (as in PAs) and Markovian transitions (as in CTMCs). Let Act be a countable universe of actions with internal action τ ∈ Act, and Distr(S) denote the set of discrete probability distribution functions over the countable set S. Let α, β range over Act and µ, ν over Distr(S). Actions such as α can be used for interaction with other MAs [16]. This does not apply to the internal action τ , which is executed autonomously.
where S is a nonempty, finite set of states with initial state s 0 ∈ S, A ⊆ Act is a finite set of actions with τ ∈ A, and • − → ⊆ S × A × Distr(S) is the probabilistic transition relation, and We abbreviate (s, α, µ) ∈ − → by s α − − → µ and (s, λ, s ′ ) ∈ =⇒ by s λ =⇒ s ′ . An MA can evolve via its probabilistic and Markovian transitions. If s α − − → µ, it can leave state s by executing the action α, after which the probability of going to some state s ′ ∈ S is given by µ(s ′ ). If s λ =⇒ s ′ is the only transition emanating from s, a state transition from s to s ′ can occur after an exponentially distributed delay with rate λ. That is to say, the expected delay from s to s ′ is 1 λ . If s λ =⇒ s ′ and s τ −→ µ for some µ, however, always the τ -transition is taken and never the Markovian one. This is the maximal progress assumption [16]. The rationale behind this assumption is that internal (i.e., τ -labelled) transitions are not subject to interaction and thus can happen immediately, whereas the probability of a Markovian transition to immediately happen is zero. Thus, s λ =⇒ s ′ almost never fires instantaneously. Note that the maximal progress assumption does not apply in case s λ =⇒ s ′ and s α − − → µ with α = τ , as α-transitions -unlike τ -transitions -can be used for synchronisation and thus be subject to a delay. In this case, the transition s λ =⇒ s ′ may happen with positive probability. The semantics of several Markovian transitions in a state is as follows. For a state with one or more Markovian transitions, let R(s, s ′ ) = {λ | s λ =⇒ s ′ } be the total rate of moving from state s to state s ′ , and let E(s) = s ′ ∈S R(s, s ′ ) be the total outgoing rate of s. If s has more than one outgoing Markovian transition, a competition between its Markovian transitions exists. Then, the probability of moving from s to state After a delay of at most d time units (second factor) in state s, the MA moves to a direct successor state s ′ with probability P(s, s ′ ) = R(s,s ′ ) E(s) . Note that also in this case, the maximal progress assumption applies: if s τ −→ µ and s has several Markovian transitions, only the τ -transition can occur and no delay occurs in s. The behaviour of an MA in states with only Markovian transitions is thus the same as in CTMCs [3].  [35]), consisting of a server and two stations. Each state is represented as a tuple (s 1 , s 2 , j), with s i the number of jobs in station i, and j the number of jobs in the server. The two stations have incoming requests with rates λ 1 , λ 2 , which are stored until fetched by the server. If both stations contain a job, the server chooses nondeterministically (in state (1,1,0)). Jobs are processed with rate µ, and when polling a station, with probability 1 10 the job is erroneously kept in the station after being fetched. For simplicity we assume that each component can hold at most one job.
τ -transitions emanate from a state, a nondeterministic choice between these transitions exists.

2.2.
Actions. Actions different from τ can be used to compose MAs from smaller MAs using parallel composition. For instance, M 1 || H M 2 denotes the parallel composition of MA M 1 and M 2 in which actions in the set H ⊆ Act with τ ∈ H need to be executed by both MAs simultaneously, and actions not in H are performed autonomously by M i . In this paper, we will not cover the details of such composition operation (see [16]); it suffices to understand that the distinction between τ and α = τ is relevant when composing MAs from component MAs. We assume in the sequel that the MAs to be analysed are single, monolithic MAs. These MAs are not subject to any interaction with other MAs. Hence, we assume that all transitions are labelled by τ -actions. (This amounts to the assumption that prior to the analysis all actions needed to compose several MAs are explicitly turned into internal actions by hiding.) Due to the maximal progress assumption, the outgoing transitions of each state are either all probabilistic or all Markovian. We can therefore partition the states into a set of probabilistic states, denoted PS ⊆ S, and a set of Markovian states, denoted MS ⊆ S. We denote the set of enabled actions in s with Act(s), where Act(s) = {α ∈ A | ∃µ ∈ Distr(S) . s α − − → µ} if s ∈ PS, and Act(s) = {⊥} otherwise.

Paths.
A path in an MA is an infinite sequence π = s 0 denotes that after residing t i = 0 time units in s i , the MA moved via action σ i to s i+1 with probability µ i (s i+1 ). In case σ i = ⊥, s i ⊥,µ i ,t i − −−−− → s i+1 denotes that after residing t i time units in s, a Markovian transition led to s i+1 with probability µ i (s i+1 ) = P(s i , s i+1 ). For t ∈ R ≥0 , let π@t denote the sequence of states that π occupies at time t. Due to instantaneous probabilistic transitions, π@t is a sequence of states, as an MA may occupy various states at the same time instant. Let Paths denote the set of infinite paths and Paths * be the set of finite prefixes thereof (called finite paths). The time elapsed along the infinite path π is given by ∞ i=0 t i . Path π is Zeno whenever this sum converges. As the probability of a Zeno path in an MA that only contains Markovian transitions is zero [3,Prop. 1], an MA is non-Zeno if and only if no SCC with only probabilistic states is reachable (with positive probability). As such SCC contains no Markovian transitions, it can be traversed infinitely often without any passage of time. In the rest of this paper, we assume MAs to be non-Zeno.

2.4.
Policies. Nondeterminism occurs when there is more than one probabilistic transition emanating from a state. To define a probability space over sets of infinite paths, we adopt the approach as for MDPs [30] and resolve the nondeterminism by a policy. A policy is a function that yields for each finite path ending in state s a probability distribution over the set of enabled transitions in s.
Formally, a policy is a function D : Paths * → Distr((Act ∪ {⊥}) × Distr(S)). Of course, policies should only choose from available transitions, so we require for each path π ending in a state s n that D(π)(α, µ) > 0 implies s n α − − → µ and D(π)(⊥, µ) > 0 implies that s n is Markovian and µ = P(s n , ·). Let GM (generic measurable) denote the most general class of such policies that are still measurable; see [28] for details on measurability. In general, a policy randomly picks an enabled action and probability distribution in the final state of a given path. This is also known as a history-dependent randomised policy. If a policy always selects an action and probability distribution according to a Dirac distribution, it is called a deterministic policy. Policies are also classified based on the level of information they use for the resolution of nondeterminism. In the most general setting, a policy may use all information in a finite path, e.g., the states along the path, their ordering in the path, the amount of time spent in each state, and so forth. A stationary policy only bases its decision on the current state, and not on anything else. That is, D is stationary whenever D(π 1 ) = D(π 2 ) for any finite paths π 1 and π 2 that have the same last state. A stationary deterministic policy can be viewed as a function D : PS → Act × Distr(S) that maps each probabilistic state s to an action α ∈ Act and probability distribution µ ∈ Distr(S) such that s α − − → µ; such policies always take the same decision every time they are in the same state. A time-abstract policy resolves nondeterminism based on the alternating sequence of states and transitions visited so far, but not on the state residence times. Let TA denote the set of time-abstract policies. For more details on different classes of policies (and their relationship) on models such as MAs, we refer to [28]. Like for MDPs [30], a stationary or time-abstract policy on an MA induces a countable stochastic process that is equivalent to a (continuous-time) Markov chain. Using a standard cylinder-set construction on infinite paths in such Markov chains [3] we obtain a σ-algebra of subsets of Paths; given a policy D and an initial state s, a measurable set of paths is equipped with probability measure Pr s,D .
To ease the development of the theory, and without loss of generality, we assume that each internal action induces a unique probability distribution. Note that this is no restriction: if there are multiple τ -transitions emerging from a state s ∈ PS, we may replace the τ by internal actions τ 1 to τ n , where n is the out-degree of s.

2.5.
Stochastic shortest path (SSP) problems. As some objectives on MAs can be reduced to SSP problems, we briefly introduce them. An MDP is a tuple (S, A, P, s 0 ) where S is a finite set of states, A ⊆ Act is a set of actions, P : S ×A×S → [0, 1] such that for each state s and each α, s ′ ∈S P(s, α, s ′ ) ∈ { 0, 1 }, and s 0 ∈ S is the initial state. It is assumed that in each state at least one action is enabled, i.e., P(s, α, s ′ ) > 0 for each s, for some α. A non-negative SSP problem is a tuple (S, A, P, s 0 , G, c, g) where the first four elements represent its underlying MDP accompanied by a set G ⊆ S of goal states, cost function c : (S \ G) × A → R ≥0 and terminal cost function g : G → R ≥0 . A path through an MDP is an alternating sequence s 0 The accumulated cost along a path π through the MDP before reaching G, denoted by C G (π), is k−1 j=0 c(s j , α j ) + g(s k ) where k is the state index of reaching G. If π does not reach G, then C G (π) equals ∞. As standard in MDPs [30], nondeterminism between different actions in a state is resolved using policies; similar to the notion for MAs, a stationary deterministic policy is a function D : PS → Act. Let cR min (s, ♦G) denote the minimum expected cost reachability of G in the SSP (under all policies) when starting from s. It is a well-known result that stationary policies suffice to achieve cR min (s, ♦G). This expected cost can be obtained by solving an LP (linear programming) problem [5].

Expected time objectives
Let M be an MA with state space S and G ⊆ S a set of goal states. Define the (extended) random variable V G : Paths → R ∞ ≥0 as the elapsed time before first visiting some state in G. That is, for an infinite path π = s 0 (With slight abuse of notation we use π@t as the set of states occurring in the sequence π@t.) The minimal expected time to reach G from s ∈ S is defined by where D is a generic measurable policy on M. (In the sequel, we assume that eT min is a function indexed by G.) Note that by definition of V G , only the amount of time before entering the first G-state is relevant. Hence, we may turn all G-states into absorbing without affecting the expected time reachability. It is done via replacing all of their emanating transitions by a single Markovian self loop (a Markovian transition to the state itself) with an arbitrary rate. In the remainder of this section we assume all goal states to be absorbing. Let µ s α be the distribution such that s α − − → µ s α . As we assume that all action labels of the transitions emanating a state are unique (by numbering them), this distribution is unique.

Theorem 3.1. The function eT min is a fixpoint of the Bellman operator
where Act(s) = {τ i | s τ i − − → µ} and µ s α ∈ Distr(S) is as formerly defined. We will later see that eT min is in fact the unique fixpoint of the Bellman operator. Let us explain the above result. For a goal state, the expected time obviously is zero. For a Markovian state s ∈ G, the minimal expected time to reach some state in G is the expected sojourn time in s (which equals 1 E(s) ) plus the expected time to reach some state in G via one of its successor states. For a probabilistic state, an action is selected that minimises the expected time according to the distribution µ s α corresponding to α in state s. The characterisation of eT min (s, ♦G) in Thm. 3.1 allows us to reduce the problem of computing the minimum expected time reachability in an MA to a non-negative SSP problem [5,12]. This goes as follows.
Terminal costs are zero. Transition probabilities are defined in the standard way. The cost of a Markovian state is its expected sojourn time, whereas that of a probabilistic one is zero.
Thus there is a stationary deterministic policy on M yielding eT min (s, ♦G). Moreover, the uniqueness of the minimum expected cost of an SSP [5,12] now yields that eT min (s, ♦G) is the unique fixpoint of L (see Thm. 3.1). This follows from the fact that the Bellman operator defined in Thm 3.1 equals the Bellman operator for cR min (s, ♦G). The uniqueness result enables the usage of standard solution techniques such as value iteration and linear programming to compute eT min (s, ♦G). For maximum expected time objectives, a similar fixpoint theorem is obtained, and it can be proven that those objectives correspond to the maximal expected reward in the SSP problem defined above. Thus far, we have assumed MAs to be non-Zeno, i.e., they do not contain a reachable cycle solely consisting of probabilistic transitions. However, the above notions can all be extended to deal with such Zeno cycles, by, e.g., setting the minimal expected time of states in Zeno BSCCs that do not contain G-states to be infinite (as such states cannot reach G). Similarly, the maximal expected time of states in Zeno end components (that do not contain G-states) can be defined as infinity, as in the worst case these states will never reach G.

Long-run objectives
Let M be an MA with state space S and G ⊆ S a set of goal states. Let 1 G be the characteristic function of G on finite sequences, i.e., 1 G (π) = 1 if and only if s ∈ G for some s in π. Following the ideas of [11,26], the fraction of time spent in G on an infinite path π in M up to time bound t ∈ R ≥0 is given by the random variable A G,t (π) = 1 t t 0 1 G (π@u) du. Taking the limit t → ∞, we obtain the random variable The expectation of A G for policy D and initial state s yields the corresponding long-run average time spent in G:

ANALYSIS OF TIMED AND LONG-RUN OBJECTIVES FOR MARKOV AUTOMATA 9
The minimum long-run average time spent in G starting from state s is then: Note that 1 G (π@u) = 1 if and only if π@u is a sequence containing at least one state in G. For the long-run average analysis, we assume w.l.o.g. that G ⊆ MS, as the long-run average time spent in any probabilistic state is always 0. This claim follows directly from the fact that probabilistic states are instantaneous, i.e. their sojourn time is 0 by definition. Note that in contrast to the expected time analysis, G-states cannot be made absorbing in the long-run average analysis.
First we need to introduce maximal end components.
In the remainder of this section, we discuss in detail how to compute the minimum long-run average fraction of time spent in G in an MA M with initial state s 0 . The general idea is the following three-step procedure: The first phase can be performed by a graph-based algorithm [10,8], whereas the last two phases boil down to solving (distinct) LP problems. 4.1. Unichain MA. We first show that for unichain MAs computing LRA min (s, G) can be reduced to determining long-run ratio objectives in MDPs. The notion of unichain is standard in MDPs [30] and is adopted to MAs in a straightforward manner. An MA is unichain if for any stationary deterministic policy the induced stochastic process consists of a single ergodic class plus a possibly non-empty set of transient states 2 . Let us first explain the long-run ratio objectives. Let M = (S, A, P, s 0 ) be an MDP. Assume w.l.o.g. that for each s ∈ S there exists α ∈ A such that P(s, α, s ′ ) > 0 for some s ′ ∈ S. Let c 1 , c 2 : S × A → R ≥0 be cost functions. The operational interpretation is that a cost c 1 (s, α) is incurred when selecting action α in state s, and similar for c 2 . Our interest is the ratio between c 1 and c 2 along a path. The long-run ratio R between the accumulated costs c 1 and c 2 along the infinite path π = s 0 The minimum long-run ratio objective for state s of MDP M is defined by: 2 State s is transient if and only if the probability of the set of paths that start from s but never return back to it is positive, otherwise it is recurrent. An MA is ergodic if for all stationary deterministic policies the induced stochastic process consists of a single recurrent class.
Here, Paths is the set of paths in the MDP, D is a stationary deterministic MDP-policy, and Pr is the probability measure on MDP-paths. From [10, Th. 6.14], it follows that R min (s) can be obtained by solving the following LP problem with real variables k and non-negative x s for each s ∈ S: Maximise k subject to: We now transform an MA into an MDP with two cost functions as follows.
Observe that cost function c 2 keeps track of the average residence time in state s whereas c 1 only does so for states in G. Furthermore, R is well-defined in this setting, since the cost functions c 1 and c 2 are obtained from non-Zeno MA. In other words, the probability of the set of paths with ill-defined long-run ratio is zero.
To summarise, computing the minimum long-run average fraction of time that is spent in some goal state in G ⊆ S in a unichain MA M equals the minimum long-run ratio objective in an MDP with two cost functions. The latter can be obtained by solving an LP problem. Observe that for any two states s, s ′ in a unichain MA, LRA min (s, G) and LRA min (s ′ , G) coincide. We therefore omit the state and simply write LRA min (G) when considering unichain MAs.  Therefore, we can say that each M j induces a unichain MA for the optimal long-run ratio. Using this decomposition of M into maximal end components, we obtain the following result: . . , S k ⊆ S, and set of goal states G ⊆ S: where Pr s 0 ,D (♦✷S j ) is the probability to eventually reach and continuously stay in some states in S j from s 0 under policy D and LRA min Figure 3(a). Computing the minimal LRA for arbitrary MAs is now reducible to a non-negative SSP problem. This proceeds as follows. In MA M, we replace each maximal end component M j by two fresh states q j and u j . Intuitively, q j represents M j whereas u j can be seen as the gate to and from M j . Thus, state u j has a Dirac transition to q j as well as all probabilistic transitions leaving S j . Let U denote the set of u j states and Q the set of q j states. For simplicity of the definition we assume w.l.o.g. that each probabilistic state induces a τtransition with an index of the state. Further, the τ -transitions of each state s k ∈ PS are numbered from 1 to n s k ∈ N, where n s k is the number of probability distributions induced by τ s k . Thus, we denote an action in state s k with τ s k l with l ∈ {1 . . . n s k }.
Here, P(s, α, S ′ ) is a shorthand for s ′ ∈S ′ P(s, α, s ′ ) and A i denotes the action set of maximal end component M i . The terminal costs of the new q i -states are set to LRA min i (G).  To summarise, computing the minimum long-run average fraction of time that is spent in some goal states in G ⊆ S in an arbitrary MA M starting in state s 0 equals the minimum expected cost of an SSP.

Timed reachability objectives
This section presents an algorithm that approximates time-bounded reachability probabilities in MAs. We start with a fixpoint characterisation, and then explain how these probabilities can be approximated using a discretisation technique.

Fixpoint characterisation.
Our goal is to come up with a fixpoint characterisation for the maximum (or minimum) probability to reach a set of goal states in a time interval. Let I and Q be the set of all nonempty nonnegative real intervals with real and rational bounds, respectively. For interval I ∈ I and t ∈ R ≥0 , let Given MA M, I ∈ I and a set G ⊆ S of goal states, the set of all paths that reach some goal states within interval I is denoted by ♦ I G. is Lipschitz continuous and thus measurable. The characterisation is a simple generalisation of that for IMCs [38], reflecting the fact that taking an action from a probabilistic state leads to a distribution over the states (rather than a single state). The above characterisation yields a Volterra integral equation system which is in general not directly tractable [3]. To tackle this problem, we approximate the fixpoint characterisation using discretisation, extending ideas developed in [38].

5.2.
Discretisation. We split the time interval into equally-sized discretisation steps, each of length δ. The discretisation step is assumed to be small enough such that with high probability it carries at most one Markovian transition. This allows us to construct a discretised MA (dMA), a variant of a semi-MDP, obtained by summarising the behaviour of the MA at equidistant time points. Paths in a dMA can be seen as time-abstract paths in the corresponding MA, implicitly counting discretisation steps, and thus discrete time.
Using the above fixpoint characterisation, it is now possible to relate reachability probabilities in the MA M to reachability probabilities in its dMA M δ .
This theorem can be extended to intervals with non-zero lower bounds; for the sake of brevity, the details are omitted here. The remaining problem is to compute p M δ max (s, ♦ [0,k b ] G), which is the maximum probability to reach some goal state in dMA M δ within the step bound k b from initial state s. Let ♦ [0,k b ] G be the set of infinite (time-abstract) paths of M δ that reach some state in G within k b steps; the objective is then formalised by where we recall that TA denotes the set of time-abstract policies. Our algorithm is now an adaptation (to dMA) of the well-known value iteration scheme for MDPs.
The algorithm proceeds by backward unfolding of the dMA in an iterative manner, starting from the goal states. Each iteration intertwines the analysis of Markovian states and of probabilistic states. The key idea is that a path from probabilistic states to G is split into two parts: reaching Markovian states from probabilistic states in zero time and reaching goal states from Markovian states in interval [0, j], where j is the step count of the iteration. The former computation can be reduced to an unbounded reachability problem in the MDP induced by probabilistic states with rewards on Markovian states. For the latter, the algorithm operates on the previously computed reachability probabilities from all Markovian states up to step count j. We can generalise this recipe from step-bounded reachability to step interval-bounded reachability; details are described in [21].

Tool chain and case studies
This section describes the implementation of the algorithms discussed, together with the modelling features resulting in our MaMa tool chain. Also, we present two case studies that provide empirical evidence of the strengths and weaknesses of the MaMa tool chain. 6.1. Modelling. As argued in the introduction, MAs can be used as a semantical model for various modelling formalisms. We use the process-algebraic specification language MAPA (Markov Automata Process Algebra) [35,34]. This language contains the usual process algebra operators, can treat data as first-class citizens, and supports several reduction techniques for MA specifications. In fact, it turns out to be beneficial to map a language (like GSPNs) to MAPA so as to profit from these reductions.
The MAPA language supports algebraic processes featuring data, nondeterministic choice, action prefix with probabilistic choice, rate prefix, conditional behaviour and process instantiation (allowing recursion). Using MAPA processes as basic building blocks, the language also supports the modular construction of large systems via top-level parallelism, encapsulation, hiding and renaming. The operational semantics of a MAPA specification yields an MA; for a detailed exposition of the syntax and semantics we refer to [35,34].
To enable state space reduction and generation, our tool chain uses a linearised normal form of MAPA referred to as MLPE (Markovian Linear Probabilistic process Equation). In this format, there is precisely one process which consists of a nondeterministic choice between a set of symbolic transitions, making MLPEs easy to translate to MAs. Every MAPA specification can be translated efficiently into an MLPE while preserving strong bisimulation [35].
Reduction techniques. On MLPEs, several reduction techniques have been defined. Some of them simplify the MLPE to improve readability and speed up state space generation, while others really modify it in such a way that the underlying MA gets smaller. Being defined on the specification, these reductions eliminate the need to ever generate the original unreduced state space. We briefly discuss six such techniques.
• Maximal progress reduction removes Markovian transitions from states also having τtransitions (motivated by the maximal progress assumption). • Constant elimination [25] replaces parameters that remain forever constant by their initial (and hence permanent) value. • Expression simplification [25] evaluates functions for which all parameters are constants and applies basic laws from logic. • Summation elimination [25] removes trivial nondeterministic choices often arising from synchronisations. • Dead-variable reduction [37] detects parts of the specification in which the value of some variable is irrelevant: it will be overwritten before being used for all possible futures. When reaching such a part, the variable is reset to its initial value. • Confluence reduction [36] detects spurious nondeterminism resulting from parallel composition. It denotes a subset of the probabilistic transitions of a MAPA specification as confluent, meaning that they can safely be given priority if enabled together with other transitions.
6.2. MaMa tool chain. Our tool chain consists of several tool components: SCOOP [33,35], IMCA [18], and GEMMA [4], see Figure 4. The tool chain comprises about 8,000 LOC (without comments). SCOOP (written in Haskell) supports the generation of MAs from MAPA specifications by a translation into the MLPE format. It implements all the reduction techniques described above. The capabilities of the IMCA tool component (written in C++) have been lifted to expected time and long-run objectives for MAs, and extended with timed reachability objectives. It also supports (untimed) reachability objectives which are not treated further here. A prototypical translator from GSPNs to MAs, in fact MAPA specifications, has been realised (the GEMMA component, written in Haskell). We connected the three components into a single tool chain, by making SCOOP export the (reduced) state space of an MLPE in the IMCA input language. Additionally, SCOOP has been extended to translate properties, based on the actions and parameters of a MAPA specification, to a set of goal states in the underlying MA. That way, in one easy process, systems and their properties can be modelled in MAPA, translated to an optimised MLPE by SCOOP, exported to the IMCA tool and then analysed. Processor grid. First, we consider a model of a 2 × 2 concurrent processor architecture.
Using GEMMA [4], we automatically derived the MA model from the GSPN model in [1,Fig. 11.7]. Previous analysis of this model required weights for all immediate transitions, which necessitates having complete knowledge of the mutual behaviour of all these transitions. We allow a weight assignment to just a (possibly empty) subset of the immediate transitions-reflecting the practical scenario of only knowing the mutual behaviour for a selection of the transitions. For this case study we indeed kept weights for only a few of the transitions, obtaining probabilistic behaviour for them and nondeterministic behaviour for the others. Table 1 reports on the time-bounded and time-interval bounded probabilities for reaching a state such that the first processor has an empty task queue. We vary the degree of multitasking K, the error bound ǫ and the interval I. For each setting, we report the number of states |S| and goal states |G|, and the generation time with SCOOP (both with and without the reductions from Section 6.1).
The runtime demands grow with both the upper and lower time bound, as well as with the required accuracy. The model size also affects the per-iteration cost and thus the overall complexity of reachability computation. Note that the reductions speed-up the analysis times by a factor between 1.8 and 2.5: even more than the reduction in state space size. This is due to the fact that these techniques significantly reduce the degree of nondeterminism. Table 2 displays the results for expected time until an empty task queue, as well as the long-run average that a processor is active. In contrast to [1], which fixes all nondeterminism and obtains, for instance, an LRA of 0.903 for K = 2, we are now able to retain nondeterminism and provide the more informative interval [0.8810, 0.9953]. Again, SCOOP's reduction techniques significantly improve runtimes.       Table 3: Interval reachability probabilities for the polling system. (Time in seconds.) Polling system. Second, we consider a polling system with two stations and one server, similar to the one depicted in Figure 2 and inspired by [32]. There are incoming requests of N possible types, each of them with a (possibly different) service rate. Additionally, the stations each store these in a local queue of size Q. We vary the values of Q and N , analysing a total of six different settings. Since-as for the previous case-analysis scales proportionally with the error bound, we keep this constant here. Table 3 reports results for time-bounded and time-interval bounded properties, and Table 4 displays probabilities and runtime results for expected times and long-run averages. For all analyses, the goal set consists of all states for which both station queues are full.

Conclusion
This paper presented new algorithms for the quantitative analysis of Markov automata (MAs) and proved their correctness. Three objectives have been considered: expected time, long-run average, and timed reachability. The MaMa tool chain supports the modelling and reduction of MAs, and can analyse these three objectives. It is also equipped with a prototypical tool to map GSPNs onto MAs. The MaMa tool is accessible via its easy-to-use web interface that can be found at http://fmt.cs.utwente.nl/~timmer/mama. Experimental results on a processor grid and a polling system give insight into the accuracy and scalability of the presented algorithms. Future work will focus on efficiency improvements and reward extensions [20]. Recall that the minimal expected time to reach G from s ∈ S is defined by where D is a generic measurable policy on M. eT min is a function indexed by G. Further, V G : Paths → R ∞ ≥0 is the elapsed time before visiting some state in G for the first time, i.e., V G (π) = min {t ∈ R ≥0 | G ∩ π@t = ∅} where min(∅) = ∞. Let ∆(π, k) = k−1 i=0 t i be the elapsed time on infinite path π = s 0 Theorem 3.1. The function eT min is a fixpoint of the Bellman operator where Act(s) = {τ i | s τ i − − → µ} and µ s α ∈ Distr(S) is as formerly defined. Proof. We show that L(eT min (s, ♦G)) = eT min (s, ♦G), for all s ∈ S. Therefore, we will distinguish three cases: s ∈ MS \ G, s ∈ PS \ G, and s ∈ G. Note that D ∈ GM.
(ii) If s ∈ PS \ G, for each action α ∈ Act(s) and successor state s ′ , with P(s, α, s ′ ) > 0 it follows that P(s, α, s ′ ) = µ s α (s ′ ). Further, c(s, α) = 0 for all α ∈ Act. Thus  First we recall the definition of weak bisimulation for MAs [16]. Therefore, we have to introduce some additional notation. A sub-distribution µ over a set S is a function µ : S → [0, 1] with s∈S µ(s) ≤ 1. We define supp(µ) = {s ∈ S | µ(s) > 0} as the support of µ and the probability of S ′ ⊆ S with respect to µ as µ(S ′ ) = s∈S ′ µ(s). Let |µ| := µ(S) denote the size of the sub-distribution µ. If |µ| = 1 then µ is a full distribution. Let Distr(S) and Subdistr(S) denote the set of distributions and sub-distributions over S, respectively. We write 1 s for the Dirac distribution for s, determined by 1 s (s) = 1. Let µ and µ ′ be two sub-distributions, then µ ′′ := µ ⊕ µ ′ is defined by µ ′′ (s) = µ(s) + µ ′ (s), if |µ ′′ | ≤ 1. Further, µ ′′ can be split back into µ and µ ′ , where (µ, µ ′ ) is defined as the splitting of µ ′′ . Next we introduce the tree notation for weak transitions. For σ, is called an (infinite) L-labelled tree. The root of the tree T is called ǫ and σ ∈ dom(T ) is a node of T . A node σ is called a leaf of T if there is no σ ′ ∈ dom(T ) such that σ < σ ′ . We denote the set of all leaves of T by Leaf T and the set of all inner nodes of T by Inner T . Let L = S × R ≥0 . A node in an L-labelled tree T is labelled by a state and the probability of reaching this node from the root node of the tree. For a node σ we write Sta T (σ) for the first component of T (σ) and Prob T (σ) for the second component of T (σ).
Definition C.1 (Weak transition tree). Let M = (S, Act, − → , =⇒, s 0 ) be an MA. A weak transition tree T is a S × R ≥0 -labelled tree that satisfies the following condition (i) Prob T (ǫ) = 1, σ∈Leaf T Prob(σ) = 1. A weak transition tree T corresponds to a probabilistic execution fragment. It starts from Sta T (ǫ), and resolves the nondeterministic choices at every inner node of the tree, which represents the state in the MA it is labelled with. Prob T (σ) is the probability of reaching a state Sta T (σ) via immediate transitions in the MA, starting from state Sta T (ǫ). The distribution associated with T , denoted µ T , is defined as Now we can define a weak transition: For s ∈ S and µ ∈ Distr(S), let s µ if µ is induced by some internal weak transition tree T with Sta T (ǫ) = s. Let µ ∈ Distr(S). If for every state s i ∈ supp(µ), s i µ ′ i for some µ ′ i , then we write µ s i ∈supp(µ) µ(s i )µ ′ i .

ANALYSIS OF TIMED AND LONG-RUN OBJECTIVES FOR MARKOV AUTOMATA 23
Now a convex combination of weak transitions can be defined. Let µ C γ if there exists a finite index set I, and weak transitions µ γ i and a factor c i ∈ (0, 1] for every i ∈ I, with i∈I c i = 1 and γ = i∈I c i γ i . Let the set of splittings of immediate successor sub-distributions be defined as split(µ) = {(µ 1 , µ 2 )|∃µ ′ : Definition C.2 (Weak bisimulation). A symmetric relation R on sub-distributions over S is called a weak bisimulation if and only if whenever µ 1 Rµ 2 then for all α ∈ R∪{ǫ} : |µ 1 | = |µ 2 | and for all s ∈ supp(µ 1 ) there exists µ − → 2 , µ ∆ 2 ) ∈ split(µ 2 ) and Proof. Let M D be the stochastic process induced by a unichain MA M and stationary deterministic policy D. As M is unichain it directly follows that M D is strongly connected. The proof that M D is weakly bisimilar to a CTMC C goes along the same lines as in [14] where it has been shown that the MA semantics of well-defined GSPNs is weakly bisimilar to their CTMC semantics. As the stochastic process M D can be considered as a 1-safe GSPN that by D is well-defined, the result follows. Proof. Let M be a unichain MA with state space S and G ⊆ S a set of goal states. We consider a stationary deterministic policy D on M. It follows that there exists an ergodic CTMC C such that M D ≈ C. Note that G ⊆ MS; thus G can be represented by the union of zero or more equivalence classes under ≈.
The long-run average for state s ∈ S and G ⊆ S is given by where X u is the random variable, denoting π@u. With the ergodic theorem from [29] we obtain that almost surely holds, where m i is the expected return time to state s i . Therefore, in our induced ergodic CTMC, almost surely Thus, almost surely the fraction of time to stay in s i in the long-run is E(s) . Thus µ · P = µ where µ is the vector containing µ i for all states s i ∈ S. Given the probability of µ i of staying in state s i the expected return time is given by Gathering these results yields: Proof. (sketch). By the limit in the long-run ratio definition of R it follows that for every i ≥ 0 the prefix of π up to i does not matter. Thus, R(π) = R(π i ) where π i denotes the path π from the i-th position onwards. Therefore, given a policy D, inducing a multichain on maximal end component M, we can construct a unichain policy D ′ as follows: Let D ′ fixes the recurrent class S ′ of M with the minimal value induced by D (in case of the maximal long-run ratio, the maximal value respectively). For states outside of S ′ , D ′ is a policy that reaches S ′ with probability 1.
Appendix E. Proof of Theorem 4.4 where π s 0 s is the path starting in initial state s 0 and ends in s ∈ M i for some 0 < i ≤ k. Further, all states on path π ω s belong to maximal end component M i . Note, that a state on path π s 0 s can be part of another maximal end component M j (as in Example 4.6). Hence, it is not sufficient to only check if eventually a MEC is reached, as done in the corresponding theorem for IMCs in [18]. Thus, the minimal LRA will be obtained when the LRA in each MEC M i is minimal and the combined LRA of all MECs is minimal according to their persistence under policy D.
Appendix F. Proof of Theorem 4.7 Proof. Let π be an infinite path in the MDP ssp lra (M) such that π[i Q ] is the first visit of a state in Q along π, i.e., for all j < i Q , π[j] ∈ Q and π[i Q ] ∈ Q. Similarly, we define i q for a single state q. We define random variable C Q : Paths → R ≥0 by C Q (π) = g(π[i Q ]). Note that D ∈ GM.
Observe that in step ( * ) we use the transformation from (ii) s ∈ PS \ G: From the law of total probability, we split time bounded reachability into two parts. First we compute the probability to reach the set of Markovian states from s by only taking probabilistic transitions in zero time, and then we quantify the probability to reach some goal state in G from Markovian states inside interval I. Therefore: