Thermodynamic graph-rewriting

We develop a new thermodynamic approach to stochastic graph-rewriting. The ingredients are a finite set of reversible graph-rewriting rules called generating rules, a finite set of connected graphs P called energy patterns and an energy cost function. The idea is that the generators define the qualitative dynamics, by showing which transformations are possible, while the energy patterns and cost function specify the long-term probability $pi$ of any reachable graph. Given the generators and energy patterns, we construct a finite set of rules which (i) has the same qualitative transition system as the generators; and (ii) when equipped with suitable rates, defines a continuous-time Markov chain of which $pi$ is the unique fixed point. The construction relies on the use of site graphs and a technique of `growth policy' for quantitative rule refinement which is of independent interest. This division of labour between the qualitative and long-term quantitative aspects of the dynamics leads to intuitive and concise descriptions for realistic models (see the examples in S4 and S5). It also guarantees thermodynamical consistency (AKA detailed balance), otherwise known to be undecidable, which is important for some applications. Finally, it leads to parsimonious parameterizations of models, again an important point in some applications.


Introduction
Along with Petri nets, communicating finite state machines, and process algebras, models of concurrent systems based on graphs and graph transformations (GTS) have long been investigated as means to describe, verify and synthesize distributed systems [14].Beyond their visual aspect, which is often useful in modelling situations, there is a lot to like about GTSs: there are category-theoretic frameworks to express them and encapsulate their syntax; and the existence of a strong meta-theory [21] is a reassurance that methodologies developed in specific cases can be 'ported' to other variants.
Graph-rewriting rules are convenient for writing compact models and modifying them [8], and lend themselves naturally to probabilistic extensions [18,20].However, for all their flexibility, even rules can only do so much.We ask in this paper "what if we did not have to write the rules?".This is where we take a page from the book of classical statistical mechanics.In such models, which often involve graph-like structures, as in the Ising model, the dynamics is not described upfront.Instead, the system of interest is equipped with an 'energy lansdcape' which specifies its long run behaviour, be it deterministic as in classical mechanics, or probabilistic in statistical physics.The dynamics just follows from the energy data.In the eye of a computer scientist, this use of energy looks like a latent syntax.(This is especially true in the application of these ideas to molecular dynamics.) The broad aim of this paper is to make this syntax explicit by introducing energy patterns and costs from which the total energy of a state of the system can be computed; and to define a procedure whereby, indeed, the dynamics described as probabilistic graphrewriting rules can be derived from these energy data.Descriptively, this takes us to an entirely new level of conciseness (as in the example shown in §4).It also guarantees thermodynamical consistency by construction, a property known otherwise to be undecidable [11].This property provides a predictable equilibrium state for the system which can be used as a guide when writing models by giving an intuitive understanding of favoured states and the (possibly overlapping) subgraphs within them.But perhaps the nicest byproduct of this approach is the fact that the methodology leads to parsimonious parameterizations.The parameter space which usually scales as the number of rules (which in turn has at best a logarithmic impact on the cost of a simulation event [5]), will now scale as the number of energy patterns provided in the specification.
The particular GTSs we consider form a reversible subset of the Kappa site-graph stochastic rewriting language.Kappa is used for the simulation and analysis of combinatorial dynamical systems as typically found in cellular signalling networks [22,26] and has been predicted to "become one of the mainstream modelling tools of systems biology within the coming decade" [1].Similar graph formalisms where nodes have a controlled valence/degree have been considered e.g. the BNG language [15,19], Kissinger and Dixon's quantum proof language [13], and Kirchner et al. chemical calculi [3].Site-graph rewriting has found recently a 'home' both in the single-pushout GTS tradition [9] and the double-pushout one [6,17].This makes one hopeful that the thermodynamic methodology we propose can cross over to other fields where quantitative GTSs can be used, e.g. in the modelling of adaptive networks [16].While our scalable energy-based parameterization is particularly important in biological applications where parameters often need to be inferred, one can imagine it to be useful in other modelling situations with uncertainty.
Outline: We start by introducing a running example that will help us illustrate the concepts presented throughout the paper.Then we proceed with the definition and relevant properties of the specific GTS we use, namely a simple reversible fragment of Kappa.Next, we introduce growth policies (adapted from Ref. [25]), a tool which allows one to replace a rule with an orthogonal set of refined rules while preserving the quantitative semantics.We use this tool with a specific policy which refines a rule into finitely many rules, each of which has a definite energy balance with respect to a given set of energy patterns.This leads to our main theorem which guarantees that the stochastic dynamics of the obtained refined rule set converges to an equilibrium distribution parametrized by the cost of each energy pattern.Throughout, the presentation is set in category-theoretic terms and mostly self-contained.A substantial example concludes the paper.
1.1.Running example: Assembling triangles.The following example will be used to show the intrincacies of our rule generation mechanism.We consider graphs with three types of nodes where each can only bind nodes of a different type, and at most one thereof.As generators, we consider simple binding and unbinding rules subject to this constraint (see details in §2.4): These rules, given an unbounded supply of the three types of nodes, can create chains (of any length) and closed cycles (of length some multiple of three) as shown in Fig. 1.Given the simplicity of the graphs that can be formed, we can describe them using a linear textual notation where numbers are used to represent the three types of nodes, then chains are simply written as words (e.g.2312) while for cycles we indicate half-edges at both ends, e.g.•123• is the triangle and •123123• is the hexagon.We will use that shorthand notation in the sequel.
Our goal here is to be able to favour the formation at equilibrium of certain structures like the triangle by assigning a significantly negative energy to them (the convention is that a lower energy means a higher probability).

Site graph rewriting
2.1.Site graphs and homomorphisms.A site graph G consists of finite sets of agents (nodes) and sites (connection slots), A G and S G , a partial function σ G : S G A G , and a symmetric edge relation E G on S G .The pair S G , E G form an undirected graph; sites not in the domain of E G are said to be free.The role of the additional map σ G is to assign sites to agents; sites not in the domain of σ G are said to be dangling, and will be used to represent half-edges (see below).Usually one also endows agents and/or sites with states (as we do in the example treated in §4); our main construction in §3 carries over readily to these.
One says G is realizable iff (i) sites have at most one incident edge; (ii) no dangling site is free; and, (iii) edges have at most one dangling site.Note that point (i) implies that no site has an edge to itself.
Site graphs and homomorphisms form a category SG with the natural 'tiered' composition; embeddings form a subcategory; if in addition, we restrict objects to be realizable, we get the subcategory rSGe of realizable site graphs and embeddings.
Returning to our chains-and-cycles example, we can use sites as a means to enforce on nodes our earlier constraint where a node can connect to at most one node of each other type.Here we present a possible encoding of a simple chain of 3 nodes: Note that this site graph is realizable and the l x and r z sites are free.To complete the encoding of our example we need types for nodes which we now introduce.

The category of site graphs over
The third condition of local injectivity means that every agent of G has at most one copy of each site of its corresponding agent in C; C is called the contact graph.
Hereafter, we work in the (comma) category rSGe C whose objects are contact maps over C, and arrows are embeddings such that the associated triangle commutes in SG.We write Υ C (h, h ) for the set of such embeddings between h, h contact maps over C; we also write | | for the domain functor from rSGe C to rSGe which forgets types.In particular, if h : G → C is a contact map, we write |h| for its domain G.
Note that in rSGe, an embedding h : G → G may map a dangling site s of G to any site s of G (provided h S is injective).In particular, s does not have itself to be dangling.This means that dangling sites can be used as an "any site" wild card when matching G.In the typed case, i.e. in rSGe C , the contact map h : G → C tells us which agent in C the dangling site belongs to because σ C is total, and this must be respected by h.Dangling sites can now be used as "binding types" to express the property of being bound to the site s of an agent of a given type.
The contact graph C is fixed and plays the role of a type: it specifies the kinds of agents that exist, the sites that they may possess, and which of the |S C |(|S C | + 1)/2 possible edge types are actually valid.It also gives canonical names to the types of agents and their sites.
In our running example we use the triangle C = •123• as contact graph: ), (l 2 , r 1 ), (r 2 , l 3 ), (l 3 , r 2 ), (r 3 , l 1 ), (l 1 , r 3 )} This contact graph constrains agents to bind only agents of a different type.The local injectivity condition on contact maps restricts the number of sites available for each node to two.Our simple chain of three nodes from the previous section ( §2.1) can thus be typed using a contact map h : We now extend the shorthand notation introduced in §1.1 to cover the case when h S is not locally surjective.Specifically, when a node does not mention its two allowed sites, we write '?' for the missing site.A missing site can be introduced by an embedding and can be free or bound.Sites do not explicitly show in this notation but they are implicitly positioned at the left (l) and right (r) of each agent.Binding types are denoted by an exponent.Thus, for instance, in 3 12? we have two agents u, v of type 1, 2, respectively, where u is bound to a dangling site of type r 3 on its l u site and bound to v's l v site on its r u site.Similarly, v is bound to u on its l v site and does not have any r site.Using this extended notation we can write down our three rules in text: ?1 + 2? ?12?, ?2 + 3? ?23?, and ?3 + 1? ?31?.Recall that the goal of this example is to control the type of cycles which one finds at equilibrium.To this effect, we introduce a set of energy patterns P consisting of one contact map for each edge type (i.e.?12?, ?23?, and ?31?), and one for the triangle •123•.As we will see, a careful choice of energy costs for each pattern will indeed lead to a system producing almost only triangles in the long run (see Fig. 4 in Appendix A).
2.3.Multi-sums in rSGe C .The category SG has all pull-backs, constructed from those in Set; it is easy to see that they restrict to rSGe C .SG also has all push-outs and all sums, but these do not in general restrict to rSGe C .(Just as push-outs and sums in Set do not restrict to the subcategory of injective functions.)Figure 1: An example of a mixture of chains and 3n-gons which our three rules can reach starting with 12 agents of each type.
However, rSGe C has multi-sums meaning that, for all pairs of site graphs of type C, h 1 : G 1 → C and h 2 : G 2 → C, there exists a family of co-spans θ i 1 : h 1 → s i ← h 2 : θ i 2 , such that any co-span γ 1 : h 1 → h ← h 2 : γ 2 factors through exactly one of the family and does so uniquely.In outline, given a co-span γ 1 : h 1 → h ← h 2 : γ 2 , we first take its pull-back (which is guaranteed to remain within rSGe C ) and then the push-out of that: the fact that the original co-span exists implies that this push-out also remains within rSGe C .The multi-sum is then a choice of push-out for each (isomorphism class of) pull-back.
The idea is that the pairs θ i 1 , θ i 2 enumerate all minimal ways in which one can glue h 1 and h 2 , that is to say all the minimal gluings of G 1 and G 2 that respect C.There are finitely many, all of which factor through the standard sum in the larger slice category SG C .
The notion of multi-sum dates back to Ref. [12]; we will call them minimal gluings in rSGe C according to their intuition in this concrete context, and use them in §3.3 to construct balanced rules.
To illustrate this idea, let us consider, in the context of our example, the minimal gluings of D 1 = ?123? and D 2 = ?231?.Computing them is a matter of computing all pull-backs.One such pull-back is D 1 ← ?23? → D 2 , which gives us the minimal gluing ?1231?.On the other hand, the span D 1 ← ?3? → D 2 is not a pull-back since for all cospans that can close the square, the span factors through D 1 ← ?23? → D 2 .This is a consequence of ?3? not being a maximal overlap of D 1 and D 2 : whenever 3 is contained in a possible overlap of D 1 and D 2 , 2 will also be there.
All minimal gluings of D 1 and D 2 are displayed in the following diagram with their respective pullbacks.The left square has the empty graph as pullback and ?123? + ?231? as pushout, the central square has ?23? and ?1231?, and the right square has ?23? + ?1? and •123•.Each square uses arrows of different colour and style.
For example, the unbinding rule g 12 =?12?→ ?1 + 2? may be represented as: with only the edge structures In words, a mixture is a fully-specified site graph with respect to the type C.
(2.1)An embedding ψ : r L → h induces a rewrite of the mixture h by modifying the edge structure of the image of ψ from that of r L to that of r R .The result of rewriting is a new mixture h , where |h | has the same agents and sites as |h|, and an embedding ψ : r R → h .This type of rewriting can be formalized using double push-out rewriting [6], but with the simple rules considered here, there is no need.The inverse of r, defined as r := (r R , r L ) is also a valid rule; by rewriting h with r via ψ , we recover h and ψ.
Given a finite set of rules G over C, we define a labelled transition system L G on mixtures over C: a transition from a mixture h is a rewriting step determined and labelled by an event (r, ψ), as in diagram (2.1), with r in G and ψ in Υ C (r, h).We suppose hereafter that G is closed under rule inversion, i.e.G = G .Hence, every (r, ψ)-transition has an inverse (r , ψ ), and L G is symmetric.

CTMC semantics. It is not difficult to see that for any rule r, |Υ
where d(r) is the number of connected components in r L .Hence, L G has finite out-degree, bounded by |G| • |A |h| | d for some d.Also, as agents are preserved by rules, the (strongly) connected components of L G are finite.Given a rate map k from G to R + , we can therefore equip L G with the structure of a time-homogeneous irreducible continuous-time Markov chain (CTMC), simply by assigning rate k(r) to an event of the form (r, ψ).We write L k G for the CTMC thus obtained.
A finite time-homogeneous CTMC M has detailed balance for a probability distribution π on M's state space iff, for all states x and y, π(x) • q(x, y) = π(y) • q(y, x) where q(x, y) is M's transition rate from x to y.This implies, assuming M is irreducible, that π is the unique fixed point of the action of M to which the probabilistic state of M converges, regardless of the initial state.
In our case, the probability π(x) will be proportional to e −E(x) , where E(x) is the energy of the system as defined by a set of patterns and their associated costs.

2.6.
Extensions and rule refinement.We say an embedding φ : s → s is a prefix of there is some embedding θ : s → s such that θφ = φ and write ψ ≤ φ for this.We refer to a prefix of an epi φ : s → s as an extension of s.In the category of extensions of s, a morphism between objects φ : s → s and φ : s → s is an embedding θ : s → s such that the triangle commutes.If θ is an iso we write φ ∼ = s φ .
Epis of rSGe C are characterized as follows [25]: suppose s : G → C and s : G → C are contact maps then φ : s → s of rSGe C is an epi iff every connected component of G contains at least one agent in the image of φ A .Rule application preserves epis and in fact also prefixes of epis: Lemma 1.Let r = (r L , r R ) be a rule and φ : r L → x be an embedding with r L , r R , and x contact maps in rSGe C .The embedding φ : r R → x that results from applying r to φ is a prefix of an epi iff φ is.
Proof.This amounts to proving that ψ ≥ φ is an epi if ψ ≥ φ is.For this it is enough to consider the case where the rule adds or deletes exactly one edge.Rules that modify more than one edge at a time can be decomposed as sequences of deletions and insertions of edges.Given that each deletion and insertion preserves the property, the sequence will preserve it as well.The case of adding an edge is easy, as the image of ψ has fewer connected components to "touch".The case where r deletes an edge can introduce new connected components, however in this case both ends u, v of the deleted edge must be in r L , so whether the deletion disconnects or not the codomain of ψ, the components of ψ (u) and ψ (v) will have a pre-image, namely u and v.
It follows that the category of extensions of r L and r R are isomorphic.Indeed, because we have chosen to restrict to reversible rules, the full categories under r L and r R are evidently isomorphic, and by the lemma above this correspondence preserves being prefix-of-an-epi; if φ is an extension of r L , we will write φ for the corresponding extension of r R .
A family of epis φ i : s → t i uniquely decomposes s, or is a refinement of s, if, for all mixtures h and embeddings ψ : s → h, there exists a unique i and ψ such that ψ = ψ φ i .This is the basic requirement for a reasonable notion of rule refinement: it guarantees that the LHS s of a given rule splits into a non-overlapping and exhaustive collection of more specific cases t i .
In the next section, we will be constructing specific such decompositions in order to produce families of subrules that are compatible with energy patterns, i.e. each subrule should produce and consume the same number of energy patterns in each application, regardless of the particular embedding ψ ∈ Υ C (r L , h) that the subrule is applied to.
To find such unique decompositions we first recall the growth policy method [25], which works by detailing which agents and sites should be added to s.Specifically, a growth policy Γ for s is a family of functions Γ φ , indexed by all extensions φ : s → t, where each agent in |t| is allocated a subset of the set of sites belonging to the agent t A (u) it is mapped to in C.An agent in |t| may cover some, or all, of these sites or even completely extraneous sites: if the former, i.e. for all u in A |t| , t S (σ −1 |t| (u)) ⊆ Γ φ (u), we say that φ is immature; if for all us, the inclusion is an equality and φ is also an epi, we say that φ is mature; otherwise φ is said to be overgrown.The functions Γ φ must satisfy, for all extensions φ and φ ≥ φ, the faithfulness property, Γ φ = Γ φ • ψ A with ψ such that ψφ = φ ; so a site requested by φ must be requested by any further extension.Additionally, this property forces Γ to eagerly ask for all sites that will be eventually requested at any given agent in the codomain of φ.If φ is not overgrown then no φ ≤ φ is overgrown either.Also, note that the union of two growth policies is itself a growth policy.
Given an s and a growth policy Γ for s, we define Γ (s) by choosing one representative per ∼ = s -isomorphism class of the set of all extensions of s which are mature according to Γ .
The following theorem guarantees that factorizations through Γ (s) are unique when they exist, but not that they necessarily do exist.In the next section, we will construct a specific growth policy of interest for which this property of exhaustivity of the decomposition can be proved by hand.As such, it fulfils our desired criteria of providing an exhaustive collection of mutually exclusive sub-cases.
Theorem 1.Let s and x be contact maps and Γ a growth policy for s.If an embedding ψ : s → x in Υ C (s, x) can be decomposed in two ways as γ 1 φ 1 and γ 2 φ 2 with φ i : s → t i in Γ (s) and γ i : where φ 1 and φ 2 are mature extensions of s according to Γ and φ 1 = φ 2 .As shown in diagram (2.2), we have an inner square formed by the pull-back π 1 , π 2 , and the minimal gluing θ 1 , θ 2 of γ 1 and γ 2 .Also θ 1 and θ 2 are epis, as every connected component of m has a pre-image in t 1 or t 2 and so also in s, since the φ i s are epis, and so also in the other of t 2 and t 1 .
If θ 1 , θ 2 are not both isomorphisms then there must be a pair u, z, consisting of a node in m with pre-images u 1 , u 2 in t 1 , t 2 and a site z of u, such that z has no preimage in exactly one of θ 1 , θ 2 .Say it is θ 2 .Since φ 1 is not overgrown, z ∈ Γ φ 1 (u 1 ) and, by faithfulness, z ∈ Γ φ ((u 1 , u 2 )), where (u 1 , u 2 ) is the pull-back pre-image of u 1 and u 2 .So again, by faithfulness, z ∈ Γ φ 2 (u 2 ) which contradicts our original assumption.So θ 1 and θ 2 are isos.It follows that φ 1 = φ 2 as there is only one representative per ∼ = s -isomorphism class in Γ (s).Finally, γ 1 = γ 2 because φ 1 is an epi.
Given a rule r and an extension φ : r L → t of r L , we write r φ for the refined rule associated to φ; that is, r φ is the pair (t, t ) with t the codomain of the extension φ corresponding to φ.Given Γ a growth policy for r L , we write Γ (r) for the family of rules obtained by refining r according to Γ ; that is Γ (r) is the family of rules r φ , for φ ranging in Γ (r L ).
An example of growth policy is the ground policy which assigns all possible sites to all agents.In this case, Γ (s) is simply the set, possibly infinite, of all epis of s into mixtures, considered up to ∼ = s ; and Γ (r), the ground refinement of r, contains all refinements of r along those epis, which therefore directly manipulate mixtures.It is easy to see that the ground refinement for our running example is infinite, since each of the three rules can trigger the extension of a chain of any length.

Rule generation
We fix a finite set G of generator rules; and a finite set P of connected contact maps in rSGe C ; these are our energy patterns.
A canonical set of generator rules can be derived from the contact graph C by constructing a pair of binding/unbinding rules (and a full set of state changes if we were to use site graphs with states as we do in applications) for each edge in E C .This set is maximally general in that each generator rule asks for the least possible context to trigger a binding or unbinding event.This is the set which we have chosen to use in our running example.In the next example which we present in §4, we will consider a strict subset of this canonical generating set.But, in general, there is no prescription regarding which rules one decides to incorporate to G.
The goal is now to refine G into a new rule set G P where each refined rule is P-balanced, which means that, however applied, it consumes or produces a fixed amount of each c in P. The construction proceeds in two steps.Firstly, we characterize balanced refinements; secondly, we define a specific growth policy with balanced mature extensions.Using Th. 1, we show that these mature extensions obtain a proper finite refinement of G.
Note that ground extensions of g are trivially balanced but, in general, the ground refinement is impractically large or even infinite; ours will always be finite.
3.1.Consumption and production of patterns.Consider c in P and a rule r.For an r-event ψ to consume an instance γ of c in a mixture h, γ S , and ψ S must have images which intersect on at least one site which is modified by r (by adding an edge if it was free, or by deleting or changing an edge it was part of).This is the case iff the associated minimal gluing (γ , ψ ) (obtained by restricting the co-span to the union of its images in h) has the same property.Likewise, for an r-event to produce an instance of c, the associated minimal gluing between c and r R must have a modified intersection.We call such minimal gluings relevant; they are the ones which underlie events that can affect the instances of c.
This notion can be illustrated by looking at the minimal gluings of the left-hand side of the unbinding rule ?12? → ?1 + 2? and energy pattern ?1231? (supposing for a moment that we had ?1231? in P): ∅ Of these, only the central square is a relevant minimal gluing since the image of the edge consumed by the rule is contained in the image of the energy pattern in the minimal gluing.
3.2.P-balanced extensions.Given g in G with LHS g L and φ : g L → t an extension of g L , we say that φ is P-left-balanced iff, for all relevant minimal gluings γ : c → m ← t : θ with c ∈ P, θ is an isomorphism.This means that the image of c under γ is contained in t.
Symmetrically, one says that φ is P-right-balanced iff φ is a P-left-balanced extension of r .An extension φ is P-balanced iff it is P-left-and P-right-balanced.We say that a balanced extension φ is prime iff it is minimally so, i.e. any prefix of φ that is P-balanced is isomorphic to φ (as an extension).Prime extensions are epis since erasing an 'untouched' connected component in the codomain preserves balance.
If φ is a P-balanced extension of g, the refined rule g φ has a balance vector in Z P , written ∆φ, where, for each c ∈ P, ∆φ(c) is the number of copies of c produced by any g φ -event, which is also the difference between the number of embeddings of c in the RHS and the LHS of g φ .In other words, for any mixture h, balance of φ guarantees ∆φ , well-chosen applications of g φ will result in different and contextdependent ∆φ(c).Thus, the notion of balanced extension characterizes the property that we want.
To illustrate these definitions consider the unbinding rule g 12 =?12?→ ?1 + 2? with the trivial extension φ = 1 r L (the identity arrow of the left-hand side of rule g 12 ).Define P 1 = {?12?, ?23?, ?31?} and P 2 = {•123•}.On the one hand, φ is P 1 -balanced.Of all possible gluings of patterns in P 1 with φ's codomain, only the trivial one of ?12? with itself is relevant to g 12 .On the other hand, φ is not P 2 -balanced, because there is a gluing between •123• and ?12? mapping the 12 edge to itself in the triangle.This gluing is relevant since applying g 12 breaks the triangle; and clearly θ : ?12? → •123• is not an iso.Failure to be balanced reflects in the fact that different applications of g 12 incur different values of ∆φ(•123•): namely 1 or 0 depending on whether the broken edge belongs to a triangle.
The ideas introduced now are discussed in a detailed example in §3.5.

Absorb or Avoid.
We have just seen how the property of being P-balanced characterizes, among all extensions φ of g L , those with an unambiguous energy balance.The next step is to define a growth policy with a set of mature extensions which are balanced and form a unique decomposition of g L .If we return to diagram (3.1) in the case where θ is not an isomorphism, the natural idea is to request the addition to the codomain of φ of these sites which are in m and belong to nodes in the image of θ A .In this way, we force φ to grow further and make a decision about whether it wants to absorb the image of c in m, or would rather avoid this by growing in a way that is incompatible with the γ, θ gluing.Some care is needed to ensure faithfulness, i.e.Γ φ = Γ φ φ φ A , since relevant minimal gluings on φ can disappear along a further extension φ so that a site that was requested at φ may no longer be so after at φ φ.To address this, we add site requests from all relevant minimal gluings in the past of an extension.
2) Given g in G we define a growth policy Γ for g L .Suppose φ : g L → t is an extension of g L .We set Γ φ to request a site s in where φ 1 : g L → t 1 , and there is a relevant minimal gluing γ : c → m ← t 1 : θ, with c in P, and some ) such that u = φ 2,A (u 1 ) and s = m S (s 1 ).We refer to the extension φ 2 : t 1 → t as a rewind of φ; we say that the request of s at u originates from u 1 .The first clause simply ensures that all sites already covered in g L are asked for; the second one adds in sites which appear by gluing at some point between g L and t, and implements the absorb-or-avoid constraint explained beforehand.
Symmetrically, we define a growth policy Γ for g R by applying the same definition to the reverse generator g .Since extensions of g L and g R are isomorphic, we can, with a slight abuse of notation, define Γ P := Γ ∪ Γ .
Theorem 2. The above Γ P is indeed a growth policy for g L ; the induced refined family of rules Γ P (g) is exhaustive, non-empty, P-balanced, and finite.
Proof.We take the same notations as in diagram (3.2).Growth policy: Clearly, Γ P φ 1 (u 1 ) ⊆ Γ P φ (u) as every request for a site in t 1 will propagate to t by definition.To prove the other direction, we need to verify that the requests generated by rewinds do not depend on the choice of factorization.So, wlog, consider gluings on left extensions of g, and let an alternative factorization of φ through t 2 be given which gives rise to a site request in u originating from some u 2 in t 2 : Consider p the pull-back of the two rewinds (ie the lower co-span); by construction it contains a pre-image for u 1 and u 2 , say (u 1 , u 2 ).The relevant minimal gluing of c and t 2 that makes the site request restricts to another (minimal) gluing of c and p.This new gluing is still obviously relevant (as it contains the same overlap with the original g L ); as such, the same site request is made by the pre-image (u 1 , u 2 ) agent in p which then propagates to u 1 in t 1 as required.
Surjectivity: Let φ : g L → x be an embedding of g L into a mixture.We can restrict the co-domain of φ to be the connected closure y of the image of φ in x, resulting in an epi φ y : g L → y.Let us further restrict y by removing: 1) all unneeded agents, i.e. those that have no sites requested by the growth policy, 2) all sites not requested by the growth policy except those bound to sites that are requested and, finally, 3) all connected components that no longer have a pre-image in g L ; call the result z.The non-requested sites left in z act as binding types (see §2.2).We thus obtain an epi φ z : g L → z which is mature with respect to Γ P since, by construction, it has all the sites requested by φ y and so, by faithfulness, all those requested by φ z .
Non-empty: Clause (i) guarantees that we request at least the sites in g which implies that g is not overgrown.
Balanced : If φ ∈ Γ P (g) is not balanced then there must be some relevant minimal gluing inducing a further site request, hence φ cannot be mature.
Finite: A request for a site a at some node in an extension φ : g L → t, or φ : g R → t, originates from a relevant minimal gluing of some c in P with a prefix φ 1 of φ.Because this gluing is relevant, it must be that a is at a distance from the image of g L in the codomain of φ 1 which is at most δ(c), the diameter of c (else c would not intersect the image of g L ).The same bound holds in the codomain of φ, as distances can only contract by further extension.Therefore any site requested in t has a distance to the image φ(g L ) which is bounded by max c∈P δ(c).If φ is not overgrown, this sets a bound on the diameter of t.Hence there are finitely many mature extensions.
Therefore, given G and P, we obtain a finite P-balanced rule set which refines G exhaustively, by setting G P := ∪g∈G Γ P (g) (disjoint sum).To every refinement g φ , corresponds an inverse refinement g φ ; hence, G P = G P is closed under inversion like G.

3.4.
Other options considered.As said earlier, there are other ways to generate a balanced rule set.Beyond the ground extensions, one idea is to use primes, i.e. minimal balanced extensions.Clearly, prime extensions factorize all balanced ones -including the ones we generate by using the absorb-or-avoid growth policy presented above.The problem is that this factorization is not unique: prime extensions do not define in general a valid refinement as distinct sub-cases may overlap.
Another idea is to obtain balanced rules by gluing P directly onto a generator rule g, in all possible maximal combinations, rather than on all extensions of g as in the above growth policy.This always reveals enough of the context in which g is applied so as to get P-balanced refinements because, by definition, no additional form can be glued further.The problem with this approach is the opposite to that with primes: it does not in general build enough rules; so the refinement is valid but not exhaustive.
3.5.Triangles in detail.We now discuss the concepts introduced in this section using the example of the unbinding rule g 12 = ?12? → ?1 + 2?.Of the four patterns in P, we need only consider the triangle T = •123• as the others all trivially glue or trivially don't.We apply our growth policy to the extension φ 1,? : ?12? → 12? whereby we reveal a new free site in 1.The relevant minimal gluing T → T ← ?12? disappears in this extension, as the new site is bound in T .However, by rewinding back to ?12?, our growth policy requires us to reveal additionally the second site of 2; as such, φ 1,? is not mature despite being balanced (indeed minimally so).
Intuitively, the growth policy forces us to reveal additional sites of ?12? because an instance of g 12 may or may not consume a triangle and so has an ambiguous energy balance.The definition of the growth policy makes it explicit that we have to reveal both of the hidden sites of ?12? and so there are four sub-cases to consider: 12, 3 12, 12 3 and 3 12 3 (using the notation introduced earlier for binding types).The first three cases all avoid the triangle, thereby having definite energy balance, and are moreover mature with respect to the growth policy; however, the case of 3 12 3 requires further analysis as it covers both the case of a triangle and of any chain of length at least 4. T ?3123?
Consider then the case of the extension to ?3123?where the two binding types of 3 12 3 have been embodied in two distinct 3 agents.Any rewind of interest has to go back far enough to glue T ; there are two maximal such extensions: φ ?3,? : ?12? → ?312? and φ ?,3? : ?12? → ?123?.The former requests the hidden site of the lefthand 3 agent of ?3123?; the latter (diagram not shown) requests the hidden site of the right-hand 3 agent.As such, ?3123? is not mature with respect to the growth policy and we must expand it into its four sub-cases: 3123, 2 3123, 3123 1 , 2 3123 1 .
If were to glue only on ?12?, not on all extensions thereof, we would only generate the rule T → 231, i.e. the case where the bond happens to belong to a triangle.Clearly, this is a valid refinement but is far from covering all cases.Note also that the extension φ 1,2 : ?12? → 12 is not a minimal balanced extension as it factors through either φ 1,? : ?12? → 12? or φ ?,2 : ?12? → ?12 which are both minimal.This illustrates the problem with prime extensions discussed earlier: the candidate refinement {12?, ?12} is not valid precisely because it leads to an ambiguous decomposition of the case of φ 1,2 and will therefore generate rates that are different for instances of g 12 with the same balance vector ∆φ.
That said, the output of the growth policy could be either optimized by hand or indeed subjected to the procedure of rule compression which would automatically perform this optimization for us.A working Kappa model where the generators of this example have been fully expanded according to the growth policy and then manually compressed can be found in Appendix A.
3.6.Rates.To equip G P with rates, we suppose given a P-indexed real-valued vector of energy costs , and a rate map k : G P → R + such that, for all g φ in G P : with ∆φ in Z P , the balance vector of the refined rule g φ with respect to P, a well-defined quantity by Th. 2.
We write P(x) for the P-indexed vector which maps c to |Υ C (c, x)|, and define the energy E(x) of x as • P(x), i.e. the sum over all c ∈ P of the product of the number of instances of c in x and the energy cost of c.We also write L G (x) for the finite (strongly) connected component of x in L G , and define a probability distribution (in Boltzmann format) π x on L G (x) by: We can now prove our main theorem.
Theorem 3. Let G, P, G P , k, and π x be defined as above; then (i) L G P and L G are isomorphic as symmetric LTSs; and (ii) for any mixture x, the irreducible time-homogeneous continuous-time Markov chain L k G P has detailed balance for, and converges to, π x on L G P (x).
Proof.Both L G and L G P offer transitions from a mixture x: the former are labelled by pairs (g, ψ) with g in G, ψ in Υ C (g L , x); the latter by pairs (g φ , γ) with g φ the refinement of g along a mature extension φ : g L → t, and γ in Υ C (t, x).Steps in the latter can be mapped to steps in the former by transforming labels as follows: (g φ , γ) → (g, γφ).As G P refines G exhaustively (Th.2), this correspondence is a bijection, which establishes the first claim.
(Pedantically, there is a full and faithful functor between the two corresponding free categories which is the identity on objects -incidentally, this bijection is readily seen to respect the symmetries on labels.) Since we have multiple rules in L G P , each of which can be applied in several ways, there can be more than one transition from x to the same y -each uniquely described by a (g φ , γ) label.Each such (g φ , γ) has an inverse, (g φ , γ ), where: g is the rule inverse to g; φ corresponds to φ in the isomorphism between the categories of extensions of x and y, with φ A = φ A ; and γ is the embedding corresponding to γ, also with γ A = γ A .One can easily verify that φ is an epi, and that φ is also mature.Hence (g φ , γ ) determines a valid transition in L G P which is inverse to (g φ , γ), and we have a bijective correspondence between transitions from x to y and those from y to x.
Consider a pair e, e of such corresponding events due to g φ and g φ ; because e is a transition from x to y, and φ is P-balanced (Th.2), we have P(y) = P(x) + ∆φ, and hence • ∆φ = • (P(y) − P(x)); so, by (3.3), the rates of e, e are such that: and by summing this equation over all pairs, we obtain detailed balance for the probability local to the component L G P (x) = L G P (y), defined above as π x = π y , since: q(y, x) e − •P(y) = q(x, y) e − •P(x) The convergence statement follows by standard results on finite CTMCs.Note that the subset of the state space which is reachable from x in L G , namely L G (x), is finite; hence, the partition function Z(x) := z∈L G (x) e −E(z) which figures in the denominator of π x is also finite.In the presence of rules which increase the number of agents, the components L G (x) can be infinite and Z(x) may diverge.For (mass action stochastic) Petri nets, convergence is guaranteed if detailed balance holds, but it is not true in general for Kappa [7,11].
Another point worth making is that the result holds symbolically -regardless of the energy costs .So can be seen as a set of parameters, an ideal support for machine learning techniques if one were contemplating fitting a model to data.

3.7.
A linear kinetic model.We now ask: what of the actual rates of L k G P ?Among all possible choices which accord with (3.3), we delineate a tractable subset whose size grows quadratically in |P|.This is a useful log-linear heuristic, which is common in machine learning, but has no particular claim to validity.
Suppose we have, for every generating rule g in G, a constant c g ∈ R, and a matrix A g of dimension |P| × |P|.Subject to the constraints that c g = c g , and A g + A g = I, we can define a log-affine rate map which satisfies (3.3) by: The kinetic model expressed in (3.5) requires of the order of |P| 2 × |G| parameters.In practice, one needs even fewer parameters, as only those energy patterns that are relevant to a given g, i.e. have non-zero balance for at least one rule in Γ P (g), need to be considered when building A g .Typically, for larger models, this will be a far smaller number than |P|.This relative parsimony is compounded by the fact that the number of independent parameters will be often lower, because the ∆φ family often has low rank.It is to be compared with the total number of choices which is far greater as it is of the order of the number of refinements, that is to say g∈G |Γ P (g)|.
If we set c g = c g = 0, A g = 0, A g = I, we get: k(g φ ) = e − •∆φ , k(g φ ) = 1.As • ∆φ is the difference of energy between the target and source in any application g φ , this choice amounts to being exponentially reluctant to climb up the energy gradient.This is a continuous-time version of the celebrated Metropolis algorithm [24].
Another particular case, completely symmetric and which can be a reasonable choice to begin with, is obtained for with C g = e cg .As ∆φ = −∆φ, this is indeed a symmetric definition.
Finally, it is fun to draw a comparison between the ascription given in (3.5) and the Arrhenius rate law.This law posits a dependency of the rate constant k of a reaction of the form log k = c − E a /kT , where c is a constant (defining the basic time scale of the reaction), E a is the so-called activation energy of the reaction and T is the temperature.In our case, we are not concerned with the effect of T on the (logarithm of the) rate but with the effect of consuming and producing various energy patterns in P at the locus of the instance of the generator rule g.In this view of things, (3.5) posits that the 'activation energy' of φ depends linearly on the cost of the various patterns and the balance of φ.
3.8.Energy functions do not need to be linear.We now return to a key assumption made in the preceding section and consider a more general situation where the energy function E is no longer asked to be linear.For reasons to become clear shortly, we still assume the much weaker property that E can be factored as v • P( ) for some finite set of patterns P.
Note that if we see multi-sets as equipped with the usual point-wise partial order, P( ) is evidently functorial.
As an example, consider a divalent agent with contact graph X(a 1 , b 1 ) -where the shared superscript symbolizes an edge between sites a and b-with which one can form only chains and cycles, and a single generator g which can create/delete the unique edge type.Write c 3 for a cycle of length 3 (a triangle), and t 2 for an open chain of length 2. We define the quadratic energy function Applying g forward to a chain t 2 in a site graph of the form x = t 2 + x will create a new copy of c 3 , and give the following energy balance: Therefore, detailed balance forces the log-ratio Kg of the backward/forward rates assigned to an edge creation to depend on x.This is unlike the case of linear potentials where this ratio is independent of x.However, this extension g φ of g (where it is applied to a chain t 2 ) is balanced with respect to P. This means, as we have seen, that the stoichiometric P-vector ∆φ associated to g φ -where each component ∆φ(c) is defined as the difference of |Υ C (c, y)| − |Υ C (c, x)| for a g-transition from x to y-is the same for all x, y.In the example ∆φ has only one component, which is constant ∆φ(c 3 ) = 1 because the binding of the two free extremes of an open chain t 2 can only produce one triangle, regardless of the context in which the refined rule g φ is applied.
In general, one can visualize the situation as follows.
x g G G and detailed balance amounts to asking for Kg = v(P(x) + ∆φ(x)) − v(P(x)).If v happens to be linear then this is the usual condition Kg = v(∆φ(x)).If v is not linear, the constraint does not seem very helpful as a priori one has to know x to compute the right hand side.
But by the assumption (3.6) opening the paragraph, ∆φ factors through P, hence: Kg = v(P(x) + σ g (P(x))) − v(P(x)) =: where the second equation defining ψ g uniquely as a real-valued function from P-multisets.
With this rewriting, it is plain that the constraint depends not on full knowledge of x, but only on P(x).Equivalently, we see that Kg factors through P( ) just like E. In the example, Kg = 2|Υ C (c 3 , x)| + 1, and ψ g (n) = 2n + 1.This is good enough to define rates for a balanced g φ .For example, by analogy with the earlier linear kinetic model, we can choose log-rates (seen as real-valued multi-set functions) as follows: kg = α g − β g ψ g (3.7) with α g , β g real-valued functions on P-multisets such that α g = α g and β g + β g = 1.This assignment solves the above constraint as, clearly, ψ g + ψ g = 0.
From the simulation point of view, this added generality requires two things: (i) that rates can be made to depend explicitly on observables; (ii) that the internal state of the simulation be extended to incorporate P(x).Both possibilities are already generically available in the current version of the main Kappa simulator KaSim [4].A slight modification of the engine (not implemented) could obtain direct updates to P(x) as, by assumption, applying g φ leads to a constant +∆φ update; and the same holds for propagating these updates to the rates of the rules which depend on them, e.g. as in (3.7).Thus, the complexity properties of the simulation algorithm are preserved [10].

Allosteric ring
We now put our methodology to use on a realistic example of a bacterial flagellar engine.In this section, we will use the traditional syntax of Kappa to denote site graphs: subscripts for states and shared superscripts for edges between sites, e.g.A(x 1 0 ), B(y 1 ).Unlike the mathematical definitions of §2, agent and site types are indicated as explicit labels.Again, we use KaSim (the standard Kappa engine) for the simulation shown below.
4.1.The model.The engine can rotate clockwise or anti-clockwise at high angular velocities, and this will decide whether the bacterium tumbles or swims forward.One can build a simple model of the switch between the two modes [2].The engine is seen as a ring of n identical components, P , with two possible conformations, 0 and 1.(In reality, each of the n = 34 component protomers is itself a tiny complex made of different subcomponents, but the model ignores this.)A ring homogeneously in state 0 (1) rotates (anti-) clockwise and induces tumbling (straight motion).Importantly, neighbouring P s on the ring prefer to have matching conformations.States of the ring with many mismatches thus incur high penalties.In the absence of any Y molecule binding a P , its favoured conformation is 0; conversely, in the presence of a Y, P favours 1. (Y stands for a small diffusible protein named CheY.)To bind, Y has to be activated by an external signal.Hence the switch can be triggered by a sudden activation of Y which then binds the ring and induces a change of regime.The sharper the transition between the two regimes the better.
As each of the P s can be in four states, the ring has on the order of 10 20 non-isomorphic configurations which precludes any reaction-based (e.g.Petri nets) approach to the dynamics where each global state is considered as one chemical species.At this stage, we could apply the rule-based approach, or, better, we can obtain the rules indirectly by applying the methodology of §3.This is what we now do informally.
First, we define our contact graph with two agent types: P (x, y, f 0,1 , s) with domains x, y to form the ring, s to bind its signal Y , and f a placeholder for P 's conformation; Y(s u,p ) with two internal states to represent activity.

Motif Cost
P (f i , x 1 ), P (y 1 , f j ) Second, we capture the informal statements in the discussion above by defining the energy patterns and associated costs.Note that the various motifs overlap.Following §3, we associate to each ring configuration x the occurrence vector P(x) and total energy • P(x).For example, a ring of size n uniformly in state 0 and with no bound Y s has total energy n( P P 00 + P 0 ).This, in turn, defines the equilibrium distribution of the ring, namely x has probability proportional to exp(− • P(x)).(The convention is that the lower the energy, the likelier the state.)P P 00 , P P 11 < P P 10 , P P 01 (4.1) In order to complete our energy landscape, we need to pick energy costs which reward or penalize local configurations as discussed above: the role of (4.1) is to align the internal states of neighbours on the ringan Ising penalty term for mismatching neighbours which will "spread conformation"; (4.2) makes 0 the favoured state, while (4.3), which kicks in only in the presence of Y, makes 1 the favoured state.
The next step is to create the dynamics.The naive rule b for PY binding: b := P (s), Y (s p ) ↔ P (s 1 ), Y (s 1 p ) has a ∆E which is ambiguous as it will be either P Y 0 or P Y 1 depending on its instances; hence, we have no hope of assigning rates to this rule that satisfy detailed balance -unless 1 , which contradicts (4.3).To get a definite balance, one needs to refine this rule: b 0 := P (f 0 , s), Y (s p ) ↔ P (f 0 , s 1 ), Y (s 1 p ) b 1 := P (f 1 , s), Y (s p ) ↔ P (f 1 , s 1 ), Y (s 1 p ) Now each rule b i specifies enough of the context in which it applies to have a definite energy balance P Y i .Following the same intuition of revealing (just) enough context, we obtain a balanced rule set for conformational changes: f ij := P (f i , y 1 ), P (x 1 , f 0 , y 2 , s), P (x 2 , f j ) ↔ P (f i , y 1 ), P (x 1 , f 1 , y 2 , s), P (x 2 , f j ) f ij := P (f i , y 1 ), P (x 1 , f 0 , y 2 , s ), P (x 2 , f j ) ↔ P (f i , y 1 ), P (x 1 , f 1 , y 2 , s ), P (x 2 , f j ) The first (second) group of rules represents the changes in the absence (presence) of a Y bound to the middle P undergoing a change of conformation.(The fact that P 's site s is bound is indicated by the underscore exponent.) These f -rules have respective balance: As we have ten reversible rules, and only eight energy patterns, there must be linear dependencies between the various balances.Indeed, in this case, it is easy to see that the family of vector balances has rank six.Thermodynamic consistency induces relationships between rates; a well-established fact in the case of reaction networks (e.g.see Ref. [7]).With the rules in place, the final step is to choose rates which satisfy detailed balance.This guarantees that the obtained rule set converges to the equilibrium specified by the choice of the energy cost vector.Convergence will happen whatever is, i.e. symbolically.If, in addition, follows (4.1-4.3),one can see in Fig. 2 that the ring (i) undergoes sharp Note that there is a design choice here.In effect, we are saying that we are not interested in forming/breaking the bonds between the P s in the ring.If we wanted to incorporate also the ring assembly in the model, we would have to add P (x), P (y) ↔ P (x 1 ), P (y 1 ) among our generator set G. This would generate many more refined rules, as we will see.Recall that our patterns fall in three subgroups: P (f i , x 1 ), P (y 1 , f j ); P (f i ); and P (f i , s 1 ), Y (s 1 ).
Consider the extensions of b: clearly only the last pattern can glue relevantly on it; the corresponding (unique) site request is for P to reveal f and its internal state.This gives the first two rules b 0 , b 1 .
Consider now the more interesting extensions of f : the second pattern type glues relevantly but does not generate any site request; the third one asks P to reveal its site s, resulting in two possible extensions (s means that s is bound): P (f 0 , s) ↔ P (f 1 , s) P (f 0 , s ) ↔ P (f 1 , s ) These extensions are not mature yet, as one can glue relevantly patterns of the first type on both sides of P , inducing a further request for revealing P 's sites x and y.
If we are in the component of an initial state where P s are arranged in a ring, then we know that the neighbours on both sides exist and are P s; this gives the final refinement of the above into the rules f ij , f ij described earlier.If, on the other hand, we do not know that, we also have to add several rules where one or both of x, y are free, corresponding to open P -chains.This demonstrates the sensitivity of the obtained rule set to the initial choice of generators.
Hence, the rules we generated by hand are indeed the ones we would generate using our general refinement strategy of §3.
We can visualize the obtained simulations by extracting snapshots before, after and during the injection of active Y s, as in Fig. 3. Again we see few mismatches in both regime because of the Ising interaction expressed by the P P energy costs.The choice of rates made in Ref. [2] for the f -generator is the symmetric version of (3.5).

Conclusion
We have presented a new 'energy-oriented' methodology for the development of site graph rewriting models based on a set P of energy patterns; these patterns use a graphical syntax which allows us to specify the energy landscape.Rewrite rules are implicitly defined by P and generator rules G.The resulting rule set G P is guaranteed to be thermodynamically correct and to converge to the probability distribution described by the energy landscape, given suitable rates.The construction is entirely parametric in the energy costs , and modular in G.This means that in a modelling context, one can sweep over various values for without having to rebuild the model, and compositionally add new rule components to G.Both features are clearly useful.We expect our construction to provide a broad and uniform language to describe and analyse models of interacting biomolecules and similar systems of a quantitative fine-grained and distributed nature.
There are no specific conditions bearing on this construction other than that energy patterns should be local.It would be interesting to investigate whether suitable constraints on patterns and generator rules can obtain optimized generated rule sets.Another interesting extension would be to deal with non-local forms of energies expressing long-range interactions, where the metric is read off the graph itself.In practice, there will be many more rules generated, and beyond the descriptive aspects, simulations will need new ideas to be feasible.A ray of hope comes from the log-affine kinetic model presented in §3, as rules can be partitioned by energy balances for faster selection.
Finally, as said in the introduction, there is a growing body of literature which turns a theoretical eye to site graph rewriting [17,13,18,6], and it is tempting to ask whether our derivation can be replayed in more abstract settings; in particular, it would be very interesting to investigate its integration with the abstract framework for rule-based modelling developed in [23].all agents are used to build triangles.Instead, when (T ) = −2.5only less than 5% is used.Interestingly, in both cases the set of states that minimize the energy function is the same, namely those states that maximize the amount of triangles.So then why is it that in the latter case there are so few triangles?The reason is entropic: although the probability of being in a state with few triangles is small, there are many such states and together they outweigh the probability of being in the few states were the energy is minimized.By further decreasing the energy of those few states we compensate for this mass effect, until at (T ) = −10, order wins, and the effect is not noticeable anymore.

Figure 2 : 4 . 2 .
Figure 2: The simulation steps up the amount of active Y at t = 100, and down again at t = 200; this sends the entire ring into an homogeneously 1 conformation, and back to 0. The number of mismatches (lowest curve) stays low, even during transitions.

Figure 3 :
Figure 3: Snapshots of the ring configuration are taken at time 500, 1500, and 2500.Solid (green) circles indicate conformation 1, hollow ones conformation 0; a dot in the centre indicates a bound (hence active) Y .At times 500, 2500, no Y is bound (because they are all inactive) and the ring is globally in state 0, up to tiny fluctuations; at time 1500, it is globally in state 1 as a consequence of the binding of Y s.

Figure 4 :
Figure 4: Trajectories for T = •123• when (T ) varies (this value is set by changing the value of parameter 't' in the KaSim file).