Continuous Positional Payoffs

What payoffs are positionally determined for deterministic two-player antagonistic games on finite directed graphs? In this paper we study this question for payoffs that are continuous. The main reason why continuous positionally determined payoffs are interesting is that they include the multi-discounted payoffs. We show that for continuous payoffs, positional determinacy is equivalent to a simple property called prefix-monotonicity. We provide three proofs of it, using three major techniques of establishing positional determinacy -- inductive technique, fixed point technique and strategy improvement technique. A combination of these approaches provides us with better understanding of the structure of continuous positionally determined payoffs as well as with some algorithmic results.


Introduction
We study two-player turn-based games on finite directed graphs. In these games, two players called Max and Min travel over nodes of a given graph along its edges for infinitely many turns. In each turn, one of the players decides where to go next, and which of the two depends on a predetermined partition of the nodes between the players.
The game is of infinite duration. As a result of the game, we get an infinite path in our graph. Each infinite path is mapped to a real number called its reward, according to some payoff function (or, for brevity, a payoff). The larger the reward is the more Max is happy; on the contrary, Min wants to minimize the reward.
We consider only payoffs that are defined through edge labels. Namely, we first fix some finite set A of labels. Then we label edges of our game graph by elements of A. After this, any bounded function φ : A ω → R can be viewed as a payoff in our graph. Namely, it takes an infinite path, considers an infinite word over A "written" on this path, and applies φ to this infinite word.
Fix a strategy of one of the players (that is, an instruction how to play in all possible developments of the game). If this is a strategy of Max, then its value is the infimum of the In [GZ04], Gimbert and Zielonka briefly mention another interesting additional property, namely, continuity. They observe that the multi-discounted payoffs are continuous (they utilize this in showing that the multi-discounted payoffs are fairly mixing). In this paper, we study continuous positionally determined payoffs in more detail. A payoff is continuous if its range converges to just a single point as more and more initial letters of its input (which an infinite word over the set of labels) are getting fixed. This contrasts with prefix-independent payoffs (such as the parity and the mean payoffs), for which any initial finite segment is irrelevant. Thus, continuity serves as a natural property which separates the multi-discounted payoffs from other classical positionally determined payoffs. This is our main motivation to study continuous positionally determined payoffs in general, besides the general importance of the notion of continuity.
We show that for continuous payoffs, positional determinacy is equivalent to a simple property which we call prefix-monotonicity. A payoff φ is prefix-monotone if there are no two infinite words α and β and no two finite words x and y such that φ(xα) > φ(xβ) and φ(yα) < φ(yβ).
A proof of the fact that any continuous positionally determined payoff is prefix-monotone can be found in Section 3. We give three different proofs of the opposite direction of our main result, using three major techniques of establishing positional determinacy: • An inductive argument. Here we use a sufficient condition of Gimbert and Zielonka [GZ04], which is proved by induction on the number of edges of a game graph. This type of argument goes back to a paper of Ehrenfeucht and Mycielski [EM79], where they provide an inductive proof of the positional determinacy of Mean Payoff Games. This argument can be found in Section 4. • A fixed point argument. Then we give a proof which uses a fixed point approach due to Shapley [Sha53]. Shapley's technique is a standard way of establishing positional determinacy of Discounted Games. In this argument, one derives positional determinacy from the existence of a solution to a certain system of equations (sometimes called Bellman's equations). In turn, to establish the existence of a solution, one uses Banach's fixed point theorem. This argument can be found in Section 5. • A strategy improvement argument. For Discounted Games, the existence of a solution to Bellman's equations can also be proved by strategy improvement. This technique goes back to Howard [How60]; for its thorough treatment (as well as for its applications to other payoffs) we refer the reader to [Fea10]. We generalize it to arbitrary continuous positionally determined payoffs. This argument can be found in Section 7.
The simplest way to obtain our main result is via the inductive argument (at the cost of appealing without a proof to the sufficient condition of Gimbert and Zielonka). In turn, two other proofs give the following additional results: • Using the fixed point approach, in Section 6 we give an explicit description of the set of continuous positionally determined payoffs. Namely, it turns out that all continuous positionally determined payoffs are, in a sense, non-affine multi-discounted payoffs. We use this to give an example of a positionally determined payoff which does not reduce to multi-discounted payoffs in an "algorithmic sense".
• Using the strategy improvement approach, in Section 8 we show that a problem of finding a pair of optimal positional strategies is solvable in randomized subexponential time for any continuous positionally determined payoff.
We also believe that our paper makes a useful addition to these approaches from a technical viewpoint. For example, the main problem for the fixed point approach is to identify a metric with which one can carry out the same "contracting argument" as in the case of multi-discounted payoffs. To solve it, we obtain a result of independent interest about compositions of continuous functions. As for the strategy improvement approach, our main contribution is a generalization of such well-established tools as "modified costs" and a "potential transformation lemma" [HMZ13,Lemma 3.6]. Finally, we study continuous payoffs that are positional in stochastic games. Namely, in Section 10 we show that any continuous payoff which is positional in Markov Decision Processes is multi-discounted. On the other hand, it is classical multi-discounted games are positional even in two-player stochastic games. Using it, we disprove the following conjecture of Gimbert [Gim07]: "Any payoff function which is positional for the class of non-stochastic one-player games is positional for the class of Markov decision processes".

Preliminaries
2.1. Notation. We denote the function composition by •. For two sets A and B by B A we denote the set of all functions from A to B. We write C = A ⊔ B for three sets A, B, C if A and B are disjoint and C = A ∪ B.
Take any set A. By A * we denote the set of all finite words over the alphabet A. By A + we denote the set of all non-empty finite words pver the alphabet A. Finally, by A ω we denote the set of all infinite infinite words over the alphabet A. For w ∈ A * , we let |w| be the length of w. For α ∈ A ω we define |α| = ∞.
For u ∈ A * and v ∈ A * ∪ A ω we let uv denote the concatenation of u and v. We call u ∈ A * a prefix of v ∈ A * ∪ A ω if for some w ∈ A * ∪ A ω we have v = uw. For u ∈ A * , by uA ω we denote the set {uα | α ∈ A ω }. Alternatively, uA ω is the set of all β ∈ A ω such that u is a prefix of β.
For u ∈ A * and k ∈ N we define In turn, if u ∈ A + , we let u ω ∈ A ω be a infinite word obtained by repeating u infinitely many times. We call α ∈ A ω ultimately periodic if α = uv ω for some u ∈ A * , v ∈ A + .

2.2.
Deterministic infinite duration games on finite directed graphs.
Definition 2.1. Let A be a finite set. A tuple G = ⟨V, V Max , V Min , E⟩ is called an A-labeled game graph if the following conditions hold: Elements of V are called nodes of G. Nodes from V Max (resp., V Min ) are called Max's nodes (resp., Min's nodes). Elements of E are called edges of E. For an edge e = (s, a, t) ∈ E we define source(e) = s, lab(e) = a, target(e) = t. We imagine e as an arrow from source(s) to target(e) with the label lab(a).
We will apply the function lab not only to individual edges, but also to arbitrary finite or infinite sequences of edges. Namely, given a sequence of edges, we first apply lab to its elements, and then concatenate the resulting letters from A in the same order as in the sequence. We will get a word over A of the same length as the initial sequence of edges.
The out-degree of a node v ∈ V is the number of e ∈ E with source(e) = u. The last requirement in the definition of an A-labeled game graph means that the out-degree of every node must be positive.
A path in G is a non-empty (finite or infinite) sequence of edges of G with a property that target(e) = source(e ′ ) for any two consecutive edges e and e ′ from the sequence. For a path p, we define source(p) = source(e), where e is the first edge of p. For a finite path p, we define target(p) = target(e ′ ), where e ′ is the last edge of p.
For technical convenience, we also consider 0-length paths. Namely, for every node s ∈ V we consider a 0-length path λ s , for which we define source(λ s ) = target(λ s ) = s. Hence, there are |V | different 0-length paths.
If p and q are two paths of positive length and p is finite, then we can consider their concatenation pq. This will be a path if and only if target(p) = source(q). Now, if p = λ s is a 0-length path, then λ s q is a path if and only if source(q) = s. In this case, λ s q = q. Similarly, if q = λ s is a 0-length path, then pλ s is a path if and only if target(p) = s. In this case, pλ s = p.
Fix a finite set A and an A-labeled game graph G = ⟨V, V Max , V Min , E⟩. Consider the following infinite-duration game (IDG for short) which is played over G. Players are called Max and Min. Positions of the game are finite paths in G (informally, these are possible finite developments of the game). Possible starting positions are paths of length 0. Positions from where Max (resp., Min) is the one to move are finite paths with target(p) ∈ V Max (resp., target(p) ∈ V Min ).
The set of moves available at a position p is the set {e ∈ E | source(e) = target(p)} of edges that come out of the endpoint of p. A move e from a position p leads to a position pe.
A Max's strategy σ in a game graph G is a mapping, assigning to every position p with target(p) ∈ V Max some move available at p. Similarly, a Min's strategy τ in a game graph G is a mapping, assigning to every position p with target(p) ∈ V Min some move available at p.
Let P = e 1 e 2 e 3 . . . be an infinite path in G. We say that P is consistent with a Max's strategy σ if for every finite prefix p of P with target(p) ∈ V Max it holds that σ(p) is the next edge of P after p. For s ∈ V and for a Max's strategy σ we let Cons(s, σ) be a set of all infinite paths in G that start in s and are consistent with σ. We use a similar terminology and notation for strategies of Min.
Given a Max's strategy σ, a Min's strategy τ and s ∈ V , the play of σ and τ from s is an infinite path P σ,τ s which can be obtained as follows. First, set p = λ s . Then repeat the following infinitely many times. If target(p) ∈ V Max , extend it by the edge σ(p). Similarly, if target(p) ∈ V Min , extend it by the edge τ (p). The resulting infinite path will be P σ,τ s . It is not hard to see that P σ,τ s is a unique element of the intersection Cons(s, σ) ∩ Cons(s, τ ).
A Max's strategy σ in an A-labeled game graph G = ⟨V, V Max , V Min , E⟩ is called positional if σ(p) = σ(q) for all finite paths p and q in G with target(p) = target(q) ∈ V Max . For a positional strategy σ of Max and for u ∈ V Max , we let σ(u) be the move of σ from any position whose endpoint is u. That is, we can view a positional Max's strategy σ as a function σ : V Max → E. Obviously, this function satisfies source(σ(u)) = u for all u ∈ V Max . We define Min's positional strategies analogously.
We call an edge e ∈ E consistent with a Max's positional strategy σ if source(e) ∈ V Max =⇒ e = σ(source(e)). We denote the set of edges that are consistent with σ by E σ . If τ is a Min's positional strategy, then we say that an edge e ∈ E is consistent with τ if source(e) ∈ V Min =⇒ e = τ (source(e)). The set of edges that are consistent with a Min's positional strategy τ is denoted by E τ .
Fix a finite set A and an A-labeled game graph G = ⟨V, V Max , V Min , E⟩. Take any bounded function φ : A ω → R to which we will refer to as a payoff ("bounded" means that φ(A ω ) ⊆ [−C, C] for some C > 0). Given a Max's strategy σ in G, its value in a node s ∈ V (w.r.t. φ) is defined as follows: That is, we first take all infinite paths from s that are consistent with σ. Then we consider all infinite words over A that are "written" on these paths. The set of these words is lab Cons(s, σ) . Finally, we take the infimum of our payoff over this set.
Similarly, if τ is a Min's strategy in G, then the value of τ in a node s ∈ V (w.r.t. φ) is the following quantity: Observe that for any Max's strategy σ, for any Min's strategy τ and for any s ∈ V we have: Remark 2.2. "Uniformity" here refers to the fact that a strategy is optimal irrespectively of the starting node. One, of course, could consider strategies that are optimal for some nodes but not for the other. However, this kind of optimality is out of scope of this paper. Thus, from now on, we write "optimal strategies" instead of "uniformly optimal strategies".
A pair (σ, τ ) of a Max's strategy σ and a Min's strategy τ is called an equilibrium if Val[σ](s) = Val[τ ](s) for every a ∈ V . It is easy to see that any strategy appearing in an equilibrium is optimal. On the other hand, if at least one equilibrium exists, then the following holds: the Cartesian product of the set of optimal strategies of Max and the set of optimal strategies of Min is the set of equilibria. We say that φ is determined if in every A-labeled game graph there exists an equilibrium with respect to φ. We say that φ is positionally determined if every A-labeled game graph contains an equilibrium (w.r.t. φ) of two positional strategies.
Proof. Let (σ, τ ) be an equilibrium w.r.t. φ, where σ is a Max's strategy and τ is a Min's strategy. Our goal is to show that (σ, τ ) is also an equilibrium w.r.t. g • φ.
By definition, the values of σ and τ w.r.t. φ coincide. That, for every node s, we have: We have to derive from this that the values of σ and τ w.r.t. g • φ also coincide. That is, we have to show that: for every node s. The sets φ • lab Cons(s, σ) and φ • lab Cons(s, τ ) have a common element φ • lab P σ,τ s . Since the infimum of the first set equals the supremum of the second set, their common element φ • lab P σ,τ s must be the minimum of the first set and the maximum of the second set. Due to the fact that the function g is non-decreasing, we have that g φ • lab P σ,τ s is the minimum of g • φ • lab Cons(s, σ) and the maximum g • φ • lab Cons(s, σ) . This implies (2.1).
Corollary 2.4. If A is a finite set, φ : A ω → R is a positionally determined payoff and g : φ(A ω ) → R is a bounded non-decreasing function, then g • φ is a positionally determined payoff.
2.3. Continuous payoffs. For a finite set A, we consider the set A ω as a topological space. Namely, we take the discrete topology on A and the corresponding product topology on A ω . In this product topology, open sets are sets of the form where S ⊆ A * . When we say that a payoff φ : A ω → R is continuous we always mean continuity with respect to this product topology (and with respect to the standard topology on R). The following proposition gives a convenient way to establish continuity of payoffs.
Proposition 2.5. Let A be a finite set. A payoff φ : A ω → R is continuous if and only if for any α ∈ A ω and for any infinite sequence {β n } ∞ n=1 of elements of A ω the following holds. If for all n ≥ 1 it holds that α and β n have the same prefixes of length n, then lim n→∞ φ(β n ) exists and equals φ(α).
Let us now establish the other direction of the proposition. It is enough to show that for any x, y ∈ R with x < y the set φ −1 ((x, y)) is open. Take any α ∈ φ −1 ((x, y)). Let us show that there exists n(α) such that all β ∈ A ω that coincide with α in the first n(α) elements belong to φ −1 ((x, y)). Indeed, otherwise for any n there exists β n , coinciding with α in the first n elements, such that β n / ∈ φ −1 ((x, y)). Now, the limit lim n→∞ φ(β n ) must exist and must be equal to φ(α). But φ(α) ∈ (x, y) and all φ(β n ) are not in this interval, contradiction. Now, for α ∈ φ −1 ((x, y)) let u α ∈ A n(α) be the n(α)-length prefix of α. Observe that So the set φ −1 ((x, y)) is open, as required.
For a finite set A, the space A ω is compact by Tychonoff's theorem 2 . This has the following consequence which is important for this paper: if φ : A ω → R is a continuous payoff, then φ(A ω ) is a compact subset of R.

2.4.
MDPs. This subsection concerns stochastic games, but we deal with them only in Section 10. So for the rest of our results, one can skip this subsection.
In fact, we will need only one-player stochastic games, also known as Markov Decision Processes (MDPs). We will follow a formalization of Gimbert [Gim07]. We use the following notation. Let A be a finite set. By S bor A we mean the Borel σ-algebra on A ω (generated by the product topology from the previous subsection). By ∆(S) we denote the set of all probability distributions over a finite set S. such that for every s ∈ S there exists P ∈ ∆(S) such that (s, P ) ∈ Act.
Given an A-labeled MDP M = ⟨S, Act, lab⟩, we imagine that there is a single player called Max traveling over the states of M. When Max is in a state s ∈ S, he considers the set of all actions of M whose first coordinate is s (by definition, this set is non-empty for every s ∈ S). He chooses one such action (s, P ). Then Max samples his next location according to P . This continues for infinitely many turns.
For e = (s, P ) ∈ Act, we define source(e) = s and Dist[e] = P . The set T = Act × S, which is the domain of the function lab, is called the set of transitions of M. Informally, transitions describe what happens in one turn. Namely, a transition (e, s) ∈ Act × S means that in the beginning of a turn, Max was in the state source(e), then he took the action e, and this led him to the state s.
Consistent sequences of transitions are called histories. Namely, a non-empty sequence h = (e 1 , s 1 )(e 2 , s 2 )(e 3 , s 3 ) . . . ∈ T + ∪ T ω is called a history if for every 2 ≤ i ≤ |h|, we have s i−1 = source(e i ). We set source(h) = source(e 1 ) and, if h is finite, target(h) = s |h| . We also map each state s ∈ S to a 0-length history λ s with source(λ s ) = target(λ s ) = s. These histories correspond to |S| possible starting positions of Max.
A strategy σ of Max is a mapping, which to every finite history h assigns an action σ(h) ∈ Act such that target(h) = source(σ(h)). Informally, σ(h) is the action which, according to σ, Max takes after h.
Given s ∈ S, a strategy σ defines a function P σ s : T * → [0, 1]. Informally, P σ s (h) is the probability that we will see a history h if Max starts in s and plays according to σ. It can be defined inductively.
First, we set P σ s (empty word) = 1. The empty word here corresponds to the initial history λ s . Now, given (e, s 1 ) ∈ T , we set P σ s ((e, s 1 )) = 0 if e ̸ = σ(λ s ) and P σ s ((e, s 1 )) = Dist[e](s 1 ) if e = σ(λ s ). That is, if e is not the action played by Max according to σ in the starting position, then the probability of the transition (e, s 1 ) is 0. Otherwise, the probability of (e, s 1 ) is the probability that the action e = σ(λ s ) brings us to s 1 .
Any sequence of transitions h ∈ T * ∪ T ω can be mapped to a word lab(h) over the set of labels by setting:

Now, fix a payoff function φ :
A ω → R. It maps any infinite history H ∈ T ω to its reward, defined as φ(lab(H)). Max wants a strategy which maximizes the expected value of the reward. That is, Max wants to attain over his strategies σ, for all s ∈ S. This expectation is well defined if φ • lab is bounded and measurable with respect to S bor T . Since lab : T ω → A ω is continuous, it is well-defined if φ : A ω → R is bounded and measurable with respect to S bor A . For brevity, we will abbreviate the expectation in (2.2) by Definition 2.7. Let A be a finite set and φ : A ω → R be a bounded measurable payoff.
We say that a strategy σ in an A-labeled MDP M is optimal if for any strategy σ ′ and for any state s of M we have: We say that a strategy σ in an A-labeled MDP M is positional if for any two finite histories h 1 and h 2 in M we have target(h 1 ) = target(h 2 ) =⇒ σ(h 1 ) = σ(h 2 ).
We say that φ is positionally determined in MDPs if every A-labeled MDP has an optimal positional strategy w.r.t. φ.
In the paper, we will use this definition only for continuous φ -they all are, of course, bounded (by compactness of A ω ) and measurable.

Statement of the Main Result and its "Only If" Part
Our main result establishes a simple property which is equivalent to positional determinacy for continuous payoffs.
(One can note that prefix-independence trivially implies prefix-monotonicity. On the other hand, no prefix-independent payoff is continuous, unless it takes just 1 value.) Theorem 3.2. Let A be a finite set and φ : A ω → R be a continuous payoff. Then φ is positionally determined if and only if φ is prefix-monotone.
The fact that any continuous positionally determined payoff must be prefix-monotone is proved below in this section. Three different proofs of the "if" part of Theorem 3.2 are given in, respectively, Sections 4, 5 and 7. As an illustration of our result, we first give a formal definition of multi-discounted payoffs and show that they are continuous and prefix-monotone. λ(a 1 ) · . . . · λ(a n−1 ) · w(a n ) (3.1) for all a 1 a 2 a 3 . . . ∈ A ω .
Proposition 3.4. All multi-discounted payoffs are continuous and prefix-monotone.
Proof. Let A be a finite set and φ : A ω → R be a multi-discounted payoff, defined by λ : A → [0, 1) and w : A → R. Take any W > 0 such that λ(a) < 1 − 1 W and |w(a)| < W for every a ∈ A.
Let us first show that φ is continuous. Take any α, β ∈ A ω that coincide in the first n elements. It is sufficient to bound the difference |φ(α) − φ(β)| by some quantity which depends only on n and tends to 0 as n → ∞. First, observe that the value of φ never exceeds W · 1 1−(1− 1 W ) = W 2 . Now, let u = a 1 a 2 . . . a n ∈ A n be the first n letters of α and β. Then α = uα ′ , β = uβ ′ for some α ′ , β ′ ∈ A ω . It is not hard to derive from (3.1) that: This means that the difference |φ(α) − φ(β)| is bounded by (1 − 1 W ) n · W 2 . This quantity tends to 0 as n → ∞. Hence, φ is continuous. Equation (3.2) also implies that φ is prefix-monotone. Indeed, it gives that for any u ∈ A * and β, γ ∈ A ω there exists λ ≥ 0 such that φ(uβ) − φ(uγ) = λ · (φ(β) − φ(γ)). This equality gives us that: Hence, there are no u, v ∈ A * and β, γ ∈ A ω such that φ(uβ) > φ(uγ) and φ(vβ) < φ(vγ). Proof of the "only if " part of Theorem 3.2. Assume that φ is not prefix-monotone. Then for some u, v ∈ A * and α, β ∈ A ω we have First, notice that by the continuity of φ we may assume that α and β are ultimately periodic. Indeed, take any a ∈ A and for every n ∈ N, define α n , β n ∈ A ω as follows: By continuity of φ, we have: These equations imply that if u, v, α, β violate prefix-monotonicity, then so do u, v, α n , β n for some n ∈ N (and α n , β n are ultimately periodic for every n). Thus, we assume from now on that α, β are ultimately periodic. Then α = p(q) ω and β = w(r) ω for some p, q, w, r ∈ A * . Consider an A-labeled game graph from Figure 1  There are two positional strategies of Max in this game graph, one which goes along p from c, and the other which goes along w from c. The first one is not optimal when the game starts in b, and the second one is not optimal when the game starts in a (because of (3.3)). So φ is not positionally determined in this game graph.
Remark 3.5. In this argument, it is crucial that our definition of positional determinacy is "uniform". That is, we require that some positional strategy is optimal for all the nodes. Allowing each starting node to have its own optimal positional strategy gives us a weaker, "non-uniform" version of positional determinacy. It is not clear whether non-uniform positional determinacy implies prefix-monotonicity for continuous payoffs. At the same time, we are not even aware of a payoff which is positional in the non-uniform sense, but not in the uniform sense.

Inductive Argument
In this section, we show that any continuous prefix-monotone payoff is positionally determined, using the following sufficient condition due to Gimbert is positionally determined.
We observe that in case of continuous payoffs, one can get rid of the conditions (b) and (c) in this Proposition. A weaker version of this statement was proved in the on-line version of [GZ04]. Namely, it was shown there that one can get rid of the condition (c) for continuous payoffs. Proof. Take any finite set A and any continuous payoff φ : A ω → R satisfying the condition (a) of Proposition 4.1. We first show that φ satisfies the condition (b) of this proposition. We will only show that φ(uα) ≤ max{φ(u ω ), φ(α)}, the other inequality from this condition can be proved similarly. If φ(uα) ≤ φ(α), then we are done. Assume now that φ(uα) > φ(α). By repeatedly applying (a), we obtain φ(u i+1 α) ≥ φ(u i α) for every i ∈ N. In particular, for every i ≥ 1 we get that φ(u i α) ≥ φ(uα). By continuity of φ, we have that lim i→∞ φ(u i α) = φ(u ω ). Hence, φ(u ω ) ≥ φ(uα). Now we show that φ satisfies the condition (c) of Proposition 4.1. We will only show that n )}, the other inequality from this condition has the same proof. Namely, we will show that if φ( . Note that this claim is stronger than we need.
First, we show that φ(x n x n+1 x n+2 . . .) ≤ φ(x n+1 x n+2 x n+3 . . .) for every n ≥ 0. This can be easily proved by induction on n. Let us start with the induction base. By the condition (b), which is already established for φ, we have φ(

Let us now perform the induction step. Assume that it is already proved that
Then, by the same argument as in the induction base, we get φ( We will now prove that φ( for every n ≥ 0. For n = 0, the left-hand side and the right-hand side coincide. Then we show that . It remains to apply (a) by appending x 0 x 2 . . . x 2n to both sides.
Thus, we have established that φ( . .) for every n ≥ 0. By continuity of φ, the left-hand side of this inequality converges to φ(x 0 x 2 x 4 . . .) as n → ∞. Hence, we get that φ(x 0 x 2 x 4 . . .) ≥ φ(x 0 x 1 x 2 . . .), as required. Thus, to establish that some continuous payoff is positionally determined, it is enough to demonstrate that this payoff satisfies the condition (a) of Proposition 4.1. Let us now reformulate this condition using the following definition. Proof. Assume first that φ satisfies the condition (a) of Proposition 4.1. It is shiftdeterministic, because for every a ∈ A, β, γ ∈ A ω . In turn, assume for contradiction that φ is not prefix-monotone. Then φ(uβ) > φ(uγ) and φ(vβ) < φ(vγ) for some u, v ∈ A * and β, γ ∈ A ω . Due to the contraposition to the condition (a) of Proposition 4.1, Now, assume that φ is prefix-monotone and shift-deterministic. Take any u ∈ A * and The above discussion gives the following sufficient condition for positional determinacy.
Proposition 4.5. Let A be a finite set. Any continuous prefix-monotone shift-deterministic payoff φ : A ω → R is positionally determined.
Still, some argument is needed for continuous prefix-monotone payoffs that are not shift-deterministic. To tie up loose ends, we prove the following: Proposition 4.6. Let A be a finite set and let φ : A ω → R be a continuous prefix-monotone payoff. Then φ = g • ψ for some continuous prefix-monotone shift-deterministic payoff ψ : A ω → R and for some continuous 3 non-decreasing function g : ψ(A ω ) → R (note that since g is defined on a compact and is continuous, it is also bounded).
Due to Corollary 2.4, this proposition means that all continuous prefix-monotone payoffs are positionally determined. In fact, we do not need continuity of g here, but it will be useful later. Thus, once we establish Proposition 3.2, our first proof of Theorem 3.2 will be finished.
Proof of Proposition 4.6. Define a payoff ψ : A ω → R as follows: First, why is ψ well-defined, i.e., why does this series converge?
for some W > 0, which means that (4.1) is bounded by the following absolutely converging series: We shall show that ψ is continuous, prefix-monotone and shift-deterministic, and that φ = g • ψ for some continuous non-decreasing g : Why is ψ continuous? We will use Proposition 2.5. Consider any α ∈ A ω and any infinite sequence {β n } n∈N of elements of A ω such that for all n, the words α and β n have the same prefix of length n. We have to show that ψ(β n ) converges to ψ(α) as n → ∞. By definition: For m ∈ N, define: By continuity of φ, we have for every m ∈ N that: (the sum is finite, which means that we can interchange it with the limit). On the other hand, we can bound the difference between ψ(β n ) and S m n as follows: . Thus, for every m we obtain: Since m can be arbitrarily large, we obtain lim sup n→∞ |ψ(β n ) − ψ(β)| = 0, as required. Why is ψ prefix-monotone? Take any β, γ ∈ A ω . We have to show that either ψ(uβ) ≥ ψ(uγ) for all u ∈ A * or ψ(uβ) ≤ ψ(uγ) for all u ∈ A * .
Why is ψ shift-deterministic? Take any a ∈ A and β, γ ∈ A ω with ψ(β) = ψ(γ). We have to show that ψ(aβ) = ψ(aγ). Indeed, assume that If this series contains a non-zero term, then it must contain a positive term and a negative term. But this contradicts prefix-monotonicity of φ. So all the terms in this series must be 0. That is, we have φ(wβ) − φ(wγ) = 0 for every w ∈ A * . Therefore, Indeed, if φ(α) > φ(β), then we also have φ(wα) ≥ φ(wβ) for every w ∈ A * , by prefixmonotonicity of φ. Now, by definition, All the terms in this series are non-negative, and the term corresponding to the empty w is strictly positive. So we have ψ(α) > ψ(β), as required.
Finally, we show that any g : ψ(A ω ) → R such that φ = g • ψ must be continuous. For that, we show that |g(x) − g(y)| ≤ |x − y| for all x, y ∈ ψ(A ω ). Take any α, β ∈ A ω with x = ψ(α) and y = ψ(β). By prefix-monotonicity of φ we have that either φ(wα) ≥ φ(wβ) for all w ∈ A * or φ(wα) ≤ φ(wβ) for all w ∈ A * . Up to swapping x and y, we may assume that the first option holds. Then On the left here we have x − y, and on the right

Fixed point argument
Here we present a way of establishing positional determinacy of continuous prefix-monotone shift-deterministic payoffs (Proposition 4.5) via a fixed point argument. Together with Proposition 4.6, this constitutes our second proof of Theorem 3.2.
Obviously, for any shift-deterministic payoff φ : Remark 5.1. Sometimes, when φ is clear from the context, we will simply write s[a] instead of s[a, φ].
Proof. A statement that s[a, φ] is non-decreasing for every a ∈ A is equivalent to the condition (a) of Proposition 4.1. In turn, by Claim 4.4, this condition is equivalent to a statement that φ is prefix-monotone and shift-deterministic.
We use this notation to introduce so-called Bellman's equations, playing a key role in our fixed point argument.
Definition 5.3. Let A be a finite set, φ : A ω → R be a shift-deterministic payoff and G = ⟨V, V Max , V Min , E⟩ be an A-labeled game graph.
The following equations in x ∈ φ(A ω ) V are called Bellman's equations for φ in G: The most important step of our argument is to show the existence of a solution to Bellman's equations.
Proposition 5.4. For any finite set A, for any continuous prefix-monotone shift-deterministic payoff φ : A ω → R and for any A-labeled game graph G there exists a solution to Bellman's equations for φ in G.
This proposition requires some additional work. We first discuss why does it imply that all continuous prefix-monotone shift-deterministic payoffs are positionally determined. Assume that we are given a solution x to (5.1-5.2). How can one extract an equilibrium of positional strategies from it? For that, we take any pair of positional strategies that use only x-tight edges. Here an edge e is called x-tight if x source(e) = s[a, φ](x target(e) ). Note that each node must contain an out-going x-tight edge (this will be any edge on which the Vol. 19:3 CONTINUOUS POSITIONAL PAYOFFS * 10:17 maximum/minimum in (5.1-5.2) is attained for this node). So clearly each player has at least one positional strategy which only uses x-tight edges. It remains to show that for continuous prefix-monotone shift-deterministic φ, any two such strategies of the players form an equilibrium. Proof. For brevity, we will omit φ in the notation s[a, φ]. We will also use a notation for n ∈ N, a 1 a 2 . . . a n ∈ A n . In particular, s[empty string] will denote the identity function.
It is enough to show that • (a) for any v ∈ V and for any P ∈ Cons(v, σ * ) we have (v) for every v ∈ V , and this by definition means that (σ * , τ * ) is an equilibrium.
So (a) is equivalent to a statement that lim n→∞ T n ≥ T 0 . To show this statement, we demonstrate that T n+1 ≥ T n for every n. Indeed, assume first that v n ∈ V Max . Then, since P is consistent with σ * , we have e n+1 = σ * (v n ). In particular, e n+1 is x-tight, by the conditions of the lemma. This gives us that s[lab(e n+1 )](x * v n+1 ) = x * vn . After applying the function s[lab(e 1 e 2 . . . e n )] to this equality, we obtain 2). The function s[lab(e 1 e 2 . . . e n )] is composed of non-decreasing functions due to Claim 5.2. Hence, after applying this function to the left-hand and the right-hand sides of the inequality s[lab(e n+1 )](x * v n+1 ) ≥ x * vn , we obtain T n+1 ≥ T n .
We now proceed to details of our proof of Proposition 5.4. Consider a function T : φ(A ω ) V → φ(A ω ) V , mapping x ∈ φ(A ω ) V to the vector of the right-hand sides of (5.1-5.2). We should argue that T has a fixed point. For that, we will construct a continuous metric D : φ(A ω ) V × φ(A ω ) V → [0, +∞) with respect to which T is contracting. More precisely, D(T x, T y) will always be smaller than D(x, y) as long as x and y are distinct. Due to the compactness of the domain of T , this will prove that T has a fixed point. Now, to construct such D, we show that for continuous shift-deterministic φ there must be a continuous metric d : φ(A ω ) × φ(A ω ) → [0, +∞) such that all functions s[a, φ], a ∈ A are d-contracting. Once we have such d, we let D(x, y) be the maximum of d(x a , y a ) over a ∈ V . Checking that T is contracting with respect to such D will be rather straightforward. The main technical challenge is to prove the existence of d. We do so via the following general fact about compositions of continuous functions.
Theorem 5.6. Let K ⊆ R be a compact set, m ≥ 1 be a natural number and f 1 , . . . , f m : K → K be m continuous functions. Then the following two conditions are equivalent: every a 1 a 2 a 3 . . . ∈ {1, 2, . . . , m} ω we have y)).
If f 1 , . . . , f m are non-decreasing, then these two conditions are equivalent to the following condition: • (c) there exists a continuous metric d : . . , f m are all d-contracting, and second, for all x, y, s, We postpone the proof of this result to the end of this section.
To derive Proposition 5.4 from this theorem, we first show that it is applicable to functions s[a, φ], a ∈ A for continuous shift-deterministic φ.
Proposition 5.7. Let A be a finite set and φ : A ω → R be a continuous shift-deterministic payoff. Then the functions s[a, φ], a ∈ A are continuous and satisfy the condition (a) of Theorem 5.6 for K = φ(A ω ).
Proof. We use the same abbreviations with respect to the notation s[a, φ] as in the proof of Lemma 5.5.
Let us first demonstrate that s[a] is continuous for every a ∈ A. Consider any x ∈ φ(A ω ) and any infinite sequence {x n ∈ φ(A ω )} n∈N such that lim n→∞ x n = x. We shall show that lim n→∞ s[a](x n ) = s[a](x). It is enough to show that s[a](x) is the only limit point of the sequence {s[a](x n )} n∈N . In other words, w.l.o.g we may assume that the limit lim n→∞ s[a](x n ) exists, and our goal is to show that it equals s[a](x).
Let β n ∈ A ω be such that x n = φ(β n ). Due to the compactness of A ω , there exists β ∈ A ω such that any open set S ⊆ A ω , containing β, also contains β n for infinitely many n. For every k ∈ N there exists n k ≥ k such that the first k letters of β n k and β coincide. Indeed, consider a word u ∈ A k , consisting of the first k letters of β. An open set S = uA ω contains β. Hence, there are infinitely many n such that β n ∈ uA ω , or, equivalently, such that β n starts with u. In particular, there exists such n which is at least as large as k.
Due to continuity of φ, we have that lim k→∞ φ(β n k ) = φ(β). On the other hand, lim k→∞ φ(β n k ) = lim k→∞ x n k = x. Hence, φ(β) = x. Using the continuity of φ again, we  (a 1 a 2 a 3 . . . a n A ω ) for every n ∈ N and a 1 a 2 . . . a n ∈ A n . Thus, it is enough to establish that lim n→∞ diam φ(a 1 a 2 . . . a n A ω ) = 0 for any a 1 a 2 a 3 . . . ∈ A ω . This is a simple consequence of the continuity of φ. Indeed, assume for contradiction that for some a 1 a 2 a 3 . . . ∈ A ω we have diam φ(a 1 a 2 . . . a n A ω ) > ε for infinitely many n. Then for infinitely many n there exist β n , γ n ∈ a 1 a 2 . . . a n A ω with |φ(β n ) − φ(γ n )| ≥ ε. At the same time, by continuity of φ, both φ(β n ) and φ(γ n ) must converge to φ(a 1 a 2 a 3 . . .), contradiction.
We finally derive Proposition 5.4 from Theorem 5.6 and Proposition 5.7. This will finish our second proof of the fact that all continuous prefix-monotone payoffs are positionally determined.
Proof of Proposition 5.4. We use the same abbreviations with respect to the notation s[a, φ] as in the proof of Lemma 5.5.
Define a mapping T : Recall that K is a compact set (because A ω is compact and φ is continuous). It is enough to show that T has a fixed point. By Proposition 5.7, the functions s[a], a ∈ A are continuous (which means that T is also continuous) and satisfy the item (a) of Theorem 5.6. By Claim 5.2, the functions s[a], a ∈ A are non-decreasing. Hence, these functions satisfy the item (c) of Theorem 5.6. That is, there exists a continuous metric d : K × K → [0, +∞) such that, first, the function s[a] is d-contracting for every a ∈ A, and second, for every x, s, t, y ∈ K we have x ≤ s ≤ t ≤ y =⇒ d(s, t) ≤ d(x, y). Define a metric D : K V × K V → [0, +∞) as follows: It is enough to show D(T (x), T (y)) < D(x, y) for all x, y ∈ K V , x ̸ = y. Indeed, assume that this inequality is already established. Consider a point x * ∈ K V minimizing D(x, T (x)). Such x * exists because D(x, T (x)) is continuous and K V × K V is a compact set. If x * ̸ = T (x * ), then D(T (x * ), T • T (x * )) < D(x * , T (x * )), contradiction. Now, take any x, y ∈ K V , x ̸ = y. Let u ∈ V be such that D(T (x), T (y)) = d(T (x) u , T (y) u ). Assume w.l.o.g. that u ∈ V Max . Also, up to swapping x and y, we may assume that T (x) u ≤ T (y) u . Let e be an edge on which the maximum in (5.3) is Since for any x, s, t, y ∈ K it holds that x ≤ s ≤ t ≤ y =⇒ d(s, t) ≤ d(x, y), we get: . We finish this section with the missing proof of Theorem 5.6.
Proof of Theorem 5.6. For the sake of readability, we will use the following notation. First, we will denote f i by f [i]. Moreover, we will abbreviate for n ∈ N, a 1 a 2 . . . a n ∈ {1, 2, . . . , m} n . In particular, f [empty word] will denote the identity function).
Lemma 5.8. The condition (a) of Theorem 5.6 is equivalent to the following condition: for every ε > 0 there are only finitely many w ∈ {1, 2, . . . , m} * such that This is a contradiction with the condition (a) of Theorem 5.6. The opposite direction of the lemma is obvious.
This already implies that f [i] is d-contracting for every i ∈ {1, 2, . . . , m}. Indeed, take any x, y ∈ K. Then for some w ∈ {1, 2, . . . , m} * we have: there is nothing to prove. Otherwise, the quantity is positive. Therefore, we can write: It remains to show that d is continuous. Consider any (x 0 , y 0 ) ∈ K × K. We have to show that for any ε > 0 there exists δ > 0 such that for all (x, y) By Lemma 5.8, there exists n ∈ N such that for all w ∈ {1, 2, . . . , m} * with |w| ≥ n we have: diam f [w](K) ≤ ε/6. In particular, this means that all terms in (5.5) corresponding to w ∈ {1, 2, . . . , m} * with |w| ≥ n are at most ε/3. Hence, for every (x, y) ∈ K × K we have that d(x, y) is (ε/3)-close to d n (x, y), where Now, notice that the function d n is continuous (as a composition of finitely many continuous functions). Hence there exists δ > 0 such that for all (x, y) ∈ K × K with |x − x 0 | + |y − y 0 | ≤ δ we have |d n (x, y) − d n (x 0 , y 0 )| ≤ ε/3. Obviously, for all such (x, y) we also have |d(x, y) − d(x 0 , y 0 )| ≤ ε. Proof of (b) =⇒ (a). We show that for every ε > 0 there exists n ∈ N such that for all w ∈ {1, 2, . . . , m} * with |w| ≥ n it holds that Obviously, this implies (a). Define T = {(x, y) ∈ K × K | |x − y| ≥ ε}. Note that T is a compact set. A function d(x, y)/|x − y| is continuous on T . Hence, there exists Observe that z > 0. Indeed, for some (x, y) ∈ T we have z = d(x, y)/|x − y|. By definition of T , we have |x − y| ≥ ε. Hence x ̸ = y and d(x, y) is positive, as well as z.
Again, S is a compact set. Consider a function: The function h is continuous on S (we never have 0 in its denominator on S). Hence there The function h is non-negative, so λ ≥ 0. Let us show that λ < 1. Indeed, for some (x, y) ∈ S we have λ = h(x, y). By definition of h, for some i ∈ {1, 2, . . . , m} we have: y) and λ < 1. Define D = sup x,y∈K d(x, y). If D = 0, then K consists of a singe point, which means that the condition (a) trivially holds. From now on we assume that D > 0. Take any n ∈ N such that λ n < zε D .
We claim that for any w ∈ {1, 2 . . . , m} * with |w| ≥ n we have diam f [w](K) ≤ ε. We only have to show this for w of length exatly n. This is because if w ′ is of length at least n, then where w is a prefix of w ′ of length n. So take any w = a 1 a 2 . . . a n ∈ {1, 2 . . . , m} n . Let us first establish that: . . a n for i = 1, . . . , n, and let w ≥n+1 be the empty string. Set , and since f [a i ] is d-contracting, for every i = 1, . . . , n we have In fact, if F i+1 ≥ zε, then, by definition of λ, it holds that F i ≤ λF i+1 . Recall that

The Structure of Continuous Positional Payoffs
In this section, we give an explicit description of the set of continuous positionally determined payoffs, see Theorem 6.3 below. Then, in Proposition 6.5, we use our description to give an alternative definition of the class of multi-discounted payoffs. Finally, in Proposition 6.7, we give an example of a continuous positionally determined payoff which, in a quite strong sense, does not "reduce" to multi-discounted payoffs. We start with some terminology. Let K ⊆ R be a compact set. We call a family of Proof. This intersection is non-empty due to Cantor's intersection theorem. To show that this intersection contains just one point, observe that f 1 , . . . , f m satisfy the item (b) of Theorem 5.6 by definition. Hence, they also satisfy the item (a) of this theorem. This means that the diameter of this intersection is 0.
As for the second claim, note that the distance between the unique element of our intersection and the point f   a 1 a 2 a 3 . . . ∈ {1, 2, . . . , m} ω we have: Proof. Let us first establish (6.1). Take any x ∈ K. By Claim 6.1, we have This immediately implies that ψ is shift-deterministic. To show that ψ is continuous, we use Proposition 2.5. Take any α = a 1 a 2 a 3 . . . ∈ {1, 2, . . . , m} ω and any infinite sequence {β n } n≥1 of elements of {1, 2, . . . , m} ω such that α and β n have the same prefixes of length n, for every n ≥ 1. We have to show that ψ(β n ) → ψ(α) as n → ∞. By (6.1), both ψ(β n ) and ψ(α) belong to the set f a 1 • f a 2 • . . . • f an (K). Hence, the difference between ψ(β n ) and ψ(α) does not exceed the diameter of this set. But by the item (a) of Theorem 5.6, the diameter of this set converges to 0 as n → ∞. Proof. Assume first that φ : A ω → R is continuous and positionally determined. Then φ is prefix-monotone by Theorem 3.2. By Proposition 4.6, there is a continuous prefixmonotone shift-deterministic payoff ψ : A ω → R and a continuous non-decreasing function g : ψ(A ω ) → R such that φ = g • ψ. Set K = ψ(A ω ). Note that K is compact due to the continuity of ψ. Define f i = s[i, ψ]. By Claim 5.2, the functions f 1 , f 2 , . . . , f m are non-decreasing. By Proposition 5.7, the functions f 1 , f 2 , . . . , f m are continuous and satisfy the item (a) of Theorem 5.6. Hence, they form a non-decreasing contracting base with respect to some continuous metric d : K × K → [0, +∞). It remains to show that ψ coincides with the payoff induced by f 1 , . . . , f m . For that, take any x ∈ K = ψ(A ω ). By the second part of Claim (6.1), it is sufficient to show that ψ (a 1 a 2 a 3 a 1 a 2 . . . a n β). This quantity converges to ψ(a 1 a 2 a 3 . . .) as n → ∞ due to the continuity of ψ.
In turn, assume that φ was obtained in these 5 steps. By Theorem 3.2, we only have to show that φ is continuous and prefix-monotone. First, by Claim 6.2, we have that ψ is continuous. Since φ = g • ψ and g is continuous, we have that φ is continuous as well.
Remark 6.4. Recall that we did not use the continuity of g from Proposition 4.6 in the inductive argument, but we use it in the proof of Theorem 6.3.
Next, we characterize the class of multi-discounted payoffs, using the language of Theorem 6.3.
We also construct a continuous positionally determined payoff which does not "reduce" to the multi-discounted ones, in a sense of the following definition.
Definition 6.6. Let A be a finite set, φ, ψ : A ω → R be two payoffs, and G be an A-labeled game graph. We say that φ positionally reduces to ψ inside G if any pair of positional strategies in G which is an equilibrium for ψ is also an equilibrium for φ. This definition has an algorithmic motivation. Namely, note that finding a positional equilibrium for ψ in G is at least as hard as for φ, provided that φ reduces to ψ inside G. There are classical reductions from Parity to Mean Payoff games [Jur] and from Mean Payoff to Discounted games [ZP96] that work in exactly this way. See also [Gim] for a reduction from Priority Mean Payoff games to Multi-Discounted games. As far as we know, our next proposition provides the first example of a positionally determined payoff which does not reduce to multi-discounted payoffs in this sense.
Proposition 6.7. There exist a finite set A, a continuous positionally determined payoff φ : A ω → R and an A-labeled game graph G such that there exists no multi-discounted payoff to which φ reduces inside G.
Proof. It is sufficient to establish the following lemma.
Lemma 6.8. There exist a finite set A, a continuous positionally determined payoff φ : A ω → R and three pairs (α 1 , β 1 ), (α 2 , β 2 ), (α 3 , β 3 ) ∈ A ω × A ω of ultimately periodic infinite words such that: Indeed, assume that this lemma is proved. Consider a game graph from Figure 2, consisting of three pairs of "lassos". The only optimal positional strategy of Max there w.r.t. φ is to go to the left from v 1 , v 2 and v 3 . On the other hand, any multi-discounted payoff has an optimal positional strategy which for some i ∈ {1, 2, 3} goes to the right from v i . Hence, there is no multi-discounted payoff to which φ positionally reduces insides the game graph from Figure 2.
Figure 2: All nodes are owned by Max. For every i = 1, 2, 3, the node v i has two lassos L i and R i starting at it, one going to the left, and the other going to the right. We label their edges in such a way that lab(L i ) = α i and lab(R i ) = β i . This is possible because α 1 , α 2 , α 3 and β 1 , β 2 , β 3 are ultimately periodic.
To finish a proof of Lemma 6.8, we construct a continuous positionally determined payoff φ : {1, 2, 3} ω → R such that: For that, we use Theorem 6.3. Namely, we set K = [0, 1] and d(x, y) = |x − y|. Next, we let f 1 = x 2 , f 3 = x 2 + 1 2 . These two functions are clearly d-contracting. Finally, we let f 2 : [0, 1] → [0, 1] be a piece-wise linear function whose graph has the following break-points: Observe that its slope is always from [0, 1), so f 2 is also d-contracting. So f 1 , f 2 , f 3 is a non-decreasing contracting base. Let φ : {1, 2, 3} ω → R be the payoff induced by f 1 , f 2 , f 3 , that is, (Of course, 1 can here can be changed to any point from [0, 1], but 1 is the most convenient for computations below.) By Theorem 6.3, we have that φ is a continuous positionally determined payoff. Now, it is easy to see that φ(3 ω ) = 1 and

Strategy improvement argument
Here we establish the existence of a solution to Bellman's equations (Proposition 5.4) via the strategy improvement. This will yield our third proof of Theorem 3.2. We start with an observation that the vector of values of a positional strategy always gives a solution to a restriction of Bellman's equations to edges that are consistent with this strategy. (If u ∈ V Max , the minimum is over a single edge e = σ(u). If u ∈ V Min , the minimum is over all edges that start at u).
Remark 7.2. Technically, Bellman's equations are over x ∈ φ(A ω ) V . So we have to argue that Val[σ](u) ∈ φ(A ω ) for every u ∈ V . This is because Val[σ](u) is the infimum of some subset of φ(A ω ). In turn, since φ is continuous, we have that φ(A ω ) is compact, and hence is closed.
Proof of Lemma 7.1. For brevity, we will denote C u = Cons(u). By definition, Val[σ](u) is the infimum of the image of φ • lab on the set C u . Now, the set C u is exactly the set of infinite paths that start at u and consist only of edges from E σ . So we can write: The infimum of a union of finitely many sets is the minimum of the infimums of these sets. So we get: It is sufficient to show that: inf φ • lab eC target(e) = s[lab(e)] Val[σ](target(e)) . (7.1) For any a ∈ A, S ⊆ A ω , by definition of s[a], we can write: After applying this to a = lab(e), S = lab C target(e) , we obtain: Hence, (7.1) is proved. Next, take a positional strategy σ of Max. If the vector {Val[σ](u)} u∈V happens to be a solution to the Bellman's equations, then we are done. Otherwise by Lemma 7.1 there must exist an edge e ∈ E with source(e) ∈ V Max such that Val[σ](source(e)) < s[lab(e), φ] Val[σ](target(e)) . We call edges satisfying this property σ-violating. We show that switching σ to any σ-violating edge gives us a positional strategy which improves σ.
Proof. For x ∈ φ(A ω ) V , let the modified cost of an edge e ∈ E with respect to x be the following quantity: We need the following "potential transformation lemma" (its analog for discounted payoffs is well-known, see, e.g., [HMZ13, Lemma 3.6]).
We apply this lemma to the vector g = {Val[σ](u)} u∈V . Note that by Lemma 7.1 we have R g (e) ≥ 0 for every e ∈ E σ . In turn, since e ′ is σ-violating, we have R g (e ′ ) > 0.
Let us at first show that In other words, we will demonstrate that φ • lab(P) ≥ g u for any infinite path P = e 1 e 2 e 3 . . . ∈ Cons(u, σ ′ ). Indeed, by Lemma 7.4 we can write: for some λ n ∈ [0, +∞), λ 1 = 1. All edges of P are from E σ ∪ {e ′ }. Hence, all terms in this series are non-negative, and so is the left-hand side.
We will show this for u = source(e ′ ). The first edge of any P ∈ Cons(u, σ ′ ) is e ′ . So the first term in (7.2) for any such P equals R g (e ′ ). All the other terms, as we have discussed, are non-negative. Hence, φ • lab(P) ≥ R g (e ′ ) + Val[σ](u) for any P ∈ Cons(u, σ ′ ). Since R g (e ′ ) is strictly positive, we get that Val

Subexponential-time Algorithm
In this subsection, we discuss implications of our strategy improvement argument to the strategy synthesis problem. The strategy synthesis for a positionally determined payoff φ is an algorithmic problem of finding an equilibrium (with respect to φ) of two positional strategies in a given game graph. It is classical that the strategy synthesis for parity, mean and multi-discounted payoffs payoffs admits a randomized algorithm which is subexponential in the number of nodes [Hal07,BV05]. We obtain the same subexponential bound for all continuous positionally determined payoffs. For that, we use a framework of recursively local-global functions due to Björklund and Vorobyov [BV05].
Let us start with an observation that for continuous positionally determined shiftdeterministic payoffs, a non-optimal positional strategy can always be improved by changing it in a single node. Since Val[σ] is not a solution to Bellman's equation, we can take σ ′ as in Lemma 7.3, obtained by switching σ to some σ-violating edge. The argument for positional strategies of Min is similar.
It is instructive to visualize this proposition by imagining the set of positional strategies of one of the players (say, Max) as a hypercube. Namely, in this hypercube there will be as many dimensions as there are nodes of Max. A coordinate, corresponding to a node u ∈ V Max , will take values in the set of edges that start at u. Obviously, vertices of such a hypercube are in a one-to-one correspondence with positional strategies of Max. Let us call two vertices neighbors of each other if they differ in exactly one coordinate. Now, Proposition 8.1 means the following: any vertex σ, maximizing u∈V Val[σ](u) over its neighbors, also maximizes this quantity over the whole hypercube.
So the optimization problem of maximizing u∈V Val[σ](u) (equivalently, finding an optimal positional strategy of Max) has the following remarkable feature: all its local maxima are also global. For positional strategies of Min the same holds for the minima. Optimization problems with this feature have been studied in numerous works, starting from a classical area of convex optimization.
Observe that in our case this local-global property is recursive; i.e., it holds for any restriction to a subcube of our hypercube. Indeed, subcubes correspond to subgraphs of our initial game graph, and for any subgraph we still have Proposition 8.1. Björklund and Vorobyov [BV05] noticed that a similar phenomenon occurs for all classical positionally determined payoffs. In turn, they showed that any optimization problem on a hypercube with this recursive local-global property admits a randomized algorithm which is subexponential in the dimension of a hypercube. In our case, this yields a randomized algorithm for the strategy synthesis problem which is subexponential in the number of nodes of a game graph.
Still, this only applies to continuous payoffs that are shift-deterministic (as we have Proposition 8.1 only for shift-deterministic payoffs). One more issue is that we did not specify how our payoffs are represented. We overcome these difficulties in the following result. Its proof is given in Section 9.
Theorem 8.2. Let A be a finite set and φ : A ω → R be a continuous positionally determined payoff. Consider an oracle which for given u, v, a, b ∈ A * tells, whether there exists w ∈ A * such that φ(wu(v) ω ) > φ(wa(b) ω ). There exists a randomized algorithm, which solves the strategy synthesis problem for φ with this oracle in expected e O(log m+ √ n log m) time for game graphs with n nodes and m edges. In particular, every call to the oracle in the algorithm is for u, v, a, b ∈ A * that are of length O(n), and the expected number of the calls is e O(log m+ √ n log m) . So to deal with the issue of representation we assume a suitable oracle access to φ. Still, the oracle from Theorem 8.2 might look unmotivated. Here it is instructive to recall that all continuous positionally determined φ must be prefix-monotone. For prefix-monotone φ, the formula ∃w ∈ A * φ(wα) > φ(wβ) defines a total preorder on A ω , and our oracle just compares ultimately periodic infinite words according to this preorder. In fact, it is easy to see that the formula ∃w ∈ A * φ(wα) > φ(wβ) defines a total preorder on A ω if and only if φ is prefix-monotone. This indicates a fundamental role of this preorder for prefixmonotone φ and justifies the use of the corresponding oracle in Theorem 8.2. Let us note that ∃w ∈ A * φ(wα) > φ(wβ) ⇐⇒ φ(α) > φ(β) if φ is additionally shift-deterministic. 9. Proof of Theorem 8.2 First in Subsection 9.1 it is demonstrated that w.l.o.g. we may assume that φ is shiftdeterministic (so that we can use Proposition 8.1) and that we are given an oracle which simply compares values of φ on ultimately periodic infinite words. Then in Subsection 9.2 we expose a framework of recursively local-global functions due to Björklund and Vorobyov. Finally, in Subsection 9.3 we use this framework to show Theorem 8.2 in the assumptions of Subsection 9.1. 9.1. Reducing to shift-deterministic payoffs. It is sufficient to establish Theorem 8.2 with the following assumptions.
To justify this, it is enough to show the following lemma.
Indeed, let φ be an arbitrary continuous positionally determined payoff and ψ be as in Lemma 9.3. By Proposition 2.3, an equilibrium for ψ is also an equilibrium for φ = g • ψ. So to solve the strategy synthesis for φ, it is enough to do so for ψ. Clearly, ψ satisfies Assumption 9.1. Finally, note that the oracle from Assumption 9.2 for ψ simply coincides on every input with the oracle we are given for φ in Theorem 8.2.
To demonstrate that we have to recall the construction of ψ. By (4.1), we can write: If φ(wα) ≤ φ(wβ) for all w ∈ A * , then clearly ψ(α) ≤ ψ(β). This is exactly the contraposition to the implication that we have to prove.
Let S be a structure and f be a function from the set of vertices of S to R. Given a structure S and a function f from the set of vertices of S to R, we are interested in finding a global maximum of f . In [BV05] Björklund and Vorobyov obtained the following result.
Theorem 9.4 [BV05, Theorem 5.1]. Let S = {S i } d i=1 be a d-dimensional structure and f : d i=1 S i → R be a recursively local-global function. Consider an oracle which, given two vertices σ 1 and σ 2 of S that are neighbors of each other, compares f (σ 1 ) and f (σ 2 ). There is a randomized algorithm which find a global maximum of f with this oracle in expected 9.3. Deriving Theorem 8.2 with Assumptions 9.1 and 9.2. Let G = ⟨V, V Max , V Min , E⟩ be an A-labeled game graph in which we want to solve the strategy synthesis. We will only show how to find an optimal positional strategy of Max, the argument for Min is similar.
Obviously, we may identify vertices of S with positional strategies of Max. Define Val[σ](u).

10:34
A. Kozachinskiy Vol. 19:3 Lemma 9.5. Any global maximum of f is an optimal positional strategy of Max.
Proof. Let σ be a global maximum of f and σ * be any unformly optimal positional strategy of Max. By uniform optimality of σ * , we have On the other hand, σ maximizes the sum of the values (over all positional strategies of Max), so we must have Val[σ * ](u) = Val[σ](u) for every u ∈ V . This means that σ is also optimal.
Lemma 9.6. The function f is recursively local-global.

Proof.
A fact that f is local-global is a simple consequence of Proposition 8.1 (note that by Assumption 9.1, our payoff satisfies the requirements of this proposition). Indeed, a strategy σ which is not a global maximum of f cannot be optimal. Then take σ ′ as in Proposition 8.1. It is a neighbor of σ with f (σ ′ ) > f (σ), so σ cannot be a local maximum as well.
To show that f is recursively local-global, it is sufficient to note that substructures of S correspond to subgraphs of G, and for these subgraphs we also have Proposition 8.1.
Due to these two lemmas, if we run the algorithm from Theorem 9.4, we get an optimal positional strategy of Max in expected Note that d does not exceed the number of nodes of G and m does not exceed the number of edges, so Theorem 8.2 follows.
Still, the algorithm from Theorem 9.4 requires an oracle which, given any two vertices σ 1 and σ 2 of S that are neighbors of each other, compares f (σ 1 ) and f (σ 2 ). In our case, this oracle, given two positional strategies σ 1 , σ 2 of Max that differ from each other in exactly one node, compares the sums of their values: We have to perform this comparison using the oracle from Assumption 9.2.
Assume that the node where σ 1 and σ 2 differ is v. Let G 1 (respectively, G 2 ) be a game graph obtained from G by deleting all edges that are not consistent with σ 1 (resp., σ 2 ). Next, let G 1,2 be a game graph of all edges that appear either in G 1 or in G 2 .
Observe that in G 1,2 , strategies σ 1 , σ 2 are the only two positional strategies of Max (indeed, all nodes of Max except v have exactly one out-going edge in G 1,2 , and v has exactly two). One of these strategies must be optimal in G 1,2 . So either So our task reduces to a task of comparing Val[σ 1 ](u) and Val[σ 2 ](u) for u ∈ V .
Assume first that our game graph G is one-player. This means that for one of the players it holds that all nodes of this player have out-degree 1. In our case, this must be Min, because Max has two distinct positional strategies σ 1 and σ 2 . In particular, there is exactly one strategy τ of Min in G, and this strategy is positional (even if there are no nodes controlled by Min, we assume that Min has a unique empty strategy τ ). Hence, Val[σ 1 ](u) = φ•lab P σ 1 ,τ u and Val[σ 2 ](u) = φ • lab P σ 2 ,τ u . It remains to compare the value of φ on lab P σ 1 ,τ u and on lab P σ 2 ,τ u using the oracle from Assumption 9.2. These two infinite words are written over some lassos in G, so we can decompose them as lab P σ 1 ,τ u = u(v) ω and lab P σ 2 ,τ u = a(b) ω in polynomial time.
Theorem 8.2 is already proved for one-player game graphs. Hence, at the cost of increasing the expected running time by a factor of e O(log m+ √ n log m) , we may assume that we also have an oracle which can solve the strategy synthesis for φ in one-player game graphs. Then we can find an optimal positional strategy τ 1 of Min in G 1 and an optimal positional strategy τ 2 of Min in G 2 . Indeed, in these two graphs all nodes of Max have exactly one out-going edge. Observe that τ 1 is an optimal response to σ 1 and τ 2 is an optimal response to σ 2 , so we have: It remains to compare the value of φ on lab P σ 1 ,τ 1 u and lab P σ 2 ,τ 2 u . We can do this via the oracle from Assumption 9.2.

Multi-discounted Payoffs and MDPs
In this section, we establish the following result.
Theorem 10.1. Let A be a finite set and φ : A ω → R be a continuous payoffs. Then φ is positionally determined in MDPs if and only if φ is multi-discounted.
This theorem disproves the following conjecture of Gimbert [Gim07]: "Any payoff function which is positional for the class of non-stochastic one-player games is positional for the class of Markov decision processes". Indeed, by Proposition 6.7, there exists a continuous positionally determined payoff which is not multi-discounted. By Theorem 10, this payoff is not positionally determined in MDPs.
Proposition 10.3. If a continuous payoff is positionally determined in MDPs, then this payoff is prefix-monotone.
We also show that these two necessary conditions imply that φ is multi-discounted.
Then φ is a multi-discounted payoff. Note that Proposition 10.3 is already proved. Indeed, in Section 3 we have shown that for any continuous payoff which is not prefix-monotone, there exists a game graph where φ is not positional. This game graph had the following feature: all its nodes were controlled by Max. Thus, this game graph is a deterministic MDP, which means that any continuous payoff which is not prefix monotone is not positionally determined in MDPs.
To finish our proof of Theorem 10.1, it remains to prove Propositions 10.2 and 10.4.
10.1. Proof of Proposition 10.2. Assume for contradiction that such a, β, γ, δ, (p 1 , p 2 , p 3 ) and (q 1 , q 2 , q 3 ) exist. By the continuity of φ, we may assume that β, γ and δ are ultimately periodic. We construct an A-labeled MDP M where φ has no optimal positional strategy.
To define M, consider an A-labeled game graph from Figure 3. In this graph there are exactly 3 infinite paths ("lassos") P 1 , P 2 , P 3 that start at v. We label their edges in such a way that lab(P 1 ) = β, lab(P 2 ) = γ, lab(P 3 ) = δ. This is possible because β, γ and δ are ultimately periodic.
Next, we turn this graph into an MDP (formally, nodes of the graph will be states of the MDP). There will be two actions available at the node v. Both will be distributed on the set of successors of v. One gives a probability p i to the successor which leads to the lasso P i , for i = 1, 2, 3. The other gives a probability q i to the successor which leads to the lasso P i , for i = 1, 2, 3. For each node different from v there will be only one action with the source in this node, leading with probability 1 to its unique successor.
It remains to define the labeling function of M. Fix a transition. It is over some edge of the graph from Figure 3. We define the label this transition as the label of this edge. This concludes the description of M.
From the second and the third coordinate we conclude that (λ, w) must be a solution to (10.1), so λ = λ(a), w = w(a). Now, looking at the first coordinate, we obtain that φ(aβ) = λ(a)φ(β) + w(a), as required.