Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes

Krishnendu Chatterjee; Zuzana Křetínská; Jan Křetínský

doi:10.23638/LMCS-13(2:15)2017

Krishnendu Chatterjee ; Zuzana Křetínská ; Jan Křetínský - Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes

lmcs:3757 - Logical Methods in Computer Science, July 3, 2017, Volume 13, Issue 2 - https://doi.org/10.23638/LMCS-13(2:15)2017

Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision ProcessesArticle

Authors: Krishnendu Chatterjee ; Zuzana Křetínská ; Jan Křetínský

We consider Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) objectives. There exist two different views: (i) the expectation semantics, where the goal is to optimize the expected mean-payoff objective, and (ii) the satisfaction semantics, where the goal is to maximize the probability of runs such that the mean-payoff value stays above a given vector. We consider optimization with respect to both objectives at once, thus unifying the existing semantics. Precisely, the goal is to optimize the expectation while ensuring the satisfaction constraint. Our problem captures the notion of optimization with respect to strategies that are risk-averse (i.e., ensure certain probabilistic guarantee). Our main results are as follows: First, we present algorithms for the decision problems which are always polynomial in the size of the MDP. We also show that an approximation of the Pareto-curve can be computed in time polynomial in the size of the MDP, and the approximation factor, but exponential in the number of dimensions. Second, we present a complete characterization of the strategy complexity (in terms of memory bounds and randomization) required to solve our problem.

https://doi.org/10.23638/LMCS-13(2:15)2017

Source: arXiv.org:1502.00611

Volume: Volume 13, Issue 2

Published on: July 3, 2017

Accepted on: July 3, 2017

Submitted on: July 3, 2017

Keywords: Computer Science - Logic in Computer Science

Licence: arXiv.org - Non-exclusive license to distribute

Funding:

Source : OpenAIRE Graph

International IST Postdoctoral Fellowship Programme; Funder: European Commission; Code: 291734
Quantitative Reactive Modeling; Funder: European Commission; Code: 267989
Quantitative Graph Games: Theory and Applications; Funder: European Commission; Code: 279307
Modern Graph Algorithmic Techniques in Formal Verification; Funder: Austrian Science Fund (FWF); Code: P 23499
Formal methodes for the design and analysis of complex systems; Funder: Austrian Science Fund (FWF); Code: Z 211

Bibliographic References

15 Documents citing this article

Maryam Eghbali-Zarch;Reza Tavakkoli-Moghaddam;Amir Azaron;Kazem Dehghan-Sanej, 2021, An extended ϵ‐constraint method for a multiobjective finite‐horizon Markov decision process, International transactions in operational research, 29, 5, pp. 3131-3160, 10.1111/itor.12989.

Jan Křetínský;Tobias Meggendorfer, arXiv (Cornell University), Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes, 2018, Oxford United Kingdom, 10.1145/3209108.3209176, https://arxiv.org/abs/1805.02946.

Jan Křetínský;Tobias Meggendorfer;Salomon Sickert;Christopher Ziegler, Lecture notes in computer science, Rabinizer 4: From LTL to Your Favourite Deterministic Automaton, pp. 567-577, 2018, 10.1007/978-3-319-96145-3_30, https://doi.org/10.1007/978-3-319-96145-3_30.

Christoph Haase;Stefan Kiefer;Markus Lohrey, Oxford University Research Archive (ORA) (University of Oxford), Computing quantiles in Markov chains with multi-dimensional costs, 2017, Reykjavik, Iceland, 10.1109/lics.2017.8005090, https://ora.ox.ac.uk/objects/uuid:e7a714fe-c958-4ab2-8a4d-13a2ae5e4203.

Mickael Randour;Jean-François Raskin;Ocan Sankur, 2017, Percentile queries in multi-dimensional Markov decision processes, arXiv (Cornell University), 50, 2-3, pp. 207-248, 10.1007/s10703-016-0262-7.

Pranav Ashok;Krishnendu Chatterjee;Przemysław Daca;Jan Křetínský;Tobias Meggendorfer, arXiv (Cornell University), Value Iteration for Long-Run Average Reward in Markov Decision Processes, pp. 201-221, 2017, 10.1007/978-3-319-63387-9_10, https://arxiv.org/abs/1705.02326.

Krishnendu Chatterjee;Laurent Doyen, arXiv (Cornell University), Perfect-Information Stochastic Games with Generalized Mean-Payoff Objectives, 2016, New York NY USA, 10.1145/2933575.2934513, https://arxiv.org/abs/1604.06376.

Krishnendu Chatterjee;Thomas A. Henzinger;Jan Otop, Nested Weighted Limit-Average Automata of Bounded Width, 58, pp. 14-, 2016, 10.4230/lipics.mfcs.2016.24.

Krishnendu Chatterjee;Thomas A. Henzinger;Jan Otop, arXiv (Cornell University), Quantitative Automata under Probabilistic Semantics, 2016, New York NY USA, 10.1145/2933575.2933588, http://arxiv.org/abs/1604.06764.

Krishnendu Chatterjee;Thomas A. Henzinger;Jan Otop, Lecture Notes in Computer Science, Quantitative Monitor Automata, pp. 23-38, 2016, 10.1007/978-3-662-53413-7_2.

Tomáš Brázdil;Antonín Kučera;Petr Novotný, arXiv (Cornell University), Optimizing the Expected Mean Payoff in Energy Markov Decision Processes, pp. 32-49, 2016, 10.1007/978-3-319-46520-3_3, http://arxiv.org/abs/1607.00678.

Krishnendu Chatterjee;Zuzana Komarkova;Jan Kretinsky, arXiv (Cornell University), Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes, 2015, Kyoto, Japan, 10.1109/lics.2015.32, https://arxiv.org/abs/1502.00611.

Lorenzo Clemente;Jean-Francois Raskin, arXiv (Cornell University), Multidimensional beyond Worst-Case and Almost-Sure Problems for Mean-Payoff Objectives, 2015, Kyoto, Japan, 10.1109/lics.2015.33, http://arxiv.org/abs/1504.08211.

Mickael Randour;Jean-François Raskin;Ocan Sankur, Lecture Notes in Computer Science, Percentile Queries in Multi-dimensional Markov Decision Processes, pp. 123-139, 2015, 10.1007/978-3-319-21690-4_8, https://doi.org/10.1007/978-3-319-21690-4_8.

Vojtĕch Forejt;Jan Krăźál;Jan Křetínský, Lecture notes in computer science, Controller Synthesis for MDPs and Frequency LTL $$_{\setminus \mathbf{G}\mathbf U}$$, pp. 162-177, 2015, 10.1007/978-3-662-48899-7_12.

Sources : OpenCitations, OpenAlex & Crossref

Share and export

Consultation statistics

This page has been seen 1340 times.

This article's PDF has been downloaded 366 times.