Abstract

This paper is concerned with the properties of the value-iteration operator which arises in undiscounted Markov decision problems. We give both necessary and sufficient conditions for this operator to reduce to a contraction operator, in which case it is easy to show that the value-iteration method exhibits a uniform geometric convergence rate. As necessary conditions we obtain a number of important characterizations of the chain and periodicity structures of the problem, and as sufficient conditions, we give a general "scrambling-type" recurrency condition, which encompasses a number of important special cases. Next, we show that a data transformation turns every unichained undiscounted Markov Renewal Program into an equivalent undiscounted Markov decision problem, in which the value-iteration operator is contracting, because it satisfies this "scrambling-type" condition. We exploit this contraction property in order to obtain lower and upper bounds as well as variational characterizations for the fixed point of the optimality equation and a test for eliminating suboptimal actions.

Authors
Awi Federgruen, Paul Schweitzer, and H. C. Tijms
Format
Journal Article
Publication Date
Journal
Journal of Mathematical Analysis and Applications

Full Citation

Federgruen, Awi, Paul Schweitzer, and H. C. Tijms
. “Contraction mappings underlying undiscounted Markov decision problems.”
Journal of Mathematical Analysis and Applications
vol.
65
, (October 01, 1978):
711
-
730
.