Approximating ergodic average reward continuous: time controlled Markov chains

Lorenzo Magán, José María2024-10-042024-10-042010-01T. Prieto-Rumeau and J. M. Lorenzo, "Approximating Ergodic Average Reward Continuous-Time Controlled Markov Chains," in IEEE Transactions on Automatic Control, vol. 55, no. 1, pp. 201-207, Jan. 2010, doi: 10.1109/TAC.2009.2033848. keywords: {Convergence;Optimal control;State-space methods;Statistics;Operations research;Process control;Adaptive control;Terminology;Approximation of control problems;Ergodic Markov decision processes (MDPs);policy iteration algorithm},0018-928610.1109/TAC.2009.2033848https://hdl.handle.net/20.500.14352/108643We study the approximation of an ergodic average reward continuous-time denumerable state Markov decision process (MDP) by means of a sequence of MDPs. Our results include the convergence of the corresponding optimal policies and the optimal gains. For a controlled upwardly skip-free process, we show some computational results to illustrate the convergence theoremsengAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/Approximating ergodic average reward continuous: time controlled Markov chainsjournal article1558-2523restricted accessApproximation of control problemsErgodic Markov decision processes (MDPs)Policy iteration algorithmEstadística1209 Estadística