Convergence Properties of a Computational Learning Model for Unknown Markov Chains

Andreas A. Malikopoulos

Source: Journal of Dynamic Systems, Measurement, and Control:;2009:;volume( 131 ):;issue: 004::page 41011

Author:

DOI: 10.1115/1.3117202

Publisher: The American Society of Mechanical Engineers (ASME)

Abstract: The increasing complexity of engineering systems has motivated continuing research on computational learning methods toward making autonomous intelligent systems that can learn how to improve their performance over time while interacting with their environment. These systems need not only to sense their environment, but also to integrate information from the environment into all decision-makings. The evolution of such systems is modeled as an unknown controlled Markov chain. In a previous research, the predictive optimal decision-making (POD) model was developed, aiming to learn in real time the unknown transition probabilities and associated costs over a varying finite time horizon. In this paper, the convergence of the POD to the stationary distribution of a Markov chain is proven, thus establishing the POD as a robust model for making autonomous intelligent systems. This paper provides the conditions that the POD can be valid, and be an interpretation of its underlying structure.

Download: (143.5Kb)
Show Full MetaData Hide Full MetaData
Get RIS
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

Convergence Properties of a Computational Learning Model for Unknown Markov Chains

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/140201

Collections

Journal of Dynamic Systems, Measurement, and Control

contributor author	Andreas A. Malikopoulos
date accessioned	2017-05-09T00:32:10Z
date available	2017-05-09T00:32:10Z
date copyright	July, 2009
date issued	2009
identifier issn	0022-0434
identifier other	JDSMAA-26497#041011_1.pdf
identifier uri	http://yetl.yabesh.ir/yetl/handle/yetl/140201
description abstract	The increasing complexity of engineering systems has motivated continuing research on computational learning methods toward making autonomous intelligent systems that can learn how to improve their performance over time while interacting with their environment. These systems need not only to sense their environment, but also to integrate information from the environment into all decision-makings. The evolution of such systems is modeled as an unknown controlled Markov chain. In a previous research, the predictive optimal decision-making (POD) model was developed, aiming to learn in real time the unknown transition probabilities and associated costs over a varying finite time horizon. In this paper, the convergence of the POD to the stationary distribution of a Markov chain is proven, thus establishing the POD as a robust model for making autonomous intelligent systems. This paper provides the conditions that the POD can be valid, and be an interpretation of its underlying structure.
publisher	The American Society of Mechanical Engineers (ASME)
title	Convergence Properties of a Computational Learning Model for Unknown Markov Chains
type	Journal Paper
journal volume	131
journal issue	4
journal title	Journal of Dynamic Systems, Measurement, and Control
identifier doi	10.1115/1.3117202
journal fristpage	41011
identifier eissn	1528-9028
tree	Journal of Dynamic Systems, Measurement, and Control:;2009:;volume( 131 ):;issue: 004
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive