Author: Girgin Sertan
Publisher: Springer Publishing Company
ISSN: 0885-6125
Source: Machine Learning, Vol.81, Iss.3, 2010-12, pp. : 283-331
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
This paper proposes a novel approach to discover options in the form of stochastic conditionally terminating sequences; it shows how such sequences can be integrated into the reinforcement learning framework to improve the learning performance. The method utilizes stored histories of possible optimal policies and constructs a specialized tree structure during the learning process. The constructed tree facilitates the process of identifying frequently used action sequences together with states that are visited during the execution of such sequences. The tree is constantly updated and used to implicitly run corresponding options. The effectiveness of the method is demonstrated empirically by conducting extensive experiments on various domains with different properties.
Related content
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
By Lin L-J.
Machine Learning, Vol. 8, Iss. 3-4, 1992-05 ,pp. :
Reinforcement learning using Voronoi space division
By Aung Kathy
Artificial Life and Robotics, Vol. 15, Iss. 3, 2010-09 ,pp. :
REINFORCEMENT LEARNING FOR POMDP USING STATE CLASSIFICATION
By Dung Le Tien
Applied Artificial Intelligence, Vol. 22, Iss. 7-8, 2008-08 ,pp. :
REINFORCEMENT LEARNING FOR POMDP USING STATE CLASSIFICATION
By Dung Le Tien
Applied Artificial Intelligence, Vol. 22, Iss. 7, 2008-08 ,pp. :