- Full text
- Title
- Author
- ISBN/ISSN
- Publisher
AdvancedSearch
- AdvancedSearch
- Search help

Near-Optimal Reinforcement Learning in Polynomial Time

Author： Kearns M.

Publisher： Springer Publishing Company

ISSN： 0885-6125

Source： Machine Learning, Vol.49, Iss.2-3, 2002-11, pp. : 209-232

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

We present new algorithms for reinforcement learning and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal return is lower bounded by the mixing time T of the optimal policy (in the undiscounted case) or by the horizon time T (in the discounted case), we then give algorithms requiring a number of actions and total computation time that are only polynomial in T and the number of states and actions, for both the undiscounted and discounted cases. An interesting aspect of our algorithms is their explicit handling of the Exploration-Exploitation trade-off.

Related content

Near-optimal solution to an employee assignment problem with seniority

By Hojati Mehran

Annals of Operations Research, Vol. 181, Iss. 1, 2010-12 ,pp. : 539-557 (19)

Springer Publishing Company

Access to resources Recommend Favorite

Near-optimal channel reservation for cellular phone system

By Ku Cheng-Yuan Huang Shi-Ming Yen David C. Chen Yi-Wen

International Journal of Electronic Business, Vol. 2, Iss. 3, 2004-09 ,pp. : 244-254 (11)

Inderscience Publishers

Access to resources Recommend Favorite

Optimal control of ship unloaders using reinforcement learning

By Scardua L.A. Da Cruz J.J. Reali Costa A.H.

Advanced Engineering Informatics, Vol. 16, Iss. 3, 2002-07 ,pp. : 217-227 (11)

Access to resources Recommend Favorite

Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains

By Bernstein Andrey

Machine Learning, Vol. 81, Iss. 3, 2010-12 ,pp. : 359-397 (39)

Springer Publishing Company

Access to resources Recommend Favorite

Optimal and Near-Optimal Algorithms for k-Item Broadcast

Journal of Parallel and Distributed Computing, Vol. 57, Iss. 2, 1999-05 ,pp. : 121-139 (19)

Access to resources Recommend Favorite