컨텐츠 시작

학술대회/행사

초록검색

제출번호(No.) 0175
분류(Section) Special Session
분과(Session) (SS-18) Mathematics and AI (SS-18)
발표시간(Time) 20th-C-10:00 -- 10:30
영문제목
(Title(Eng.))
Accelerating Value Iteration with anchoring
저자(Author(s))
Jongmin Lee1, Ernest K. Ryu1
Seoul National University1
초록본문(Abstract) Value Iteration (VI) is foundational to the theory and practice of modern reinforcement learning, and it is known to converge at a O(γk)-rate, where γ is the discount factor. Surprisingly, however, the optimal rate for the VI setup was not known, and finding a general acceleration mechanism has been an open problem. In this paper, we present the first accelerated VI for both the Bellman consistency and optimality operators. Our method, called Anc-VI, is based on an \emph{anchoring} mechanism (distinct from Nesterov's acceleration), and it reduces the Bellman error faster than standard VI. In particular, Anc-VI exhibits a O(1/k)-rate for γ≈1 or even γ=1, while standard VI has rate O(1) for γ≥1−1/k, where k is the iteration count. We also provide a complexity lower bound matching the upper bound up to a constant factor of 4, thereby establishing optimality of the accelerated rate of Anc-VI. Finally, we show that the anchoring mechanism provides the same benefit in the approximate VI and Gauss--Seidel VI setups as well.
분류기호
(MSC number(s))
68T01
키워드(Keyword(s)) Value Iteration, anchoring, complexity lower bound
강연 형태
(Language of Session (Talk))
English