컨텐츠 시작
학술대회/행사
초록검색
제출번호(No.) | 0175 |
---|---|
분류(Section) | Special Session |
분과(Session) | (SS-18) Mathematics and AI (SS-18) |
발표시간(Time) | 20th-C-10:00 -- 10:30 |
영문제목 (Title(Eng.)) |
Accelerating Value Iteration with anchoring |
저자(Author(s)) |
Jongmin Lee1, Ernest K. Ryu1 Seoul National University1 |
초록본문(Abstract) | Value Iteration (VI) is foundational to the theory and practice of modern reinforcement learning, and it is known to converge at a O(γk)-rate, where γ is the discount factor. Surprisingly, however, the optimal rate for the VI setup was not known, and finding a general acceleration mechanism has been an open problem. In this paper, we present the first accelerated VI for both the Bellman consistency and optimality operators. Our method, called Anc-VI, is based on an \emph{anchoring} mechanism (distinct from Nesterov's acceleration), and it reduces the Bellman error faster than standard VI. In particular, Anc-VI exhibits a O(1/k)-rate for γ≈1 or even γ=1, while standard VI has rate O(1) for γ≥1−1/k, where k is the iteration count. We also provide a complexity lower bound matching the upper bound up to a constant factor of 4, thereby establishing optimality of the accelerated rate of Anc-VI. Finally, we show that the anchoring mechanism provides the same benefit in the approximate VI and Gauss--Seidel VI setups as well. |
분류기호 (MSC number(s)) |
68T01 |
키워드(Keyword(s)) | Value Iteration, anchoring, complexity lower bound |
강연 형태 (Language of Session (Talk)) |
English |