컨텐츠 시작
학술대회/행사
초록검색
제출번호(No.) | 0150 |
---|---|
분류(Section) | Special Session |
분과(Session) | (SS-19) Optimization and Machine Learning (SS-19) |
발표시간(Time) | 19th-B-13:00 -- 13:30 |
영문제목 (Title(Eng.)) |
Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-{\L}ojasiewicz condition} |
저자(Author(s)) |
Hyung Jun Choi1, Woocheol Choi2, Jinmyoung Seok3 Korea University of Technology and Education1, Sungkyunkwan University2, Seoul National University3 |
초록본문(Abstract) | In this work, we establish the linear convergence estimate for the gradient descent involving the delay $\tau\in\mathbb{N}$ when the cost function is $\mu$-strongly convex and $L$-smooth. This result improves upon the well-known estimates in Arjevani-Shamir-Srebro (2020) and Stich-Karmireddy (2020) in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate $\eta$ can be extended from $\eta\leq 1/(10L\tau)$ to $\eta\leq 1/(4L\tau)$ for $\tau =1$ and $\eta\leq 3/(10L\tau)$ for $\tau \geq 2$, where $L >0$ is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-{\L}ojasiewicz\,(PL) condition, for which the available choice of learning rate is further improved as $\eta\leq 9/(10L\tau)$ for the large delay $\tau$. The framework of the proof for this result is also extended to the stochastic gradient descent with time-varying delay under the PL condition. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results. |
분류기호 (MSC number(s)) |
90C25, 68Q25 |
키워드(Keyword(s)) | Convex programming, gradient descent method, time delay |
강연 형태 (Language of Session (Talk)) |
Korean |