kms 대한수학회

컨텐츠 시작

학술대회/행사

초록검색

제출번호(No.)	0150
분류(Section)	Special Session
분과(Session)	(SS-19) Optimization and Machine Learning (SS-19)
발표시간(Time)	19th-B-13:00 -- 13:30
영문제목 (Title(Eng.))	Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-{\L}ojasiewicz condition}
저자(Author(s))	Hyung Jun Choi¹, Woocheol Choi², Jinmyoung Seok³ Korea University of Technology and Education¹, Sungkyunkwan University², Seoul National University³
초록본문(Abstract)	In this work, we establish the linear convergence estimate for the gradient descent involving the delay $\tau\in\mathbb{N}$ when the cost function is $\mu$-strongly convex and $L$-smooth. This result improves upon the well-known estimates in Arjevani-Shamir-Srebro (2020) and Stich-Karmireddy (2020) in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate $\eta$ can be extended from $\eta\leq 1/(10L\tau)$ to $\eta\leq 1/(4L\tau)$ for $\tau =1$ and $\eta\leq 3/(10L\tau)$ for $\tau \geq 2$, where $L >0$ is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-{\L}ojasiewicz\,(PL) condition, for which the available choice of learning rate is further improved as $\eta\leq 9/(10L\tau)$ for the large delay $\tau$. The framework of the proof for this result is also extended to the stochastic gradient descent with time-varying delay under the PL condition. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results.
분류기호 (MSC number(s))	90C25, 68Q25
키워드(Keyword(s))	Convex programming, gradient descent method, time delay
강연 형태 (Language of Session (Talk))	Korean