Yangfan Zhou
FastAdaBelief: Improving Convergence Rate for Belief-Based Adaptive Optimizers by Exploiting Strong Convexity
Zhou, Yangfan; Huang, Kaizhu; Cheng, Cheng; Wang, Xuguang; Hussain, Amir; Liu, Xin
Authors
Abstract
AdaBelief, one of the current best optimizers, demonstrates superior generalization ability over the popular Adam algorithm by viewing the exponential moving average of observed gradients. AdaBelief is theoretically appealing in which it has a data-dependent O(√T) regret bound when objective functions are convex, where T is a time horizon. It remains, however, an open problem whether the convergence rate can be further improved without sacrificing its generalization ability. To this end, we make the first attempt in this work and design a novel optimization algorithm called FastAdaBelief that aims to exploit its strong convexity in order to achieve an even faster convergence rate. In particular, by adjusting the step size that better considers strong convexity and prevents fluctuation, our proposed FastAdaBelief demonstrates excellent generalization ability and superior convergence. As an important theoretical contribution, we prove that FastAdaBelief attains a data-dependent O(log T) regret bound, which is substantially lower than AdaBelief in strongly convex cases. On the empirical side, we validate our theoretical analysis with extensive experiments in scenarios of strong convexity and nonconvexity using three popular baseline models. Experimental results are very encouraging: FastAdaBelief converges the quickest in comparison to all mainstream algorithms while maintaining an excellent generalization ability, in cases of both strong convexity or nonconvexity. FastAdaBelief is, thus, posited as a new benchmark model for the research community.
Citation
Zhou, Y., Huang, K., Cheng, C., Wang, X., Hussain, A., & Liu, X. (in press). FastAdaBelief: Improving Convergence Rate for Belief-Based Adaptive Optimizers by Exploiting Strong Convexity. IEEE Transactions on Neural Networks and Learning Systems, https://doi.org/10.1109/tnnls.2022.3143554
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 12, 2022 |
Online Publication Date | Mar 10, 2022 |
Deposit Date | Jul 8, 2022 |
Publicly Available Date | Jul 8, 2022 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Print ISSN | 2162-237X |
Electronic ISSN | 2162-2388 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
DOI | https://doi.org/10.1109/tnnls.2022.3143554 |
Keywords | Adaptive Learning Rate, Stochastic Gradient Descent, Online Learning, Optimization Algorithm, Strong Convexity |
Public URL | http://researchrepository.napier.ac.uk/Output/2885371 |
Files
FastAdaBelief: Improving Convergence Rate For Belief-Based Adaptive Optimizers By Exploiting Strong Convexity (accepted version)
(4.2 Mb)
PDF
You might also like
WikiDes: A Wikipedia-based dataset for generating short descriptions from paragraphs
(2022)
Journal Article
DPb-MOPSO: A Dynamic Pareto bi-level Multi-objective Particle Swarm Optimization Algorithm
(2022)
Journal Article