Yangfan Zhou
FastAdaBelief: Improving Convergence Rate for Belief-Based Adaptive Optimizers by Exploiting Strong Convexity
Zhou, Yangfan; Huang, Kaizhu; Cheng, Cheng; Wang, Xuguang; Hussain, Amir; Liu, Xin
Authors
Abstract
AdaBelief, one of the current best optimizers, demonstrates superior generalization ability over the popular Adam algorithm by viewing the exponential moving average of observed gradients. AdaBelief is theoretically appealing in which it has a data-dependent O(√T) regret bound when objective functions are convex, where T is a time horizon. It remains, however, an open problem whether the convergence rate can be further improved without sacrificing its generalization ability. To this end, we make the first attempt in this work and design a novel optimization algorithm called FastAdaBelief that aims to exploit its strong convexity in order to achieve an even faster convergence rate. In particular, by adjusting the step size that better considers strong convexity and prevents fluctuation, our proposed FastAdaBelief demonstrates excellent generalization ability and superior convergence. As an important theoretical contribution, we prove that FastAdaBelief attains a data-dependent O(log T) regret bound, which is substantially lower than AdaBelief in strongly convex cases. On the empirical side, we validate our theoretical analysis with extensive experiments in scenarios of strong convexity and nonconvexity using three popular baseline models. Experimental results are very encouraging: FastAdaBelief converges the quickest in comparison to all mainstream algorithms while maintaining an excellent generalization ability, in cases of both strong convexity or nonconvexity. FastAdaBelief is, thus, posited as a new benchmark model for the research community.
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 12, 2022 |
Online Publication Date | Mar 10, 2022 |
Publication Date | 2023-09 |
Deposit Date | Jul 8, 2022 |
Publicly Available Date | Jul 8, 2022 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Print ISSN | 2162-237X |
Electronic ISSN | 2162-2388 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 34 |
Issue | 9 |
Pages | 6515 - 6529 |
DOI | https://doi.org/10.1109/tnnls.2022.3143554 |
Keywords | Adaptive Learning Rate, Stochastic Gradient Descent, Online Learning, Optimization Algorithm, Strong Convexity |
Public URL | http://researchrepository.napier.ac.uk/Output/2885371 |
Files
FastAdaBelief: Improving Convergence Rate For Belief-Based Adaptive Optimizers By Exploiting Strong Convexity (accepted version)
(4.2 Mb)
PDF
You might also like
Applications of Deep Learning and Reinforcement Learning to Biological Data
(2018)
Journal Article
Guided Policy Search for Sequential Multitask Learning
(2018)
Journal Article
Learning Latent Features With Infinite Nonnegative Binary Matrix Trifactorization
(2018)
Journal Article
Cross-modality interactive attention network for multispectral pedestrian detection
(2018)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search