Randomized block-coordinate adaptive algorithms for nonconvex optimization problems
Zhou, Yangfan; Huang, Kaizhu; Li, Jiang; Cheng, Cheng; Wang, Xuguang; Hussian, Amir; Liu, Xin
Prof Amir Hussain A.Hussain@napier.ac.uk
Nonconvex optimization problems have always been one focus in deep learning, in which many fast adaptive algorithms based on momentum are applied. However, the full gradient computation of high-dimensional feature vector in the above tasks become prohibitive. To reduce the computation cost for optimizers on nonconvex optimization problems typically seen in deep learning, this work proposes a randomized block-coordinate adaptive optimization algorithm, named RAda, which randomly picks a block from the full coordinates of the parameter vector and then sparsely computes its gradient. We prove that RAda converges to a -accurate solution with the stochastic first-order complexity of , where is the upper bound of the gradient’s square, under nonconvex cases. Experiments on public datasets including CIFAR-10, CIFAR-100, and Penn TreeBank, verify that RAda outperforms the other compared algorithms in terms of the computational cost.
Zhou, Y., Huang, K., Li, J., Cheng, C., Wang, X., Hussian, A., & Liu, X. (2023). Randomized block-coordinate adaptive algorithms for nonconvex optimization problems. Engineering Applications of Artificial Intelligence, 121, Article 105968. https://doi.org/10.1016/j.engappai.2023.105968
|Journal Article Type||Article|
|Acceptance Date||Feb 5, 2023|
|Online Publication Date||Feb 11, 2023|
|Deposit Date||Feb 13, 2023|
|Publicly Available Date||Feb 12, 2024|
|Journal||Engineering Applications of Artificial Intelligence|
|Peer Reviewed||Peer Reviewed|
This file is under embargo until Feb 12, 2024 due to copyright reasons.
Contact email@example.com to request a copy for personal use.
You might also like
WikiDes: A Wikipedia-based dataset for generating short descriptions from paragraphs
DPb-MOPSO: A Dynamic Pareto bi-level Multi-objective Particle Swarm Optimization Algorithm